
apfelmus wrote:
Grzegorz Chrupala wrote:
split "<DOC>" . words . map toLower = (:[]) . words . map toLower
Since you converted everything to lowercase, the string "<DOC>" will never appear in the text, resulting in a single huge document.
Oops, that should have been obvious, sorry for the dumb question. Thanks,
No problem, it was not obvious to me and I had fun trying to figure it out :)
Speaking of not obvious: Haskell's type system catches a lot of bugs -- but still gives no help with this particular 'problem'. But one can easily imagine an extension to a type system which could have detected that "<DOC>" can never occur in the result of words . map toLower, and then with a bit more work [type-level Nat], the type of the full expression could have encoded that the result is always going to be of length 1. That would surely have been a good hint that something non-trivial was going on. Whether a Haskell-friendly type system extension could be created/implemented which would cover this example, I don't know. However, I have had a lot of fun with the underlying idea: anytime someone encounters a bug in their code (and relates the debugging story on haskell-cafe), try to imagine how the type system could be extended to automate that. In most cases, I don't mean to have the type system reject the code, but rather to have an inferred type that would make it obvious that the code did not behave as expected. Jacques