On Wed, Dec 11, 2013 at 1:37 PM, Joerg Fritsch <fritsch@joerg.cc> wrote:

I have the following code snippet:

import System.IO

import Data.String.Utils

main = withFile "test.txt" ReadMode $ \handle -> do

           xs <- getwords handle

           sequence_ $ map putStrLn (escapeRe xs)

getwords :: Handle -> IO [String]

getwords h = hGetContents h >>= return . words

 

What I want to to there is to get i.e. “word,” or “word!” etc. and arrive at “word”. I understand that escapeRe may do this. However, I always get some sort of mismatch errors like this:

 

test.hs:6:38:

    Couldn't match type `Char' with `[Char]'

    Expected type: [String]

      Actual type: String

    In the return type of a call of `escapeRe'

    In the second argument of `map', namely `(escapeRe xs)'

    In the second argument of `($)', namely

      `map putStrLn (escapeRe xs)'

test.hs:6:47:

    Couldn't match type `[Char]' with `Char'

    Expected type: String

      Actual type: [String]

    In the first argument of `escapeRe', namely `xs'

    In the second argument of `map', namely `(escapeRe xs)'

    In the second argument of `($)', namely

      `map putStrLn (escapeRe xs)'

Now I have three questions:

1.      Is escapeRe the right function to use here?

`escapeRe` is not the correct function to use. That is the function you would use if you were trying to create a regular expression to match the given input, but this is not at all what you are doing.
 

2.      What do I do wrong?


Well, the type is wrong because you did `sequence_ $ map putStrLn (escapeRe xs)` instead of `sequence_ $ map (putStrLn . escapeRe) xs`. Note that `sequence_ $ map f xs` can be written as `mapM_ f xs` which is much shorter and more clear. This is what I would write:

  mapM_ (putStrLn . escapeRe) xs
 
That said, `escapeRe` is not at all useful for what you are trying to do. You should probably use `filter` and `isAlphaNum` from Data.Char.

3.      I read in the Real World Haskell book that actually all these file/string operations are very very slow. The recommendation is to work with bytestrings instead. Is there any (fast) way to strip non-alphanumericals from bytestrings?

This is true. You should use Text or ByteString for performance. Text is probably more appropriate for your use case. You can efficiently solve this exercise with functionality from Data.Char, Data.Text, and Data.Text.IO.
 
Note that this sort of question might be more appropriate for haskell-beginners: http://www.haskell.org/mailman/listinfo/beginners

-bob