Re: [Haskell-beginners] Seeking some clarification...

6 Jun 2011

      On Sonntag, 5. Juni 2011, 20:46, Sean Charles wrote:
...
I have a CSV file containing some data about trains: stations from, to,
times etc. and I wanted to 'learn some more Haskell' and, to my
astonishment,
I have gotten thus far but I am not sure *why* it works or *how* I got
there! LMAO Here is the relevant code ...
====>
trains :: String -> IO ()
trains csvfile = do
      legodata <- parseCSVFromFile csvfile
      case legodata of
          Left error -> print error
          Right legodata -> mapM_ putStrLn (trainCodes legodata)
-- Assumes row 1 contains cell header information
-- Note: the train-code is always the third cell
trainCodes :: [Record] -> [String]
trainCodes = nub . map (!! 2) . tail
That'll bomb of course on malformed input, but that's probably okay in this 
scenario.
...
====>
I was chuffed with writing the trainCodes as a point-free function,
that sort of thing is getting a little easier to work with but I still
have real head-banging frustrations sometimes with seemingly simple
things, like looping and just printing stuff out, despite having taught
myself
LISP six years ago and Erlang in recent years! I quit!! I really do!!!
My confusion arises over: mapM_ putStrLn (traincodes legodata)
Given that: mapM_ :: Monad
<http://haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Control-
Monad.html#t%3AMonad> m => (a -> m b) -> [a] -> m ()
<http://haskell.org/ghc/docs/6.12.2/html/libraries/ghc-prim-0.2.0.0/GHC-
Unit.html#t%3A%28%29>
Here's how I read it: For any m that is a Monad, mapM_ takes a function
that "takes
an 'a' and returns it wrapped in a monad",
Not it, but a value based on it (the mapM_'ed function has type (a -> m b))
"A value wrapped in a monad" is kind of a skewed picture, doesn't really do 
justice to State or Cont for example.

mapM_ is the composition of

map :: (a -> c) -> [a] -> [c],

restricted to types c = m b for some Monad m -
that part produces a list [m b], then - and

sequence_ :: (Monad m) => [m b] -> m ()

sequence_ "runs" all the actions in the list and discards their results.
If you want to collect the results, there's

sequence :: (Monad m) => [m b] -> m [b]

and the composition of sequence and map,

mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
...
a "list of a's" and returns a
"monad containing
'unit'", the empty-list LISP-() undefined voidy thing.
Given that: putStrLn :: String
<http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelud
e.html#t:String> -> IO
<http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelud
e.html#t:IO> ()
<http://hackage.haskell.org/packages/archive/base/latest/doc/ghc-prim-0.
2.0.0/GHC-Unit.html#t:-40--41->, this means that 'a' is String and 'm b'
is
IO () and my list of [a] is the result of calling 'traincodes legodata'.
Right.
...
trainCodes = nub . map (!! 2) . tail
legodata is [Record] (from Text.CSV) and so, 'tail' removes the header
row from
the data, 'map (!! 2)' extracts the third  field from each row and
finally 'nub'
removes the duplicates. Thus the return type from trainCodes is
[String].
Yup.
...
Gluing it all together:
mapM_ :: Monad
<http://haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Control-
Monad.html#t%3AMonad> m => (a -> m b) -> [a] -> m ()
<http://haskell.org/ghc/docs/6.12.2/html/libraries/ghc-prim-0.2.0.0/GHC-
Unit.html#t%3A%28%29> putStrLn :: String
<http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelud
e.html#t:String> -> IO
<http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelud
e.html#t:IO> ()
<http://hackage.haskell.org/packages/archive/base/latest/doc/ghc-prim-0.
2.0.0/GHC-Unit.html#t:-40--41-> trainCodes :: [Record] -> [String]
the type of my call then would seem to be:
String -> IO () -> [String] -> IO ()
Missing parentheses, it's (String -> IO ()) -> [String] -> IO ()
...
"putStrLn" -> (trainCodes legodata) -> IO ()
which means that not only have I got the types correct for the call but
the result
type of 'IO ()' also satisfies the type return for my function and hence
it executes
from 'main' (where it is called from) with no issues.
So, am I finally beginning to get a grip on it all ?
Looks quite so.
...
This list is a constant source
of education and I don't post very often as you guys give me far too
much stuff to be reading all the time! :)
I am using Text.CSV to read my file and all I wanted to do was to output
a list of unique codes from column three of the spreadsheet data, one
per line, so that I can use this Haskell as part of a bigger 'bash'
script.
And that's what your code does :)
...
Any detailed explanations that might help me better understand my
solution would be welcome; right now I feel I 'just got lucky' although
there must be a glimmer of understanding somewhere! LOL
ToDos:
1. parse file to get a list of rows -- parseCSVFromFile
2. remove header row    -- tail
3. extract the field(s) of interest -- (!! 2) for one, map (!! 2) for the 
list
4. remove duplicates -- nub
5. output -- mapM_ putStrLn

1. is delegated to a library function, how that works need not concern us 
at the moment

2. should be clear

3. also clear

4. library, you need not care how it does what it does (unless performance 
becomes an issue; nub is O(n^2), if that's too slow, you have to use faster 
variants exploiting that in your case you have more than the Eq constraint 
nub can only work with; an Ord constraint gives easy O(n*log n) 
implementations [using Data.Set, for example]; in a few special cases O(n) 
is possible)

5. putStrLn is clear, for mapM_ see above

1. and 5. involve IO (reading a file resp. printing to stdout), 2., 3. and 
4. operate only on data, so those steps can be combined into a pipeline 
like you did.
...
Thanks,
Sean.
PS: Phew!

Re: [Haskell-beginners] Seeking some clarification...

Daniel Fischer