
Hi all - In the process of learning Haskell I'm wanting to do some simple data summarization. ( Btw, I'm looking at putting any submitted code for this in the "cookbook" section of the Haskell wiki. Imo it would be very useful there as a "next step" up from just reading in a file and printing it out. ) This would involve reading in a delimited file like this - ( just a contrived example of how many books some people own ) - Name,Gender,Age,Ethnicity,Books Mary,F,14,NZ European, 11 Brian,M,13,NZ European, 6 Josh,M,12,NZ European, 14 Regan,M,14,NZ Maori, 9 Helen,F,15,NZ Maori, 17 Anna,F,14,NZ European, 16 Jess,F,14,NZ Maori, 21 .... and doing some operations on it. As you can see, the file has column headings - I prefer to be able to manipulate data with headings (as it is what I do a lot of at work, using another programming language). I've tried to break the problem down into small parts as follows. a) Read the file into a list of pairs. The first element of the pair would be the column heading. The second will be a list containing the data. For example, ("Name", [Mary, Brian, Josh, Regan, ..... ] ) b) Select a numeric variable to summarise ( "Books" in this example) c) Do a fold to summarize the variable. I think a left-fold would be the one to use here, but I may be wrong.... After looking through previous postings on this list, I found some code which is somewhat similar to what I'm after (although the data it was crunching is very different). This is what I've come up with so far - summarize [] = [] summarize ls = let byvariable = head ls numeric_variable = last ls sum = foldl (+) 0 $ numeric_variable in (byvariable, sum) : sum ls main = interact (unlines . map show . summarize . lines) I think this might be a useful start, but I still need to read the data into a list of pairs as mentioned, and I'm unsure as to how to do that. Many thanks in advance for any help received. As mentioned, I'm sure that examples like this could be very useful to other beginners, so I'm keen to make sure that any help given is made maximum use of (by putting any code on the Haskell wiki). - Andy