
On Wednesday 08 September 2010 15:31:19, Lorenzo Isella wrote:
Dear All, I must be stuck on something pretty basic (I am struggling badly with I/O). Let us assume you have a rather simple file mydata.dat (3 columns of integer numbers), see below.
1246191122 1336 1337 1246191142 1336 1337 1246191162 1336 1337 1246191182 1336 1337 1246191202 1336 1337 1246191222 1336 1337 1246191242 1336 1337 1246191262 1336 1337 1246191282 1336 1337 1246191302 1336 1337 1246191322 1336 1337 1246191342 1336 1337 1246191362 1336 1337 1246191382 1336 1337 1246191402 1336 1337 1246191422 1336 1337
Now, my intended pipeline could be
read file as string--> convert to list of integers-->pass it to hmatrix (or try to convert it into a matrix/array). Leaving aside the last step, I can easily do something like
let dat=readFile "mydata.dat"
in the interactive shell and get a string,
Not quite. `dat' is the IO-action that reads the file, of type (IO String) and not a String. In a programme, you'd do something like main = do ... -- argument parsing perhaps txt <- readFile "mydata.dat" let dat = convert txt doSomething with dat
but I am having problems in converting this to a list or anything more manageable (where every entry is an integer number i.e. something which can be summed, subtracted etc...). Ideally even a list where every entry is a row (a list in itself) would do.
Depending on what the reult type should be, different solutions are required. The simplest solutions for such a file format are built from read -- to convert e.g. "135" to 135 lines :: String -> [String] words :: String -> [String] map :: (a -> b) -> [a] -> [b] If you want a flat list of Integers from that file, convert = map read . words will do. First, `words' splits the String on whitespace (spaces and newlines), producing a list of digit-strings, those are then read as Integers. If you want a list of lists, each line its own list inside the top level list, convert = map (map read . words) . lines is what you want. If you want to convert each line into a different data structure, say (Integer, Double, Int64), the general form would still be convert = map parseLine . lines and parseLine would depend on the structure you want. For the above, parseLine str = case words str of (a : b : c : _) -> (read a, read b, read c) _ -> error "Bad line format" would be a solution. For any but the simplest formats, you should write a real parser to deal with possible bad formatting though (writing parsers is fun in Haskell).
I found online this suggestion http://bit.ly/9jv1WG but I am not sure if it really applies to this case. Many thanks
Lorenzo