How to improve lazyness of a foldl (and memory footprint)

Hi, I'm trying to improve a small haskell program of mine. A more extended description with full source code is here: http://codereview.stackexchange.com/questions/26107/how-to-improve-readabili... The script transforms CSV files into other CSV files but looks like it's reading the whole input files before writing output files. I guess that the script can be improved in many ways, in readability and efficiency, thus any suggestion is wellcome as an occasion to learn. But what I can't understand is why this design doesn't work: transformFile :: FilePath -> ([String] -> a) -> (a -> IO r) -> IO r transformFile file operation continuation = withFile file ReadMode (\h -> hGetContents h >>= (continuation.operation.lines)) This function recieves a path, a function to left fold lines to a new list of objects and a function to persist the fold output to files. Here the relevant parts: importTrades :: FilePath -> FilePath -> IO () importTrades outDir csvFile = transformFile csvFile (foldTradingSample.getTickWriteTrades) (saveTradingSamples outDir) where getTickWriteTrades = filter (isBetween (9, 0) (18, 0)).(catMaybes.(map fromCSVLine)) foldTradingSample = foldl toTradingSample [] This is the folding function: toTradingSample :: [TradingSample] -> Tw.Trade -> [TradingSample] toTradingSample (current:others) twTrade | newEqt == equity current && newDay == day current = (current { trades = newTrades }):others | otherwise = current : toTradingSample others twTrade where newEqt = Tw.tSimbol twTrade newDay = Tw.tDate twTrade newTrade = fromTickWrite twTrade newTrades = trades current ++ [newTrade] toTradingSample [] twTrade = [TradingSample { equity = Tw.tSimbol twTrade , day = Tw.tDate twTrade , trades = [fromTickWrite twTrade] }] And this is the function that safe the fold results to files saveTradingSamples :: String -> [TradingSample] -> IO () saveTradingSamples folder samples = mapM_ (saveTradingSample folder) samples saveTradingSample :: String -> TradingSample -> IO () saveTradingSample folder sample = writeFile fileName contents where fileName = folder ++ "\\" ++ (equity sample) ++ "_" ++ (formatTime defaultTimeLocale "%F" $ day sample) ++ ".CSV" contents = tradingSampleToCSV sample What's wrong here? My insight is that the problem is in the signature of transform files, that requires to completely compute the list of TradingSample before calling saveTradingSamples. Is this the problem? How can I fix this? Giacomo

On Tue, 14 May 2013 11:22:27 +0200, Giacomo Tesio
Hi, I'm trying to improve a small haskell program of mine. :
Some remarks: 0) Use hlint (available on Hackage) for improvement suggestions 1) You don't have to write the module heading in Main.hs, it is not a library (why export main?) 2) Change "print" to "putStrLn" if you want to display messages without quotes 2) switchArgs is only partially defined, add something like: switchArgs [x] = putStrLn $ "Unknown tool: " ++ x 3) Use shorter lines, for example change: importTrades outDir csvFile = transformFile csvFile (foldTradingSample.getTickWriteTrades) (saveTradingSamples outDir) to: importTrades outDir csvFile = transformFile csvFile (foldTradingSample.getTickWriteTrades) (saveTradingSamples outDir) 4) It is considered good practice, to write the function composition operator between spaces (change f.g to f . g) I have analyze your software further to see how sufficient laziness can be reached. Regards, Henk-Jan van Tuyl -- Folding@home What if you could share your unused computer power to help find a cure? In just 5 minutes you can join the world's biggest networked computer and get us closer sooner. Watch the video. http://folding.stanford.edu/ http://Van.Tuyl.eu/ http://members.chello.nl/hjgtuyl/tourdemonad.html Haskell programming --

Thanks a lot!
Yesterday on freenode's #haskell channel Cane noted how my laziness problem
reside in the foldl use in foldTradingSample.
I have to turn it into a foldr (but I'm still unsure how...)
Giacomo
On Wed, May 15, 2013 at 12:46 AM, Henk-Jan van Tuyl
On Tue, 14 May 2013 11:22:27 +0200, Giacomo Tesio
wrote: Hi, I'm trying to improve a small haskell program of mine.
:
Some remarks:
0) Use hlint (available on Hackage) for improvement suggestions 1) You don't have to write the module heading in Main.hs, it is not a library (why export main?) 2) Change "print" to "putStrLn" if you want to display messages without quotes 2) switchArgs is only partially defined, add something like: switchArgs [x] = putStrLn $ "Unknown tool: " ++ x 3) Use shorter lines, for example change:
importTrades outDir csvFile = transformFile csvFile (foldTradingSample.* *getTickWriteTrades) (saveTradingSamples outDir)
to:
importTrades outDir csvFile = transformFile csvFile (foldTradingSample.**getTickWriteTrades) (saveTradingSamples outDir) 4) It is considered good practice, to write the function composition operator between spaces (change f.g to f . g)
I have analyze your software further to see how sufficient laziness can be reached.
Regards, Henk-Jan van Tuyl
-- Folding@home What if you could share your unused computer power to help find a cure? In just 5 minutes you can join the world's biggest networked computer and get us closer sooner. Watch the video. http://folding.stanford.edu/
http://Van.Tuyl.eu/ http://members.chello.nl/**hjgtuyl/tourdemonad.htmlhttp://members.chello.nl/hjgtuyl/tourdemonad.html Haskell programming --
______________________________**_________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/**mailman/listinfo/beginnershttp://www.haskell.org/mailman/listinfo/beginners

Turned out that I didn't need fold at all, just a proper groupBy. As for these lines module Main (
main
) where
they were generated by Leksah. Do you suggest to remove them? And what
about Leksah as an IDE: do you use it?
Giacomo
On Wed, May 15, 2013 at 9:35 AM, Giacomo Tesio
Thanks a lot!
Yesterday on freenode's #haskell channel Cane noted how my laziness problem reside in the foldl use in foldTradingSample. I have to turn it into a foldr (but I'm still unsure how...)
Giacomo
On Wed, May 15, 2013 at 12:46 AM, Henk-Jan van Tuyl
wrote: On Tue, 14 May 2013 11:22:27 +0200, Giacomo Tesio
wrote: Hi, I'm trying to improve a small haskell program of mine.
:
Some remarks:
0) Use hlint (available on Hackage) for improvement suggestions 1) You don't have to write the module heading in Main.hs, it is not a library (why export main?) 2) Change "print" to "putStrLn" if you want to display messages without quotes 2) switchArgs is only partially defined, add something like: switchArgs [x] = putStrLn $ "Unknown tool: " ++ x 3) Use shorter lines, for example change:
importTrades outDir csvFile = transformFile csvFile (foldTradingSample. **getTickWriteTrades) (saveTradingSamples outDir)
to:
importTrades outDir csvFile = transformFile csvFile (foldTradingSample.**getTickWriteTrades) (saveTradingSamples outDir) 4) It is considered good practice, to write the function composition operator between spaces (change f.g to f . g)
I have analyze your software further to see how sufficient laziness can be reached.
Regards, Henk-Jan van Tuyl
-- Folding@home What if you could share your unused computer power to help find a cure? In just 5 minutes you can join the world's biggest networked computer and get us closer sooner. Watch the video. http://folding.stanford.edu/
http://Van.Tuyl.eu/ http://members.chello.nl/**hjgtuyl/tourdemonad.htmlhttp://members.chello.nl/hjgtuyl/tourdemonad.html Haskell programming --
______________________________**_________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/**mailman/listinfo/beginnershttp://www.haskell.org/mailman/listinfo/beginners

On Wed, 15 May 2013 14:55:43 +0200, Giacomo Tesio
Turned out that I didn't need fold at all, just a proper groupBy.
As for these lines
module Main (
main
) where
they were generated by Leksah. Do you suggest to remove them? And what about Leksah as an IDE: do you use it?
I do not know whether or not these lines will give you problems; you simply do not need them. I use Geany as IDE. My main problem with Leksah is that you can not open files with drag and drop or clicking on a .hs file in the file manager (Windows). Geany has serious problems too; I have to look for another IDE. Regards, Henk-Jan van Tuyl -- Folding@home What if you could share your unused computer power to help find a cure? In just 5 minutes you can join the world's biggest networked computer and get us closer sooner. Watch the video. http://folding.stanford.edu/ http://Van.Tuyl.eu/ http://members.chello.nl/hjgtuyl/tourdemonad.html Haskell programming --

For any associative binary operator (+) with an identity element z, foldl and foldr are equivalent, that is, foldl (+) z === foldr (+) z The first yields something like ((z + a) + b) + c whereas the second yields a + (b + (c + z)) but with associativity and the fact that z is an identity it is not hard to see that these are equal. However, as you found out, that does not necessarily mean they have the same performance! Since atop is associative you could just replace foldl with foldr. In fact, atop is the binary operation for the Monoid instance of diagrams, so you can just write 'mconcat' in place of 'foldr atop mempty'. -Brent On Wed, May 15, 2013 at 09:35:34AM +0200, Giacomo Tesio wrote:
Thanks a lot!
Yesterday on freenode's #haskell channel Cane noted how my laziness problem reside in the foldl use in foldTradingSample. I have to turn it into a foldr (but I'm still unsure how...)
Giacomo
On Wed, May 15, 2013 at 12:46 AM, Henk-Jan van Tuyl
wrote: On Tue, 14 May 2013 11:22:27 +0200, Giacomo Tesio
wrote: Hi, I'm trying to improve a small haskell program of mine.
:
Some remarks:
0) Use hlint (available on Hackage) for improvement suggestions 1) You don't have to write the module heading in Main.hs, it is not a library (why export main?) 2) Change "print" to "putStrLn" if you want to display messages without quotes 2) switchArgs is only partially defined, add something like: switchArgs [x] = putStrLn $ "Unknown tool: " ++ x 3) Use shorter lines, for example change:
importTrades outDir csvFile = transformFile csvFile (foldTradingSample.* *getTickWriteTrades) (saveTradingSamples outDir)
to:
importTrades outDir csvFile = transformFile csvFile (foldTradingSample.**getTickWriteTrades) (saveTradingSamples outDir) 4) It is considered good practice, to write the function composition operator between spaces (change f.g to f . g)
I have analyze your software further to see how sufficient laziness can be reached.
Regards, Henk-Jan van Tuyl
-- Folding@home What if you could share your unused computer power to help find a cure? In just 5 minutes you can join the world's biggest networked computer and get us closer sooner. Watch the video. http://folding.stanford.edu/
http://Van.Tuyl.eu/ http://members.chello.nl/**hjgtuyl/tourdemonad.htmlhttp://members.chello.nl/hjgtuyl/tourdemonad.html Haskell programming --
______________________________**_________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/**mailman/listinfo/beginnershttp://www.haskell.org/mailman/listinfo/beginners
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
participants (3)
-
Brent Yorgey
-
Giacomo Tesio
-
Henk-Jan van Tuyl