
You seem to ignore garbage collection. On Sat, Sep 24, 2011 at 6:40 AM, Arseniy Alekseyev < arseniy.alekseyev@gmail.com> wrote:
Apparently it doesn't, and it seems to be fixed now.
Does anyone know what exactly the bug was? Because this seems like a serious bug to me. I've run into it myself today and wasn't happy. Linear algorithms should work in linear time however much memory they allocate (modulo cache thrashing of course). Existence of people claiming otherwise surprises me!
On Wed, Sep 21, 2011 at 1:57 PM, Tim Docker
wrote: On 09/09/2011, at 8:19 PM, John Lato wrote:
Agreed. Whenever I'd like to use mapM (or any other function for which a *M_ is available), I've found the following rules helpful:
1. If I can guarantee the list is short (~ n<=20), go ahead and use
mapM
2. Otherwise use mapM_, foldM_, or foldM if a real reduction is possible (i.e. not "foldM snocM []").
Step 2 sometimes requires changing my design, but it's always been for the better. `mapM_` tends to require more pipeline composition, so it's leveraging the language's strengths.
This thread is really interesting - it relates directly to problems I am currently having with mapM over large lists (see the thread "stack overflow
On 22 September 2011 01:05, John Lato
wrote: pain"). Can you explain what you mean by "mapM_ tends to require more pipeline composition"? In what way is it leveraging the language strengths?
Hmm, that is suitably cryptic. One way to think of it is an inversion of control. Instead of operating on whole collections of things in a monad, you specify monadic actions (pipelines) which are applied sequentially to each input.
Here's a simple example. Suppose you have a bunch of data serialized to files, and you want to read each file into a data structure, apply some process based upon the last file's data, and write out the output to new files. One way to do that would look like:
do dats <- mapM readMyData files let pairs = zip (mempty:dats) dats zipWithM_ (\(last, this) fname -> writeMyData (update last this) fname) pairs newFiles
However, you could also put everything into a single monadic operation, like this
do foldM_ (\last (infile, outfile) -> do this <- readMyData infile writeMyData (update last this) outfile return this ) mempty (zip files newFiles)
The first interleaves control (mapM, zipWIthM_) with monadic actions (file IO), whereas the second only has one control function (foldM_) which completely processes one input. I say this is more pipeline composition because you have to create an entire pipeline from input to output, which is then sequentially fed inputs by the control function.
I say this leverages Haskell's strengths because it's quite easy to compose functions and monadic actions in Haskell. It also tends to be garbage-collector friendly. I also find it much easier to reason about space usage. You don't need to worry if part of a list is being retained, because the full list of data doesn't appear anywhere. If you need to access prior elements they're specified explicitly so you know exactly how much data you're holding on to.
My perspective might be warped by my work on iteratees, but I find this a very natural approach.
John L.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe