Example programs with ample use of deepseq?

Dear Haskellers, I’m wondering if the use of deepseq to avoid unwanted lazyness might be a too large hammer in some use cases. Therefore, I’m looking for real world programs with ample use of deepseq, and ideally easy ways to test performance (so preferably no GUI applications). I’ll try to find out, by runtime observerations, which of the calls ot deepseq could be replaced by id, seq, or „shallow seqs“ that, for example, calls seq on the elements of a tuple. Thanks, Joachim -- Joachim "nomeata" Breitner mail@joachim-breitner.de | nomeata@debian.org | GPG: 0x4743206C xmpp: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/

There are two senses in which deepseq can be overkill: 1. The structure was already strict, and deepseq just forces another no-op traversal of the entire structure. This hypothetically affects seq too, although seq is quite cheap so it's not a problem. 2. deepseq evaluates too much, when it was actually sufficient only to force parts of the structure, e.g. the spine of a list. This is less common for the common use-cases of deepseq; e.g. if I want to force pending exceptions I am usually interested in all exceptions in a (finite) data structure; a space leak may be due to an errant closure---if I don't know which it is, deepseq will force all of them, ditto with work in parallel programs. Certainly there will be cases where you will want snip evaluation at some point, but that is somewhat difficult to encode as a typeclass, since the criterion varies from structure to structure. (Though, perhaps, this structure would be useful: data Indirection a = Indirection a class DeepSeq Indirection rnf _ = () ) Cheers, Edward Excerpts from Joachim Breitner's message of Mon Jan 07 04:06:35 -0800 2013:
Dear Haskellers,
I’m wondering if the use of deepseq to avoid unwanted lazyness might be a too large hammer in some use cases. Therefore, I’m looking for real world programs with ample use of deepseq, and ideally easy ways to test performance (so preferably no GUI applications).
I’ll try to find out, by runtime observerations, which of the calls ot deepseq could be replaced by id, seq, or „shallow seqs“ that, for example, calls seq on the elements of a tuple.
Thanks, Joachim

Hi, Am Montag, den 07.01.2013, 13:06 +0100 schrieb Joachim Breitner:
I’m wondering if the use of deepseq to avoid unwanted lazyness might be a too large hammer in some use cases. Therefore, I’m looking for real world programs with ample use of deepseq, and ideally easy ways to test performance (so preferably no GUI applications).
surprisingly, deepseq is not used as much as I thought. http://packdeps.haskellers.com/reverse/deepseq lists a lot of packages, but (after grepping through some of the code) most just define NFData instances and/or use it in tests, but rarely in the „real“ code. For some reason I expected it to be in more widespread use. But therefore I am even more interested in non-hackaged applications that I can be allowed to stud – in return I might be able to tell you way to speed up your application. Greetings, Joachim -- Joachim "nomeata" Breitner mail@joachim-breitner.de | nomeata@debian.org | GPG: 0x4743206C xmpp: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/

http://article.gmane.org/gmane.comp.lang.haskell.parallel/340 (with follow-up message about rseq => rdeepseq) - J.W.

surprisingly, deepseq is not used as much as I thought. http://packdeps.haskellers.com/reverse/deepseq lists a lot of packages, but (after grepping through some of the code) most just define NFData instances and/or use it in tests, but rarely in the „real“ code. For some reason I expected it to be in more widespread use.
I've been using deepseq quite a bit lately, but for the purpose of debugging space leaks. If, when I deepseq a big structure, the space leak goes away, I can then apply it to some subset. After much trial-and-error I can find something which is insufficiently strict. Ideally I can then strictify that one thing and stop using the deepseq. I wish there was a more efficient way to do this. I also use it to explicitly force certain parts of a data structure so they evaluate in parallel, though that part is not actually done yet.

Hi, Am Dienstag, den 08.01.2013, 13:01 -0800 schrieb Evan Laforge:
surprisingly, deepseq is not used as much as I thought. http://packdeps.haskellers.com/reverse/deepseq lists a lot of packages, but (after grepping through some of the code) most just define NFData instances and/or use it in tests, but rarely in the „real“ code. For some reason I expected it to be in more widespread use.
I've been using deepseq quite a bit lately, but for the purpose of debugging space leaks. If, when I deepseq a big structure, the space leak goes away, I can then apply it to some subset. After much trial-and-error I can find something which is insufficiently strict. Ideally I can then strictify that one thing and stop using the deepseq. I wish there was a more efficient way to do this.
this is also a possible application of my approach, by providing a „I want this data structure to be fully evaluated now, please tell me how it currently looks, i.e. where in the data structure still thunks lie hidden.“ Do you have a nice, striking example where using your approach (using deepseq and comparing efficiency) you could make a difference, and where a tool as described above would make the analysis much easier? Thanks, Joachim -- Joachim Breitner e-Mail: mail@joachim-breitner.de Homepage: http://www.joachim-breitner.de Jabber-ID: nomeata@joachim-breitner.de

On Tue, Jan 8, 2013 at 10:54 PM, Joachim Breitner
Am Dienstag, den 08.01.2013, 13:01 -0800 schrieb Evan Laforge:
surprisingly, deepseq is not used as much as I thought. http://packdeps.haskellers.com/reverse/deepseq lists a lot of packages, but (after grepping through some of the code) most just define NFData instances and/or use it in tests, but rarely in the „real“ code. For some reason I expected it to be in more widespread use.
I've been using deepseq quite a bit lately, but for the purpose of debugging space leaks. If, when I deepseq a big structure, the space leak goes away, I can then apply it to some subset. After much trial-and-error I can find something which is insufficiently strict. Ideally I can then strictify that one thing and stop using the deepseq. I wish there was a more efficient way to do this.
this is also a possible application of my approach, by providing a „I want this data structure to be fully evaluated now, please tell me how it currently looks, i.e. where in the data structure still thunks lie hidden.“
Do you have a nice, striking example where using your approach (using deepseq and comparing efficiency) you could make a difference, and where a tool as described above would make the analysis much easier?
We've also used this approach to debug space-leaks, and would have loved such a tool. We used deepseq, and compared the heap profiles. We finally found the leaks this way, and fixed them using strictness annotations. Finally, we verified by running deepseq again at top level and observing that it didn't change anything anymore. Erik

Hi Erik, Am Mittwoch, den 09.01.2013, 14:23 +0100 schrieb Erik Hesselink:
We've also used this approach to debug space-leaks, and would have loved such a tool. We used deepseq, and compared the heap profiles. We finally found the leaks this way, and fixed them using strictness annotations. Finally, we verified by running deepseq again at top level and observing that it didn't change anything anymore.
same question to you: Would you have a suitable test case that can be used to test and demonstrate the usefulness of such tools? Greetings, Joachim -- Joachim "nomeata" Breitner mail@joachim-breitner.de | nomeata@debian.org | GPG: 0x4743206C xmpp: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/

On Wed, Jan 9, 2013 at 2:40 PM, Joachim Breitner
Hi Erik,
Am Mittwoch, den 09.01.2013, 14:23 +0100 schrieb Erik Hesselink:
We've also used this approach to debug space-leaks, and would have loved such a tool. We used deepseq, and compared the heap profiles. We finally found the leaks this way, and fixed them using strictness annotations. Finally, we verified by running deepseq again at top level and observing that it didn't change anything anymore.
same question to you: Would you have a suitable test case that can be used to test and demonstrate the usefulness of such tools?
Sadly, no. This is a private, core part of our application that I cannot share. I can describe what we did, and also the structure of the data in broad terms: we have several Maps, where the keys are usually Text, and the values are custom data types. These contain keys into these maps again. The whole thing defines a graph with several indexes into it. We finally solved the problems by completely moving to strict map operations, strict MVar/TVar operations, and strict data types. If you have more questions, or tools you want to test, I'd be happy to help, though. Regards, Erik

Hi, Am Mittwoch, den 09.01.2013, 15:11 +0100 schrieb Erik Hesselink:
We finally solved the problems by completely moving to strict map operations, strict MVar/TVar operations, and strict data types.
do you mean strict by policy (i.e. before storing something in a [MT]Var, you ensure it is evaluated) or by construction (by `seq` or `deepseq`’ing everything before it is stored)? In the latter case: Seq or deeqseq? Again in the latter case: Do you worry about the performance of repeatedly deepseq’ing an already deepseq’ed and possibly large value? You are not the first user of Haskell who ends up with that approach to lazy evaluation. I’m not sure what that means for Haskell, though. Greetings, Joachim -- Joachim "nomeata" Breitner mail@joachim-breitner.de | nomeata@debian.org | GPG: 0x4743206C xmpp: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/

On Wed, Jan 9, 2013 at 11:38 PM, Joachim Breitner
Am Mittwoch, den 09.01.2013, 15:11 +0100 schrieb Erik Hesselink:
We finally solved the problems by completely moving to strict map operations, strict MVar/TVar operations, and strict data types.
do you mean strict by policy (i.e. before storing something in a [MT]Var, you ensure it is evaluated) or by construction (by `seq` or `deepseq`’ing everything before it is stored)? In the latter case: Seq or deeqseq? Again in the latter case: Do you worry about the performance of repeatedly deepseq’ing an already deepseq’ed and possibly large value?
Strict by construction: we `seq` when storing in a var. This is similar to the primed functions in some places, and the new Data.Map.Strict. So repeatedly deepseq'ing isn't a problem: deepseq in our case is only used for debugging.
You are not the first user of Haskell who ends up with that approach to lazy evaluation. I’m not sure what that means for Haskell, though.
Well, we still use the 'normal' lazy approach in most places. Only where we have persistent shared state do we use the above approach of making everything strict. This means updaters pay the computational price, not readers, which seems like a sane idea. Regards, Erik

On Mon, Jan 7, 2013 at 4:06 AM, Joachim Breitner
I’m wondering if the use of deepseq to avoid unwanted lazyness might be a too large hammer in some use cases. Therefore, I’m looking for real world programs with ample use of deepseq, and ideally easy ways to test performance (so preferably no GUI applications).
I never use deepseq, except when setting up benchmark data where it's a convenient way to make sure that the data is evaluated before the benchmark is run. When removing space leaks you want to avoid creating the thunks in the first place, not remove them after the fact. Consider a leak caused by a list of N thunks. Even if you deepseq that list to eventually remove those thunks, you won't lower your peak memory usage if the list was materialized at some point. In addition, by not creating the thunks in the first place you avoid some allocation costs. -- Johan

Joachim Breitner
I’m wondering if the use of deepseq to avoid unwanted lazyness might be a too large hammer in some use cases. Therefore, I’m looking for real world programs with ample use of deepseq, and ideally easy ways to test performance (so preferably no GUI applications).
I’ll try to find out, by runtime observerations, which of the calls ot deepseq could be replaced by id, seq, or „shallow seqs“ that, for example, calls seq on the elements of a tuple.
Now that you know when /not/ to use deepseq, let me tell you when it's appropriate: parallelization via parallel strategies. It's not exactly necessary to use deepseq (or rdeepseq in this case), but it's often very easy to express your algorithms in the usual way and then just change some of the 'map' applications to 'parMap rdeepseq'. When your algorithm is written with parallelization in mind this often gives you an amazingly parallel program by changing only a few words in your source code. Greets, Ertugrul -- Not to be or to be and (not to be or to be and (not to be or to be and (not to be or to be and ... that is the list monad.
participants (7)
-
Edward Z. Yang
-
Erik Hesselink
-
Ertugrul Söylemez
-
Evan Laforge
-
Joachim Breitner
-
Johan Tibell
-
Johannes Waldmann