
Hi Cafe, In the last couple of days I completed my quest of making my graphing utility timeplot ( http://jkff.info/software/timeplotters ) not load the whole input dataset into memory and consequently be able to deal with datasets of any size, provided however that the amount of data to *draw* is not so large. On the go it also got a huge speedup - previously visualizing a cluster activity dataset with a million events took around 15 minutes and a gig of memory, now it takes 20 seconds and 6 Mb max residence. (I haven't yet uploaded to hackage as I have to give it a bit more testing) The refactoring involved a number of interesting programming patterns that I'd like to share with you and ask for feedback - perhaps something can be simplified. The source is at http://github.com/jkff/timeplot The datatype of incremental computations is at https://github.com/jkff/timeplot/blob/master/Tools/TimePlot/Incremental.hs . Strictness is extremely important here - the last memory leak I eliminated was lack of bang patterns in teeSummary. There's an interesting function statefulSummary that shows how closures allow you to achieve encapsulation over an unknown piece of state - curiously enough, you can't define StreamSummary a r as StreamSummary { init :: s, insert :: a->s->s, finalize :: s->r } (existentially qualified over s), as then you can't define summaryByKey - you don't know what type to store in the states map. It is used to incrementally build all plots simultaneously, shown by the main loop in makeChart at https://github.com/jkff/timeplot/blob/master/Tools/TimePlot.hs Incremental building of plots of different types is at https://github.com/jkff/timeplot/blob/master/Tools/TimePlot/Plots.hs There are also a few interesting functions in that file - e.g. edges2eventsSummary, which applies a summary over a stream of "long" events to a stream of rise/fall edges. This means that you can define a "stream transformer" (Stream a -> Stream b) as a function (StreamSummary b -> StreamSummary a), which can be much easier. I have to think more about this idea. -- Eugene Kirpichov Principal Engineer, Mirantis Inc. http://www.mirantis.com/ Editor, http://fprog.ru/