
On Wed, Nov 12 2014, Christopher Allen
[Snip] csv-conduit isn't in the test results because I couldn't figure out how to use it. pipes-csv is proper streaming, but uses cassava's parsing machinery and data types. Possibly this is a problem if you have really wide rows but I've never seen anything that would be problematic in that realm even when I did a lot of HDFS/Hadoop ecosystem stuff. AFAICT with pipes-csv you're streaming rows, but not columns. With csv-conduit you might be able to incrementally process the columns too based on my guess from glancing at the rather scary code.
Any problems in particular? I've had pretty good luck with csv-conduit. However, I have noticed that it's rather picky about type signatures and integrating custom data types isn't straight forward at first. csv-conduit also seems to have drawn inspiration from cassava: http://hackage.haskell.org/package/csv-conduit-0.6.3/docs/Data-CSV-Conduit-C...
[Snip] To that end, take a look at my rather messy workspace here: https://github.com/bitemyapp/csvtest
I've made a PR for the conduit version: https://github.com/bitemyapp/csvtest/pull/1 It could certainly be made more performent but it seems to hold up well in comparison. I would be interested in reading the How I Start Article and hearing more about your conclusions. Is this focused primarily on the memory profile or also speed? Regards, -Christopher
Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe