
Daniel Peebles ha scritto:
I have added UAProd-based Binary instances to my copy of the uvector repo at http://patch-tag.com/publicrepos/pumpkin-uvector .
I can confirm that it works for me. However I have now a memory problem with data decoding. I need to serialize the Netflix Prize training dataset. When I parse the data from original data set, memory usage is about 640 MB [1]. But when I load the data serialized and compressed, (as [UArr (Word32 *:* Word8)]), memory usage is about 840 MB... The culprit is probably the decoding of the list (17770 elements). [1] I have written a script in Python that parse the data, and it only takes 491 MB (using a list of a tuple with two compact arrays from numpy). So, GHC has memory problems here. Thanks Manlio Perillo