
braver wrote:
I dump results of a computation as a Data.Trie of [(Int,Float)]. It contains about 5 million entries, with the lists of 35 or less pairs each. It takes 8 minutes to load with Data.Binary and lookup a single key. What can take so long? If I change from compressed to uncompressed (and then decode), it's the same time... It's not IO, CPU is loaded 100%.
The Binary instance for Trie is based on the old Binary instance for Data.IntMap. There were some corner-case performance issues with the latter which were recently fixed[1], but I haven't had a chance to look at the new instance or to figure out if the changes would also be relevant for Trie. So this might be a potential source of your problems. [1] Alas, I can't find the thread discussing the new instance ATM.
I'm now thinking of using cereal. Given I have Data.Binary in place, what needs to be changed to work with cereal? Is it binary- compatible? How can one construct a cereal instance for Data.Trie?
If you send me an instance, I can apply the patch. -- Live well, ~wren