From: Brandon Moore <brandon_m_moore@yahoo.com>
I was worried data sharing might mean your keys
retain entire 64K chunks of the input. However, it
seems enumLines depends on the StringLike ByteString
instance, which just converts to and from String.
That can't be efficient, but I suppose it avoids excessive sharing.
That's true for 'enumLines', however the OP is using 'enumLinesBS', which operates on bytestrings directly.
Data sharing certainly could be an issue here. I tried performing Data.ByteString.copy before inserting the key into the map, but that used more memory. I don't have an explanation for this; it's not what I would expect.
The other parameter which affects sharing is the chunk size. I got a much better memory profile when using a chunksize of 1024 instead of 65536.
Oddly enough, when using the large chunksize I saw lower memory usage from Data.Map, but with the small chunksize Data.HashMap has a significant advantage.
John Lato