Executing conduit streams in parallel leads to memory leaks

When I run my conduit without any additions, it works as expected, with low constant memory usage, as advertised. It's a bit slow, so I tried to speed it up with worker pools (via parallel-io) and staged folding (via stm-conduit). However, then the memory usage indicates all the ByteString from the file readings are being fully allocated and kept in memory, even though they're not being used after a step of conduit. [1] I thought maybe because of the closing IO, the release of the file handle somehow keeps the read string in memory, so I wanted to make absolutely sure that's not the problem. [2] Switch out the `Lib.readFile` with `B.readFile` to undo that specific part. I was not using a worker pool in the beginning, so maybe the `mapConcurrently_` somehow allocated all the threads, but with the pooled solution, that should be solved as well. What else could cause all the ByteStrings to be kept in memory in the parallel version? The example is available on: https://github.com/reactormonk/non-constant-memory [1] https://github.com/reactormonk/non-constant-memory/blob/master/src/Lib.hs#L5... [2] https://github.com/reactormonk/non-constant-memory/blob/master/src/Lib.hs#L6...
participants (1)
-
Simon Hafner