[GHC] #13630: ByteString pinned memory can be leaky

#13630: ByteString pinned memory can be leaky -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 8.0.1 System | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- My question on IRC: {{{ How does memory allocation for pinned blocks work? Let's say pinned blocks are 4KB in size, and I allocate first a 3 KB ByteString A and then an 8-byte ByteString B. Now I GC A, no longer need it. Then according to https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/GC/Pinned "a single pinned object keeps alive the whole block in which it resides", my small ByteString B keeps the entire block alive. But what happens with the 3KB in the front of that block? Will it be re-used by the next ByteString allocation (say 1KB)? In other words, how smart is allocatePinned as an allocator? }}} The answer: {{{ slyfox: nh2: allocatePinned it quite dump. it only allocated from free tail space }}} {{{ nh2: slyfox: that sounds like a huge potential for memory leak then slyfox: yes, fragmentation is quite bad for bytestrings }}} So it seems that I can get into the unfortunate situation where a super short `ByteString` of a few bytes can waste an entire 4 KB block of memory; some migth call this a leak. One idea to solve it seems to be to change standard `ByteStrings` to not pinned, and to allocate pinned ones explicitly when needed. This seems to be an often-discussed topic and not trivial because many `ByteString` functions are implemented using libc FFI functions. However, it seems there will always be _some_ need for pinned memory, so we should better have an efficient way to manage it in any case. Efficient here means, for example, to re-use freed memory inside a block instead of only using free tail space. It seems that `jemalloc` has a feature to use given blocks of memory and provide a `malloc()` functionality inside them: https://stackoverflow.com/questions/30836359/jemalloc-mmap-and-shared- memory Perhaps this could be used to provide GHC with a simple method to use pinned memory more efficiently? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13630 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13630: ByteString pinned memory can be leaky -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by Artyom.Kazak): * cc: Artyom.Kazak (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13630#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13630: ByteString pinned memory can be leaky -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by duncan): I've been wanting to switch to unpinned for ByteString for years, but it's a lot of work. It'd mean fixing up lots of other libs that use ByteString internals and it also requires a solution for the mmap'd ByteString use case (which does have solutions but it's still more work and may need some RTS extensions to be able to have an mmaped ByteArray# that unmaps when it's collected). As a workaround I often recommend people use ShortByteString, which is part of the bytestring package and uses unpinned memory. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13630#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13630: ByteString pinned memory can be leaky -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by snoyberg): * cc: snoyberg (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13630#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13630: ByteString pinned memory can be leaky -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by adamse): * cc: adamse (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13630#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC