
#10652: Better cache performance in Array# -------------------------------------+------------------------------------- Reporter: MikeIzbicki | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): If the values pointed to by your array don't themselves contain pointers (and assuming nothing else on the heap points to them) then either a BFS or a DFS traversal will visit them all sequentially. But in many cases if your values don't contain pointers then you could find a way to store them in an unboxed array with higher memory density anyways. (Using a boxed array costs you two words per element up front, one for the pointer in the array and one for the info pointer of the If your values do themselves contain pointers then it does matter whether the traversal is BFS or DFS, but which one you want depends on how you plan to use the values, which the GC can't know without further hints. For example a large Integer contains a pointer to a ByteArray# containing the actual integer data, which you will need for most (but not all) operations on the Integer. If you have an array with a lot of large Integers, you may or may not want to put the ByteArray# after its corresponding Integer (even ignoring the possibility of the ByteArray#s being shared). My potential concerns about the original feature request are - It sounds like a lot of extra complexity in the GC - Existing programs gain nothing from this added complexity unless they are modified to use the new primops - Consequently the implementation of the new primops must have zero cost for programs that do not use them - A fairly narrow range of programs stand to benefit from the new primops - It's unclear how much there is to be gained in the best case, especially compared to user-space alternatives like explicitly prefetching the next value of the array before processing the current one You might be better off just implementing your own memory management (e.g. allocate variable-length representations of your data sequentially from a mutable ByteArray, then store offsets into this storage in your array and access the data through short-lived Haskell values, Storable-style). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10652#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler