
wurmli, what's the matter with it?
"800,100,272 bytes allocated in the heap" means that the total size of all the allocations done over the course of the program is 800,100,272 bytes. That's the expected size of 20 million (Int, Int) pairs which share their second field (`n`), plus a small amount of other stuff. It doesn't have anything to do with the size of the heap at any given time. The maximum heap size is shown separately: "50,520 bytes maximum residency" which is quite reasonable.
Similarly your original program does not ever occupy 10 GB of heap at a time. If you look at the process in top you will see a memory usage close to "47,184 bytes maximum residency" (well probably more like a couple MB, to hold the program image, but not anything near 10 GB).
I have no idea why the original program timed out on the language benchmark machines, but it wasn't due to it allocating 10 GB sequentially. Allocation of short-lived objects is very cheap. But it is not free, and
#7367: Optimiser / Linker Problem on amd64 --------------------------------------------+------------------------------ Reporter: wurmli | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.1 Component: Build System | Version: 7.6.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime performance bug | (amd64) Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Comment (by wurmli): Replying to [comment:12 rwbarton]: this discussion has been about why current GHC produces a program that allocates a lot when GHC 7.4 did not. Eliminating the large amount of allocation might reduce the runtime by a few percent or so. Would you agree that it is reasonable to expect the optimiser to optimise these allocations away? My simple assumption about the fannkuch program is that speed is enhanced if memory use stays local. The more only registers and cache are used the faster the program runs. With the repeated allocation of an intermediary variable the cache might be exhausted and the processor might have to copy in and out of cache what could slow down the program. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/7367#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler