bug in mallocForeignPtrBytes (both 6.4 and 6.6)

Hello glasgow-haskell-users, the attached program show up the bug in mallocForeignPtrBytes (and newPinnedByteArray#) implementation - it allocates two times more memory as requested. The bug seen both on 6.6rc and june 6.4 windows builds, namely: http://www.haskell.org/ghc/dist/current/dist/ghc-6.5.20060901-i386-unknown-m... http://www.haskell.org/ghc/dist/stable/dist/ghc-6.4.2.20060609-i386-unknown-... after program prints "40 mb allocated" look at Task Manager indication - it shows that two times more memory actually in use. it seems that problem is only with allocating memory buffers whose sizes are powers of 2 or very close - bug shows for bufsize = 4kb, 1mb or 4095, but not not for 4000 or 10^6 -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Sat, 2006-09-23 at 13:58 +0400, Bulat Ziganshin wrote:
Hello glasgow-haskell-users,
the attached program show up the bug in mallocForeignPtrBytes (and newPinnedByteArray#) implementation - it allocates two times more memory as requested. The bug seen both on 6.6rc and june 6.4 windows builds, namely:
http://www.haskell.org/ghc/dist/current/dist/ghc-6.5.20060901-i386-unknown-m... http://www.haskell.org/ghc/dist/stable/dist/ghc-6.4.2.20060609-i386-unknown-...
after program prints "40 mb allocated" look at Task Manager indication - it shows that two times more memory actually in use. it seems that problem is only with allocating memory buffers whose sizes are powers of 2 or very close - bug shows for bufsize = 4kb, 1mb or 4095, but not not for 4000 or 10^6
The reason for this is that for larger objects ghc has an allocation granularity of 4k. That is it always uses a multiple of 4k bytes. However a byte array has some overhead: it needs one word for the heap cell header and another for the length. So if you allocate a 4k byte array then it uses 8k. So the trick is to allocate 4k - overhead. This is what the Data.ByteString library does. Duncan

Hello Duncan, Sunday, September 24, 2006, 12:22:38 AM, you wrote:
after program prints "40 mb allocated" look at Task Manager indication - it shows that two times more memory actually in use. it seems that problem is only with allocating memory buffers whose sizes are powers of 2 or very close - bug shows for bufsize = 4kb, 1mb or 4095, but not not for 4000 or 10^6
The reason for this is that for larger objects ghc has an allocation granularity of 4k. That is it always uses a multiple of 4k bytes.
with 512k blocks it also allocates two times more data than requested.
However a byte array has some overhead: it needs one word for the heap cell header and another for the length. So if you allocate a 4k byte array then it uses 8k. So the trick is to allocate 4k - overhead. This is what the Data.ByteString library does.
thank you - it will be ok for my lib -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Sun, 2006-09-24 at 01:28 +0400, Bulat Ziganshin wrote:
Hello Duncan,
Sunday, September 24, 2006, 12:22:38 AM, you wrote:
after program prints "40 mb allocated" look at Task Manager indication - it shows that two times more memory actually in use. it seems that problem is only with allocating memory buffers whose sizes are powers of 2 or very close - bug shows for bufsize = 4kb, 1mb or 4095, but not not for 4000 or 10^6
The reason for this is that for larger objects ghc has an allocation granularity of 4k. That is it always uses a multiple of 4k bytes.
with 512k blocks it also allocates two times more data than requested.
Oh, sorry I missed that point. Something fishy is going on then. Duncan

Duncan Coutts wrote:
On Sun, 2006-09-24 at 01:28 +0400, Bulat Ziganshin wrote:
Hello Duncan,
Sunday, September 24, 2006, 12:22:38 AM, you wrote:
after program prints "40 mb allocated" look at Task Manager indication - it shows that two times more memory actually in use. it seems that problem is only with allocating memory buffers whose sizes are powers of 2 or very close - bug shows for bufsize = 4kb, 1mb or 4095, but not not for 4000 or 10^6
The reason for this is that for larger objects ghc has an allocation granularity of 4k. That is it always uses a multiple of 4k bytes.
with 512k blocks it also allocates two times more data than requested.
Oh, sorry I missed that point. Something fishy is going on then.
Duncan's explanation works for allocating 4k, but there's another explanation for larger blocks. GHC allocates memory from the OS in units of a "megablock" (see http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage), currently 1Mb. So if you allocate a 1Mb array, the storage manager has to allocate 1Mb + overhead, which will cause it to allocate a 2Mb megablock. The surplus will be returned to the system in the form of free blocks, but if all you do is allocate lots of 1Mb arrays, you'll waste about half the sapce because there's never enough contiguous free space to contain another 1Mb array. Similar problem for 512k arrays: the storage manager allocates a 1Mb block, and returns slightly less than half of it as free blocks, so each 512k allocation takes a whole new 1Mb block. Cheers, Simon

I added notes about this to http://haskell.org/haskellwiki/Performance/GHC S | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-bounces@haskell.org] | On Behalf Of Simon Marlow | Sent: 27 September 2006 13:44 | To: Duncan Coutts | Cc: glasgow-haskell-users@haskell.org | Subject: Re: bug in mallocForeignPtrBytes (both 6.4 and 6.6) | | Duncan Coutts wrote: | > On Sun, 2006-09-24 at 01:28 +0400, Bulat Ziganshin wrote: | > | >>Hello Duncan, | >> | >>Sunday, September 24, 2006, 12:22:38 AM, you wrote: | >> | >> | >>>>after program prints "40 mb allocated" look at Task Manager indication | >>>>- it shows that two times more memory actually in use. it seems that | >>>>problem is only with allocating memory buffers whose sizes are powers | >>>>of 2 or very close - bug shows for bufsize = 4kb, 1mb or 4095, but not | >>>>not for 4000 or 10^6 | >> | >>>The reason for this is that for larger objects ghc has an allocation | >>>granularity of 4k. That is it always uses a multiple of 4k bytes. | >> | >>with 512k blocks it also allocates two times more data than requested. | > | > Oh, sorry I missed that point. Something fishy is going on then. | | Duncan's explanation works for allocating 4k, but there's another explanation | for larger blocks. GHC allocates memory from the OS in units of a "megablock" | (see http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage), currently | 1Mb. So if you allocate a 1Mb array, the storage manager has to allocate 1Mb + | overhead, which will cause it to allocate a 2Mb megablock. The surplus will be | returned to the system in the form of free blocks, but if all you do is allocate | lots of 1Mb arrays, you'll waste about half the sapce because there's never | enough contiguous free space to contain another 1Mb array. Similar problem for | 512k arrays: the storage manager allocates a 1Mb block, and returns slightly | less than half of it as free blocks, so each 512k allocation takes a whole new | 1Mb block. | | Cheers, | Simon | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
participants (4)
-
Bulat Ziganshin
-
Duncan Coutts
-
Simon Marlow
-
Simon Peyton-Jones