Memory usage outside of the Haskell heap

Folks, How can I profile memory usage outside of the Haskell heap? I'm using FastStrings for everything and some folks on #haskell suspect it might be causing my problems. hGet :: Handle -> Int -> IO FastString hGet _ 0 = return empty hGet h i = do fp <- mallocForeignPtr i l <- withForeignPtr fp $ \p-> hGetBuf h p i return $ PS fp 0 l I do this a lot and the finalizers might not be running or something like that. Thanks, Joel -- http://wagerlabs.com/

Is there a way to tell ghc to use a debug version of the malloc library when building? Can I just pass in -lmalloc_debug or something like that? On Nov 5, 2005, at 12:18 PM, Joel Reymont wrote:
Folks,
How can I profile memory usage outside of the Haskell heap? I'm using FastStrings for everything and some folks on #haskell suspect it might be causing my problems.
hGet :: Handle -> Int -> IO FastString hGet _ 0 = return empty hGet h i = do fp <- mallocForeignPtr i l <- withForeignPtr fp $ \p-> hGetBuf h p i return $ PS fp 0 l

On Sat, Nov 05, 2005 at 12:18:04PM +0000, Joel Reymont wrote:
How can I profile memory usage outside of the Haskell heap? I'm using FastStrings for everything and some folks on #haskell suspect it might be causing my problems.
hGet :: Handle -> Int -> IO FastString hGet _ 0 = return empty hGet h i = do fp <- mallocForeignPtr i l <- withForeignPtr fp $ \p-> hGetBuf h p i return $ PS fp 0 l
I just checked that the memory allocated with Foreign.ForeignPtr.mallocForeignPtrBytes is included in the GC stats. Does the FastString library use its own implementation of mallocForeignPtr? Best regards Tomasz

On Nov 5, 2005, at 3:13 PM, Tomasz Zielonka wrote:
I just checked that the memory allocated with Foreign.ForeignPtr.mallocForeignPtrBytes is included in the GC stats. Does the FastString library use its own implementation of mallocForeignPtr?
Yes, -- (internal) GC wrapper of mallocForeignPtrArray mallocForeignPtr :: Int -> IO (ForeignPtr Word8) mallocForeignPtr l = when (l > 1000000) performGC >> mallocForeignPtrArray l -- http://wagerlabs.com/

On Sat, Nov 05, 2005 at 12:18:04PM +0000, Joel Reymont wrote:
How can I profile memory usage outside of the Haskell heap? I'm using FastStrings for everything and some folks on #haskell suspect it might be causing my problems.
hGet :: Handle -> Int -> IO FastString hGet _ 0 = return empty hGet h i = do fp <- mallocForeignPtr i l <- withForeignPtr fp $ \p-> hGetBuf h p i return $ PS fp 0 l
I do this a lot and the finalizers might not be running or something like that.
You can attach a print message to the finalizer and also print a message out when creating the string. I've found this to be helpful at times. -- David Roundy http://www.darcs.net

How do I do that?
On 11/5/05, David Roundy
out when creating the string. I've found this to be helpful at times.

David, I followed your suggestion and I think the finalizers for FPS are not running. Please take a look at http://wagerlabs.com/Bar.hs to see what I mean. I had to modify the FPS export list to expose the constructor but that's the only change I made. The other useful tidbits are: -- (internal) GC wrapper of mallocForeignPtrArray mallocForeignPtr :: Int -> IO (ForeignPtr Word8) mallocForeignPtr l = when (l > 1000000) performGC >> mallocForeignPtrArray l and createPS :: Int -> (Ptr Word8 -> IO ()) -> FastString createPS l write_ptr = unsafePerformIO $ do fp <- mallocForeignPtr l withForeignPtr fp $ \p -> write_ptr p return $ PS fp 0 l I replaced mallocForeignPtr with a call to mallocForeignBytes but that did not make a difference, the finalizers still don't run. Joel On Nov 5, 2005, at 4:57 PM, David Roundy wrote:
You can attach a print message to the finalizer and also print a message out when creating the string. I've found this to be helpful at times.

The finalizers do run if I insert unsafeFinalize fps right before recursing into foo, like this: main = do foo data_ performGC foo [] = return () foo (x:xs) = do let fps = P.packWords x installFinalizer fps putStrLn $ "fps: " ++ show fps unsafeFinalize fps foo xs installFinalizer fps@(P.PS fp _ _) = addForeignPtrConcFinalizer fp $ putStrLn $ "--- Finalized: " ++ show (P.unpackWords fps) Thanks, Joel On Nov 6, 2005, at 2:29 PM, Joel Reymont wrote:
David,
I followed your suggestion and I think the finalizers for FPS are not running.
Please take a look at http://wagerlabs.com/Bar.hs to see what I mean. I had to modify the FPS export list to expose the constructor but that's the only change I made.

Joel Reymont wrote:
foo [] = return () foo (x:xs) = do let fps = P.packWords x installFinalizer fps putStrLn $ "fps: " ++ show fps unsafeFinalize fps <---- ??? foo xs
Is it even possible for the compiler to finalize and gc `fps' at the marked line? Without the unsafeFinalize, `fps' is still in scope and might be used after `foo xs' returns. Is GHC supposed to see that `fps' is no longer in use? Does the following code work better?
foo (x:xs) = do do let fps = P.packWords x installFinalizer fps putStrLn $ "fps: " ++ show fps foo xs
BTW, Joel, it seems, your code is creating lots of very short FastPackedStrings. That's useless, a list of Word8s could replace a list of short FastPackedStrings at essentially no additional cost. But it would simplify the code a lot, overcome any problems with finalizers in passing, and probably enable weird and wonderful optimizations by GHC. Only long FastPackedStrings will be fast, and only if you're not growing them in pieces, and if you're storing them for an extended time, and if you're consuming them more than once. It looks as if you're using them four times wrong. Udo. -- The most happy marriage I can imagine to myself would be the union of a deaf man to a blind woman. -- Samuel Taylor Coleridge

On Sun, Nov 06, 2005 at 02:29:05PM +0000, Joel Reymont wrote:
David,
I followed your suggestion and I think the finalizers for FPS are not running.
This is definitely odd. I can't see a reason why the finalizers wouldn't be running. -- David Roundy http://www.darcs.net

You need to do something like this to force them to run, apparently
main = do foo $ Prelude.concat $ replicate 100 data_
Just triggering the GC with performGC doesn't do it.
The really odd thing is that if I setup a C finalizer (see below)
then it's triggered once and the whole thing crashes with a bus
error. Duncan reported that it segfaulted for him as well.
installFinalizer fps@(P.PS fp _ _) =
addForeignPtrFinalizer finalizer fp
foreign import ccall unsafe "finalizer" finalizer :: FinalizerPtr Word8
where the finalizer itself is
#include
This is definitely odd. I can't see a reason why the finalizers wouldn't be running.

joelr1:
David,
I followed your suggestion and I think the finalizers for FPS are not running.
Hmm. Is this on Windows only though? Or also on unix?
Please take a look at http://wagerlabs.com/Bar.hs to see what I mean. I had to modify the FPS export list to expose the constructor but that's the only change I made.
Should be exported already: module Data.FastPackedString ( -- * The @FastString@ type FastString(..), - Maybe you're running an old version? -- Don
The other useful tidbits are:
-- (internal) GC wrapper of mallocForeignPtrArray mallocForeignPtr :: Int -> IO (ForeignPtr Word8) mallocForeignPtr l = when (l > 1000000) performGC >> mallocForeignPtrArray l
and
createPS :: Int -> (Ptr Word8 -> IO ()) -> FastString createPS l write_ptr = unsafePerformIO $ do fp <- mallocForeignPtr l withForeignPtr fp $ \p -> write_ptr p return $ PS fp 0 l
I replaced mallocForeignPtr with a call to mallocForeignBytes but that did not make a difference, the finalizers still don't run.
Joel
On Nov 5, 2005, at 4:57 PM, David Roundy wrote:
You can attach a print message to the finalizer and also print a message out when creating the string. I've found this to be helpful at times.

Yes, you are right. On Nov 6, 2005, at 11:57 PM, Donald Bruce Stewart wrote:
Should be exported already:
module Data.FastPackedString (
-- * The @FastString@ type FastString(..), -
Maybe you're running an old version?

joelr1:
David,
I followed your suggestion and I think the finalizers for FPS are not running.
Here are some experiments. Sometimes I can get the finalizers to run as expected, but only when putting memory pressure on the Haskell heap with normal haskell values. Works with 6.4.1 and 6.5 on OpenBSD or Linux. So here we generate a bunch of lists inside Haskell, then pack them. Our finalisers get run nicely as expected: import qualified Data.FastPackedString as P import Foreign.Concurrent import Data.Word import Data.Char data_ = [ replicate 5000 i | i <- [1..100] ] :: [[Word8]] main = do foo data_ putStrLn "" foo [] = return () foo (x:xs) = do let fps = P.packWords x installFinalizer fps putStr "." foo xs installFinalizer fps@(P.PS fp _ _) = addForeignPtrFinalizer fp $ putStr (show.ord.P.head $ fps) pill00$ ./a.out 100 ..................................................1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950...........................................515253545556575859606162636465666768697071727374757677787980818283848586878889909192....... Now, if we instead allocate only outside the Haskell heap (with an mmap packed string), then we see something different: import qualified Data.FastPackedString as P import Foreign.Concurrent import Data.Word import Data.Char main = do foo 100 putStrLn "" foo 0 = return () foo n = do fps <- P.mmapFile "/home/dons/tmp/tests/128k" installFinalizer fps putStr "." foo (n-1) installFinalizer fps@(P.PS fp _ _) = addForeignPtrFinalizer fp $ putStr (show.ord.P.head $ fps) No finalisers (even though more data is allocated)! $ ./a.out ...................................................................................... The GHC rts doesn't know how big the packed string is. It just sees the few bytes of the packed string (as dcoutts pointed out on irc). Strangely, the same effect is seen with P.readFile, which (from darcs) allocates as so: mallocForeignPtr :: Int -> IO (ForeignPtr Word8) mallocForeignPtr l = when (l > 1000000) performGC >> mallocForeignPtrArray l where here mallocForeignPtrArray ends up calling GHC.Prim.newPinnedByteArray# (i.e. on the Haskell heap). Hmm. I don't understand that then, as the memory is in the Haskell heap, yet still no finalisers are run. If we do something expensive with the packed string, then finalizers get run: foo n = do fps <- P.readFile "/home/dons/tmp/tests/128k" installFinalizer fps -- unpack the string into a normal Haskell list: forkIO (Control.Exception.evaluate ((length . P.unpack) fps) >> return ()) putStr "." foo (n-1) Also interestingly, calling unsafeFinalize didn't help. Simon, any thoughts on this? Is there some way we can ensure memory outside the Haskell heap gets counted for its full weight? Also, why don't we see any finalizers run with the pinned arrays? -- Don
participants (6)
-
David Roundy
-
dons@cse.unsw.edu.au
-
joel reymont
-
Joel Reymont
-
Tomasz Zielonka
-
Udo Stenzel