
On 09/09/2010 00:28, Daniel Fischer wrote:
On Wednesday 08 September 2010 23:55:35, Don Stewart wrote:
simonpj:
|> ghc-6.12.3: |> 89,330,672 bytes allocated in the heap |> 15,092 bytes copied during GC |> 35,980 bytes maximum residency (1 sample(s)) |> 29,556 bytes maximum slop |> 2 MB total memory in use (0 MB lost due to |> fragmentation) |> |> ghc-6.13.20100831: |> 475,305,720 bytes allocated in the heap |> 89,272 bytes copied during GC |> 68,860 bytes maximum residency (1 sample(s)) |> 29,444 bytes maximum slop |> 2 MB total memory in use (0 MB lost due to |> fragmentation) | | Can you put your benchmark code somewhere? Likely a GHC | regression.
Indeed bad. If someone could characterise the regression more precisely (e.g. "fusion isn't happening here") that would be jolly helpful.
Shouldn't be fusion. Is this a straight IO function. Something to do with buffering/encoding?
Maybe the following observation helps:
ghc-6.13.20100831 reads lazy ByteStrings in chunks of 8192 bytes.
If I understand correctly, that means (since defaultChunkSize = 32760) - bytestring allocates a 32K buffer to be filled and asks ghc for 32760 bytes in that buffer - ghc asks the OS for 8192 bytes (and usually gets them) - upon receiving fewer bytes than requested, bytestring copies them to a new smaller buffer - since the number of bytes received is a multiple of ghc's allocation block size (which I believe is 4K), there's no space for the bookkeeping overhead, hence the new buffer takes up 12K instead of 8, resulting in 44K allocation for 8K bytes
That factor of 5.5 corresponds pretty well with the allocation figures above, and the extra copying explains the approximate doubling of I/O time.
Trying to find out why ghc asks the OS for only 8192 bytes instead of 32760 hasn't brought enlightenment yet.
I think I've found the problem, GHC.IO.Handle.Text: bufReadNBEmpty :: Handle__ -> Buffer Word8 -> Ptr Word8 -> Int -> Int -> IO Int bufReadNBEmpty h_@Handle__{..} buf@Buffer{ bufRaw=raw, bufR=w, bufL=r, bufSize=sz } ptr so_far count | count > sz, False, Just fd <- cast haDevice = do m <- RawIO.readNonBlocking (fd::FD) ptr count case m of Nothing -> return so_far Just n -> return (so_far + n) See if you can spot it. I must have made this change for debugging, and forgot to revert it. d'oh! Cheers, Simon
Cheers, Daniel
Excerpt of strace log:
read(3, "%!PS-Adobe-2.0\n%%Title: nbench\n%"..., 8192) = 8192 open("/usr/lib/gconv/UTF-32.so", O_RDONLY) = 4 read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\4\0\0004\0\0\0"..., 512) = 512 fstat64(4, {st_mode=S_IFREG|0755, st_size=9672, ...}) = 0 mmap2(NULL, 12328, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) = 0xb7852000 fadvise64(4, 0, 12328, POSIX_FADV_WILLNEED) = 0 mmap2(0xb7854000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED| MAP_DENYWRITE, 4, 0x1) = 0xb7854000 close(4) = 0 mprotect(0xb7854000, 4096, PROT_READ) = 0 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbff37cb0) = -1 ENOTTY (Inappropriate ioctl for device) select(2, [], [1], NULL, {0, 0}) = 1 (out [1], left {0, 0}) write(1, "", 0) = 0 select(2, [], [1], NULL, {0, 0}) = 1 (out [1], left {0, 0}) write(1, "%!PS-Adobe-2.0\n%%Title: nbench\n%"..., 8192) = 8192 read(3, " 20.000000 lineto\n121.153524 20."..., 8192) = 8192 select(2, [], [1], NULL, {0, 0}) = 1 (out [1], left {0, 0}) write(1, "", 0) = 0 select(2, [], [1], NULL, {0, 0}) = 1 (out [1], left {0, 0}) write(1, " 20.000000 lineto\n121.153524 20."..., 8192) = 8192 read(3, "30.542315 21.394403 lineto\n125.3"..., 8192) = 8192 select(2, [], [1], NULL, {0, 0}) = 1 (out [1], left {0, 0}) write(1, "", 0) = 0 select(2, [], [1], NULL, {0, 0}) = 1 (out [1], left {0, 0}) write(1, "30.542315 21.394403 lineto\n125.3"..., 8192) = 8192 read(3, "neto\n308.929337 21.969871 lineto"..., 8192) = 8192 _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users