
Thanks Tyson. Not only for finding the problem, but for fixing it too! We love that. Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Tyson Whitehead | Sent: Friday, January 30, 2009 5:44 AM | To: GHC users | Subject: UVector overallocating for (Word/Int)(8/16/32) | | I believe the arrays for (Word/Int)(8/16/32) are currently taking eight, four, | and two times, respectively, as much memory as actually required. That is, | | newMBU n = ST $ \s1# -> | case sizeBU n (undefined::e) of {I# len# -> | case newByteArray# len# s1# of {(# s2#, marr# #) -> | (# s2#, MBUArr n marr# #) }} | | sizeBU (I# n#) _ = I# (wORD_SCALE n#) | wORD_SCALE n# = scale# *# n# where I# scale# = SIZEOF_HSWORD | | (sizeBU is a class member, but all the instances for (Word/Int)(8/16/32) are | as given above, and SIZEOF_HSWORD is defined as 8 in MachDeps.h on my x86_64) | | which would seems to always allocate memory assuming an underlying alignment | that is always eight bytes. It seems like the readWord(8/16/32)Array# | functions may have once operated that way, but, when I dumped the assembler | associated with them under ghc 6.8.2 (both native and C), I get | | readWord8Array | leaq 16(%rsi),%rax | movzbl (%rax,%rdi,1),%ebx | jmp *(%rbp) | | readWord16Array | leaq 16(%rsi),%rax | movzwl (%rax,%rdi,2),%ebx | jmp *(%rbp) | | readWord32Array | leaq 16(%rsi),%rax | movl (%rax,%rdi,4),%ebx | jmp *(%rbp) | | readWord64Array | leaq 16(%rsi),%rax | movq (%rax,%rdi,8),%rbx | jmp *(%rbp) | | which is using alignments of one, two, four, and eight bytes respectively. | | I'll attach a patch (which I haven't tested beyond compiling and looking at | the generated assembler). | | Cheers! -Tyson | |