
john:
On Wed, Oct 19, 2005 at 10:07:37AM +1000, Donald Bruce Stewart wrote:
kr.angelov:
Hello Guys,
I tried my own version of PackedStrings and the results are very nice. It is entirely based on ByteArray# and Int#. I have made two tests:
Elapsed time | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 99.26s | 3.81s | |test2| 175.88s | 5.28s |
Maximum Memory Residency | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 40.60Mb | 36.25Mb | |test2| 91.58Mb | 33.94Mb |
Wow. Now this is really surprising.
Firstly, I would point out that only testing pack and concat may be slightly unrepresentative :)
However, on my machine:
OpenBSD/Pentium-M 1.6G/ghc-6.5 -O Elapsed time: FPS Simon's PackedString Krasimir's test1 1.966s (40M) 2.151s (36M) 2.235s (36M) test2 6.048s (24M) 3.160s (73M) 2.318s (39M)
Which is basically what I expected. Though perhaps I need to improve concat (we currently do things a little strangely in concat, due to the darcs legacy), but pack itself is nice and fast.
Linux/Pentium 4 3.6G/ghc-6.4.1 -O test1 35.37s 30.97s 2.180s test2 90.93s 60.55s 1.916s
Ah!! So what's going on on Linux, I wonder. Could it be something about 6.4.1? Are we seeing the difference between ForeignPtrs from 6.4 to 6.5? I will investigate.
I'd be very wary of switching entirely to non-portable ghc primop-based code, as FPS already run ons hugs and I think nhc.
can we add Data.PackedString and my PackedString (in the jhc repo) to the testing lineup?
actually, is the test code available somewhere?
Ok, so we have: FPS is at http://www.cse.unsw.edu.au/~dons/fps.html SimonM's code I've posted at: http://www.cse.unsw.edu.au/~dons/packedstring.tar.gz Data.PackedString in the base hier libs Krasimir's primop code posted online. John's code in the jhc repo (where?) And also, potentially, is the FastString.hs code in ghc's utils/ dir. I'm not sure if just testing pack and concat are very useful though. Pack, at least, is rarely used in the way we're testing it -- generally you avoid having Strings in the first place. There's some other tests in Simon's code, and a full regress suite in FPS, using some large data sets. What are we trying to establish here? -- Don