
On 18 October 2005 10:08, Krasimir Angelov wrote:
I am curious why FPS is implemented on top of ForeignPtr. ByteArray# based implementation should be faster and more space efficient.
Actually there's not much difference between ForeignPtr and ByteArray#, at least in GHC 6.5+ where we optimised the ForeignPtr implementation. Furthermore ForeignPtr is much more useful, because you can create a ForeignPtr from a Ptr, and you can pass a ForeignPtr to C code portably (i.e. not using GHC extensions). ForeignPtr is essential if you want to make packed strings from eg. mmap()'d memory. Also, a ForeignPtr allocated using mallocForeignPtr *is* a ByteArray#. My PackedString library (which DonS is currently merging with FPS) uses ForeignPtr and gets pretty good performance. withForeignPtr compiles to almost nothing at all: the touch# becomes visible to GHC and compiles to no code. I needed to inline unsafePerformIO to get the best results, though (not always safe, but in this case it's ok). Cheers, Simon

simonmar:
On 18 October 2005 10:08, Krasimir Angelov wrote:
I am curious why FPS is implemented on top of ForeignPtr. ByteArray# based implementation should be faster and more space efficient.
Actually there's not much difference between ForeignPtr and ByteArray#, at least in GHC 6.5+ where we optimised the ForeignPtr implementation.
Furthermore ForeignPtr is much more useful, because you can create a ForeignPtr from a Ptr, and you can pass a ForeignPtr to C code portably (i.e. not using GHC extensions). ForeignPtr is essential if you want to make packed strings from eg. mmap()'d memory.
Right. And with a ForeignPtr, we can attach a finalizer, to e.g. unmap mmapped regions.
Also, a ForeignPtr allocated using mallocForeignPtr *is* a ByteArray#.
-- Don

ForeignPtr requires more memory. Each ForeignPtr has ForeignPtrContents structure attached. In the ForeignPtrContents there is one IORef. Each finalizer adds one Weak# object in the heap. The IORef is a reference to a list of finalizers, so each new finalizer adds a new CONS cell. The finalizers require additional steps in GC time. I haven't tried to measure the difference between ForeignPtr and ByteArray# based implementations but it seems to me that the latter might be better. Of course it will be GHC specific. mmap is valuable only for large string. Typically only when someone wants to read the entire file in the memory. For this case a special data structure might be provided. Cheers, Krasimir

Hello Guys, I tried my own version of PackedStrings and the results are very nice. It is entirely based on ByteArray# and Int#. I have made two tests:
--import StringUtils.PackedString as PS import Data.FastPackedString as PS
main = test1 [] 1000000 --main = test2 [] 1000000
test1 xs 0 = return () test1 xs n = let x = PS.pack ("*****" ++ show n ++ "*****") in x `seq` test1 (x : xs) (n-1)
test2 xs 0 = return () test2 xs n = let x = PS.concat [PS.pack "*****", PS.pack (show n), PS.pack "*****"] in x `seq` test2 (x : xs) (n-1)
The detailed statistics are attached. Basically the results are: Elapsed time | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 99.26s | 3.81s | |test2| 175.88s | 5.28s | Maximum Memory Residency | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 40.60Mb | 36.25Mb | |test2| 91.58Mb | 33.94Mb | My PackedString implementation is attached too. Cheers, Krasimir

kr.angelov:
Hello Guys,
I tried my own version of PackedStrings and the results are very nice. It is entirely based on ByteArray# and Int#. I have made two tests:
Elapsed time | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 99.26s | 3.81s | |test2| 175.88s | 5.28s |
Maximum Memory Residency | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 40.60Mb | 36.25Mb | |test2| 91.58Mb | 33.94Mb |
Wow. Now this is really surprising. Firstly, I would point out that only testing pack and concat may be slightly unrepresentative :) However, on my machine: OpenBSD/Pentium-M 1.6G/ghc-6.5 -O Elapsed time: FPS Simon's PackedString Krasimir's test1 1.966s (40M) 2.151s (36M) 2.235s (36M) test2 6.048s (24M) 3.160s (73M) 2.318s (39M) Which is basically what I expected. Though perhaps I need to improve concat (we currently do things a little strangely in concat, due to the darcs legacy), but pack itself is nice and fast. Linux/Pentium 4 3.6G/ghc-6.4.1 -O test1 35.37s 30.97s 2.180s test2 90.93s 60.55s 1.916s Ah!! So what's going on on Linux, I wonder. Could it be something about 6.4.1? Are we seeing the difference between ForeignPtrs from 6.4 to 6.5? I will investigate. I'd be very wary of switching entirely to non-portable ghc primop-based code, as FPS already run ons hugs and I think nhc. -- Don

On Wed, Oct 19, 2005 at 10:07:37AM +1000, Donald Bruce Stewart wrote:
kr.angelov:
Hello Guys,
I tried my own version of PackedStrings and the results are very nice. It is entirely based on ByteArray# and Int#. I have made two tests:
Elapsed time | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 99.26s | 3.81s | |test2| 175.88s | 5.28s |
Maximum Memory Residency | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 40.60Mb | 36.25Mb | |test2| 91.58Mb | 33.94Mb |
Wow. Now this is really surprising.
Firstly, I would point out that only testing pack and concat may be slightly unrepresentative :)
However, on my machine:
OpenBSD/Pentium-M 1.6G/ghc-6.5 -O Elapsed time: FPS Simon's PackedString Krasimir's test1 1.966s (40M) 2.151s (36M) 2.235s (36M) test2 6.048s (24M) 3.160s (73M) 2.318s (39M)
Which is basically what I expected. Though perhaps I need to improve concat (we currently do things a little strangely in concat, due to the darcs legacy), but pack itself is nice and fast.
Linux/Pentium 4 3.6G/ghc-6.4.1 -O test1 35.37s 30.97s 2.180s test2 90.93s 60.55s 1.916s
Ah!! So what's going on on Linux, I wonder. Could it be something about 6.4.1? Are we seeing the difference between ForeignPtrs from 6.4 to 6.5? I will investigate.
I'd be very wary of switching entirely to non-portable ghc primop-based code, as FPS already run ons hugs and I think nhc.
can we add Data.PackedString and my PackedString (in the jhc repo) to the testing lineup? actually, is the test code available somewhere? John -- John Meacham - ⑆repetae.net⑆john⑈

john:
On Wed, Oct 19, 2005 at 10:07:37AM +1000, Donald Bruce Stewart wrote:
kr.angelov:
Hello Guys,
I tried my own version of PackedStrings and the results are very nice. It is entirely based on ByteArray# and Int#. I have made two tests:
Elapsed time | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 99.26s | 3.81s | |test2| 175.88s | 5.28s |
Maximum Memory Residency | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 40.60Mb | 36.25Mb | |test2| 91.58Mb | 33.94Mb |
Wow. Now this is really surprising.
Firstly, I would point out that only testing pack and concat may be slightly unrepresentative :)
However, on my machine:
OpenBSD/Pentium-M 1.6G/ghc-6.5 -O Elapsed time: FPS Simon's PackedString Krasimir's test1 1.966s (40M) 2.151s (36M) 2.235s (36M) test2 6.048s (24M) 3.160s (73M) 2.318s (39M)
Which is basically what I expected. Though perhaps I need to improve concat (we currently do things a little strangely in concat, due to the darcs legacy), but pack itself is nice and fast.
Linux/Pentium 4 3.6G/ghc-6.4.1 -O test1 35.37s 30.97s 2.180s test2 90.93s 60.55s 1.916s
Ah!! So what's going on on Linux, I wonder. Could it be something about 6.4.1? Are we seeing the difference between ForeignPtrs from 6.4 to 6.5? I will investigate.
I'd be very wary of switching entirely to non-portable ghc primop-based code, as FPS already run ons hugs and I think nhc.
can we add Data.PackedString and my PackedString (in the jhc repo) to the testing lineup?
actually, is the test code available somewhere?
Ok, so we have: FPS is at http://www.cse.unsw.edu.au/~dons/fps.html SimonM's code I've posted at: http://www.cse.unsw.edu.au/~dons/packedstring.tar.gz Data.PackedString in the base hier libs Krasimir's primop code posted online. John's code in the jhc repo (where?) And also, potentially, is the FastString.hs code in ghc's utils/ dir. I'm not sure if just testing pack and concat are very useful though. Pack, at least, is rarely used in the way we're testing it -- generally you avoid having Strings in the first place. There's some other tests in Simon's code, and a full regress suite in FPS, using some large data sets. What are we trying to establish here? -- Don

dons:
john:
On Wed, Oct 19, 2005 at 10:07:37AM +1000, Donald Bruce Stewart wrote:
kr.angelov:
Hello Guys,
I tried my own version of PackedStrings and the results are very nice. It is entirely based on ByteArray# and Int#. I have made two tests:
Elapsed time | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 99.26s | 3.81s | |test2| 175.88s | 5.28s |
Maximum Memory Residency | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 40.60Mb | 36.25Mb | |test2| 91.58Mb | 33.94Mb |
Wow. Now this is really surprising.
Firstly, I would point out that only testing pack and concat may be slightly unrepresentative :)
However, on my machine:
OpenBSD/Pentium-M 1.6G/ghc-6.5 -O Elapsed time: FPS Simon's PackedString Krasimir's test1 1.966s (40M) 2.151s (36M) 2.235s (36M) test2 6.048s (24M) 3.160s (73M) 2.318s (39M)
Which is basically what I expected. Though perhaps I need to improve concat (we currently do things a little strangely in concat, due to the darcs legacy), but pack itself is nice and fast.
Linux/Pentium 4 3.6G/ghc-6.4.1 -O test1 35.37s 30.97s 2.180s test2 90.93s 60.55s 1.916s
Ah!! So what's going on on Linux, I wonder. Could it be something about 6.4.1? Are we seeing the difference between ForeignPtrs from 6.4 to 6.5? I will investigate.
I'd be very wary of switching entirely to non-portable ghc primop-based code, as FPS already run ons hugs and I think nhc.
can we add Data.PackedString and my PackedString (in the jhc repo) to the testing lineup?
actually, is the test code available somewhere?
Ok, so we have: FPS is at http://www.cse.unsw.edu.au/~dons/fps.html SimonM's code I've posted at: http://www.cse.unsw.edu.au/~dons/packedstring.tar.gz Data.PackedString in the base hier libs Krasimir's primop code posted online. John's code in the jhc repo (where?)
And also, potentially, is the FastString.hs code in ghc's utils/ dir.
I'm not sure if just testing pack and concat are very useful though. Pack, at least, is rarely used in the way we're testing it -- generally you avoid having Strings in the first place.
There's some other tests in Simon's code, and a full regress suite in FPS, using some large data sets.
What are we trying to establish here?
(As I'd be willing to set up a nice benchmark script for all this code) -- Don

Hello John, Wednesday, October 19, 2005, 4:23:00 AM, you wrote: JM> can we add Data.PackedString and my PackedString (in the jhc repo) to JM> the testing lineup? JM> actually, is the test code available somewhere? i think, that larger testsuite for string-implementation libraries need to be established. this testsuite must, of course, include testing of individual operations and more complex scenarios which can be seen as typical for string usage in real programs it will be also interesting to add [Char] as possible string implementation in these tests - to see what improvements can we get over standard strings implementation i also need to say that lack of large string-processing library and very non-efficient default representation of strings sufficiently beats, imho, haskell/ghc as "real-world", "imperative", and "scripting" language. for example, my own program, which scans hard drive and saves all found filenames in memory, need 3-4 times more memory than analogous programs written in C++. another time, when i tried to write bencmarking script, i found that half of my code was just realization of string-processing functions, such as trim_spaces and replace inlcuding large and efficient string-processing library in GHC (and then Haskell) will be very helpful for such areas of programming -- Best regards, Bulat mailto:bulatz@HotPOP.com

bulatz:
Hello John,
Wednesday, October 19, 2005, 4:23:00 AM, you wrote:
JM> can we add Data.PackedString and my PackedString (in the jhc repo) to JM> the testing lineup?
JM> actually, is the test code available somewhere?
i think, that larger testsuite for string-implementation libraries need to be established. this testsuite must, of course, include testing of individual operations and more complex scenarios which can be seen as typical for string usage in real programs
it will be also interesting to add [Char] as possible string implementation in these tests - to see what improvements can we get over standard strings implementation
I'll start on this -- it's a good idea! :) -- Don

Hello Donald, Thursday, October 20, 2005, 5:00:15 AM, you wrote:
i think, that larger testsuite for string-implementation libraries need to be established. this testsuite must, of course, include testing of individual operations and more complex scenarios which can be seen as typical for string usage in real programs
DBS> I'll start on this -- it's a good idea! :) spellcheck.hs from http://www.cse.unsw.edu.au/~dons/packedstring.tar.gz is a very good candidate -- Best regards, Bulat mailto:bulatz@HotPOP.com

bulatz:
Hello Donald,
Thursday, October 20, 2005, 5:00:15 AM, you wrote:
i think, that larger testsuite for string-implementation libraries need to be established. this testsuite must, of course, include testing of individual operations and more complex scenarios which can be seen as typical for string usage in real programs
DBS> I'll start on this -- it's a good idea! :)
spellcheck.hs from http://www.cse.unsw.edu.au/~dons/packedstring.tar.gz is a very good candidate
Yep, I'll start with that, and the coverage HUnit coverage tests in FPS.

2005/10/19, Donald Bruce Stewart
kr.angelov:
Hello Guys,
I tried my own version of PackedStrings and the results are very nice. It is entirely based on ByteArray# and Int#. I have made two tests:
Elapsed time | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 99.26s | 3.81s | |test2| 175.88s | 5.28s |
Maximum Memory Residency | | FastPackedString | PackedString | +-----+------------------+--------------+ |test1| 40.60Mb | 36.25Mb | |test2| 91.58Mb | 33.94Mb |
Wow. Now this is really surprising.
Firstly, I would point out that only testing pack and concat may be slightly unrepresentative :)
Of course the representative benchmark needs more test cases.
However, on my machine:
OpenBSD/Pentium-M 1.6G/ghc-6.5 -O Elapsed time: FPS Simon's PackedString Krasimir's test1 1.966s (40M) 2.151s (36M) 2.235s (36M) test2 6.048s (24M) 3.160s (73M) 2.318s (39M)
Which is basically what I expected. Though perhaps I need to improve concat (we currently do things a little strangely in concat, due to the darcs legacy), but pack itself is nice and fast.
Linux/Pentium 4 3.6G/ghc-6.4.1 -O test1 35.37s 30.97s 2.180s test2 90.93s 60.55s 1.916s
Ah!! So what's going on on Linux, I wonder. Could it be something about 6.4.1? Are we seeing the difference between ForeignPtrs from 6.4 to 6.5? I will investigate.
I wonder why the test results are so different. I made my benchmark on WinXP on Celeron 3GHz with GHC from CVS HEAD. Cheers, Krasimir

kr.angelov:
Hello Guys,
I tried my own version of PackedStrings and the results are very nice. It is entirely based on ByteArray# and Int#. I have made two tests:
As an aside, Yi used to use ByteArray#'s for buffers, I then switched to ForeignPtrs (a la FastPackedString) -- and the code got faster. -- Don
participants (5)
-
Bulat Ziganshin
-
dons@cse.unsw.edu.au
-
John Meacham
-
Krasimir Angelov
-
Simon Marlow