
Somebody asked me, so now I'm asking you... In Haskell, you can make "unboxed" arrays of certain value types. These are typically more efficient in space, and probably time too, and also make the array strict in its values. However, you can only do this magic trick for certain types - not for *all* types. Why is that? Is it because nobody has done anything about it yet? Is it because it's thought to be "not necessary" in some way? Is there some theoretical problem? Has somebody got a better idea? I did think it was along the lines of "oh, well, if you want to unbox a type of your own, you just need to write your own instance". The thing that makes me suspicious of this logic is the absense of an instance for tuples. Surely this would be trivial to write, and yet it's not present. If we had instances for a couple of sizes of tuples, it would surely be quite easy to write your own custom instances that just sit on top of this and tuple/untuple your custom values... Any insights here?

Hello Andrew, Wednesday, March 26, 2008, 3:37:53 PM, you wrote:
type of your own, you just need to write your own instance". The thing that makes me suspicious of this logic is the absense of an instance for tuples.
Any insights here?
and even insiders :) i've rewrote arrays library to be more uniform using ideas of Simon Marlow and Oleg Kiselyov unboxed arrays implementation uses GHC primitives that fetches/stores array element by its index. these primitives implemented for simple types - from Word8 to Double but nothing more. they use *indexes*, not *byte offsets* which means that you can use them for addressing array of {Word8,Double}, for example i believe that it should be possible to rewrite array machinery using storable class (now it should not have any overheads compared to these primitives). meanwhile i recommend you to use storable array together with Storable instances for tuples -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

* Andrew Coppin
Somebody asked me, so now I'm asking you...
In Haskell, you can make "unboxed" arrays of certain value types. These are typically more efficient in space, and probably time too, and also make the array strict in its values. However, you can only do this magic trick for certain types - not for *all* types.
Why is that? Is it because nobody has done anything about it yet? Is it because it's thought to be "not necessary" in some way? Is there some theoretical problem? Has somebody got a better idea?
I did think it was along the lines of "oh, well, if you want to unbox a type of your own, you just need to write your own instance". The thing that makes me suspicious of this logic is the absense of an instance for tuples. Surely this would be trivial to write, and yet it's not present. If we had instances for a couple of sizes of tuples, it would surely be quite easy to write your own custom instances that just sit on top of this and tuple/untuple your custom values...
Any insights here?
Could Data Parallel Haskell[1] be useful for you? It was designed for parallel computation, but it includes unboxed arrays, nice list-like syntax and array comprehensions. 1. http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell -- Roman I. Cheplyaka :: http://ro-che.info/ ...being in love is totally punk rock...

On Wed, 26 Mar 2008, Roman Cheplyaka wrote:
* Andrew Coppin
[2008-03-26 12:37:53+0000] Somebody asked me, so now I'm asking you...
In Haskell, you can make "unboxed" arrays of certain value types. These are typically more efficient in space, and probably time too, and also make the array strict in its values. However, you can only do this magic trick for certain types - not for *all* types.
Why is that? Is it because nobody has done anything about it yet? Is it because it's thought to be "not necessary" in some way? Is there some theoretical problem? Has somebody got a better idea?
I did think it was along the lines of "oh, well, if you want to unbox a type of your own, you just need to write your own instance". The thing that makes me suspicious of this logic is the absense of an instance for tuples. Surely this would be trivial to write, and yet it's not present. If we had instances for a couple of sizes of tuples, it would surely be quite easy to write your own custom instances that just sit on top of this and tuple/untuple your custom values...
Any insights here?
Could Data Parallel Haskell[1] be useful for you? It was designed for parallel computation, but it includes unboxed arrays, nice list-like syntax and array comprehensions.
1. http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell
A light-weight unboxed array variant is: http://code.haskell.org/~sjanssen/storablevector/ I thought it might be more efficient sometimes to split, say Word8 and Double data into two arrays, instead of padding data in order to align a (Word8,Double) record, but this wouldn't fit into the array data structure.

* Henning Thielemann
On Wed, 26 Mar 2008, Roman Cheplyaka wrote:
* Andrew Coppin
[2008-03-26 12:37:53+0000] Somebody asked me, so now I'm asking you...
In Haskell, you can make "unboxed" arrays of certain value types. These are typically more efficient in space, and probably time too, and also make the array strict in its values. However, you can only do this magic trick for certain types - not for *all* types.
Why is that? Is it because nobody has done anything about it yet? Is it because it's thought to be "not necessary" in some way? Is there some theoretical problem? Has somebody got a better idea?
I did think it was along the lines of "oh, well, if you want to unbox a type of your own, you just need to write your own instance". The thing that makes me suspicious of this logic is the absense of an instance for tuples. Surely this would be trivial to write, and yet it's not present. If we had instances for a couple of sizes of tuples, it would surely be quite easy to write your own custom instances that just sit on top of this and tuple/untuple your custom values...
Any insights here?
Could Data Parallel Haskell[1] be useful for you? It was designed for parallel computation, but it includes unboxed arrays, nice list-like syntax and array comprehensions.
1. http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell
A light-weight unboxed array variant is: http://code.haskell.org/~sjanssen/storablevector/
I thought it might be more efficient sometimes to split, say Word8 and Double data into two arrays, instead of padding data in order to align a (Word8,Double) record, but this wouldn't fit into the array data structure.
As far as I know, ndp (which I linked above) takes exactly this approach. -- Roman I. Cheplyaka :: http://ro-che.info/ ...being in love is totally punk rock...

On Wed 2008-03-26 14:22, Henning Thielemann wrote:
A light-weight unboxed array variant is: http://code.haskell.org/~sjanssen/storablevector/
There is also CArray which offers an immutable interface for any Storable. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/carray You can also do unsafe (no-copy) freeze/thaw to IOCArray which is equivalent to StorableArray. Unfortunately there is a performance hit to using Storable versus the built in unboxed types. On the other hand, CArray is binary compatible with C which is handy if you are making foreign calls.
I thought it might be more efficient sometimes to split, say Word8 and Double data into two arrays, instead of padding data in order to align a (Word8,Double) record, but this wouldn't fit into the array data structure.
This depends on the access pattern, but if it makes you miss the cache frequently, then it is certainly not the way to go. For split storage to win, you must frequently traverse while only referencing one entry. It is not clear (to me) how to make an interface where the split storage is hidden, but the compiler can still remove all references to the unused part (without boxing which defeats the purpose). Jed

Hello Jed, Wednesday, March 26, 2008, 7:02:28 PM, you wrote:
StorableArray. Unfortunately there is a performance hit to using Storable versus the built in unboxed types.
are you sure? it was in ghc 6.4, now afair they should be the same. look in http://haskell.org/haskellwiki/Modern_array_libraries -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Wed 2008-03-26 19:50, Bulat Ziganshin wrote:
Hello Jed,
Wednesday, March 26, 2008, 7:02:28 PM, you wrote:
StorableArray. Unfortunately there is a performance hit to using Storable versus the built in unboxed types.
are you sure? it was in ghc 6.4, now afair they should be the same. look in http://haskell.org/haskellwiki/Modern_array_libraries
The comparison I'm referring to is a micro-benchmark taken from the shootout. I modified the nsieve-bits code here: http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsievebits&lang=ghc&id=0 The shootout code uses STUArray, one modification uses IOCArray, another uses StorableArray (which should be equivalent to IOCArray). Here are some timing results with ghc-6.8.2 on x86_64. $ for e in nsU nsC nsS; do time ./$e 12 > /dev/null; done 1.087 real 1.087 user 0.000 sys 99.96 cpu 3.454 real 3.326 user 0.127 sys 99.98 cpu 3.448 real 3.343 user 0.103 sys 99.94 cpu With ghc-6.8.2 on x86, I get $ for e in nsU nsC nsS; do time ./$e 12 > /dev/null; done ./nsU 12 4.57 real 4.45 user 0.01 sys ./nsC 12 10.46 real 9.88 user 0.27 sys ./nsS 12 10.59 real 9.86 user 0.24 sys That is, the STUArray version (nsU) is much faster than IOCArray (nsC) and StorableArray (nsS) for this test. To see for yourself, checkout CArray darcs get http://code.haskell.org/carray build it, then see the script in tests/runtests.sh. It will build an run these benchmarks, confirming that the output is the same and providing timing information. I would love to see this discrepancy go away, but sadly, it is not there yet. Jed

Roman Cheplyaka wrote:
* Andrew Coppin
[2008-03-26 12:37:53+0000] Any insights here?
Could Data Parallel Haskell[1] be useful for you? It was designed for parallel computation, but it includes unboxed arrays, nice list-like syntax and array comprehensions.
1. http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell
Mmm, maybe. Does it implement unboxed arrays for arbitrary types? For that matter, it seems like this NDP stuff has been "in development" for a seemingly long time now. Does anybody know what the current status is? (I'm guessing the people working on this are more interested in working on it than documenting every step of their progress... which leaves outsiders with the impression that nothing is happening.)

Andrew Coppin wrote:
Roman Cheplyaka wrote:
* Andrew Coppin
[2008-03-26 12:37:53+0000] Any insights here?
Could Data Parallel Haskell[1] be useful for you? It was designed for parallel computation, but it includes unboxed arrays, nice list-like syntax and array comprehensions.
1. http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell
Mmm, maybe. Does it implement unboxed arrays for arbitrary types?
For that matter, it seems like this NDP stuff has been "in development" for a seemingly long time now. Does anybody know what the current status is?
Google -> http://research.microsoft.com/~simonpj/papers/ndp/
(I'm guessing the people working on this are more interested in working on it than documenting every step of their progress... which leaves outsiders with the impression that nothing is happening.)
I don't think the above suggests that "nothing is happening" ... -- Dr. Janis Voigtlaender http://wwwtcs.inf.tu-dresden.de/~voigt/ mailto:voigt@tcs.inf.tu-dresden.de

Janis Voigtlaender wrote:
Google -> http://research.microsoft.com/~simonpj/papers/ndp/
I don't think the above suggests that "nothing is happening" ...
The latet thing on that page is dated over a year ago.

Andrew Coppin wrote:
Janis Voigtlaender wrote:
Google -> http://research.microsoft.com/~simonpj/papers/ndp/
I don't think the above suggests that "nothing is happening" ...
The latet thing on that page is dated over a year ago.
Well, if you expect monthly updates... -- Dr. Janis Voigtlaender http://wwwtcs.inf.tu-dresden.de/~voigt/ mailto:voigt@tcs.inf.tu-dresden.de

andrewcoppin:
Janis Voigtlaender wrote:
Google -> http://research.microsoft.com/~simonpj/papers/ndp/
I don't think the above suggests that "nothing is happening" ...
The latet thing on that page is dated over a year ago.
It's very active. See: http://haskell.org/haskellwiki/GHC/Data_Parallel_Haskell and watch the commits coming in from Roman. -- Don

Don Stewart wrote:
It's very active. See:
http://haskell.org/haskellwiki/GHC/Data_Parallel_Haskell
and watch the commits coming in from Roman.
*digs around* Hmm. So in summary, stuff is happening behind the scenes, there's just not a lot of visible activity at the surface.
participants (7)
-
Andrew Coppin
-
Bulat Ziganshin
-
Don Stewart
-
Henning Thielemann
-
Janis Voigtlaender
-
Jed Brown
-
Roman Cheplyaka