fast Array operations: foldl, drop

newer
Peg solitaire in Haskell - anyone...

older
ANN: Teach Yourself Gtk2Hs in 21...

Henning Thielemann

29 Nov 2007 29 Nov '07

9:46 a.m.

I thought operations like "foldl'" and "drop" must be very fast on arrays (especially UArray) with appropriate pointer tricks, I mean pointer incrementing instead of indexing for "foldl'" and a pointer into the array for "drop". Is it planned to add such functions? Ok, if "foldl f x . elems" and "listArray (i,sufficientlyBig) . drop n . elems" are fused to high speed code, then these functions do not need to materialize in the API.

Show replies by date

Jules Bean

29 Nov 29 Nov

11:29 a.m.

Henning Thielemann wrote:

...

I thought operations like "foldl'" and "drop" must be very fast on arrays (especially UArray) with appropriate pointer tricks, I mean pointer incrementing instead of indexing for "foldl'" and a pointer into the array for "drop". Is it planned to add such functions? Ok, if "foldl f x . elems" and "listArray (i,sufficientlyBig) . drop n . elems" are fused to high speed code, then these functions do not need to materialize in the API. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

As far as I'm aware, our arrays don't support any kind of zero-copy slicing, which is what your 'drop' trick amounts to. Zero-copy slicing would seem like an obviously nice thing to have for IArrays, though, I agree. For various different kinds of slices including projections. ByteString supports zero-copy substring, but it's based on ForeignPtr not UArray. On the fusing point I believe that the stream fusers believe they can make things like foldl . elems fuse but I'm not sure. Jules

Bulat Ziganshin

3:51 p.m.

New subject: Re[2]: fast Array operations: foldl, drop

Hello Jules, Thursday, November 29, 2007, 2:29:08 PM, you wrote:

...

...
I thought operations like "foldl'" and "drop" must be very fast on arrays (especially UArray) with appropriate pointer tricks, I mean pointer

...

As far as I'm aware, our arrays don't support any kind of zero-copy slicing, which is what your 'drop' trick amounts to.

of course - they don't contain "first index" element, so entire ByteArray# should be used for UArray -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Andrew Coppin

8:54 p.m.

Jules Bean wrote:

...

As far as I'm aware, our arrays don't support any kind of zero-copy slicing, which is what your 'drop' trick amounts to.

Zero-copy slicing would seem like an obviously nice thing to have for IArrays, though, I agree. For various different kinds of slices including projections.

ByteString supports zero-copy substring, but it's based on ForeignPtr not UArray.

This is one of the many things I am often tempted to try to implement myself... In general, the IArray and MArray interfaces seem to be lacking a huge number of functions which logically *should* be quite possible on an array, they just aren't implemented. Maybe one day I'll sit down and make a serious attempt to catelogue and implement them all...

Bryan O'Sullivan

6:27 p.m.

Henning Thielemann wrote:

...

I thought operations like "foldl'" and "drop" must be very fast on arrays (especially UArray) with appropriate pointer tricks,

These kinds of functions are only much use on one-dimensional arrays, which look sufficiently list-like that the ideas translate fairly cleanly. For higher dimensions, there are enough options in terms of traversal direction and what exactly e.g. a fold should fold over (single elements? lower-dimensional slices?) that a sensible API doesn't exactly leap out.

Henning Thielemann

9:06 p.m.

On Thu, 29 Nov 2007, Bryan O'Sullivan wrote:

...

Henning Thielemann wrote:

...
I thought operations like "foldl'" and "drop" must be very fast on arrays (especially UArray) with appropriate pointer tricks,

These kinds of functions are only much use on one-dimensional arrays, which look sufficiently list-like that the ideas translate fairly cleanly. For higher dimensions, there are enough options in terms of traversal direction and what exactly e.g. a fold should fold over (single elements? lower-dimensional slices?) that a sensible API doesn't exactly leap out.

That's unfortunately true. However, for my application (signal processing) I need mostly one-dimensional arrays and they should support fast "foldl'", slice, "build".

Ketil Malde

30 Nov 30 Nov

7:47 a.m.

Bryan O'Sullivan writes:

...

For higher dimensions, there are enough options in terms of traversal direction and what exactly e.g. a fold should fold over (single elements? lower-dimensional slices?) that a sensible API doesn't exactly leap out.

How about a 'reduce' instead of 'foldl1'? I think that if you require a commutative operator, the order doesn't matter (except for efficiency and possible rounding issues, I guess). -k -- If I haven't seen further, it is by standing in the footprints of giants

Henning Thielemann

11:24 a.m.

On Fri, 30 Nov 2007, Ketil Malde wrote:

...

Bryan O'Sullivan writes:

...
For higher dimensions, there are enough options in terms of traversal direction and what exactly e.g. a fold should fold over (single elements? lower-dimensional slices?) that a sensible API doesn't exactly leap out.

How about a 'reduce' instead of 'foldl1'? I think that if you require a commutative operator, the order doesn't matter (except for efficiency and possible rounding issues, I guess).

For what I have in mind the order of execution matters. I also think now that slices for higher dimensional arrays are useful, anyway. If you choose a subrange of indices in the most significant dimension this would be possible without copying. It would be also possible to 'reshape' (in MatLab terms) an array without copying, as long as the number elements remain the same. So you could first transform an array of arbitrary dimension to a two-dimensional one, say reshape :: (j,j) -> Array i a -> Array j a specialised to reshape :: ((i,(j,k)), (i,(j,k))) -> Array (i,j,k) a -> Array (i,(j,k)) a then you could slice with respect to the first (most significant) dimension. slice :: (i,i) -> Array (i,j) a -> Array (i,j) a {- | slice with respect to the first dimension and start indices of the slice with number of type 'k' -} sliceRemap :: (i,i) -> k -> Array (i,j) a -> Array (k,j) a

Don Stewart

1 Dec 1 Dec

3:56 a.m.

lemming:

...

On Fri, 30 Nov 2007, Ketil Malde wrote:

...
Bryan O'Sullivan writes:

...
For higher dimensions, there are enough options in terms of traversal direction and what exactly e.g. a fold should fold over (single elements? lower-dimensional slices?) that a sensible API doesn't exactly leap out.

How about a 'reduce' instead of 'foldl1'? I think that if you require a commutative operator, the order doesn't matter (except for efficiency and possible rounding issues, I guess).

For what I have in mind the order of execution matters.

I also think now that slices for higher dimensional arrays are useful, anyway. If you choose a subrange of indices in the most significant dimension this would be possible without copying. It would be also possible to 'reshape' (in MatLab terms) an array without copying, as long as the number elements remain the same. So you could first transform an array of arbitrary dimension to a two-dimensional one, say

I forgot to mention this early, but possibly you could use the ndp array library. There are some people using its UArr type for (non parallel) strict arrays, that support map/fold/zip et al. http://darcs.haskell.org/packages/ndp/ This blog post recently, http://sequence.complete.org/node/371 shows at least one non-developer is using it :) Roman, what do you think -- are the unlifted, non-parallel arrays usably `beta'? -- Don

Roman Leshchinskiy

2 Dec 2 Dec

9:33 p.m.

Don Stewart wrote:

...

I forgot to mention this early, but possibly you could use the ndp array library. There are some people using its UArr type for (non parallel) strict arrays, that support map/fold/zip et al.

http://darcs.haskell.org/packages/ndp/

This blog post recently,

http://sequence.complete.org/node/371

shows at least one non-developer is using it :)

Roman, what do you think -- are the unlifted, non-parallel arrays usably `beta'?

Yes, they definitely are. UArr uses stream fusion, BTW, so things should fuse sometimes. On a more general note, we plan to put a generally usable UArr-like layer underneath the current standard array types eventually. This would provide all interesting operations as well as fusion. It might take a while (i.e., months) until I get around to doing it, though. Roman

Don Stewart

9:37 p.m.

rl:

...

Don Stewart wrote:

...
I forgot to mention this early, but possibly you could use the ndp array library. There are some people using its UArr type for (non parallel) strict arrays, that support map/fold/zip et al.

http://darcs.haskell.org/packages/ndp/

This blog post recently,

http://sequence.complete.org/node/371

shows at least one non-developer is using it :)

Roman, what do you think -- are the unlifted, non-parallel arrays usably `beta'?

Yes, they definitely are. UArr uses stream fusion, BTW, so things should fuse sometimes.

On a more general note, we plan to put a generally usable UArr-like layer underneath the current standard array types eventually. This would provide all interesting operations as well as fusion. It might take a while (i.e., months) until I get around to doing it, though.

Would it be possible to separate out the UArr modules from the larger ndp project (which needs ghc head and type families). I.e. just one smaller package for fast, fusable arrays we can work on? -- Don

Roman Leshchinskiy

9:46 p.m.

Don Stewart wrote:

...

rl:

...
Don Stewart wrote:

...
I forgot to mention this early, but possibly you could use the ndp array library. There are some people using its UArr type for (non parallel) strict arrays, that support map/fold/zip et al.

http://darcs.haskell.org/packages/ndp/

This blog post recently,

http://sequence.complete.org/node/371

shows at least one non-developer is using it :)

Roman, what do you think -- are the unlifted, non-parallel arrays usably `beta'? Yes, they definitely are. UArr uses stream fusion, BTW, so things should fuse sometimes.

On a more general note, we plan to put a generally usable UArr-like layer underneath the current standard array types eventually. This would provide all interesting operations as well as fusion. It might take a while (i.e., months) until I get around to doing it, though.

Would it be possible to separate out the UArr modules from the larger ndp project (which needs ghc head and type families). I.e. just one smaller package for fast, fusable arrays we can work on?

UArr is a type family, too! That said, it would definitely be possible to put it into a separate package. The interface needs some work, though, since it is very "backendish" at the moment. The problem is that I don't really have time for this at the moment. If anyone volunteers to do this I'll help, of course. Roman

6422

Age (days ago)

6425

Last active (days ago)

List overview

Download

11 comments

8 participants

participants (8)

Andrew Coppin
Bryan O'Sullivan
Bulat Ziganshin
Don Stewart
Henning Thielemann
Jules Bean
Ketil Malde
Roman Leshchinskiy