
So I wanted to find the first index in a vector whose running sum is greater than a given number. The straightforward way is to create the running sum and then search: Vector.findIndex (>=target) (Vector.scanl' (+) 0 vector) But vectors are strict so it could do extra work, and what if I don't want to generate garbage? I could do it with a fold, but it would have to have the ability to abort early. Of course I could write such a fold myself using indexing: import qualified Data.Vector.Generic as Vector fold_abort :: (Vector.Vector v a) => (accum -> a -> Maybe accum) -> accum -> v a -> accum fold_abort f accum vec = go 0 accum where go i accum = maybe accum (go (i+1)) $ f accum =<< vec Vector.!? i find_before :: (Vector.Vector v a, Num a, Ord a) => a -> v a -> Int find_before n = fst . fold_abort go (0, 0) where go (i, total) a | total + a >= n = Nothing | otherwise = Just (i+1, total+a) So it's bigger and clunkier, but I would think it would be much more efficient (provided using Data.Vector.Generic won't inhibit inlining and unboxing). But I'm a bit surprised there isn't already something like fold_abort... or is there?

Have you already verified that stream fusion won't just do this for you?
On May 23, 2012 12:35 AM, "Evan Laforge"
So I wanted to find the first index in a vector whose running sum is greater than a given number.
The straightforward way is to create the running sum and then search:
Vector.findIndex (>=target) (Vector.scanl' (+) 0 vector)
But vectors are strict so it could do extra work, and what if I don't want to generate garbage? I could do it with a fold, but it would have to have the ability to abort early. Of course I could write such a fold myself using indexing:
import qualified Data.Vector.Generic as Vector
fold_abort :: (Vector.Vector v a) => (accum -> a -> Maybe accum) -> accum -> v a -> accum fold_abort f accum vec = go 0 accum where go i accum = maybe accum (go (i+1)) $ f accum =<< vec Vector.!? i
find_before :: (Vector.Vector v a, Num a, Ord a) => a -> v a -> Int find_before n = fst . fold_abort go (0, 0) where go (i, total) a | total + a >= n = Nothing | otherwise = Just (i+1, total+a)
So it's bigger and clunkier, but I would think it would be much more efficient (provided using Data.Vector.Generic won't inhibit inlining and unboxing). But I'm a bit surprised there isn't already something like fold_abort... or is there?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Good question.. I copied both to a file and tried ghc-core, but it
inlines big chunks of Data.Vector and I can't read it very well, but
it looks like the answer is no, it still builds the the list of sums.
I guess the next step is to benchmark and see how busy the gc is on
each version.
But my impression was that stream fusion can't handle early aborts,
which was why I was wondering why Vector lacks a foldAbort type
function.
On Wed, May 23, 2012 at 5:13 AM, Jake McArthur
Have you already verified that stream fusion won't just do this for you?
On May 23, 2012 12:35 AM, "Evan Laforge"
wrote: So I wanted to find the first index in a vector whose running sum is greater than a given number.
The straightforward way is to create the running sum and then search:
Vector.findIndex (>=target) (Vector.scanl' (+) 0 vector)
But vectors are strict so it could do extra work, and what if I don't want to generate garbage? I could do it with a fold, but it would have to have the ability to abort early. Of course I could write such a fold myself using indexing:
import qualified Data.Vector.Generic as Vector
fold_abort :: (Vector.Vector v a) => (accum -> a -> Maybe accum) -> accum -> v a -> accum fold_abort f accum vec = go 0 accum where go i accum = maybe accum (go (i+1)) $ f accum =<< vec Vector.!? i
find_before :: (Vector.Vector v a, Num a, Ord a) => a -> v a -> Int find_before n = fst . fold_abort go (0, 0) where go (i, total) a | total + a >= n = Nothing | otherwise = Just (i+1, total+a)
So it's bigger and clunkier, but I would think it would be much more efficient (provided using Data.Vector.Generic won't inhibit inlining and unboxing). But I'm a bit surprised there isn't already something like fold_abort... or is there?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 29/05/2012, at 19:49, Evan Laforge wrote:
Good question.. I copied both to a file and tried ghc-core, but it inlines big chunks of Data.Vector and I can't read it very well, but it looks like the answer is no, it still builds the the list of sums. I guess the next step is to benchmark and see how busy the gc is on each version.
Vector should definitely fuse this, if it doesn't it's a bug. Please report if it doesn't for you. To verify, just count the number of letrecs in the optimised Core. You'll see one letrec if it has been fused and two if it hasn't.
But my impression was that stream fusion can't handle early aborts, which was why I was wondering why Vector lacks a foldAbort type function.
Stream fusion easily handles early aborts. There isn't anything like foldAbort precisely because it can be built out of existing operations at no extra cost. Roman
On Wed, May 23, 2012 at 5:13 AM, Jake McArthur
wrote: Have you already verified that stream fusion won't just do this for you?
On May 23, 2012 12:35 AM, "Evan Laforge"
wrote: So I wanted to find the first index in a vector whose running sum is greater than a given number.
The straightforward way is to create the running sum and then search:
Vector.findIndex (>=target) (Vector.scanl' (+) 0 vector)
But vectors are strict so it could do extra work, and what if I don't want to generate garbage? I could do it with a fold, but it would have to have the ability to abort early. Of course I could write such a fold myself using indexing:
import qualified Data.Vector.Generic as Vector
fold_abort :: (Vector.Vector v a) => (accum -> a -> Maybe accum) -> accum -> v a -> accum fold_abort f accum vec = go 0 accum where go i accum = maybe accum (go (i+1)) $ f accum =<< vec Vector.!? i
find_before :: (Vector.Vector v a, Num a, Ord a) => a -> v a -> Int find_before n = fst . fold_abort go (0, 0) where go (i, total) a | total + a >= n = Nothing | otherwise = Just (i+1, total+a)
So it's bigger and clunkier, but I would think it would be much more efficient (provided using Data.Vector.Generic won't inhibit inlining and unboxing). But I'm a bit surprised there isn't already something like fold_abort... or is there?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, May 29, 2012 at 12:52 PM, Roman Leshchinskiy
On 29/05/2012, at 19:49, Evan Laforge wrote:
Good question.. I copied both to a file and tried ghc-core, but it inlines big chunks of Data.Vector and I can't read it very well, but it looks like the answer is no, it still builds the the list of sums. I guess the next step is to benchmark and see how busy the gc is on each version.
Vector should definitely fuse this, if it doesn't it's a bug. Please report if it doesn't for you. To verify, just count the number of letrecs in the optimised Core. You'll see one letrec if it has been fused and two if it hasn't.
I see two letrecs in find_before2, but both of them are on findIndex.
I only have one findIndex so I'm not sure what's going on. The first
one calls the second, but there's an boxed Either argument in there,
which must be coming out of vector internals.
I had to stick NOINLINE on the functions so I could find them in the
core. I don't think this should affect the optimization of the
contents, though.
The fold_abort version is shorter and simpler, only has one letrec
that takes all unboxed arguments, and I think I can more or less
follow what it's doing.
Of course that doesn't mean it's any faster, I could be just
misreading the core. I could do a bug report, but maybe someone else
should look at the core first to make sure I'm not just confused? I
appended the file below, just run ghc-core and search for find_before.
On Tue, May 29, 2012 at 12:54 PM, Duncan Coutts
Note that foldr allows early abort so that's fine. Also, there's no fundamental restriction due to stream fusion. Stream fusion can be used for lazy lists afterall and can implement Data.List.foldr just fine.
But can foldr do a sum running from left to right? I thought you had to be left-biased for that. And as for early abort with foldr, I can think of how to do so if I'm generating lazy data with a right-biased constructor like (:), but how could you do that for, say, a sum? The obvious version, 'foldr (\x v -> if v > 10 then v else v + x) 0' will still run the function on every element. I suppose if fusion works its magic then early abort with foldl or scanl should happen. If the generating loop gets fused with the consuming loop, and the consuming loop only consumes part of the input, as it would with findIndex. import qualified Data.Vector.Unboxed as Unboxed -- | Find the index of the last value whose running sum is still below the -- given number. {-# NOINLINE find_before #-} find_before :: Int -> Unboxed.Vector Int -> Int find_before n = fst . fold_abort go (0, 0) where go (i, total) a | total + a <= n = Just (i+1, total+a) | otherwise = Nothing fold_abort :: (Unboxed.Unbox a) => (accum -> a -> Maybe accum) -> accum -> Unboxed.Vector a -> accum fold_abort f accum vec = go 0 accum where go i accum = maybe accum (go (i+1)) $ f accum =<< vec Unboxed.!? i {-# NOINLINE find_before2 #-} find_before2 :: Int -> Unboxed.Vector Int -> Int find_before2 n vec = case Unboxed.findIndex (>n) sums of Just i -> max 0 (i-1) Nothing -> 0 where sums = Unboxed.scanl' (+) 0 vec main :: IO () main = do print (t0 find_before) print (t0 find_before2) t0 :: (Int -> Unboxed.Vector Int -> Int) -> [Int] t0 f = [f n (Unboxed.fromList [2, 2, 2, 2]) | n <- [0..6]]

On 11/06/2012, at 18:52, Evan Laforge wrote:
On Tue, May 29, 2012 at 12:52 PM, Roman Leshchinskiy
wrote: Vector should definitely fuse this, if it doesn't it's a bug. Please report if it doesn't for you. To verify, just count the number of letrecs in the optimised Core. You'll see one letrec if it has been fused and two if it hasn't.
I see two letrecs in find_before2, but both of them are on findIndex. I only have one findIndex so I'm not sure what's going on. The first one calls the second, but there's an boxed Either argument in there, which must be coming out of vector internals.
Hmm, which version of GHC and what compiler flags are you using? I'm not familiar with ghc-core, maybe that's doing something wrong. Just run ghc -O2 -ddump-simpl and look at the output. Below is the code I'm getting for find_before2 with 7.4.2. As you can see, everything has been fused (although I notice that GHC isn't pushing x_a11p and y1_a124 into the branches for some reason, looks like a new regression but not a particularly bad one and nothing to do with fusion). find_before2_rkk :: Int -> Vector Int -> Int [GblId, Arity=2, Caf=NoCafRefs, Str=DmdType LU(LLL)m] find_before2_rkk = \ (n_arE :: Int) (vec_arF :: Vector Int) -> case vec_arF `cast` ... of _ { Vector ipv_s2Jf ipv1_s2Jg ipv2_s2Jh -> case n_arE of _ { I# y_a11t -> case ># 0 y_a11t of _ { False -> letrec { $sfindIndex_loop_s2Qz [Occ=LoopBreaker] :: Int# -> Int# -> Int# -> Id (Maybe Int) [LclId, Arity=3, Str=DmdType LLL] $sfindIndex_loop_s2Qz = \ (sc_s2Q8 :: Int#) (sc1_s2Q9 :: Int#) (sc2_s2Qa :: Int#) -> case >=# sc_s2Q8 ipv1_s2Jg of _ { False -> case indexIntArray# ipv2_s2Jh (+# ipv_s2Jf sc_s2Q8) of wild_a2JM { __DEFAULT -> let { x_a11p [Dmd=Just L] :: Int# [LclId, Str=DmdType] x_a11p = +# sc1_s2Q9 wild_a2JM } in case ># x_a11p y_a11t of _ { False -> $sfindIndex_loop_s2Qz (+# sc_s2Q8 1) x_a11p (+# sc2_s2Qa 1); True -> (Just @ Int (I# sc2_s2Qa)) `cast` ... } }; True -> (Nothing @ Int) `cast` ... }; } in case ($sfindIndex_loop_s2Qz 0 0 1) `cast` ... of _ { Nothing -> lvl_r2QO; Just i_arH -> case i_arH of _ { I# x_a11Q -> let { y1_a124 [Dmd=Just L] :: Int# [LclId, Str=DmdType] y1_a124 = -# x_a11Q 1 } in case <=# 0 y1_a124 of _ { False -> lvl_r2QO; True -> I# y1_a124 } } }; True -> lvl_r2QO } } } Roman

On Mon, Jun 11, 2012 at 1:29 PM, Roman Leshchinskiy
Hmm, which version of GHC and what compiler flags are you using? I'm not familiar with ghc-core, maybe that's doing something wrong. Just run ghc -O2 -ddump-simpl and look at the output. Below is the code I'm getting for find_before2 with 7.4.2. As you can see, everything has been fused (although I notice that GHC isn't pushing x_a11p and y1_a124 into the branches for some reason, looks like a new regression but not a particularly bad one and nothing to do with fusion).
I'm using 7.0.3, but I tried with 7.4.2 and it looks like what you got, with only one letrec. So it probably has to do with new optimizations introduced since 7.0. Thanks for the help, and I will use this "count the letrecs" technique in the future if I have questions.

On 29 May 2012 11:49, Evan Laforge
Good question.. I copied both to a file and tried ghc-core, but it inlines big chunks of Data.Vector and I can't read it very well, but it looks like the answer is no, it still builds the the list of sums. I guess the next step is to benchmark and see how busy the gc is on each version.
But my impression was that stream fusion can't handle early aborts, which was why I was wondering why Vector lacks a foldAbort type function.
Note that foldr allows early abort so that's fine. Also, there's no fundamental restriction due to stream fusion. Stream fusion can be used for lazy lists afterall and can implement Data.List.foldr just fine. Duncan
participants (4)
-
Duncan Coutts
-
Evan Laforge
-
Jake McArthur
-
Roman Leshchinskiy