Basic list exercise

Todd Wilson

17 Mar 2023 17 Mar '23

1:33 a.m.

Dear Cafe, Here's a basic exercise in list processing: define a function runs :: Ord a => [a] -> [[a]] that breaks up its input list into a list of increasing "runs"; e.g., runs [3,4,5,6,2,3,4,1,2,1] ---> [[3,4,5,6],[2,3,4],[1,2],[1]] A natural solution is the following: runs [] = [] runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys) My question: can we do better than this? It seems that this solution is constantly building and breaking apart pairs. (Or is it, when optimized?) Todd Wilson

Attachments:

attachment.html (text/html — 1.5 KB)

Show replies by date

Zemyla

17 Mar 17 Mar

1:38 a.m.

If you use a case statement instead of a let statement in the recursive case, then GHC will know the pairs are being made and immediately taken apart, and will optimize it to use unboxed pairs internally. On Thu, Mar 16, 2023, 20:33 Todd Wilson wrote:

...

Dear Cafe,

Here's a basic exercise in list processing: define a function

runs :: Ord a => [a] -> [[a]]

that breaks up its input list into a list of increasing "runs"; e.g.,

runs [3,4,5,6,2,3,4,1,2,1] ---> [[3,4,5,6],[2,3,4],[1,2],[1]]

A natural solution is the following:

runs [] = [] runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys)

My question: can we do better than this? It seems that this solution is constantly building and breaking apart pairs. (Or is it, when optimized?)

Todd Wilson _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Todd Wilson

2:58 a.m.

By "using unboxed pairs", do you mean that Haskell optimizes this so that it is equivalent somehow to the following Prolog version of my program? runs([], []). runs([X|Xs], [[X|Ys]|Rs]) :- run(X, Xs, Ys, Zs), runs(Zs, Rs). run(_, [], [], []) :- !. run(X, [Y|Ys], [Y|Us], Vs) :- X =< Y, !, run(Y, Ys, Us, Vs). run(_, Ys, [], Ys). Here, it is clear that, in the second clause for `runs`, computation is happening on two fronts -- `Ys` and `Rs` -- and we can build the two conses in the return value before we do the calls that fill in the missing parts, so this ends up being a tail recursion. When the computation that produces `Ys` finishes, the computation that produces `Rs` can resume. Maybe this can best be explained functionally using continuations? Todd Wilson On Thu, Mar 16, 2023 at 6:38 PM Zemyla wrote:

...

If you use a case statement instead of a let statement in the recursive case, then GHC will know the pairs are being made and immediately taken apart, and will optimize it to use unboxed pairs internally.

On Thu, Mar 16, 2023, 20:33 Todd Wilson wrote:

...
Dear Cafe,

Here's a basic exercise in list processing: define a function

runs :: Ord a => [a] -> [[a]]

that breaks up its input list into a list of increasing "runs"; e.g.,

runs [3,4,5,6,2,3,4,1,2,1] ---> [[3,4,5,6],[2,3,4],[1,2],[1]]

A natural solution is the following:

runs [] = [] runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys)

My question: can we do better than this? It seems that this solution is constantly building and breaking apart pairs. (Or is it, when optimized?)

Todd Wilson _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Jeff Clites

3:58 a.m.

...

It seems that this solution is constantly building and breaking apart pairs.

At first glance it seems fine. In order to pull a sublist off the front of a list you’ll need to build a new list for that part, so the disassemble/reassemble is necessary there. You can use an “as pattern” to avoid re-creating `y:ys` in your else clause, but that’s somewhat minor. I don’t see anywhere else where you are pulling something apart and then recreating the same thing. Regarding the other response, an unboxed pair is just an optimization whereby a pair of values can be returned from a function without actually allocating heap storage, but it’s just reducing memory allocation, nothing conceptually more fancy. Jeff

...

On Mar 16, 2023, at 6:34 PM, Todd Wilson wrote:

Dear Cafe,

Here's a basic exercise in list processing: define a function runs :: Ord a => [a] -> [[a]] that breaks up its input list into a list of increasing "runs"; e.g., runs [3,4,5,6,2,3,4,1,2,1] ---> [[3,4,5,6],[2,3,4],[1,2],[1]] A natural solution is the following: runs [] = [] runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys) My question: can we do better than this? It seems that this solution is constantly building and breaking apart pairs. (Or is it, when optimized?)

Todd Wilson _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Viktor Dukhovni

4:09 a.m.

On Thu, Mar 16, 2023 at 06:33:27PM -0700, Todd Wilson wrote:

...

Here's a basic exercise in list processing: define a function

runs :: Ord a => [a] -> [[a]]

that breaks up its input list into a list of increasing "runs"; e.g.,

runs [3,4,5,6,2,3,4,1,2,1] ---> [[3,4,5,6],[2,3,4],[1,2],[1]]

A natural solution is the following:

runs [] = [] runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys)

My question: can we do better than this? It seems that this solution is constantly building and breaking apart pairs. (Or is it, when optimized?)

The key feature of this solution is that it is lazy in the tail of the list of runs. For example, the below completes quickly despite ostensibly working with an infinite list of runs. It is able to "emit" the first run as soon as a successort is smaller than its predecessor. {-# LANGUAGE ScopedTypeVariables #-} module Main (main) where runs :: forall a. Ord a => [a] -> [[a]] runs [] = [] runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run :: a -> [a] -> ([a], [a]) run x [] = ([], []) run x l@(y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], l) main :: IO () main = print $ sum $ map sum $ take 100 $ runs $ concat $ map (\i -> [0..i]) [0..] It is also able to generate the leading elements of an infinite first run: main :: IO () main = print $ sum $ take 100 $ head $ runs $ [0..] Any constant factors are less important. -- Viktor.

Todd Wilson

26 Mar 26 Mar

5:59 p.m.

I want to thank all of those who contributed to this thread. If I can summarize the contributions (feel free to clarify any misrepresentations): - The code I provided for this problem is probably optimal, except perhaps for the small point about using an @-pattern in `run` to avoid reconstructing `y:ys` in the last line -- but isn't this also something the compiler could easily discover and take advantage of without explicit direction? - The compiler will avoid breaking down and reconstructing pairs on each recursive call, even though that is how it is coded. - Although nobody commented on my Prolog version (perhaps because this is a Haskell list!), the point I made there about compiling this as a tail call that doesn't grow the stack should also apply to Haskell (does it?). One thing that's clear from the Prolog code is that it goes down the input list and distributes the elements to one of two places -- the end of the current run or the beginning of a new run -- interspersed with some cons-cell creation. Is that what's happening in Haskell also? - Reference was made to Haskell's `sort` implementation, which essentially finds runs (increasing and decreasing) in the input before repeatedly merging them in pairs until one run remains. Decreasing runs can be turned into increasing ones by using a tail call with a list accumulator, the way iterative reverse works. But increasing runs are handled in the same way that my Prolog code does, essentially by using what appears to be the Haskell equivalent of Prolog's difference-lists: functions whose arguments are the extendible ends of lists. I'm intrigued by the idea mentioned in this last bullet point. Is this functional equivalent of Prolog's difference lists -- essentially partial data structures that are grown top-down rather than bottom-up, actually efficient? Is there an overhead penalty to be paid for using functions here, or is it compiled away into Prolog-like tail calls that fill in the missing parts of the data structure? Todd Wilson On Thu, Mar 16, 2023 at 6:33 PM Todd Wilson wrote:

...

Dear Cafe,

Here's a basic exercise in list processing: define a function

runs :: Ord a => [a] -> [[a]]

that breaks up its input list into a list of increasing "runs"; e.g.,

runs [3,4,5,6,2,3,4,1,2,1] ---> [[3,4,5,6],[2,3,4],[1,2],[1]]

A natural solution is the following:

runs [] = [] runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys)

My question: can we do better than this? It seems that this solution is constantly building and breaking apart pairs. (Or is it, when optimized?)

Todd Wilson

Viktor Dukhovni

7:07 p.m.

On Sun, Mar 26, 2023 at 10:59:13AM -0700, Todd Wilson wrote:

...

Is there an overhead penalty to be paid for using functions here, or is it compiled away into Prolog-like tail calls that fill in the missing parts of the data structure?

Efficient implementations of functional langauges make extensive use of tail call optimisation. I don't know whether looking at the GHC-generated "Core" for your implementation will answer any of your questions, but here it is: Haskell: runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys) Core (with some added comments): -- Unpack the "Ord" dictionary to extract just the required "<=" -- function and call the "$wruns" worker ("ww3" is "<=", and "ds" -- is the list to be transformed: -- runs :: forall a. Ord a => [a] -> [[a]] runs = \ (@a) ($dOrd :: Ord a) (ds :: [a]) -> case $dOrd of { C:Ord ww ww1 ww2 ww3 ww4 ww5 ww6 ww7 -> $wruns ww3 ds } Rec { $wruns :: forall {a}. (a -> a -> Bool) -> [a] -> [[a]] $wruns = \ (@a) (ww :: a -> a -> Bool) (ds :: [a]) -> case ds of { [] -> []; -- runs [] = [] : x xs -> -- runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs let { ds1 :: ([a], [a]) -- A lazily evaluated thunk for (run x xs) ds1 = letrec { -- Internal recursion in "run" returns strict unboxed pairs -- (on the stack) avoiding heap or thunk allocation for the tuple. $wrun :: a -> [a] -> (# [a], [a] #) $wrun = \ (x1 :: a) (ds2 :: [a]) -> case ds2 of wild1 { -- (y:ys) is "wild1" [] -> (# [], [] #); -- run x [] = ([], []) : y ys -> case ww x1 y of { -- x <= y ? False -> (# [], wild1 #); -- else ([], y:ys) True -> -- then let (us, vs) = run y ys in (y:us, vs) let { ds3 :: ([a], [a]) -- A "thunk" for (run y ys) evaluated lazily ds3 = case $wrun y ys of { (# ww1, ww2 #) -> (ww1, ww2) } } in (# : y (case ds3 of { (us, vs) -> us }), case ds3 of { (us, vs) -> vs } #) } }; } in case $wrun x xs of { (# ww1, ww2 #) -> (ww1, ww2) } } in : (: x (case ds1 of { (ys, zs) -> ys })) (case ds1 of { (ys, zs) -> $wruns ww zs }) } end Rec } So for a non-empty cons-cell (: x ...) the result is a new cons cell: : (x : (fst ds1)) (runs (snd ds1)) --------- -------------- in which both underline parts are computed lazily (on demand) from the thunk "ds1": λ> head $ head $ runs $ 42 : undefined 42 When we do want the successor of the first element, we look no futher than necessary: λ> head $ runs $ 42 : 0 : undefined [42] λ> take 2 $ head $ runs $ 42 : 43 : undefined [42,43] Does this help? FWIW, the (simplified) Core output was generated via: hscore() { ghc -ddump-simpl -dsuppress-idinfo -dsuppress-coercions -dsuppress-type-applications -dsuppress-uniques -dsuppress-module-prefixes "$@" } hscore -O2 Runs.hs > Runs.core -- Viktor.

Keith

9:33 p.m.

YMMV, but based on some benchmarking I've done, the default implementation using `compareBy` seems to optimize better than trying to force dictionary unpacking. The only way I've been able to generically sort faster than base's `compareBy` is to use nonempty list structures for runs. — Sent from my phone with K-9 Mail. On 26 March 2023 19:07:03 UTC, Viktor Dukhovni wrote:

...

On Sun, Mar 26, 2023 at 10:59:13AM -0700, Todd Wilson wrote:

...
Is there an overhead penalty to be paid for using functions here, or is it compiled away into Prolog-like tail calls that fill in the missing parts of the data structure?

Efficient implementations of functional langauges make extensive use of tail call optimisation.

I don't know whether looking at the GHC-generated "Core" for your implementation will answer any of your questions, but here it is:

Haskell: runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys)

Core (with some added comments): -- Unpack the "Ord" dictionary to extract just the required "<=" -- function and call the "$wruns" worker ("ww3" is "<=", and "ds" -- is the list to be transformed: -- runs :: forall a. Ord a => [a] -> [[a]] runs = \ (@a) ($dOrd :: Ord a) (ds :: [a]) -> case $dOrd of { C:Ord ww ww1 ww2 ww3 ww4 ww5 ww6 ww7 -> $wruns ww3 ds }

Rec { $wruns :: forall {a}. (a -> a -> Bool) -> [a] -> [[a]] $wruns = \ (@a) (ww :: a -> a -> Bool) (ds :: [a]) -> case ds of { [] -> []; -- runs [] = [] : x xs -> -- runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs let { ds1 :: ([a], [a]) -- A lazily evaluated thunk for (run x xs) ds1 = letrec { -- Internal recursion in "run" returns strict unboxed pairs -- (on the stack) avoiding heap or thunk allocation for the tuple. $wrun :: a -> [a] -> (# [a], [a] #) $wrun = \ (x1 :: a) (ds2 :: [a]) -> case ds2 of wild1 { -- (y:ys) is "wild1" [] -> (# [], [] #); -- run x [] = ([], []) : y ys -> case ww x1 y of { -- x <= y ? False -> (# [], wild1 #); -- else ([], y:ys) True -> -- then let (us, vs) = run y ys in (y:us, vs) let { ds3 :: ([a], [a]) -- A "thunk" for (run y ys) evaluated lazily ds3 = case $wrun y ys of { (# ww1, ww2 #) -> (ww1, ww2) } } in (# : y (case ds3 of { (us, vs) -> us }), case ds3 of { (us, vs) -> vs } #) } }; } in case $wrun x xs of { (# ww1, ww2 #) -> (ww1, ww2) } } in : (: x (case ds1 of { (ys, zs) -> ys })) (case ds1 of { (ys, zs) -> $wruns ww zs }) } end Rec }

So for a non-empty cons-cell (: x ...) the result is a new cons cell:

: (x : (fst ds1)) (runs (snd ds1)) --------- --------------

in which both underline parts are computed lazily (on demand) from the thunk "ds1":

λ> head $ head $ runs $ 42 : undefined 42

When we do want the successor of the first element, we look no futher than necessary:

λ> head $ runs $ 42 : 0 : undefined [42]

λ> take 2 $ head $ runs $ 42 : 43 : undefined [42,43]

Does this help? FWIW, the (simplified) Core output was generated via:

hscore() { ghc -ddump-simpl -dsuppress-idinfo -dsuppress-coercions -dsuppress-type-applications -dsuppress-uniques -dsuppress-module-prefixes "$@" } hscore -O2 Runs.hs > Runs.core

-- Viktor. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Todd Wilson

11:24 p.m.

Thanks, Viktor, for that information, especially the macro for getting Haskel core output at the end, which I will try to use in the future and perhaps avoid having to query this list to get such answers! I have a few follow-up questions on this code, however: On Sun, Mar 26, 2023 at 12:07 PM Viktor Dukhovni wrote:

...

I don't know whether looking at the GHC-generated "Core" for your implementation will answer any of your questions, but here it is:

Haskell: runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys)

Core (with some added comments): -- Unpack the "Ord" dictionary to extract just the required "<=" -- function and call the "$wruns" worker ("ww3" is "<=", and "ds" -- is the list to be transformed: -- runs :: forall a. Ord a => [a] -> [[a]] runs = \ (@a) ($dOrd :: Ord a) (ds :: [a]) -> case $dOrd of { C:Ord ww ww1 ww2 ww3 ww4 ww5 ww6 ww7 -> $wruns ww3 ds }

Rec { $wruns :: forall {a}. (a -> a -> Bool) -> [a] -> [[a]] $wruns = \ (@a) (ww :: a -> a -> Bool) (ds :: [a]) -> case ds of { [] -> []; -- runs [] = [] : x xs -> -- runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs let { ds1 :: ([a], [a]) -- A lazily evaluated thunk for (run x xs) ds1 = letrec { -- Internal recursion in "run" returns strict unboxed pairs -- (on the stack) avoiding heap or thunk allocation for the tuple. $wrun :: a -> [a] -> (# [a], [a] #) $wrun = \ (x1 :: a) (ds2 :: [a]) -> case ds2 of wild1 { -- (y:ys) is "wild1" [] -> (# [], [] #); -- run x [] = ([], []) : y ys -> case ww x1 y of { -- x <= y ? False -> (# [], wild1 #); -- else ([], y:ys) True -> -- then let (us, vs) = run y ys in (y:us, vs) let { ds3 :: ([a], [a]) -- A "thunk" for (run y ys) evaluated lazily

Why doesn't ds3 have an explicitly unboxed pair type, and does that have any performance implications? For example, ...

...

ds3 = case $wrun y ys of { (# ww1, ww2 #) -> (ww1, ww2) } } in (# : y (case ds3 of { (us, vs) -> us }), case ds3 of { (us, vs) -> vs } #)

Granted I'm not that familiar with core, but It sure looks like this code breaks apart pairs (with the equivalent of fst and snd) and rebuilds them

...

} }; } in case $wrun x xs of { (# ww1, ww2 #) -> (ww1, ww2) } } in : (: x (case ds1 of { (ys, zs) -> ys })) (case ds1 of { (ys, zs) -> $wruns ww zs }) } end Rec }

So for a non-empty cons-cell (: x ...) the result is a new cons cell:

: (x : (fst ds1)) (runs (snd ds1)) --------- --------------

in which both underline parts are computed lazily (on demand) from the thunk "ds1":

λ> head $ head $ runs $ 42 : undefined 42

When we do want the successor of the first element, we look no futher than necessary:

λ> head $ runs $ 42 : 0 : undefined [42]

λ> take 2 $ head $ runs $ 42 : 43 : undefined [42,43]

Does this help?

Yes, it does, thanks, although I was aware of this aspect of the laziness of my code from the beginning and was concerned more with how the output lists were built.

...

FWIW, the (simplified) Core output was generated via: hscore() { ghc -ddump-simpl -dsuppress-idinfo -dsuppress-coercions -dsuppress-type-applications -dsuppress-uniques -dsuppress-module-prefixes "$@" } hscore -O2 Runs.hs > Runs.core

Thanks again for this, it will be helpful going forward. --Todd

Viktor Dukhovni

27 Mar 27 Mar

midnight

On Sun, Mar 26, 2023 at 04:24:09PM -0700, Todd Wilson wrote:

...

...
Core (with some added comments): -- Unpack the "Ord" dictionary to extract just the required "<=" -- function and call the "$wruns" worker ("ww3" is "<=", and "ds" -- is the list to be transformed: -- runs :: forall a. Ord a => [a] -> [[a]] runs = \ (@a) ($dOrd :: Ord a) (ds :: [a]) -> case $dOrd of { C:Ord ww ww1 ww2 ww3 ww4 ww5 ww6 ww7 -> $wruns ww3 ds }

Rec { $wruns :: forall {a}. (a -> a -> Bool) -> [a] -> [[a]] $wruns = \ (@a) (ww :: a -> a -> Bool) (ds :: [a]) -> case ds of { [] -> []; -- runs [] = [] : x xs -> -- runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs let { ds1 :: ([a], [a]) -- A lazily evaluated thunk for (run x xs) ds1 = letrec { -- Internal recursion in "run" returns strict unboxed pairs -- (on the stack) avoiding heap or thunk allocation for the tuple. $wrun :: a -> [a] -> (# [a], [a] #) $wrun = \ (x1 :: a) (ds2 :: [a]) -> case ds2 of wild1 { -- (y:ys) is "wild1" [] -> (# [], [] #); -- run x [] = ([], []) : y ys -> case ww x1 y of { -- x <= y ? False -> (# [], wild1 #); -- else ([], y:ys) True -> -- then let (us, vs) = run y ys in (y:us, vs) let { ds3 :: ([a], [a]) -- A "thunk" for (run y ys) evaluated lazily

Why doesn't ds3 have an explicitly unboxed pair type, and does that have any performance implications? For example, ...

Precisely because "ds3" must be evaluated lazily, it can't be an unboxed pair (which are always strictly evaluated).

...

...
ds3 = case $wrun y ys of { (# ww1, ww2 #) -> (ww1, ww2) } } in (# : y (case ds3 of { (us, vs) -> us }), case ds3 of { (us, vs) -> vs } #)

...

Granted I'm not that familiar with Core, but It sure looks like this code breaks apart pairs (with the equivalent of fst and snd) and rebuilds them

The inner (recursive) invocation of "run" must also be lazily evaluated, so yes, its output needs to be boxed as a pair.

...

...
When we do want the successor of the first element, we look no futher than necessary:

λ> head $ runs $ 42 : 0 : undefined [42]

λ> take 2 $ head $ runs $ 42 : 43 : undefined [42,43]

Does this help?

Yes, it does, thanks, although I was aware of this aspect of the laziness of my code from the beginning and was concerned more with how the output lists were built.

As with all lists, they are built via the (:) constructor from a head element and a tail. GHC reuses any full list or tail it can reuse, and constructs new cons cells that are not already in hand. Because these are pure functions working with immutable data, to construct an initial segment of a list we must build a new list, we can't truncate the original original in place, it is immutable. Therefore, the original list will be picked apart and reassembled. With "linear Haskell" there are in some cases opportunities to mutate certain objects in place, because they are sure to not have any other references. But that isn't the case here. So even `runs [0..10]` has to pick apart and reassemble the list. I hope I understood correctly what you're getting at with the concern about building and rebuilding. It looks to me like the Core code does exactly as much building and re-building as required by laziness, and the result can be consumed in a single pass in constant space (multi-pass use naturally memoises the result). -- Viktor.

David Feuer

12:19 a.m.

Linear Haskell helps control in-place mutation, but that's still only in mutable references and arrays. As I understand it, data built from regular constructors is more fundamentally immutable because of the way GHC garbage collection works. On Sun, Mar 26, 2023, 8:00 PM Viktor Dukhovni wrote:

...

On Sun, Mar 26, 2023 at 04:24:09PM -0700, Todd Wilson wrote:

...
...
Core (with some added comments): -- Unpack the "Ord" dictionary to extract just the required "<=" -- function and call the "$wruns" worker ("ww3" is "<=", and "ds" -- is the list to be transformed: -- runs :: forall a. Ord a => [a] -> [[a]] runs = \ (@a) ($dOrd :: Ord a) (ds :: [a]) -> case $dOrd of { C:Ord ww ww1 ww2 ww3 ww4 ww5 ww6 ww7 -> $wruns ww3 ds }

Rec { $wruns :: forall {a}. (a -> a -> Bool) -> [a] -> [[a]] $wruns = \ (@a) (ww :: a -> a -> Bool) (ds :: [a]) -> case ds of { [] -> []; -- runs [] = [] : x xs -> -- runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs let { ds1 :: ([a], [a]) -- A lazily evaluated thunk for (run x xs) ds1 = letrec { -- Internal recursion in "run" returns strict unboxed pairs -- (on the stack) avoiding heap or thunk allocation for the tuple. $wrun :: a -> [a] -> (# [a], [a] #) $wrun = \ (x1 :: a) (ds2 :: [a]) -> case ds2 of wild1 { -- (y:ys) is "wild1" [] -> (# [], [] #); -- run x [] = ([], []) : y ys -> case ww x1 y of { -- x <= y ? False -> (# [], wild1 #); -- else ([], y:ys) True -> -- then let (us, vs) = run y ys in (y:us, vs) let { ds3 :: ([a], [a]) -- A "thunk" for (run y ys) evaluated lazily

Why doesn't ds3 have an explicitly unboxed pair type, and does that have any performance implications? For example, ...

Precisely because "ds3" must be evaluated lazily, it can't be an unboxed pair (which are always strictly evaluated).

...
...
ds3 = case $wrun y ys of { (#

ww1, ww2 #) -> (ww1, ww2) } } in

...
(# : y (case ds3 of { (us, vs) ->

us }),

...
case ds3 of { (us, vs) -> vs }

#)

...
Granted I'm not that familiar with Core, but It sure looks like this code breaks apart pairs (with the equivalent of fst and snd) and rebuilds them

The inner (recursive) invocation of "run" must also be lazily evaluated, so yes, its output needs to be boxed as a pair.

...
...
When we do want the successor of the first element, we look no futher than necessary:

λ> head $ runs $ 42 : 0 : undefined [42]

λ> take 2 $ head $ runs $ 42 : 43 : undefined [42,43]

Does this help?

Yes, it does, thanks, although I was aware of this aspect of the laziness of my code from the beginning and was concerned more with how the output lists were built.

As with all lists, they are built via the (:) constructor from a head element and a tail. GHC reuses any full list or tail it can reuse, and constructs new cons cells that are not already in hand.

Because these are pure functions working with immutable data, to construct an initial segment of a list we must build a new list, we can't truncate the original original in place, it is immutable.

Therefore, the original list will be picked apart and reassembled.

With "linear Haskell" there are in some cases opportunities to mutate certain objects in place, because they are sure to not have any other references. But that isn't the case here.

So even `runs [0..10]` has to pick apart and reassemble the list. I hope I understood correctly what you're getting at with the concern about building and rebuilding.

It looks to me like the Core code does exactly as much building and re-building as required by laziness, and the result can be consumed in a single pass in constant space (multi-pass use naturally memoises the result).

-- Viktor. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

David Feuer

26 Mar 26 Mar

7:53 p.m.

I don't know prolog, but I believe the answer to your last question is no. The functions will actually be function closures in this case. There's no way to leave a (native) Haskell list open to being extended at the end, except to the extent that lazy evaluation does so. Somewhat separately, while it would be possible for a Haskell implementation to construct strict-spined list-like structures with constant stack space using tail recursion modulo cons, GHC doesn't do that. On Sun, Mar 26, 2023, 1:59 PM Todd Wilson wrote:

...

I want to thank all of those who contributed to this thread. If I can summarize the contributions (feel free to clarify any misrepresentations):

- The code I provided for this problem is probably optimal, except perhaps for the small point about using an @-pattern in `run` to avoid reconstructing `y:ys` in the last line -- but isn't this also something the compiler could easily discover and take advantage of without explicit direction? - The compiler will avoid breaking down and reconstructing pairs on each recursive call, even though that is how it is coded. - Although nobody commented on my Prolog version (perhaps because this is a Haskell list!), the point I made there about compiling this as a tail call that doesn't grow the stack should also apply to Haskell (does it?). One thing that's clear from the Prolog code is that it goes down the input list and distributes the elements to one of two places -- the end of the current run or the beginning of a new run -- interspersed with some cons-cell creation. Is that what's happening in Haskell also? - Reference was made to Haskell's `sort` implementation, which essentially finds runs (increasing and decreasing) in the input before repeatedly merging them in pairs until one run remains. Decreasing runs can be turned into increasing ones by using a tail call with a list accumulator, the way iterative reverse works. But increasing runs are handled in the same way that my Prolog code does, essentially by using what appears to be the Haskell equivalent of Prolog's difference-lists: functions whose arguments are the extendible ends of lists.

I'm intrigued by the idea mentioned in this last bullet point. Is this functional equivalent of Prolog's difference lists -- essentially partial data structures that are grown top-down rather than bottom-up, actually efficient? Is there an overhead penalty to be paid for using functions here, or is it compiled away into Prolog-like tail calls that fill in the missing parts of the data structure?

Todd Wilson

On Thu, Mar 16, 2023 at 6:33 PM Todd Wilson wrote:

...
Dear Cafe,

Here's a basic exercise in list processing: define a function

runs :: Ord a => [a] -> [[a]]

that breaks up its input list into a list of increasing "runs"; e.g.,

runs [3,4,5,6,2,3,4,1,2,1] ---> [[3,4,5,6],[2,3,4],[1,2],[1]]

A natural solution is the following:

runs [] = [] runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys)

My question: can we do better than this? It seems that this solution is constantly building and breaking apart pairs. (Or is it, when optimized?)

Todd Wilson

_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Todd Wilson

11:52 p.m.

Interesting. So one consequence of this, if I understand correctly, is that the aforementioned library implementation https://hackage.haskell.org/package/base-4.18.0.0/docs/src/Data.OldList.html... of `sort` could actually be improved by rewriting the `ascending` function there so that it returned a pair, as in my code, rather than had a "difference-list" functional accumulator argument that is finally applied in the base case. --Todd On Sun, Mar 26, 2023 at 12:53 PM David Feuer wrote:

...

I don't know prolog, but I believe the answer to your last question is no. The functions will actually be function closures in this case. There's no way to leave a (native) Haskell list open to being extended at the end, except to the extent that lazy evaluation does so. Somewhat separately, while it would be possible for a Haskell implementation to construct strict-spined list-like structures with constant stack space using tail recursion modulo cons, GHC doesn't do that.

On Sun, Mar 26, 2023, 1:59 PM Todd Wilson wrote:

...
I want to thank all of those who contributed to this thread. If I can summarize the contributions (feel free to clarify any misrepresentations):

- The code I provided for this problem is probably optimal, except perhaps for the small point about using an @-pattern in `run` to avoid reconstructing `y:ys` in the last line -- but isn't this also something the compiler could easily discover and take advantage of without explicit direction? - The compiler will avoid breaking down and reconstructing pairs on each recursive call, even though that is how it is coded. - Although nobody commented on my Prolog version (perhaps because this is a Haskell list!), the point I made there about compiling this as a tail call that doesn't grow the stack should also apply to Haskell (does it?). One thing that's clear from the Prolog code is that it goes down the input list and distributes the elements to one of two places -- the end of the current run or the beginning of a new run -- interspersed with some cons-cell creation. Is that what's happening in Haskell also? - Reference was made to Haskell's `sort` implementation, which essentially finds runs (increasing and decreasing) in the input before repeatedly merging them in pairs until one run remains. Decreasing runs can be turned into increasing ones by using a tail call with a list accumulator, the way iterative reverse works. But increasing runs are handled in the same way that my Prolog code does, essentially by using what appears to be the Haskell equivalent of Prolog's difference-lists: functions whose arguments are the extendible ends of lists.

I'm intrigued by the idea mentioned in this last bullet point. Is this functional equivalent of Prolog's difference lists -- essentially partial data structures that are grown top-down rather than bottom-up, actually efficient? Is there an overhead penalty to be paid for using functions here, or is it compiled away into Prolog-like tail calls that fill in the missing parts of the data structure?

Todd Wilson

On Thu, Mar 16, 2023 at 6:33 PM Todd Wilson wrote:

...
Dear Cafe,

Here's a basic exercise in list processing: define a function

runs :: Ord a => [a] -> [[a]]

that breaks up its input list into a list of increasing "runs"; e.g.,

runs [3,4,5,6,2,3,4,1,2,1] ---> [[3,4,5,6],[2,3,4],[1,2],[1]]

A natural solution is the following:

runs [] = [] runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs where run x [] = ([], []) run x (y:ys) = if x <= y then let (us, vs) = run y ys in (y:us, vs) else ([], y:ys)

My question: can we do better than this? It seems that this solution is constantly building and breaking apart pairs. (Or is it, when optimized?)

Todd Wilson

_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

831

Age (days ago)

841

Last active (days ago)

List overview

Download

12 comments

6 participants

participants (6)

David Feuer
Jeff Clites
Keith
Todd Wilson
Viktor Dukhovni
Zemyla

Basic list exercise

tags

participants (6)