[GHC] #14387: listToMaybe doesn't participate in foldr/build fusion

#14387: listToMaybe doesn't participate in foldr/build fusion -------------------------------------+------------------------------------- Reporter: duog | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core | Version: 8.2.1 Libraries | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I noticed that `Data.OldList.findIndex` seems to use more memory than necessary, and that changing the definition of `listToMaybe` to be in terms of `foldr` fixed the situation. Consider the following module: {{{ {-# LANGUAGE MagicHash #-} {-# OPTIONS_GHC -ddump-to-file -ddump-prep -O #-} module FindIndex where import GHC.Base (Int(I#), build) import GHC.Prim -- | The definitions of listToMaybe, findIndices and findIndex are taken from base listToMaybe :: [a] -> Maybe a listToMaybe [] = Nothing listToMaybe (a:_) = Just a findIndices :: (a -> Bool) -> [a] -> [Int] findIndices p ls = build $ \c n -> let go x r k | p x = I# k `c` r (k +# 1#) | otherwise = r (k +# 1#) in foldr go (\_ -> n) ls 0# {-# inline findIndices #-} findIndex :: (a -> Bool) -> [a] -> Maybe Int findIndex p = listToMaybe . findIndices p -- This is the definition of findIndices when USE_REPORT_PRELUDE is defined findIndices' :: (a -> Bool) -> [a] -> [Int] findIndices' p xs = [ i | (x,i) <- zip xs [0..], p x] {-# inline findIndices' #-} listToMaybe' :: [a] -> Maybe a listToMaybe' = foldr (const . Just) Nothing -- | using listToMaybe', we get a join point findIndex2 :: (a -> Bool) -> [a] -> Maybe Int findIndex2 p = listToMaybe' . findIndices p -- | a "manual" implementaiton, we get a join point findIndex3 :: (a -> Bool) -> [a] -> Maybe Int findIndex3 p = go . zip [0..] where go [] = Nothing go ((i, x) : xs) | p x = Just i | otherwise = go xs -- | alternate version of findIndices, stock listToMaybe, no join point findIndex4 :: (a -> Bool) -> [a] -> Maybe Int findIndex4 p = listToMaybe . findIndices' p -- | alternate version of findIndices, foldr listToMaybe, we get a join point findIndex5 :: (a -> Bool) -> [a] -> Maybe Int findIndex5 p = listToMaybe' . findIndices' p }}} Find attached .dump-prep files with ghc-8.2.1 and ghc-head at commit 5c178012f47420b5dfa417be21146ca82959d273. My interpretation of this is: with both ghc-8.2.1 and ghc-head, findIndex{2,4,5} get join points and findIndex{"",3} don't. Having a join point means constant stack space, not having a join point means linear stack space. I don't understand the simplifier well enough to know whether ghc could do better here, but it seems that changing the definition of `listToMaybe` to {{{ listToMaybe :: [a] -> Maybe a listToMaybe = foldr (const . Just) Nothing }}} would be a win. Are there any downsides? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14387 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14387: listToMaybe doesn't participate in foldr/build fusion -------------------------------------+------------------------------------- Reporter: duog | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by duog): * Attachment "FindIndex.hs" added. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14387 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14387: listToMaybe doesn't participate in foldr/build fusion -------------------------------------+------------------------------------- Reporter: duog | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by duog): * Attachment "FindIndex.ghc-head.dump-prep" added. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14387 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14387: listToMaybe doesn't participate in foldr/build fusion -------------------------------------+------------------------------------- Reporter: duog | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by duog): * Attachment "FindIndex.ghc-8.2.1.dump-prep" added. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14387 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14387: listToMaybe doesn't participate in foldr/build fusion -------------------------------------+------------------------------------- Reporter: duog | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by duog: Old description:
I noticed that `Data.OldList.findIndex` seems to use more memory than necessary, and that changing the definition of `listToMaybe` to be in terms of `foldr` fixed the situation.
Consider the following module: {{{ {-# LANGUAGE MagicHash #-} {-# OPTIONS_GHC -ddump-to-file -ddump-prep -O #-}
module FindIndex where
import GHC.Base (Int(I#), build) import GHC.Prim
-- | The definitions of listToMaybe, findIndices and findIndex are taken from base listToMaybe :: [a] -> Maybe a listToMaybe [] = Nothing listToMaybe (a:_) = Just a
findIndices :: (a -> Bool) -> [a] -> [Int] findIndices p ls = build $ \c n -> let go x r k | p x = I# k `c` r (k +# 1#) | otherwise = r (k +# 1#) in foldr go (\_ -> n) ls 0# {-# inline findIndices #-}
findIndex :: (a -> Bool) -> [a] -> Maybe Int findIndex p = listToMaybe . findIndices p
-- This is the definition of findIndices when USE_REPORT_PRELUDE is defined findIndices' :: (a -> Bool) -> [a] -> [Int] findIndices' p xs = [ i | (x,i) <- zip xs [0..], p x] {-# inline findIndices' #-}
listToMaybe' :: [a] -> Maybe a listToMaybe' = foldr (const . Just) Nothing
-- | using listToMaybe', we get a join point findIndex2 :: (a -> Bool) -> [a] -> Maybe Int findIndex2 p = listToMaybe' . findIndices p
-- | a "manual" implementaiton, we get a join point findIndex3 :: (a -> Bool) -> [a] -> Maybe Int findIndex3 p = go . zip [0..] where go [] = Nothing go ((i, x) : xs) | p x = Just i | otherwise = go xs
-- | alternate version of findIndices, stock listToMaybe, no join point findIndex4 :: (a -> Bool) -> [a] -> Maybe Int findIndex4 p = listToMaybe . findIndices' p
-- | alternate version of findIndices, foldr listToMaybe, we get a join point findIndex5 :: (a -> Bool) -> [a] -> Maybe Int findIndex5 p = listToMaybe' . findIndices' p }}}
Find attached .dump-prep files with ghc-8.2.1 and ghc-head at commit 5c178012f47420b5dfa417be21146ca82959d273.
My interpretation of this is: with both ghc-8.2.1 and ghc-head, findIndex{2,4,5} get join points and findIndex{"",3} don't. Having a join point means constant stack space, not having a join point means linear stack space.
I don't understand the simplifier well enough to know whether ghc could do better here, but it seems that changing the definition of `listToMaybe` to {{{ listToMaybe :: [a] -> Maybe a listToMaybe = foldr (const . Just) Nothing }}} would be a win. Are there any downsides?
New description: I noticed that `Data.OldList.findIndex` seems to use more memory than necessary, and that changing the definition of `listToMaybe` to be in terms of `foldr` fixed the situation. Consider the following module: {{{ {-# LANGUAGE MagicHash #-} {-# OPTIONS_GHC -ddump-to-file -ddump-prep -O #-} module FindIndex where import GHC.Base (Int(I#), build) import GHC.Prim -- | The definitions of listToMaybe, findIndices and findIndex are taken from base listToMaybe :: [a] -> Maybe a listToMaybe [] = Nothing listToMaybe (a:_) = Just a findIndices :: (a -> Bool) -> [a] -> [Int] findIndices p ls = build $ \c n -> let go x r k | p x = I# k `c` r (k +# 1#) | otherwise = r (k +# 1#) in foldr go (\_ -> n) ls 0# {-# inline findIndices #-} findIndex :: (a -> Bool) -> [a] -> Maybe Int findIndex p = listToMaybe . findIndices p -- This is the definition of findIndices when USE_REPORT_PRELUDE is defined findIndices' :: (a -> Bool) -> [a] -> [Int] findIndices' p xs = [ i | (x,i) <- zip xs [0..], p x] {-# inline findIndices' #-} listToMaybe' :: [a] -> Maybe a listToMaybe' = foldr (const . Just) Nothing -- | using listToMaybe', we get a join point findIndex2 :: (a -> Bool) -> [a] -> Maybe Int findIndex2 p = listToMaybe' . findIndices p -- | a "manual" implementaiton, we get a join point findIndex3 :: (a -> Bool) -> [a] -> Maybe Int findIndex3 p = go . zip [0..] where go [] = Nothing go ((i, x) : xs) | p x = Just i | otherwise = go xs -- | alternate version of findIndices, stock listToMaybe, no join point findIndex4 :: (a -> Bool) -> [a] -> Maybe Int findIndex4 p = listToMaybe . findIndices' p -- | alternate version of findIndices, foldr listToMaybe, we get a join point findIndex5 :: (a -> Bool) -> [a] -> Maybe Int findIndex5 p = listToMaybe' . findIndices' p }}} Find attached .dump-prep files with ghc-8.2.1 and ghc-head at commit 8843a39b3c941b1908a8d839f52bc323f3b45081. My interpretation of this is: with both ghc-8.2.1 and ghc-head, findIndex{2,4,5} get join points and findIndex{"",3} don't. Having a join point means constant stack space, not having a join point means linear stack space. I don't understand the simplifier well enough to know whether ghc could do better here, but it seems that changing the definition of `listToMaybe` to {{{ listToMaybe :: [a] -> Maybe a listToMaybe = foldr (const . Just) Nothing }}} would be a win. Are there any downsides? -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14387#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14387: listToMaybe doesn't participate in foldr/build fusion -------------------------------------+------------------------------------- Reporter: duog | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
findIndex{2,4,5} get join points and findIndex{"",3} don't.
Actually `findIndex3` does get a join point:
{{{
FindIndex.findIndex3
= \ (@ a_a1kV)
(p_s2Z2 [Occ=OnceL!] :: a_a1kV -> GHC.Types.Bool)
(eta_s2Z3 [Occ=Once] :: [a_a1kV]) ->
case GHC.List.zip
@ GHC.Types.Int @ a_a1kV FindIndex.findIndex1 eta_s2Z3
of sat_s2Zd
{ __DEFAULT ->
joinrec {
go_s2Z4 [Occ=LoopBreakerT[1]]
:: [(GHC.Types.Int, a_a1kV)] -> GHC.Base.Maybe GHC.Types.Int
[LclId[JoinId(1)], Arity=1, Str=, Unf=OtherCon []]
go_s2Z4 (ds_s2Z5 [Occ=Once!] :: [(GHC.Types.Int, a_a1kV)])
= case ds_s2Z5 of {
[] -> GHC.Base.Nothing @ GHC.Types.Int;
: ds1_s2Z7 [Occ=Once!] xs_s2Z8 [Occ=Once] ->
case ds1_s2Z7 of { (i_s2Za [Occ=Once], x_s2Zb [Occ=Once])
->
case p_s2Z2 x_s2Zb of {
GHC.Types.False -> jump go_s2Z4 xs_s2Z8;
GHC.Types.True -> GHC.Base.Just @ GHC.Types.Int i_s2Za
}
}
}; } in
jump go_s2Z4 sat_s2Zd
}
}}}
Note the join point. But `findIndex4` and `findIndex` do not. Maybe that
was a typo.
Having a join point means constant stack space, not having a join point means linear stack space.
This definitely isn't true. Here's `findIndex` (the one that does not get
a join point):
{{{
findIndex
= \ (@ a_a1IQ) (p_a1eN :: a_a1IQ -> Bool) (x_a1Ld :: [a_a1IQ]) ->
letrec {
go_a1KZ [Occ=LoopBreaker] :: [a_a1IQ] -> Int# -> [Int]
[LclId, Arity=2, Str=
it seems that changing the definition of listToMaybe to use `foldr` would be a win
Yes I agree. It's a bit like making `map` use `foldr` rather than being written directly. But you'd need an INLINE pragma on it. I can't see a downside. If it doesn't fuse you get the original `listToMaybe` back instead. If someone makes that change (in `base`) do add a Note to explain. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14387#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Note the join point. But `findIndex4` and `findIndex` do not. Maybe
#14387: listToMaybe doesn't participate in foldr/build fusion -------------------------------------+------------------------------------- Reporter: duog | Owner: duog Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by duog): * owner: (none) => duog Comment: Replying to [comment:2 simonpj]: Thank you very much for your explanations. that was a typo. Yes it was, sorry about that.
Having a join point means constant stack space, not having a join
point means linear stack space.
This definitely isn't true. Here's `findIndex` (the one that does not
In the `False` branch inside `go`, there's a tail-call to `go` (no stack growth). In the `True` branch, `go` returns a cons-cell, and stops. Then
get a join point): the `case go x o# of ...` scrutinises that cons cell and returns a `Just`. All done. No stack growth in either case. Ah I see, I didn't understand that tail-calls to let bindings worked like that. I will examine the Notes on join points to try to understand the difference between calling a join point and tail-calling a let binding; I guess that the tail-call is a bit more expensive because the let binding has additional code for the case when it is not tail-called.
it seems that changing the definition of listToMaybe to use `foldr`
would be a win
Yes I agree. It's a bit like making `map` use `foldr` rather than being
written directly. But you'd need an INLINE pragma on it.
I can't see a downside. If it doesn't fuse you get the original
`listToMaybe` back instead. If someone makes that change (in `base`) do add a Note to explain. I will prepare a patch. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14387#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14387: listToMaybe doesn't participate in foldr/build fusion -------------------------------------+------------------------------------- Reporter: duog | Owner: duog Type: bug | Status: patch Priority: normal | Milestone: Component: Core Libraries | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4126 Wiki Page: | -------------------------------------+------------------------------------- Changes (by duog): * status: new => patch * differential: => Phab:D4126 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14387#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14387: listToMaybe doesn't participate in foldr/build fusion
-------------------------------------+-------------------------------------
Reporter: duog | Owner: duog
Type: bug | Status: patch
Priority: normal | Milestone:
Component: Core Libraries | Version: 8.2.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D4126
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#14387: listToMaybe doesn't participate in foldr/build fusion -------------------------------------+------------------------------------- Reporter: duog | Owner: duog Type: bug | Status: closed Priority: normal | Milestone: 8.4.1 Component: Core Libraries | Version: 8.2.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4126 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => closed * resolution: => fixed * milestone: => 8.4.1 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14387#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC