Ah, never mind that last remark. If it allocates less, then even that tiny slowdown applies only to benchmarks.

On Jul 24, 2014 10:41 AM, "David Feuer" <david.feuer@gmail.com> wrote:

Your approach points the way very clearly toward the proper inductive argument as well, so I will look into that when I get home from the dentist. Assuming all goes as planned, you seem to have a winner. One thing that caught my eye in your benchmark report: it looked to me like initsQ2 was very slightly slower than initsQ on the NF tests. Do you think you could do some bigger tests there and see if it's real? If so, my *guess* is that the scanl' is wasting time forcing things that have already been forced, which we probably can't do anything about but should bear in mind.

David Feuer

On Jul 24, 2014 10:02 AM, "Joachim Breitner" <mail@joachim-breitner.de> wrote:
Hi David,

Am Mittwoch, den 23.07.2014, 16:17 -0400 schrieb David Feuer:
> In a comment on trac #9345, nomeata proposed the implementation of
> Data.List.inits below. He indicates that the performance is very good,
> and wrote it in this fashion so it could fuse. My concern is that
> since seq is used in the argument to build, we need to make sure that
> fusion on the left will be safe. If I did the calculations right, this
> specifically requires that for all f, cons, nil, and bs, and all a≠_|
> _, the following "correct" expression

> e1 = a `cons`
>      foldr cons nil $
>        foldr (\b g x -> (let b' = f x b in b' `seq` (b' : g b')))
>           (\b -> b `seq` [])
>           bs
>           a
>
> gives the same result as the fused expression
>
> e2 = a `cons`
>      foldr (\b g x -> (let b' = f x b in b' `seq` (b' `cons` g b')))
>            (\b -> b `seq` nil)
>            bs
>            a
>
> I haven't been able to figure out how to prove this or give a
> counterexample, so far, but I have very little experience with such.
>
> myScanl' :: (b -> a -> b) -> b -> [a] -> [b]
> myScanl' f a bs = build $ \c n ->
>      a `seq` a `c`
>      foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
>            (\b -> b `seq` n)
>            bs
>            a

I don’t have much experience either, but let’s try.

First of all, let’s simplify it a bit. It is unlikely that the problem
will appear for lists of varying length. Since all finite lists are
one-element-lists¹, let us substitute bs = [b]:

myScanl' f a [b]
==
build $ \c n ->
     a `seq` a `c`
     foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
           (\b -> b `seq` n)
           (b : [])
           a
== simplifying foldr
build $ \c n ->
     a `seq` a `c`
     (\ x -> let b' = f x b
             in b' `seq` (b' `c` (b' `seq` n)))
     a
== more β-reduction, redundant seq
build $ \c n ->
     a `seq` a `c`
     let b' = f a b
     in b' `seq` (b' `c` n)


Now let’s calculate the expression
        foldr c n $ myScanl' f a [b]
without and with fusion.

before_fusion
==
foldr c n $ myScanl' f a [b]
==
foldr c n $ a `seq` (a : let b' = f a b in b' `seq` (b' : []))
== foldr is strict in its third argument, so it commutes with seq
a `seq` foldr c n $ (a : let b' = f a b in b' `seq` (b' : []))
== foldr on cons
a `seq` a `c` (foldr c n $ let b' = f a b in b' `seq` (b' : []))
== foldr commutes with let and seq
a `seq` a `c` (let b' = f a b in b' `seq` (foldr c n $  b' : []))
== foldr on cons and []
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` n))


after_fusion
==
(\c n -> a `seq` a `c` let b' = f a b in b' `seq` (b' `c` n)) c n
==
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` n))

So they yield the same results, independent of c and n. This includes
the counter example from “Free Theorems in the Presence of seq”, (c =
seq, n = []).

Judging from this I conclude that nothing can go wrong for finite lists.
What about partial lists, i.e. bs = b : ⊥?

myScanl' f a (b : ⊥)
==
build $ \c n ->
     a `seq` a `c`
     foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
           (\b -> b `seq` n)
           (b : ⊥)
           a
== simplifying foldr
build $ \c n ->
     a `seq` a `c`
     (\ x -> let b' = f x b
             in b' `seq` (b' `c` (b' `seq` ⊥)))
     a
== more β-reduction, redundant seq
build $ \c n ->
     a `seq` a `c`
     let b' = f a b
     in b' `seq` (b' `c` ⊥)


before_fusion
==
foldr c n $ myScanl' f a (b:⊥)
==
foldr c n $ a `seq` (a : let b' = f a b in b' `seq` (b' : ⊥))
== foldr is strict in its third argument, so it commutes with seq
a `seq` foldr c n $ (a : let b' = f a b in b' `seq` (b' : ⊥))
== foldr on cons
a `seq` a `c` (foldr c n $ let b' = f a b in b' `seq` (b' : ⊥))
== foldr commutes with let and seq
a `seq` a `c` (let b' = f a b in b' `seq` (foldr c n $  b' : ⊥))
== foldr on cons and []
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` ⊥))


after_fusion
==
(\c n -> a `seq` a `c` let b' = f a b in b' `seq` (b' `c` ⊥)) c n
==
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` ⊥))


So nothing fancy happening here either. I conclude that we are safe, and
my gut feeling tell me the reason is likely that we don’t `seq` anything
built from the unknown c and n.

Greetings,
Joachim


¹ citation needed

--
Joachim “nomeata” Breitner
  mail@joachim-breitner.dehttp://www.joachim-breitner.de/
  Jabber: nomeata@joachim-breitner.de  • GPG-Key: 0xF0FBF51F
  Debian Developer: nomeata@debian.org


_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://www.haskell.org/mailman/listinfo/libraries