
The specialisations are indeed caused (indirectly) by the presence of print_lines. If print_lines is dead code (as it is when print_lines is not exported), then there are no calls to the overloaded functions at these specialised types, and so you don't get the specialised versions. You can get specialised versions by a SPECIALISE pragma, or SPECIALISE INSTANCE
Does that make sense?
Simon
| -----Original Message-----
| From: Neil Mitchell [mailto:ndmitchell@gmail.com]
| Sent: 28 November 2008 09:48
| To: Simon Peyton-Jones
| Cc: John Lato; glasgow-haskell-users@haskell.org; Don Stewart
| Subject: Re: cross module optimization issues
|
| Hi
|
| I've talked to John a bit, and discussed test cases etc. I've tracked
| this down a little way.
|
| Given the attached file, compiling witih SHORT_EXPORT_LIST makes the
| code go _slower_. By exporting the "print_lines" function the code
| doubles in speed. This runs against everything I was expecting, and
| that Simon has described.
|
| Taking a look at the .hi files for the two alternatives, there are two
| differences:
|
| 1) In the faster .hi file, the body of print_lines is exported. This
| is reasonable and expected.
|
| 2) In the faster .hi file, there are additional specialisations, which
| seemingly have little/nothing to do with print_lines, but are omitted
| if it is not exported:
|
| "SPEC >>= [GHC.IOBase.IO]" ALWAYS forall @ el
| $dMonad :: GHC.Base.Monad GHC.IOBase.IO
| Sound.IterateeM.>>= @ GHC.IOBase.IO @ el $dMonad
| = Sound.IterateeM.a
| `cast`
| (forall el1 a b.
| Sound.IterateeM.IterateeGM el1 GHC.IOBase.IO a
| -> (a -> Sound.IterateeM.IterateeGM el1 GHC.IOBase.IO b)
| -> trans
| (sym ((GHC.IOBase.:CoIO)
| (Sound.IterateeM.IterateeG el1 GHC.IOBase.IO b)))
| (sym ((Sound.IterateeM.:CoIterateeGM) el1 GHC.IOBase.IO b)))
| @ el
| "SPEC Sound.IterateeM.$f2 [GHC.IOBase.IO]" ALWAYS forall @ el
| $dMonad ::
| GHC.Base.Monad GHC.IOBase.IO
| Sound.IterateeM.$f2 @ GHC.IOBase.IO @ el $dMonad
| = Sound.IterateeM.$s$f2 @ el
| "SPEC Sound.IterateeM.$f2 [GHC.IOBase.IO]" ALWAYS forall @ el
| $dMonad ::
| GHC.Base.Monad GHC.IOBase.IO
| Sound.IterateeM.$f2 @ GHC.IOBase.IO @ el $dMonad
| = Sound.IterateeM.$s$f21 @ el
| "SPEC Sound.IterateeM.liftI [GHC.IOBase.IO]" ALWAYS forall @ el
| @ a
| $dMonad ::
| GHC.Base.Monad GHC.IOBase.IO
| Sound.IterateeM.liftI @ GHC.IOBase.IO @ el @ a $dMonad
| = Sound.IterateeM.$sliftI @ el @ a
| "SPEC return [GHC.IOBase.IO]" ALWAYS forall @ el
| $dMonad :: GHC.Base.Monad
| GHC.IOBase.IO
| Sound.IterateeM.return @ GHC.IOBase.IO @ el $dMonad
| = Sound.IterateeM.a7
| `cast`
| (forall el1 a.
| a
| -> trans
| (sym ((GHC.IOBase.:CoIO)
| (Sound.IterateeM.IterateeG el1 GHC.IOBase.IO a)))
| (sym ((Sound.IterateeM.:CoIterateeGM) el1 GHC.IOBase.IO a)))
| @ el
|
| My guess is that these cause the slowdown - but is there any reason
| that print_lines not being exported should cause them to be omitted?
|
| All these tests were run on GHC 6.10.1 with -O2.
|
| Thanks
|
| Neil
|
|
| On Fri, Nov 21, 2008 at 10:33 AM, Simon Peyton-Jones
|