
I've also had this experience, though on a smaller scale. The things
I thought were big were not that big, and the slow things were
different from what I predicted. So far it hasn't turned out to be
lists (except via Strings, I'm happy to use Text). Also I rely on
persistence and lists are about the simplest persistent data
structure.
On Mon, Mar 13, 2017 at 11:00 AM, Tikhon Jelvis
Personally, I disagree with the whole premise. Lists are simple and elegant; you *should* use them most of them time. I'm not saying you should use lists as the data structure for an in-memory database or whatever, but that's the point—most applications only have a handful of data structures that "matter", and lists are great everywhere else.
I actually went through and replaced [] with Vector in all of the types we parse from JSON at work, some of which get relatively large. It made the code uglier and didn't meaningfully affect the performance. I undid that change, even though it's exactly the sort of thing this article recommends. In this day and age, simple things *scale*. That's enough most of the time; if you can get away with it you *should*.
The real advantage of lists comes at an intersection of two points: lists are effective in place of iterators in Haskell and, even misused as data structures, they're *not that bad* most of the time. This means that a good 80% of the time, the advantage of using a type that's compatible with the rest of my code and APIs that use lists "correctly" as iterators easily outweighs any small performance penalty. A list has to get pretty large—or my usage pattern pretty convoluted—before another type is worth the complexity.
On Mon, Mar 13, 2017 at 3:43 AM, Johannes Waldmann
wrote: Hi Olaf -
I'd be interested whether there is a way to check which of my lists in the source code the compiler managed to "deforest" away. Which intermediate files should I look at? What are the tools to inspect?
Good question, and I don't know an easy answer.
The general advice is running ghc with "-ddump-simpl" but I find it quite challenging to scan the output.
Here is a simple case where it works:
$ cat Fuse.hs
main = print $ sum $ map (^2) [1 .. 1000 :: Int]
$ ghc -fforce-recomp -ddump-simpl -O0 Fuse.hs
no optimisation - shows that "main" calls "map" etc.
$ ghc -fforce-recomp -ddump-simpl -O2 Fuse.hs
fusion works - nice non-allocating inner loop (lists gone, and Int replaced by Int#)
Rec { -- RHS size: {terms: 18, types: 3, coercions: 0} Main.$wgo [InlPrag=[0], Occ=LoopBreaker] :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int# [GblId, Arity=2, Caf=NoCafRefs, Str=DmdType
] Main.$wgo = \ (w_s5kV :: GHC.Prim.Int#) (ww_s5kZ :: GHC.Prim.Int#) -> case w_s5kV of wild_Xn { __DEFAULT -> Main.$wgo (GHC.Prim.+# wild_Xn 1#) (GHC.Prim.+# ww_s5kZ (GHC.Prim.*# wild_Xn wild_Xn)); 1000# -> GHC.Prim.+# ww_s5kZ 1000000# } end Rec }but the challenge is to find the path from "main" to that, wading through several other functions that may or may not be related.
I can imagine a source annotation like "in the code compiled from this function f, that constructor C should never be called" but this is certainly not easy. Do we really mean "never", or do we mean "only a bounded number of times" (that is, not in the inner loop). Perhaps there is no code for f itself, because it gets inlined.
But yes, *some* automated analysis (and human-readable print-out) of the code after simplification would be nice.
This could be done as a compiler plug-in?
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/extending_gh...
- J.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.