Re: [Haskell-cafe] To [] Or Not To []

Johannes, Thanks for writing this. Actually, your negative abstract turns out to be for a quite positive article. List are like iterators, that's the essence, and as long as you are iterating, it's fine to use []. The prominence of singly linked lists in Haskell has tought me to write my data processing programs in a streaming style. To think about what data must be held in memory (use Maps or whatever for this) and read the other data in as a list. I'd be interested whether there is a way to check which of my lists in the source code the compiler managed to "deforest" away. Which intermediate files should I look at? What are the tools to inspect? Olaf

Actually, your negative abstract turns out to be for a quite positive article. List are like iterators, that's the essence, and as long as you are iterating, it's fine to use []. The prominence of singly linked lists in Haskell has tought me to write my data processing programs in a streaming style. *Yes.* I've been teaching not just "data processing" - after all almost everything we program is "data processing", no?... but such concrete stuff as physics simulation (diff. eqs.), some other numerics (asymptotic expansions, etc.) signal processing (including sound generation), and I liked to present several examples in a *dataflow
Hi. Le 11/03/2017 à 00:28, Olaf Klinke a écrit : style* with plenty of co-recursive contraptions. Haskell lazy lists were natural, concise, and easy to manipulate. We enjoyed it, wrong or not. Nothing is perfect, not only mister Nobody; calling ANY approach to programming "wrong" is sectarian. A professional coder working on a concrete project may say bad words about anything he wishes, but for a teacher this is a pedagogical sin, and inefficient programs can be more inspiring than some "correct" doctrines. Jerzy Karczmarczuk --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus

I think you are missing the point of the article, as that is pretty much
exactly the type of situation where it recommends lists be used.
On Mar 10, 2017 7:45 PM, "JK"
Hi. Le 11/03/2017 à 00:28, Olaf Klinke a écrit :
Actually, your negative abstract turns out to be for a quite positive article. List are like iterators, that's the essence, and as long as you are iterating, it's fine to use []. The prominence of singly linked lists in Haskell has tought me to write my data processing programs in a streaming style.
*Yes.* I've been teaching not just "data processing" - after all almost everything we program is "data processing", no?... but such concrete stuff as physics simulation (diff. eqs.), some other numerics (asymptotic expansions, etc.) signal processing (including sound generation), and I liked to present several examples in a *dataflow style* with plenty of co-recursive contraptions. Haskell lazy lists were natural, concise, and easy to manipulate. We enjoyed it, wrong or not.
Nothing is perfect, not only mister Nobody; calling ANY approach to programming "wrong" is sectarian. A professional coder working on a concrete project may say bad words about anything he wishes, but for a teacher this is a pedagogical sin, and inefficient programs can be more inspiring than some "correct" doctrines.
Jerzy Karczmarczuk
https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient Garanti sans virus. www.avast.com https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Am 11.03.2017 um 01:42 schrieb JK:
Actually, your negative abstract turns out to be for a quite positive article. List are like iterators, that's the essence, and as long as you are iterating, it's fine to use []. The prominence of singly linked lists in Haskell has tought me to write my data processing programs in a streaming style. *Yes.* I've been teaching not just "data processing" - after all almost everything we program is "data processing", no?... but such concrete stuff as physics simulation (diff. eqs.), some other numerics (asymptotic expansions, etc.) signal processing (including sound generation), and I liked to present several examples in a *dataflow
Le 11/03/2017 à 00:28, Olaf Klinke a écrit : style* with plenty of co-recursive contraptions. Haskell lazy lists were natural, concise, and easy to manipulate. We enjoyed it, wrong or not.
Nothing is perfect, not only mister Nobody; calling ANY approach to programming "wrong" is sectarian. A professional coder working on a concrete project may say bad words about anything he wishes, but for a teacher this is a pedagogical sin, and inefficient programs can be more inspiring than some "correct" doctrines.
Jerzy, i have always liked your style and i am very glad you wrote this response, the attitude of which i find refreshingly unconventional. (Note i am not free of the sin of, sometimes hastily, condemning what i find distasteful.) Cheers Ben

On Thu, Mar 16, 2017 at 6:22 PM, Ben Franksen
Am 11.03.2017 um 01:42 schrieb JK:
*Yes.* I've been teaching not just "data processing" - after all almost everything we program is "data processing", no?... but such concrete stuff as physics simulation (diff. eqs.), some other numerics (asymptotic expansions, etc.) signal processing (including sound generation), and I liked to present several examples in a *dataflow style* with plenty of co-recursive contraptions. Haskell lazy lists were natural, concise, and easy to manipulate. We enjoyed it, wrong or not.
Jerzy, i have always liked your style and i am very glad you wrote this response, the attitude of which i find refreshingly unconventional. (Note i am not free of the sin of, sometimes hastily, condemning what i find distasteful.)
I'd just like to add: - if it gets the job done, it's not wrong. - it's my opinion that programs are best written for clarity; the *compiler* should be optimizing, not the programmer, whenever possible. Yes, there are exceptions. But how many programs really *need* something like Duff's Device? Save the cleverness for those. This doesn't necessarily justify using an inherently wrong data structure (e.g. alist for a large keyed map), but if the flow is best understood via lists, that counts for more. (Consider that it may be *your* sanity that will be saved a year down the road when you have to revise it.) -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

/Sorry, a looong message./ Le 16/03/2017 à 23:38, Brandon Allbery a écrit :
programs are best written for clarity; the *compiler* should be optimizing, not the programmer, whenever possible.
A historical anecdote... When something called 'cybernetics' ceased to be in the Soviet Union a bourgeois pseudo-science whose aim was to enslave the the Class of Workers, etc., and Russians and their satellites began to manufacture computers, they managed quite fast to master the main idea of interesting algorithms, exquisite data structures and their processing. They COULD have invented some nice programming languages (we are at the end of '50-ties...), but among many other calamities, their forerunners said that they had some wonderful teams of very competent mathematicians, who, once instructed how to program computers, would do Wonders, in the name of the True Proletarian Science. It was partly true (mathematicians, Andrey Markov Jr., Andrey Ershov, etc., not necessarily the Proletarian Science...) . So, when the ideal personage of THE Programmer became in US a cliché in some science-fiction books (e.g., Asimov's), in the world of the True Proletarian Science, very decent humans wasted a horrible amont of time producing low-level codes, and neglecting completely the domain of compilation... They managed to put Sputnik and Gagarin above our heads, but programming languages did not evolve... [[Although the Snobol language invented by Ralph Griswold, was partly based on the Markov algorithm concept]]. Now, the morale of this story?... Wait a bit. Second round. There is a pedagogical initiative, called the International Olympiad in Informatics. ( http://www.ioinformatics.org/index.shtml ). The evolution of this contest, participating countries, etc., is a very interesting story, but here I want just to tell you something different. In Wikipedia you will read that/*it is an annual competitive programming competition for secondary school students. It is the second largest olympiad, after International Mathematical Olympiad *//* *//*The contest consists of two days of computer programming and problem-solving of algorithmic nature. To deal with problems involving very large amounts of data, it is necessary to have not only programmers, "but also creative coders, who can dream up what it is that the programmers need to tell the computer to do. The hard part isn't the programming, but the mathematics underneath it.... " */Nice. And now: the TRUTH.*The only languages which are permitted are C, C++, and Java*. Sorry, recently also Pascal, the Eastern Europe insisted upon it. I looked through the proposed tasks. A good percentage of them were puzzles of logical kind. But no logical languages were allowed. Something which can be coded in 12 lines of CLP, has to be ceeplusjavaised on 8 pages, and the Jury acknowledges the speed and efficiency of such programs... Laugh or weep?... Functional languages? Anybody heard of them?... Don't blame the Soviets, please... Look up the *British Informatics Olympiad*. The rules I found randomly for the 2000 contest stipulated: /*"The languages available will be Turbo Pascal and Turbo C/C++." */Yes, not just C++, but compulsory Borland dialect. More? Let's see the *ACM International Collegiate Programming Contest*. Rules: /*"... They must submit solutions as programs in C, C++, Java or Python (although it is not guaranteed every problem is solvable in Python)." */(Most probably the author doesn't know what Python is...) This is the way we teach our youth to be creative ! In such a way we inspire them to became "creative coders". You may think whatever you wish, but I am convinced that the best part of the responsibility for such a calamitous picture of the CS pedagogy, falls upon those feeble-minded "professionals" who know better what is good, what is the "main stream" which should be promoted, and what is "wrong", which should be severely punished. The totalitarian (or fundamentalist) doctrines are everywhere. Let's build some more huge screening walls, and forbid the presence of people who think otherwise, and we shall be Great Again. Jerzy Karczmarczuk /Caen, France/ /* */ --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus

This department has participated in regional and international programming contests at tertiary levels about as long as it has been possible to do so. One of the major problems for the organisers of such events is ensuring that working environments for all permitted programming languages are installed (and compatible) on the machines the contestants will use. These days with things like Docker it should be straightforward to set this up once and just push it out to all the machines, but it was not always so. With respect to schools, the sad fact of the matter is that school IT syllabi are often set without the input of any really clued-up programmers. The NZ curriculum was revised not that long ago, and we had some input into it, as a result of which I *think* Python is used in some classrooms. I remember one programming contest where I was supposed to be a local judge, but was ever so glad that nobody needed me, as I spent the entire period *trying* to write and run "Hello World" with Visual Age for Java, which I had never previously seen, and never wish to see again. Pity any contestant who hadn't seen it... So, if we want Haskell (or F#, or whatever) allowed, - it has to be taught by enough schools, - we have to make it easy for the organisers to install the same environment that the pupils will have used.

Hi Olaf -
I'd be interested whether there is a way to check which of my lists in the source code the compiler managed to "deforest" away. Which intermediate files should I look at? What are the tools to inspect?
Good question, and I don't know an easy answer.
The general advice is running ghc with "-ddump-simpl"
but I find it quite challenging to scan the output.
Here is a simple case where it works:
$ cat Fuse.hs
main = print $ sum $ map (^2) [1 .. 1000 :: Int]
$ ghc -fforce-recomp -ddump-simpl -O0 Fuse.hs
no optimisation - shows that "main" calls "map" etc.
$ ghc -fforce-recomp -ddump-simpl -O2 Fuse.hs
fusion works - nice non-allocating inner loop
(lists gone, and Int replaced by Int#)
Rec {
-- RHS size: {terms: 18, types: 3, coercions: 0}
Main.$wgo [InlPrag=[0], Occ=LoopBreaker]
:: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType ]
Main.$wgo =
\ (w_s5kV :: GHC.Prim.Int#) (ww_s5kZ :: GHC.Prim.Int#) ->
case w_s5kV of wild_Xn {
__DEFAULT ->
Main.$wgo
(GHC.Prim.+# wild_Xn 1#)
(GHC.Prim.+# ww_s5kZ (GHC.Prim.*# wild_Xn wild_Xn));
1000# -> GHC.Prim.+# ww_s5kZ 1000000#
}
end Rec }
but the challenge is to find the path from "main" to that,
wading through several other functions that may or may not be related.
I can imagine a source annotation like "in the code compiled
from this function f, that constructor C should never be called"
but this is certainly not easy. Do we really mean "never", or do we mean
"only a bounded number of times" (that is, not in the inner loop).
Perhaps there is no code for f itself, because it gets inlined.
But yes, *some* automated analysis (and human-readable print-out)
of the code after simplification would be nice.
This could be done as a compiler plug-in?
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/extending_gh...
- J.

Personally, I disagree with the whole premise. Lists are simple and elegant; you *should* use them most of them time. I'm not saying you should use lists as the data structure for an in-memory database or whatever, but that's the point—most applications only have a handful of data structures that "matter", and lists are great everywhere else. I actually went through and replaced [] with Vector in all of the types we parse from JSON at work, some of which get relatively large. It made the code uglier and didn't meaningfully affect the performance. I undid that change, even though it's exactly the sort of thing this article recommends. In this day and age, simple things *scale*. That's enough most of the time; if you can get away with it you *should*. The real advantage of lists comes at an intersection of two points: lists are effective in place of iterators in Haskell and, even misused as data structures, they're *not that bad* most of the time. This means that a good 80% of the time, the advantage of using a type that's compatible with the rest of my code and APIs that use lists "correctly" as iterators easily outweighs any small performance penalty. A list has to get pretty large—or my usage pattern pretty convoluted—before another type is worth the complexity. On Mon, Mar 13, 2017 at 3:43 AM, Johannes Waldmann < johannes.waldmann@htwk-leipzig.de> wrote:
Hi Olaf -
I'd be interested whether there is a way to check which of my lists in the source code the compiler managed to "deforest" away. Which intermediate files should I look at? What are the tools to inspect?
Good question, and I don't know an easy answer.
The general advice is running ghc with "-ddump-simpl" but I find it quite challenging to scan the output.
Here is a simple case where it works:
$ cat Fuse.hs
main = print $ sum $ map (^2) [1 .. 1000 :: Int]
$ ghc -fforce-recomp -ddump-simpl -O0 Fuse.hs
no optimisation - shows that "main" calls "map" etc.
$ ghc -fforce-recomp -ddump-simpl -O2 Fuse.hs
fusion works - nice non-allocating inner loop (lists gone, and Int replaced by Int#)
Rec { -- RHS size: {terms: 18, types: 3, coercions: 0} Main.$wgo [InlPrag=[0], Occ=LoopBreaker] :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int# [GblId, Arity=2, Caf=NoCafRefs, Str=DmdType
] Main.$wgo = \ (w_s5kV :: GHC.Prim.Int#) (ww_s5kZ :: GHC.Prim.Int#) -> case w_s5kV of wild_Xn { __DEFAULT -> Main.$wgo (GHC.Prim.+# wild_Xn 1#) (GHC.Prim.+# ww_s5kZ (GHC.Prim.*# wild_Xn wild_Xn)); 1000# -> GHC.Prim.+# ww_s5kZ 1000000# } end Rec }but the challenge is to find the path from "main" to that, wading through several other functions that may or may not be related.
I can imagine a source annotation like "in the code compiled from this function f, that constructor C should never be called" but this is certainly not easy. Do we really mean "never", or do we mean "only a bounded number of times" (that is, not in the inner loop). Perhaps there is no code for f itself, because it gets inlined.
But yes, *some* automated analysis (and human-readable print-out) of the code after simplification would be nice.
This could be done as a compiler plug-in? https://downloads.haskell.org/~ghc/latest/docs/html/users_ guide/extending_ghc.html#compiler-plugins
- J.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

I've also had this experience, though on a smaller scale. The things
I thought were big were not that big, and the slow things were
different from what I predicted. So far it hasn't turned out to be
lists (except via Strings, I'm happy to use Text). Also I rely on
persistence and lists are about the simplest persistent data
structure.
On Mon, Mar 13, 2017 at 11:00 AM, Tikhon Jelvis
Personally, I disagree with the whole premise. Lists are simple and elegant; you *should* use them most of them time. I'm not saying you should use lists as the data structure for an in-memory database or whatever, but that's the point—most applications only have a handful of data structures that "matter", and lists are great everywhere else.
I actually went through and replaced [] with Vector in all of the types we parse from JSON at work, some of which get relatively large. It made the code uglier and didn't meaningfully affect the performance. I undid that change, even though it's exactly the sort of thing this article recommends. In this day and age, simple things *scale*. That's enough most of the time; if you can get away with it you *should*.
The real advantage of lists comes at an intersection of two points: lists are effective in place of iterators in Haskell and, even misused as data structures, they're *not that bad* most of the time. This means that a good 80% of the time, the advantage of using a type that's compatible with the rest of my code and APIs that use lists "correctly" as iterators easily outweighs any small performance penalty. A list has to get pretty large—or my usage pattern pretty convoluted—before another type is worth the complexity.
On Mon, Mar 13, 2017 at 3:43 AM, Johannes Waldmann
wrote: Hi Olaf -
I'd be interested whether there is a way to check which of my lists in the source code the compiler managed to "deforest" away. Which intermediate files should I look at? What are the tools to inspect?
Good question, and I don't know an easy answer.
The general advice is running ghc with "-ddump-simpl" but I find it quite challenging to scan the output.
Here is a simple case where it works:
$ cat Fuse.hs
main = print $ sum $ map (^2) [1 .. 1000 :: Int]
$ ghc -fforce-recomp -ddump-simpl -O0 Fuse.hs
no optimisation - shows that "main" calls "map" etc.
$ ghc -fforce-recomp -ddump-simpl -O2 Fuse.hs
fusion works - nice non-allocating inner loop (lists gone, and Int replaced by Int#)
Rec { -- RHS size: {terms: 18, types: 3, coercions: 0} Main.$wgo [InlPrag=[0], Occ=LoopBreaker] :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int# [GblId, Arity=2, Caf=NoCafRefs, Str=DmdType
] Main.$wgo = \ (w_s5kV :: GHC.Prim.Int#) (ww_s5kZ :: GHC.Prim.Int#) -> case w_s5kV of wild_Xn { __DEFAULT -> Main.$wgo (GHC.Prim.+# wild_Xn 1#) (GHC.Prim.+# ww_s5kZ (GHC.Prim.*# wild_Xn wild_Xn)); 1000# -> GHC.Prim.+# ww_s5kZ 1000000# } end Rec }but the challenge is to find the path from "main" to that, wading through several other functions that may or may not be related.
I can imagine a source annotation like "in the code compiled from this function f, that constructor C should never be called" but this is certainly not easy. Do we really mean "never", or do we mean "only a bounded number of times" (that is, not in the inner loop). Perhaps there is no code for f itself, because it gets inlined.
But yes, *some* automated analysis (and human-readable print-out) of the code after simplification would be nice.
This could be done as a compiler plug-in?
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/extending_gh...
- J.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On 14.03.2017 04:16, Evan Laforge wrote:
.. The things I thought were big were not that big, and the slow things were different from what I predicted.
recent project (by Chris Done) to collect performance information for data structures from standard libraries: https://github.com/haskell-perf https://reddit.com/r/haskell/comments/5ym276/haskell_performance_benchmarks/ - J.

Hi, Am Samstag, den 11.03.2017, 00:28 +0100 schrieb Olaf Klinke:
I'd be interested whether there is a way to check which of my lists in the source code the compiler managed to "deforest" away. Which intermediate files should I look at? What are the tools to inspect?
you might find my package http://hackage.haskell.org/package/list-fusion-probe useful. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • https://www.joachim-breitner.de/ XMPP: nomeata@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org
participants (10)
-
Ben Franksen
-
Brandon Allbery
-
Deven Lahoti
-
Evan Laforge
-
JK
-
Joachim Breitner
-
Johannes Waldmann
-
Olaf Klinke
-
Richard A. O'Keefe
-
Tikhon Jelvis