
I simply don't have the stamina to follow up to all the objections to my messages. I'm posting this here in the thread because it's a convenient point, not because Robert's message troubles me particularly. Evidently I'm not getting my point across, so I'll give one more try and then call it a day (for ten years or so, then I can look again and see if the wind has changed). 1. I don't want to remove any extant library functions -- these have already been subsumed into the collective consciousness. 2. I'm not opposed to intercalate in particular, it's just an instance of what I perceive to be a growing problem. 3. I'm not going to be strongly opposed to any particular suggested small function -- each usually has the merit of reducing the number of tokens in the code, which with all else being equal would be a good thing. All else is not equal, though. I was going to post something more in the thread on adding on, in that ultimately “<comparison> `on` fst” is more readable than “equating”, “comparing” and friends, simply because of the quantity of names. But even of those I cannot get really excited in my denouncement. I don't know for certain about anyone else, but I for one have a limited capacity for learning arbitrary names. I'm pretty sure there's a limit for most people, but for some the limit is so large that learning all the names in all the Haskell libraries that there will be in the future won't be a problem. Such folk aren't going to be inconvenienced in the long run, but for feeble minded people such as myself, the application of a brake to the proliferation of names would be an important gain. To draw a parallel, for most of my life I've been intermittently trying to learn Chinese characters. There are so many, and although there's a degree of compositionality from radicals to whole characters, a great deal of the relationship between any complex character and its meaning is arbitrary. Consequently I've yet to learn enough that I can read a single sentence in Chinese. Six years ago, I had a Russian lodger and thought it would be fun to learn a bit of Russian. There's only a few more characters in Russian than English, so I learnt them all in a couple of days (not the order, though: that's arbitrary). After that, I found that Russian words are often composed of prefixes that modify the meaning of smaller words¹. Because of this, even without trying at all hard I can already read quite a bit of Russian. The difference is that with Chinese characters there's a whole lot of arbitrary symbols to learn before you can get anywhere, while with Russian there seems to be fewer arbitrary symbols at each level. (This isn't just me, by the way: Chinese characters are a significant obstacle to literacy in China -- it's even possible that was originally deliberate -- so they use pinyin) What I'm advocating is that when deciding whether to put something into a library we make sure that it's worth the extra effort to the reader, so that reading Haskell doesn't become as hard as reading Chinese -- only as hard as Russian ;-). To twist Gauss's snippy remark to Wilson into a rule, what I want is no new notations without new notions. If you can name something without controversy so that the ordinary English (or mathematical) reading both tells the reader immediately what the function does and is obvious to the writer looking for it (the latter is the lesser), then it can go in a library. Otherwise we have to consider: is there a more powerful function that does the job? is it possible to define the same function at a higher type? is the name going to be more useful at a different meaning? is the named version /really/ more readable than the expression it stands for? And it's not always going to be possible to answer those questions without considerable work. Particularly the last case: it's quite hard, especially for a beginning (or intermediate) programmer, to escape the idea that by naming a short bit of code one has made it simpler. Part of the trouble is that, having had to think about the thing long enough to want to name it, the name becomes (for that programmer) a shorhand for the entire process of discovering the combination. Ten years later, the discovery doesn't seem so radical, reading the short bit of code is straightforward and remembering what the name means has become difficult. On 2006-11-10 at 12:51EST Robert Dockins wrote:
The Haskell standard libraries are not, and have never been, a minimal basis lacking redundancy.
I'm not sure how you got the idea that that's what I want. There are several reasons that can make naming something short worthwhile. Take “sum”. It's code is short, but there are decisions to be made (foldl or foldr &c) that mean that the name abstracts something useful away from the code. The things I've been objecting to are pretty much the only way of writing something in terms of predefined functions.
Are you also opposed to the existence of
mapM f = sequence . map f
That's an interesting case. I'm not, because of point (1) above, but I reserve the right to whine about it. Having a naming convention is better than having completely arbitrary names, but the fact that a convention was needed should raise suspicion. Given that we had concatMap already, sequenceMap would have been a more easily interpreted name -- and then some people would wonder, why bother with the name? It took me a while to realise that mapM isn't some form of [f]map, and longer still to notice that fmap for Monads is inexplicably called liftM.
and other such useful goodies? I use mapM because it eliminates parens
You save one pair.
and reduces line length
by a small constant
for a programming pattern I use a lot. This increases redability of my code: win.
Yes, but there's a reduction in readability too, though it's harder to notice once you've learned the name mapM, and so long as the number of names like that in libraries is small the loss of readability in this way is negligible. But if we go on adding such names, it will become a real problem.
Codifying this pattern by placing it in the standard libraries means that most people use the same name for the same concept, increasing overall readability: win.
On the other hand, if there were no names for it and people wrote the same thing in each case, it would be just as readable. (Do I need to repeat that point (1) makes this specific case moot?). We need names for medium to large concepts, not for really small ones.
Programmers find this concept useful and they name it. This has been done independently by multiple people persuing diverse ends. The fact that this function isn't in the standard lib means that programmers name it different things and, in the long run, this harms readability across Haskell code.
Well, for the specific case of intercalate, the data Joseph presented doesn't support that -- there were lots of instances of the code written out, but only a few where this was the body of a definition, and some of the definitions were for a specific separator. Finding a concept useful and naming it is something that programmers do, but it's far from clear to me that it's always what they should do. I can't present a real life case, but it seems to me that there will be times when it obscures what's going on. For example suppose “weeble . sort” occurs often and is given the name “foo” and that “reverse . whiffle” has similarly been named “bar”. Now, a programmer who writes “foo . bar” might be quite happy that this does the right thing and fail to notice that “weeble . sort . reverse . whiffle” simplifies to “weeble . sort . whiffle” (given appropriate conditions). This is particularly likely when a library function has been found without looking at the code. Nor do I think that programmers naming things differently is /necessarily/ a loss, either. Sometimes the name says something about the intentional meaning that use of a name from a library would not. Finally, we have to question how far one has to search to find the definition of a name. There are (a) names that one knows already (but there is a limit to how many of these there can be), (b) names defined in the module one is currently reading, and (c) unfamiliar names defined elsewhere that one has to look up. That is, I think, a list in order of difficulty, and the more names there are in libraries, the more times what was hoped to be an (a) becomes a (c). Jón [1] eg опасность means danger, безопасность is without+danger = safety, небезопасность is not-without-danger = insecurity. For some reason I find this amusing. -- Jón Fairbairn Jon.Fairbairn at cl.cam.ac.uk