Haskell's "historical futurism" needs better writing, not better tools

I haven't been able to view Andrew Boardman's video so far, but I gather from the summary that he feels the solution to the most pressing problem -- growth through fostering newbies -- is simply . . . better tools. I beg to differ, as a newbie who is now struggling with the revival of some Haskell code only with a feeling of claustrophobia. Where is this incredible feeling of freedom I was promised? At the top of the learning curve. And it feels like a Sisyphean slope -- the boulder stops, when I lay off learning, and then slowly starts to roll down under the gravitational force of memory decay. The real problem is that the writing sucks. Not all of it -- and some contributors to the community are stellar writers, even if, in the snarkish commentary they write about Haskell and the community, I don't quite get all the jokes. But speaking as a contributor to the Haskell.org wiki -- to which I contribute at times out of hope that clarifying points I understand will also lead to more clarity for myself -- I have to say it: the writing sucks. Why does it suck? Well, there are several syndromes. One is to think that Haskell is so incredibly good and pure and elementally profound that one must start conceptually from scratch. (I've actually been curtly informed on the beginners' list -- yes, the beginner' list! -- that my problems of comprehension can be solved simply: "Learn lambda calculus.") I've had a book recommended to me whose author had such pretensions to writing in a way that was easily foundational that he actually claimed his book could teach you Haskell even if you have no programming experience at all. Now, in practical marketing terms, that's just silly: such an approach addresses a minuscule fraction of the book's potential audience. If you've heard of Haskell as a programming language, you're probably software-multilingual already. But in practical pedagogical terms, it's also a little ridiculous. Your average reader (already a programmer) would be better served by a comparative approach: Here's how to say something in a couple of other programming languages, here's how to say something roughly equivalent in Haskell -- BUT, here's how it's subtly different in Haskell. Here's how Haskell makes something easier (at least perhaps in being more concise, if the Haskell is more concise) but also conferring useful leverage where Haskell might otherwise seem unnecessarily verbose. The problem of better writing is ultimately social, I believe. I have a friend who's a tenured professor at Columbia University, who has used Haskell in courses, who praises Haskell for how it makes students think a little more clearly about how to structure software. However, his final verdict is this: Functional programming is a cult. Now, he's a unabashedly abrasive type. (Just ask him, he'll tell you.) And that verdict IS a little over the top. But I have to say, my impression of the community is that there are Founder demi-gods praised even though they leave something to be desired in how articulate and clear they are for newbies, even when claiming to be addressing newbies, while the language itself seems to be somewhat an object of worship. How can the writing be better? Well, here's a clue: Simon Peyton-Jones, somewhere in a talk where he was his usual vibrant self (but where the audience was clearly stunned and intimidated into silence even though he'd invited people to interrupt with questions) said that F# had settled on the term "workflow" instead of "monad", and he felt this was wise. Workflow? Workflow!? My still-feeble understanding of monads took a leap forward, because suddenly I had a motivation for them: abstraction of the idea of a workflow. Could it be that something as simple as Maybe was a kind of "degenerate case" of a "workflow abstraction"? I felt massively encouraged that I might finally figure out how great monads were. Which is not to say that I have. Writers: you need to stop speaking your own obscure little dialect of "programmer English". If you're not aware that you are, become more self-aware. If you ARE aware of it, become aware of how snobbish or pointlessly mysterious you sound. And if you do become aware of how you sound, but don't care, ask yourself: if the goal now is to spread Haskell and help it flourish, isn't your attitude more of a hindrance than a help? Tools? That's not the front I'd pick. Because I hit a glitch in getting VScode to work under Windows 10 (yes, cue the snobbery), I backed off to github bash and vim. And you know what? It's fine. Sorry if my knuckles seem to be dragging on the floor, but I'm from the Stone Age of Unix v7 command line and Bill Joy's original vi. I'd like nicer tools. But they aren't likely to help me with Haskell. What's not fine: half the time, I can't figure out the error messages from GHC, and a dismaying portion of the time, I can't figure out the language documentation. And since the concepts are seldom described in concrete enough and time-honored programming language terms (by comparison to other programming languages) they often don't really stick. It's like my Sysiphean boulder has been pre-greased -- if I stop pushing (hard enough with all the abstraction grease on it), it starts to slide even before it rolls, whenever I leave off coding in Haskell for any length of time. Sorry for not writing better email here -- my sentences get rather long and tortuous when I rant. But not sorry for ranting. You say the solution to a tool adoption problem (getting more people to use Haskell) is yet more tools written in Haskell, for your favorite language, Haskell? No. The failure here is a failure to communicate. Regards, Michael Turner Executive Director Project Persephone 1-25-33 Takadanobaba Shinjuku-ku Tokyo 169-0075 Mobile: +81 (90) 5203-8682 turner@projectpersephone.org Understand - http://www.projectpersephone.org/ Join - http://www.facebook.com/groups/ProjectPersephone/ Donate - http://www.patreon.com/ProjectPersephone Volunteer - https://github.com/ProjectPersephone "Love does not consist in gazing at each other, but in looking outward together in the same direction." -- Antoine de Saint-Exupéry

On Wed, Sep 15, 2021 at 04:07:38PM +0900, Michael Turner wrote:
The real problem is that the writing sucks. Not all of it -- and some contributors to the community are stellar writers, even if, in the snarkish commentary they write about Haskell and the community, I don't quite get all the jokes. But speaking as a contributor to the Haskell.org wiki -- to which I contribute at times out of hope that clarifying points I understand will also lead to more clarity for myself -- I have to say it: the writing sucks.
Can you be a bit more specific about which sort of writing you find sufficiently unsatisfactory to say "the writing sucks"? * Books about Haskell - Introductory (e.g. http://learnyouahaskell.com/) - Comprehensive (e.g. the classic Real World Haskell) - Topic focused (e.g. the IMHO rather excellent Parallel and Concurrent Haskell) - Theory focused (e.g. https://bartoszmilewski.com/category/category-theory/) - ... * The library reference documentation? * The GHC User's Guide? * The Haskell report? * Blog posts? * The Haskell Wiki? * r/haskell? * Haskell mailing lists? ... * All of the above??? I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable: https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr... are these a step in the right direction, or examples of more writing that sucks? These are reference documentation, not beginner tutorials, so a more detailed write up of the concepts, pitfalls, ... things to keep in mind when using library, ... More of that sort of thing would help me to more quickly learn to use some of the libraries that lack this sort of overview prose, but perhaps what you're looking for is something else? -- Viktor.

I am not a fan of how the new Traversable documentation buries the actual
laws.
On Thu, Sep 16, 2021, 4:55 PM Viktor Dukhovni
On Wed, Sep 15, 2021 at 04:07:38PM +0900, Michael Turner wrote:
The real problem is that the writing sucks. Not all of it -- and some contributors to the community are stellar writers, even if, in the snarkish commentary they write about Haskell and the community, I don't quite get all the jokes. But speaking as a contributor to the Haskell.org wiki -- to which I contribute at times out of hope that clarifying points I understand will also lead to more clarity for myself -- I have to say it: the writing sucks.
Can you be a bit more specific about which sort of writing you find sufficiently unsatisfactory to say "the writing sucks"?
* Books about Haskell - Introductory (e.g. http://learnyouahaskell.com/) - Comprehensive (e.g. the classic Real World Haskell) - Topic focused (e.g. the IMHO rather excellent Parallel and Concurrent Haskell) - Theory focused (e.g. https://bartoszmilewski.com/category/category-theory/) - ... * The library reference documentation? * The GHC User's Guide? * The Haskell report? * Blog posts? * The Haskell Wiki? * r/haskell? * Haskell mailing lists? ... * All of the above???
I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo...
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr...
are these a step in the right direction, or examples of more writing that sucks? These are reference documentation, not beginner tutorials, so a more detailed write up of the concepts, pitfalls, ... things to keep in mind when using library, ...
More of that sort of thing would help me to more quickly learn to use some of the libraries that lack this sort of overview prose, but perhaps what you're looking for is something else?
-- Viktor. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On Thu, Sep 16, 2021 at 04:57:28PM -0400, David Feuer wrote:
I am not a fan of how the new Traversable documentation buries the actual laws.
The laws are one click away from the table of contents, and IMHO not particularly illuminating other than for advanced readers. For example, in Data.Foldable they are: foldr f z t = appEndo (foldMap (Endo . f) t ) z foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z fold = foldMap id length = getSum . foldMap (Sum . const 1) is someone new to Data.Foldable really going to learn something from these before they've deeply understood the background concepts? My take is that the laws should almost always be "buried" (one click away) at the end of the module documentation. Those who care and need them can find them, but I think they just intimidate the less experienced readers. Putting the laws first likely only discourages beginners. -- Viktor.

The last time I went to look at the laws it took me a couple minutes to
find them. I use them to write instances. Pretty important, IMO.
On Thu, Sep 16, 2021, 6:46 PM Viktor Dukhovni
On Thu, Sep 16, 2021 at 04:57:28PM -0400, David Feuer wrote:
I am not a fan of how the new Traversable documentation buries the actual laws.
The laws are one click away from the table of contents, and IMHO not particularly illuminating other than for advanced readers.
For example, in Data.Foldable they are:
foldr f z t = appEndo (foldMap (Endo . f) t ) z foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z fold = foldMap id length = getSum . foldMap (Sum . const 1)
is someone new to Data.Foldable really going to learn something from these before they've deeply understood the background concepts?
My take is that the laws should almost always be "buried" (one click away) at the end of the module documentation. Those who care and need them can find them, but I think they just intimidate the less experienced readers. Putting the laws first likely only discourages beginners.
-- Viktor. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On Fri, Sep 17, 2021 at 5:52 AM Viktor Dukhovni
On Wed, Sep 15, 2021 at 04:07:38PM +0900, Michael Turner wrote:
The real problem is that the writing sucks. [snip]
Vikor:
Can you be a bit more specific about which sort of writing you find sufficiently unsatisfactory to say "the writing sucks"?
It's an axiom of sales: quality is what the customer says it is. You'll evaluate writing differently depending on what sort of audience you're in. I'm in this audience: -- I have a lot of programming experience with a variety of languages. I have limited patience with writing that feels like it's talking down to me. -- I have limited patience for a language whose main appeal seems to be to a niche audience of programmers who want to learn a language in part because they've been told it's a challenge -- I'm not seeking status among other programmers. ("Ooh, he knows Haskell. You have to be really smart to write Haskell.") -- I prefer a presentation that's clear and concise, with a significant reliance on diagrams that show conceptual relations. A picture of how something works is worth a thousand words that tell me how great something was, or how great it's going to be once I'm proficient.
* Books about Haskell - Introductory (e.g. http://learnyouahaskell.com/)
I started here. I was initially drawn in by the welcoming chattiness. It got old, fast. Fonzie? Really? How many young programmers today even know the TV series Happy Days? Waste of page-space. (But see above -- that's MY reaction. Audiences will differ.)
- Comprehensive (e.g. the classic Real World Haskell)
I hadn't looked at this before. So I looked today. Uh-oh: the introductory part reminds me of that old joke about the IBM salesman, back in the mainframe days, when getting a computer up and running the customer's application could take months, from hardware order to production data processing: The salesman sits on the edge of the bed on his honeymoon night, telling his bride out great it's going to be. The first bit of code shows how you'd write a very short function (courtesy of "take", not of any inherent feature of Haskell's programming model) to get the k least elements in a list. A claim is made: it's faster because of laziness. Um, really? Sorting is O(n log n) upper bound no matter what. There's a vaguely game-theoretic proof of this I found delightful back in college, and it came to mind when I read that. How would you arrange numbers in a list so that, no matter what sort algorithm you used, there would be no way in general to get the k smallest elements without a full sort? Well, maybe the authors mean "faster on average, counting using the number of comparisons"? Oh, but wait: isn't there some overhead for implementing laziness? Hm, why aren't they saying something here? My eyes glaze over all the IBM salesman honeymoon bridal suite talk, so I go to chapter one. I skip through all the stuff I already know, and at the end of the chapter, there's bit of code: count the number of lines in input. After months of looking at Haskell code (but a few weeks of not looking at it) I can't quite get it, the way I'd almost instantly get it if it was written in a more conventional programming language. It's just an exercise in putting some code in a file and running it. Supposedly it will build confidence. It has the opposite effect on me. But maybe it's explicated in the next chapter? I turn to chapter two, hoping that the line-counter would be explicated. Oh no: long IBM honeymoon-night salesman talk about Why Types Are Good. Oh, fuck you! I already know why types are good! I even know what's bad about the otherwise-good type systems of several programming languages! WHY ARE YOU WASTING MY TIME?!
- Topic focused (e.g. the IMHO rather excellent Parallel and Concurrent Haskell)
It may be excellent if you're already up to speed on Haskell. I'm a newbie drowning in writing that's not for me.
- Theory focused (e.g. https://bartoszmilewski.com/category/category-theory/)
I bailed halfway through a video from Philip Wadler, a talk about Categories for the Working Hacker (or something) because he still hadn't said anything about how knowing category theory was useful to, uh, you know, a working hacker? I ran across a blog that said: forget category theory. It won't help you. At least not starting out.
* The library reference documentation?
Pretty impenetrable for a newbie, though that's hardly unusual for a newbie to any language, with reference docs.
* The GHC User's Guide?
Been a while since I looked at it, but not much easier, as I remember.
* The Haskell report?
Haven't read it.
* Blog posts?
So far, only helpful when they are honest and tell me that I'm having trouble for some very good reasons. I feel less alone. I feel less stupid.
* The Haskell Wiki?
Very sloppy, much neglected. I put in some improvements to articles, but a certain syndrome is very much in evidence: it seems mostly written by Haskell experts who can't seem to get back into a newbie mindset (and who maybe never had a mindset like mine), and who often jump from IBM honeymoon salesman talk straight into unnecessarily complex examples.
* r/haskell?
Sometimes helpful.
* Haskell mailing lists?
When you're being told curtly, "Learn lambda calculus", or someone can't answer your question about something but instead of saying "I don't know where you'd find that", gives you an answer to a question you didn't ask, you're not on a truly beginner-friendly newbies list. Other mailing lists take up topics that feel way over my head at this point.
* All of the above???
So far, pretty much.
I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr...
OK: "Merging the contribution of the current element with an accumulator value from a partial result is performed by an operator function, either explicitly provided by the caller as in foldr, implicit count as in length, or partly implicit as in foldMap (where each element is mapped into a Monoid, and the monoid's mappend operator performs the merge)." Let me tell you how I tried to read that lo-o-ong sentence. "Merging the contribution of the current element--" Um, you didn't introduce "current element", so "the current element" is confusing. It feels like you're starting in the middle. And wait, is the current element being merged, or is some kind of contribution /from/ the current element being merged? I guess since Haskell is lazy, it could be the latter . . . "--with an accumulator value--" Coming from Erlang, I just call them "accumulators". So is an accumulator value different from an accumulator or . . . what? "--from a partial result--" So, it's a partial result in the form of the accumulator? Or "the contribution of the current element"? I kinda-sorta feels like it must be the accumulator, so, plunging onward . . . "--is performed by an operator function" Uh-oh: passive voice sentence construction. And "operator function" is . . . well, all operators are functions in Haskell, so it's performed by an operator, but-- I could go on for a while, but let it suffice to say, long before I'm halfway through this sentence, the cognitive load is starting to crush me. How many times will I have to read it to understand it? Well, I probably won't understand it with multiple readings. Because there I see "monoid" toward the end (no hyperlink), and again, that feeling: Haskell may be very powerful indeed, but a lot of the writing makes me want to kill somebody. (Don't worry. I don't kill people.) Now, remember what I said about target audiences? I can imagine a reader who would actually admire that sentence. But that reader is someone who has used foldr and foldr a lot, who has a pretty adequate mental model already, and thinks: "Good. It summarizes all my knowledge, while refreshing me on a point that hasn't mattered much in my coding so far." The sentence may contain a few unnecessary words, but for such a reader, these are like drops of water rolling off a duck's back. It might read more smoothly with a few added words, but this reader is mentally inserting such smoothing linkages in his internal conceptual schema, probably without even subvocalizing those words. Still, there might be a way to break it up and with not many added words, satisfy more readers. Do you want to do that? Do you need to do that? Let me tell you something, either way: it'll probably be harder for you than writing the sentence you wrote. This is an interesting point illustrated in a different field, by Paul Krugman, a few years after he started writing about economics for layman-reader outlets like Salon: It was actually harder for him to write the same number of words for a layman, on a given topic, than it was for him to write a journal article to pass peer review. He had to think: "What might my readers NOT quite understand yet? How can I get them there quickly?" As an economist, he could think in graphs and equations. And he could easily say things like "there could be labor-activism-related variations in how downward-sticky wages become in a liquidity trap." If he said that to your average economist, he could count on being understood. It was actually harder for him to write something more generally accessible, like, "When consumers have snapped their purses shut in a recession and during the slow recovery, most bosses will prefer to lay workers off to cut costs, rather than cut wages to keep workers on payroll. However, others may not. It could depend on how much that boss fears that the workers would go out on strike in response to a pay cut." It was harder for him than for me (I'm not an economist) because first he had to realize: if I say it in the way that's most obvious to me, most people won't get it.
are these a step in the right direction, or examples of more writing that sucks? These are reference documentation, not beginner tutorials, so a more detailed write up of the concepts, pitfalls, ... things to keep in mind when using library, ...
As I say, for a certain kind of reader, that sentence that I only stumble around in may be fine for your audience. But Haskell has apparently plateaued, after surviving a phase where, as Simon Peyton-Jones points out, almost all new languages die. So it's a question of what you want to be a part of: a relatively insular community? Or a drive to expand the community, by making learning easier?
More of that sort of thing would help me to more quickly learn to use some of the libraries that lack this sort of overview prose, but perhaps what you're looking for is something else?
I've written at very low levels -- e.g., very tight assembly code to fit into a 512-byte disk boot block on up to very high levels -- attempts at cognitive-linguistic modeling of spreading activation in conceptual networks, in Erlang. My big problem has been finding a treatment of Haskell that roots its model of computation in something more computationally concrete than, say, lambda calculus or category theory. Like I could find for Lisp. Like I could find for Prolog. Like I could find for Erlang. I once tried to get Erlang up on a platform with an old C compiler, and got a little obsessive about it: there was a nasty stack crash before I even got to the Erlang shell prompt, and I basically had to resort to laborious breakpoint debugging to find the problematic line of code. On the way down to that line, I passed data structures and C code that had me thinking, "So that's how they do that. Well, but, it was already pretty obvious. It had to be something like that. But guys: clean up your code, OK?" I have yet to gain any such feeling for Haskell. I once thought I could-- I found a very small 'compiler' for Haskell on github, written in C, that actually worked for a lot of things. (It could swallow Prelude easily.) It was maybe 2000 lines of code. I was encouraged that it generated some kind of VM assembly code. Alas, the VM code emitted turned out to be for some graph-combinator thingie I didn't understand. And I laid the code-study project aside. I thought, certainly, somebody somewhere has drawn a not-too-abstract data structure diagram that shows how thunks relate to data, and how type variables are represented, and how they directly or indirectly link to thunks. With that, I could think about the Haskell execution model with my eyes wandering around the diagram: this happens here, the next thing happens over there, opportunities for parallelism happen on this branch, dispatching of tasks happens down inside this thing, and it all comes back here. And that's why my output looks the way it does. Mystery solved! Now, if somebody harumphs and remonstrates, "That's not how you should try to understand Haskell!" I'm just going to reply, "Maybe it's not how YOU came to YOUR understanding. But look: I'm not you, OK? I'm me. I need to understand how a language works from the nuts and bolts on up. I won't feel confident enough in it otherwise." But I'd only say it that way if I was in a mood to be relatively polite. Regards, Michael Turner Executive Director Project Persephone 1-25-33 Takadanobaba Shinjuku-ku Tokyo 169-0075 Mobile: +81 (90) 5203-8682 turner@projectpersephone.org Understand - http://www.projectpersephone.org/ Join - http://www.facebook.com/groups/ProjectPersephone/ Donate - http://www.patreon.com/ProjectPersephone Volunteer - https://github.com/ProjectPersephone "Love does not consist in gazing at each other, but in looking outward together in the same direction." -- Antoine de Saint-Exupéry

And I forgot to add: selling power is important, but clarity is too.
Selling ergonomics is important, but clarity is too.
I once learned a very, very powerful language: APL. Now, to be sure,
there was a market obstacle -- it had a very extended character set.
But IBM probably thought, "A very powerful language will sell! And
that means we'll sell a lot more of the specialized IBM Selectric
type-balls! This could be our most profitable programming language
ever, especially considering that type-balls wear out eventually and
have to be replaced! What's not to like?" I suppose you could say APL
was ergonomic, because you spent a lot of your keyboard time just
looking for the right keytop symbol, and it slowed you way down.
The thing is, most code that actually gets used, and even makes money,
is read more than it's written. Software expense is mostly in the
maintenance life-cycle, and I think that's still true, even in these
days of Agile and CI -- it's just that the effort of the former
"maintenance phase" bleeds more into the early phases. I'd love to be
so fluent in Haskell that I can appreciate its superior ergonomics,
and I speak as a former RSI casualty. But frustration over feeling
powerless every time GHC barfs out another error message you don't
understand only leads to muscle tension which can inflame chronic RSI,
and Haskell is pretty frustrating, starting out. Of course, I'm sure
people will tell me to just relax and take my time. But I have a lot
of other things to do.
I've noticed an interesting metric for Haskell: github activity is
more sharply distributed over weekend days than just about any other
language in the top 40. Is it basically an intellectual hobby
language, outside of academia? I'm evaluating Haskell for a project,
one for which I'd like to report some serious progress by the time a
certain conference rolls around in March. Months ago, I started with
some confidence that I'd have settled the question. Maybe I'd even
have a demo all written in Haskell by March, if Haskell seemed right.
I'm far less confident now.
I'd been warned that the learning curve is steep and long. If my
complaints about documentation sound anguished, it's for a reason: a
lot of the problems of getting traction on the slope trace back (for
me, at least) to writing by people so steeped in Haskell that they no
longer remember how they themselves absorbed it, the better to explain
it. And, as Simon Peyton-Jones says in one talk, in the early years,
the language designers had the incredible luxury of academia -- as he
admits, they could spend literally years being confused about the
right path forward, at various points, and it didn't matter much for
their careers or their goals. That's years of steeping too. And it
shows. I don't have years. I have months. And lots of other things I
have to get done.
Regards,
Michael Turner
Executive Director
Project Persephone
1-25-33 Takadanobaba
Shinjuku-ku Tokyo 169-0075
Mobile: +81 (90) 5203-8682
turner@projectpersephone.org
Understand - http://www.projectpersephone.org/
Join - http://www.facebook.com/groups/ProjectPersephone/
Donate - http://www.patreon.com/ProjectPersephone
Volunteer - https://github.com/ProjectPersephone
"Love does not consist in gazing at each other, but in looking outward
together in the same direction." -- Antoine de Saint-Exupéry
On Fri, Sep 17, 2021 at 11:27 AM Michael Turner
On Fri, Sep 17, 2021 at 5:52 AM Viktor Dukhovni
wrote: On Wed, Sep 15, 2021 at 04:07:38PM +0900, Michael Turner wrote:
The real problem is that the writing sucks. [snip]
Vikor:
Can you be a bit more specific about which sort of writing you find sufficiently unsatisfactory to say "the writing sucks"?
It's an axiom of sales: quality is what the customer says it is. You'll evaluate writing differently depending on what sort of audience you're in.
I'm in this audience:
-- I have a lot of programming experience with a variety of languages. I have limited patience with writing that feels like it's talking down to me.
-- I have limited patience for a language whose main appeal seems to be to a niche audience of programmers who want to learn a language in part because they've been told it's a challenge -- I'm not seeking status among other programmers. ("Ooh, he knows Haskell. You have to be really smart to write Haskell.")
-- I prefer a presentation that's clear and concise, with a significant reliance on diagrams that show conceptual relations. A picture of how something works is worth a thousand words that tell me how great something was, or how great it's going to be once I'm proficient.
* Books about Haskell - Introductory (e.g. http://learnyouahaskell.com/)
I started here. I was initially drawn in by the welcoming chattiness. It got old, fast. Fonzie? Really? How many young programmers today even know the TV series Happy Days? Waste of page-space. (But see above -- that's MY reaction. Audiences will differ.)
- Comprehensive (e.g. the classic Real World Haskell)
I hadn't looked at this before. So I looked today.
Uh-oh: the introductory part reminds me of that old joke about the IBM salesman, back in the mainframe days, when getting a computer up and running the customer's application could take months, from hardware order to production data processing: The salesman sits on the edge of the bed on his honeymoon night, telling his bride out great it's going to be.
The first bit of code shows how you'd write a very short function (courtesy of "take", not of any inherent feature of Haskell's programming model) to get the k least elements in a list. A claim is made: it's faster because of laziness.
Um, really? Sorting is O(n log n) upper bound no matter what. There's a vaguely game-theoretic proof of this I found delightful back in college, and it came to mind when I read that. How would you arrange numbers in a list so that, no matter what sort algorithm you used, there would be no way in general to get the k smallest elements without a full sort? Well, maybe the authors mean "faster on average, counting using the number of comparisons"? Oh, but wait: isn't there some overhead for implementing laziness? Hm, why aren't they saying something here?
My eyes glaze over all the IBM salesman honeymoon bridal suite talk, so I go to chapter one. I skip through all the stuff I already know, and at the end of the chapter, there's bit of code: count the number of lines in input. After months of looking at Haskell code (but a few weeks of not looking at it) I can't quite get it, the way I'd almost instantly get it if it was written in a more conventional programming language. It's just an exercise in putting some code in a file and running it. Supposedly it will build confidence. It has the opposite effect on me. But maybe it's explicated in the next chapter?
I turn to chapter two, hoping that the line-counter would be explicated. Oh no: long IBM honeymoon-night salesman talk about Why Types Are Good. Oh, fuck you! I already know why types are good! I even know what's bad about the otherwise-good type systems of several programming languages! WHY ARE YOU WASTING MY TIME?!
- Topic focused (e.g. the IMHO rather excellent Parallel and Concurrent Haskell)
It may be excellent if you're already up to speed on Haskell. I'm a newbie drowning in writing that's not for me.
- Theory focused (e.g. https://bartoszmilewski.com/category/category-theory/)
I bailed halfway through a video from Philip Wadler, a talk about Categories for the Working Hacker (or something) because he still hadn't said anything about how knowing category theory was useful to, uh, you know, a working hacker? I ran across a blog that said: forget category theory. It won't help you. At least not starting out.
* The library reference documentation?
Pretty impenetrable for a newbie, though that's hardly unusual for a newbie to any language, with reference docs.
* The GHC User's Guide?
Been a while since I looked at it, but not much easier, as I remember.
* The Haskell report?
Haven't read it.
* Blog posts?
So far, only helpful when they are honest and tell me that I'm having trouble for some very good reasons. I feel less alone. I feel less stupid.
* The Haskell Wiki?
Very sloppy, much neglected. I put in some improvements to articles, but a certain syndrome is very much in evidence: it seems mostly written by Haskell experts who can't seem to get back into a newbie mindset (and who maybe never had a mindset like mine), and who often jump from IBM honeymoon salesman talk straight into unnecessarily complex examples.
* r/haskell?
Sometimes helpful.
* Haskell mailing lists?
When you're being told curtly, "Learn lambda calculus", or someone can't answer your question about something but instead of saying "I don't know where you'd find that", gives you an answer to a question you didn't ask, you're not on a truly beginner-friendly newbies list. Other mailing lists take up topics that feel way over my head at this point.
* All of the above???
So far, pretty much.
I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr...
OK:
"Merging the contribution of the current element with an accumulator value from a partial result is performed by an operator function, either explicitly provided by the caller as in foldr, implicit count as in length, or partly implicit as in foldMap (where each element is mapped into a Monoid, and the monoid's mappend operator performs the merge)."
Let me tell you how I tried to read that lo-o-ong sentence.
"Merging the contribution of the current element--"
Um, you didn't introduce "current element", so "the current element" is confusing. It feels like you're starting in the middle. And wait, is the current element being merged, or is some kind of contribution /from/ the current element being merged? I guess since Haskell is lazy, it could be the latter . . .
"--with an accumulator value--"
Coming from Erlang, I just call them "accumulators". So is an accumulator value different from an accumulator or . . . what?
"--from a partial result--"
So, it's a partial result in the form of the accumulator? Or "the contribution of the current element"? I kinda-sorta feels like it must be the accumulator, so, plunging onward . . .
"--is performed by an operator function"
Uh-oh: passive voice sentence construction. And "operator function" is . . . well, all operators are functions in Haskell, so it's performed by an operator, but--
I could go on for a while, but let it suffice to say, long before I'm halfway through this sentence, the cognitive load is starting to crush me. How many times will I have to read it to understand it? Well, I probably won't understand it with multiple readings. Because there I see "monoid" toward the end (no hyperlink), and again, that feeling: Haskell may be very powerful indeed, but a lot of the writing makes me want to kill somebody. (Don't worry. I don't kill people.)
Now, remember what I said about target audiences? I can imagine a reader who would actually admire that sentence. But that reader is someone who has used foldr and foldr a lot, who has a pretty adequate mental model already, and thinks: "Good. It summarizes all my knowledge, while refreshing me on a point that hasn't mattered much in my coding so far." The sentence may contain a few unnecessary words, but for such a reader, these are like drops of water rolling off a duck's back. It might read more smoothly with a few added words, but this reader is mentally inserting such smoothing linkages in his internal conceptual schema, probably without even subvocalizing those words.
Still, there might be a way to break it up and with not many added words, satisfy more readers. Do you want to do that? Do you need to do that? Let me tell you something, either way: it'll probably be harder for you than writing the sentence you wrote.
This is an interesting point illustrated in a different field, by Paul Krugman, a few years after he started writing about economics for layman-reader outlets like Salon: It was actually harder for him to write the same number of words for a layman, on a given topic, than it was for him to write a journal article to pass peer review. He had to think: "What might my readers NOT quite understand yet? How can I get them there quickly?"
As an economist, he could think in graphs and equations. And he could easily say things like "there could be labor-activism-related variations in how downward-sticky wages become in a liquidity trap." If he said that to your average economist, he could count on being understood. It was actually harder for him to write something more generally accessible, like, "When consumers have snapped their purses shut in a recession and during the slow recovery, most bosses will prefer to lay workers off to cut costs, rather than cut wages to keep workers on payroll. However, others may not. It could depend on how much that boss fears that the workers would go out on strike in response to a pay cut." It was harder for him than for me (I'm not an economist) because first he had to realize: if I say it in the way that's most obvious to me, most people won't get it.
are these a step in the right direction, or examples of more writing that sucks? These are reference documentation, not beginner tutorials, so a more detailed write up of the concepts, pitfalls, ... things to keep in mind when using library, ...
As I say, for a certain kind of reader, that sentence that I only stumble around in may be fine for your audience. But Haskell has apparently plateaued, after surviving a phase where, as Simon Peyton-Jones points out, almost all new languages die. So it's a question of what you want to be a part of: a relatively insular community? Or a drive to expand the community, by making learning easier?
More of that sort of thing would help me to more quickly learn to use some of the libraries that lack this sort of overview prose, but perhaps what you're looking for is something else?
I've written at very low levels -- e.g., very tight assembly code to fit into a 512-byte disk boot block on up to very high levels -- attempts at cognitive-linguistic modeling of spreading activation in conceptual networks, in Erlang. My big problem has been finding a treatment of Haskell that roots its model of computation in something more computationally concrete than, say, lambda calculus or category theory. Like I could find for Lisp. Like I could find for Prolog. Like I could find for Erlang.
I once tried to get Erlang up on a platform with an old C compiler, and got a little obsessive about it: there was a nasty stack crash before I even got to the Erlang shell prompt, and I basically had to resort to laborious breakpoint debugging to find the problematic line of code. On the way down to that line, I passed data structures and C code that had me thinking, "So that's how they do that. Well, but, it was already pretty obvious. It had to be something like that. But guys: clean up your code, OK?"
I have yet to gain any such feeling for Haskell. I once thought I could-- I found a very small 'compiler' for Haskell on github, written in C, that actually worked for a lot of things. (It could swallow Prelude easily.) It was maybe 2000 lines of code. I was encouraged that it generated some kind of VM assembly code. Alas, the VM code emitted turned out to be for some graph-combinator thingie I didn't understand. And I laid the code-study project aside. I thought, certainly, somebody somewhere has drawn a not-too-abstract data structure diagram that shows how thunks relate to data, and how type variables are represented, and how they directly or indirectly link to thunks. With that, I could think about the Haskell execution model with my eyes wandering around the diagram: this happens here, the next thing happens over there, opportunities for parallelism happen on this branch, dispatching of tasks happens down inside this thing, and it all comes back here. And that's why my output looks the way it does. Mystery solved!
Now, if somebody harumphs and remonstrates, "That's not how you should try to understand Haskell!" I'm just going to reply, "Maybe it's not how YOU came to YOUR understanding. But look: I'm not you, OK? I'm me. I need to understand how a language works from the nuts and bolts on up. I won't feel confident enough in it otherwise."
But I'd only say it that way if I was in a mood to be relatively polite.
Regards, Michael Turner Executive Director Project Persephone 1-25-33 Takadanobaba Shinjuku-ku Tokyo 169-0075 Mobile: +81 (90) 5203-8682 turner@projectpersephone.org
Understand - http://www.projectpersephone.org/ Join - http://www.facebook.com/groups/ProjectPersephone/ Donate - http://www.patreon.com/ProjectPersephone Volunteer - https://github.com/ProjectPersephone
"Love does not consist in gazing at each other, but in looking outward together in the same direction." -- Antoine de Saint-Exupéry

Michael, I have an offer for you (in fact two): 1. I will collaborate with you to produce the guide to Haskell's evaluation that *you* would want to read. 2. I will collaborate with you to write the NLP tool that you want to write in Haskell. I can't do these without collaborating with someone like you. I simply don't know what someone else wants to read. Like you I find exhortions to "learn lambda calculus" and explanations that "Haskell is lazy which means it doesn't evaluate expressions until needed" to be thoroughly unhelpful. I understand the evaluation of Haskell (rather, GHC) through analogies to C and Python (in part). Please let me know what you think about my offers. Tom On Fri, Sep 17, 2021 at 12:05:09PM +0900, Michael Turner wrote:
And I forgot to add: selling power is important, but clarity is too. Selling ergonomics is important, but clarity is too.
I once learned a very, very powerful language: APL. Now, to be sure, there was a market obstacle -- it had a very extended character set. But IBM probably thought, "A very powerful language will sell! And that means we'll sell a lot more of the specialized IBM Selectric type-balls! This could be our most profitable programming language ever, especially considering that type-balls wear out eventually and have to be replaced! What's not to like?" I suppose you could say APL was ergonomic, because you spent a lot of your keyboard time just looking for the right keytop symbol, and it slowed you way down.
The thing is, most code that actually gets used, and even makes money, is read more than it's written. Software expense is mostly in the maintenance life-cycle, and I think that's still true, even in these days of Agile and CI -- it's just that the effort of the former "maintenance phase" bleeds more into the early phases. I'd love to be so fluent in Haskell that I can appreciate its superior ergonomics, and I speak as a former RSI casualty. But frustration over feeling powerless every time GHC barfs out another error message you don't understand only leads to muscle tension which can inflame chronic RSI, and Haskell is pretty frustrating, starting out. Of course, I'm sure people will tell me to just relax and take my time. But I have a lot of other things to do.
I've noticed an interesting metric for Haskell: github activity is more sharply distributed over weekend days than just about any other language in the top 40. Is it basically an intellectual hobby language, outside of academia? I'm evaluating Haskell for a project, one for which I'd like to report some serious progress by the time a certain conference rolls around in March. Months ago, I started with some confidence that I'd have settled the question. Maybe I'd even have a demo all written in Haskell by March, if Haskell seemed right. I'm far less confident now.
I'd been warned that the learning curve is steep and long. If my complaints about documentation sound anguished, it's for a reason: a lot of the problems of getting traction on the slope trace back (for me, at least) to writing by people so steeped in Haskell that they no longer remember how they themselves absorbed it, the better to explain it. And, as Simon Peyton-Jones says in one talk, in the early years, the language designers had the incredible luxury of academia -- as he admits, they could spend literally years being confused about the right path forward, at various points, and it didn't matter much for their careers or their goals. That's years of steeping too. And it shows. I don't have years. I have months. And lots of other things I have to get done.
On Fri, Sep 17, 2021 at 11:27 AM Michael Turner
wrote: On Fri, Sep 17, 2021 at 5:52 AM Viktor Dukhovni
wrote: On Wed, Sep 15, 2021 at 04:07:38PM +0900, Michael Turner wrote:
The real problem is that the writing sucks. [snip]
Vikor:
Can you be a bit more specific about which sort of writing you find sufficiently unsatisfactory to say "the writing sucks"?
It's an axiom of sales: quality is what the customer says it is. You'll evaluate writing differently depending on what sort of audience you're in.
I'm in this audience:
-- I have a lot of programming experience with a variety of languages. I have limited patience with writing that feels like it's talking down to me.
-- I have limited patience for a language whose main appeal seems to be to a niche audience of programmers who want to learn a language in part because they've been told it's a challenge -- I'm not seeking status among other programmers. ("Ooh, he knows Haskell. You have to be really smart to write Haskell.")
-- I prefer a presentation that's clear and concise, with a significant reliance on diagrams that show conceptual relations. A picture of how something works is worth a thousand words that tell me how great something was, or how great it's going to be once I'm proficient.
* Books about Haskell - Introductory (e.g. http://learnyouahaskell.com/)
I started here. I was initially drawn in by the welcoming chattiness. It got old, fast. Fonzie? Really? How many young programmers today even know the TV series Happy Days? Waste of page-space. (But see above -- that's MY reaction. Audiences will differ.)
- Comprehensive (e.g. the classic Real World Haskell)
I hadn't looked at this before. So I looked today.
Uh-oh: the introductory part reminds me of that old joke about the IBM salesman, back in the mainframe days, when getting a computer up and running the customer's application could take months, from hardware order to production data processing: The salesman sits on the edge of the bed on his honeymoon night, telling his bride out great it's going to be.
The first bit of code shows how you'd write a very short function (courtesy of "take", not of any inherent feature of Haskell's programming model) to get the k least elements in a list. A claim is made: it's faster because of laziness.
Um, really? Sorting is O(n log n) upper bound no matter what. There's a vaguely game-theoretic proof of this I found delightful back in college, and it came to mind when I read that. How would you arrange numbers in a list so that, no matter what sort algorithm you used, there would be no way in general to get the k smallest elements without a full sort? Well, maybe the authors mean "faster on average, counting using the number of comparisons"? Oh, but wait: isn't there some overhead for implementing laziness? Hm, why aren't they saying something here?
My eyes glaze over all the IBM salesman honeymoon bridal suite talk, so I go to chapter one. I skip through all the stuff I already know, and at the end of the chapter, there's bit of code: count the number of lines in input. After months of looking at Haskell code (but a few weeks of not looking at it) I can't quite get it, the way I'd almost instantly get it if it was written in a more conventional programming language. It's just an exercise in putting some code in a file and running it. Supposedly it will build confidence. It has the opposite effect on me. But maybe it's explicated in the next chapter?
I turn to chapter two, hoping that the line-counter would be explicated. Oh no: long IBM honeymoon-night salesman talk about Why Types Are Good. Oh, fuck you! I already know why types are good! I even know what's bad about the otherwise-good type systems of several programming languages! WHY ARE YOU WASTING MY TIME?!
- Topic focused (e.g. the IMHO rather excellent Parallel and Concurrent Haskell)
It may be excellent if you're already up to speed on Haskell. I'm a newbie drowning in writing that's not for me.
- Theory focused (e.g. https://bartoszmilewski.com/category/category-theory/)
I bailed halfway through a video from Philip Wadler, a talk about Categories for the Working Hacker (or something) because he still hadn't said anything about how knowing category theory was useful to, uh, you know, a working hacker? I ran across a blog that said: forget category theory. It won't help you. At least not starting out.
* The library reference documentation?
Pretty impenetrable for a newbie, though that's hardly unusual for a newbie to any language, with reference docs.
* The GHC User's Guide?
Been a while since I looked at it, but not much easier, as I remember.
* The Haskell report?
Haven't read it.
* Blog posts?
So far, only helpful when they are honest and tell me that I'm having trouble for some very good reasons. I feel less alone. I feel less stupid.
* The Haskell Wiki?
Very sloppy, much neglected. I put in some improvements to articles, but a certain syndrome is very much in evidence: it seems mostly written by Haskell experts who can't seem to get back into a newbie mindset (and who maybe never had a mindset like mine), and who often jump from IBM honeymoon salesman talk straight into unnecessarily complex examples.
* r/haskell?
Sometimes helpful.
* Haskell mailing lists?
When you're being told curtly, "Learn lambda calculus", or someone can't answer your question about something but instead of saying "I don't know where you'd find that", gives you an answer to a question you didn't ask, you're not on a truly beginner-friendly newbies list. Other mailing lists take up topics that feel way over my head at this point.
* All of the above???
So far, pretty much.
I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr...
OK:
"Merging the contribution of the current element with an accumulator value from a partial result is performed by an operator function, either explicitly provided by the caller as in foldr, implicit count as in length, or partly implicit as in foldMap (where each element is mapped into a Monoid, and the monoid's mappend operator performs the merge)."
Let me tell you how I tried to read that lo-o-ong sentence.
"Merging the contribution of the current element--"
Um, you didn't introduce "current element", so "the current element" is confusing. It feels like you're starting in the middle. And wait, is the current element being merged, or is some kind of contribution /from/ the current element being merged? I guess since Haskell is lazy, it could be the latter . . .
"--with an accumulator value--"
Coming from Erlang, I just call them "accumulators". So is an accumulator value different from an accumulator or . . . what?
"--from a partial result--"
So, it's a partial result in the form of the accumulator? Or "the contribution of the current element"? I kinda-sorta feels like it must be the accumulator, so, plunging onward . . .
"--is performed by an operator function"
Uh-oh: passive voice sentence construction. And "operator function" is . . . well, all operators are functions in Haskell, so it's performed by an operator, but--
I could go on for a while, but let it suffice to say, long before I'm halfway through this sentence, the cognitive load is starting to crush me. How many times will I have to read it to understand it? Well, I probably won't understand it with multiple readings. Because there I see "monoid" toward the end (no hyperlink), and again, that feeling: Haskell may be very powerful indeed, but a lot of the writing makes me want to kill somebody. (Don't worry. I don't kill people.)
Now, remember what I said about target audiences? I can imagine a reader who would actually admire that sentence. But that reader is someone who has used foldr and foldr a lot, who has a pretty adequate mental model already, and thinks: "Good. It summarizes all my knowledge, while refreshing me on a point that hasn't mattered much in my coding so far." The sentence may contain a few unnecessary words, but for such a reader, these are like drops of water rolling off a duck's back. It might read more smoothly with a few added words, but this reader is mentally inserting such smoothing linkages in his internal conceptual schema, probably without even subvocalizing those words.
Still, there might be a way to break it up and with not many added words, satisfy more readers. Do you want to do that? Do you need to do that? Let me tell you something, either way: it'll probably be harder for you than writing the sentence you wrote.
This is an interesting point illustrated in a different field, by Paul Krugman, a few years after he started writing about economics for layman-reader outlets like Salon: It was actually harder for him to write the same number of words for a layman, on a given topic, than it was for him to write a journal article to pass peer review. He had to think: "What might my readers NOT quite understand yet? How can I get them there quickly?"
As an economist, he could think in graphs and equations. And he could easily say things like "there could be labor-activism-related variations in how downward-sticky wages become in a liquidity trap." If he said that to your average economist, he could count on being understood. It was actually harder for him to write something more generally accessible, like, "When consumers have snapped their purses shut in a recession and during the slow recovery, most bosses will prefer to lay workers off to cut costs, rather than cut wages to keep workers on payroll. However, others may not. It could depend on how much that boss fears that the workers would go out on strike in response to a pay cut." It was harder for him than for me (I'm not an economist) because first he had to realize: if I say it in the way that's most obvious to me, most people won't get it.
are these a step in the right direction, or examples of more writing that sucks? These are reference documentation, not beginner tutorials, so a more detailed write up of the concepts, pitfalls, ... things to keep in mind when using library, ...
As I say, for a certain kind of reader, that sentence that I only stumble around in may be fine for your audience. But Haskell has apparently plateaued, after surviving a phase where, as Simon Peyton-Jones points out, almost all new languages die. So it's a question of what you want to be a part of: a relatively insular community? Or a drive to expand the community, by making learning easier?
More of that sort of thing would help me to more quickly learn to use some of the libraries that lack this sort of overview prose, but perhaps what you're looking for is something else?
I've written at very low levels -- e.g., very tight assembly code to fit into a 512-byte disk boot block on up to very high levels -- attempts at cognitive-linguistic modeling of spreading activation in conceptual networks, in Erlang. My big problem has been finding a treatment of Haskell that roots its model of computation in something more computationally concrete than, say, lambda calculus or category theory. Like I could find for Lisp. Like I could find for Prolog. Like I could find for Erlang.
I once tried to get Erlang up on a platform with an old C compiler, and got a little obsessive about it: there was a nasty stack crash before I even got to the Erlang shell prompt, and I basically had to resort to laborious breakpoint debugging to find the problematic line of code. On the way down to that line, I passed data structures and C code that had me thinking, "So that's how they do that. Well, but, it was already pretty obvious. It had to be something like that. But guys: clean up your code, OK?"
I have yet to gain any such feeling for Haskell. I once thought I could-- I found a very small 'compiler' for Haskell on github, written in C, that actually worked for a lot of things. (It could swallow Prelude easily.) It was maybe 2000 lines of code. I was encouraged that it generated some kind of VM assembly code. Alas, the VM code emitted turned out to be for some graph-combinator thingie I didn't understand. And I laid the code-study project aside. I thought, certainly, somebody somewhere has drawn a not-too-abstract data structure diagram that shows how thunks relate to data, and how type variables are represented, and how they directly or indirectly link to thunks. With that, I could think about the Haskell execution model with my eyes wandering around the diagram: this happens here, the next thing happens over there, opportunities for parallelism happen on this branch, dispatching of tasks happens down inside this thing, and it all comes back here. And that's why my output looks the way it does. Mystery solved!
Now, if somebody harumphs and remonstrates, "That's not how you should try to understand Haskell!" I'm just going to reply, "Maybe it's not how YOU came to YOUR understanding. But look: I'm not you, OK? I'm me. I need to understand how a language works from the nuts and bolts on up. I won't feel confident enough in it otherwise."
But I'd only say it that way if I was in a mood to be relatively polite.

On Fri, Sep 17, 2021 at 4:58 PM Tom Ellis
Michael, I have an offer for you (in fact two):
1. I will collaborate with you to produce the guide to Haskell's evaluation that *you* would want to read.
2. I will collaborate with you to write the NLP tool that you want to write in Haskell.
I think if I write anything long, it would first have to be my "Haskell in Plain English: A Guide for Lexical Semanticists." It's for linguists who have a little programming experience, and who would like to learn more, both for their research and as a resume item in case linguistics (or rather, the linguistics they specialized in) doesn't work out as a career. This happens. Another bit of code I was reviving (NSM-DALIA, in Prolog, now on github) uses a grammar for Tok Pisin, a Papuan creole, developed by a former NSM researcher who is now reduced to teaching high school. This is sad. Her papers remain unpublished. I'd like to get her work out there, and maybe an NLP treatment will do the trick. I plan to steer clear of the NLP approaches in vogue. I think they are due to run out of steam. This is only my opinion, of course, but I believe that, unless something dramatic happens, Deep Learning is not going to cut it for real natural language understanding. For my target audience (lexical semanticists), they couldn't be less interested anyway. And for graduate students of linguistics who don't make the academic cut (or find they don't want to, or fall out of the academic race out of life necessity), I think they'd learn enough about software concepts to impress hiring managers in interviews. ("Don't know Deep Learning? No problem. We'll train you up on it! When can you start?") If my code (and the primer) works out for this audience, the audience will be pretty small. At this point, maybe it's just a few dozen people in the world, though there is potential for wider audiences. I've chosen Natural Semantic Metalanguage (NSM) not just out of an enduring affection for it. It's also because I feel it's small enough, yet general enough, for its NLP issues to be comprehensively treated at book length, while the code for it (exclusive of GUI frills) would almost certainly be small enough to develop and explain in the book. I'd like to make it clear at the outset that this is for lexical semanticists pursuing the NSM approach. (And perhaps even opposing it! Linguistics is very factional!) Still, I plan to leave explanatory notes for other readers, including people who are not even linguists -- maybe even people who have felt defeated about Haskell in the past, who like my style and how I develop tutorial points, and who have a general interest in natural language.
I can't do these without collaborating with someone like you. I simply don't know what someone else wants to read.
So much of good writing is just figuring out an audience. As I wrote to Viktor above, the section he presented to me in case I had trouble with it could be absolutely ideal for someone with more grounding. They could admire the sentence I struggled with, for how it encapsulates their understanding while refreshing their memory. I'm not sure that gearing a piece toward what /I/ would like to read is a much bigger audience, however, than my "Plain English" chapter. It's a good question. If I had to suggest an approach you could try on your own, it would be this: (1) Smack the reader on page 1 with Core Haskell's almost-C-level entity-relationship diagram, but say something like, "Wow, that's way too much, right? But by the end of this piece, you'll know what it all means, and you'll even be able to go back to it when Haskell behavior is confusing." (2) The next diagram might be just the data structural diagram for some kind of Tiny Lisp. Lists, the Haskell equivalent of atoms, a hash table, how binding happens, etc. (3) Then add lazy evaluation, which is actually where the whole original Haskell group started, so fair enough, right? (4...) Keep adding layers and examples of how to figure out execution paths by looking at each incarnation ... finally .... (n, for small n) The same diagram -- now seen very differently by the reader.
Like you I find exhortions to "learn lambda calculus" and explanations that "Haskell is lazy which means it doesn't evaluate expressions until needed" to be thoroughly unhelpful. I understand the evaluation of Haskell (rather, GHC) through analogies to C and Python (in part).
I think some readers on this thread have interpreted me as saying that we should show equivalent code, when exact equivalents are impossible, and in some cases, there is no equivalent anyway except in some even more obscure programming language. Well, I think I qualified my comparative approach, as being a series of starting points for explaining important /differences/. But what I'm getting is knee-jerk reactions to the Heresy of Comparing Anything to the Incomparable Haskell.
Please let me know what you think about my offers.
I think the diagram sequence strategy I outline above could
/potentially/ fit into the Plain English idea. And there may be /some/
diagrams out there that could be adapted from the Tiny Lisp level, as
a starting point. It could be better to think "drawing" first, rather
than "writing." Followed by very natural-sounding pseudocode that
refers to the drawings, to give a sense of how a Haskell interpreter
would interpret.
And I'd suggest the power of Silly Examples. Some tutorials get too
abstract too soon. Others reflect some onus to get the reader started
on something that feels practical, like organizing bookshelves
primarily by topic, then by alphabetical order of author last name.
But Simple and Silly can be quite disarming AND quite illuminating.
For example: Monads are workflows. Workflows can be illustrated as
little workers at little desks, each with their own form-processing
task, and directional inboxes and outboxes that can be connected by
arrows. The Maybe monad seems ridiculously small to be diagrammed that
way. But you can still stay disarmingly silly with a little
complexity, THEN show the Maybe monad, saying, "See? The concept of
workflow applies even to extraordinarily simple workflows, though
exactly what kind of dunce they'd have manning THAT kind of desk is a
real exercise in imagining bureaucratic inefficiency."
For example: a society with a flat 10% tax, but most people are so
dumb they can't calculate their taxes. So they send in a form with
their year's incomes. But the Internal Revenue Service is almost as
stupid, so they outsource it to a company, Gimme Ten Percent, Inc.
where people are slightly smarter -- though they still need to use a
desk calculator to figure 10% of a number. Then the IRS gets the
numbers back, and forwards the results from the outsourced firm to the
taxpayers: a tax bill. But some people are so dumb they don't even
fill in their income. They just send in the form, with that field
blank. Ah: there's the Maybe type! (Sort of.) So that's a different
response supplied by Gimme Ten Percent, Inc.: "Please fill in your
income. You left it blank on the form."
But one day, the CEO for Gimme Ten Percent comes in and announces,
"There's this private company that needs us to figure out a recent 10%
surcharge on their bills. Except sometimes the number is way too
smudgy. Our workflow works for that too!" "There's just one problem,
boss." "What's that?" "The girl who knows how to work our single desk
calculator got an offer from a Wall Street boutique private-equity
firm. They offered to train her in Haskell."
OK, that's a stupid joke, but the point is made: Gimme Ten Percent
needs an abstraction for its workflows, because it has a government
client now, but also a private-sector client with almost exactly the
same requirements. I think it beats hearing from Fonzie, over and
over, that one should never lend out one's grease-laden comb. And it
makes for some pretty concise diagrams -- including Gimme Ten Percent,
the Internal Revenue Service of the Republic of Stone Idiots, the Flat
Tax Form, the hapless taxpayers, the child-prodigy desk-calculator
operator (these things are relative, right?), etc.
Regards,
Michael Turner
Executive Director
Project Persephone
1-25-33 Takadanobaba
Shinjuku-ku Tokyo 169-0075
Mobile: +81 (90) 5203-8682
turner@projectpersephone.org
Understand - http://www.projectpersephone.org/
Join - http://www.facebook.com/groups/ProjectPersephone/
Donate - http://www.patreon.com/ProjectPersephone
Volunteer - https://github.com/ProjectPersephone
"Love does not consist in gazing at each other, but in looking outward
together in the same direction." -- Antoine de Saint-Exupéry
On Fri, Sep 17, 2021 at 4:58 PM Tom Ellis
Michael, I have an offer for you (in fact two):
1. I will collaborate with you to produce the guide to Haskell's evaluation that *you* would want to read.
2. I will collaborate with you to write the NLP tool that you want to write in Haskell.
I can't do these without collaborating with someone like you. I simply don't know what someone else wants to read.
Like you I find exhortions to "learn lambda calculus" and explanations that "Haskell is lazy which means it doesn't evaluate expressions until needed" to be thoroughly unhelpful. I understand the evaluation of Haskell (rather, GHC) through analogies to C and Python (in part).
Please let me know what you think about my offers.
Tom
On Fri, Sep 17, 2021 at 12:05:09PM +0900, Michael Turner wrote:
And I forgot to add: selling power is important, but clarity is too. Selling ergonomics is important, but clarity is too.
I once learned a very, very powerful language: APL. Now, to be sure, there was a market obstacle -- it had a very extended character set. But IBM probably thought, "A very powerful language will sell! And that means we'll sell a lot more of the specialized IBM Selectric type-balls! This could be our most profitable programming language ever, especially considering that type-balls wear out eventually and have to be replaced! What's not to like?" I suppose you could say APL was ergonomic, because you spent a lot of your keyboard time just looking for the right keytop symbol, and it slowed you way down.
The thing is, most code that actually gets used, and even makes money, is read more than it's written. Software expense is mostly in the maintenance life-cycle, and I think that's still true, even in these days of Agile and CI -- it's just that the effort of the former "maintenance phase" bleeds more into the early phases. I'd love to be so fluent in Haskell that I can appreciate its superior ergonomics, and I speak as a former RSI casualty. But frustration over feeling powerless every time GHC barfs out another error message you don't understand only leads to muscle tension which can inflame chronic RSI, and Haskell is pretty frustrating, starting out. Of course, I'm sure people will tell me to just relax and take my time. But I have a lot of other things to do.
I've noticed an interesting metric for Haskell: github activity is more sharply distributed over weekend days than just about any other language in the top 40. Is it basically an intellectual hobby language, outside of academia? I'm evaluating Haskell for a project, one for which I'd like to report some serious progress by the time a certain conference rolls around in March. Months ago, I started with some confidence that I'd have settled the question. Maybe I'd even have a demo all written in Haskell by March, if Haskell seemed right. I'm far less confident now.
I'd been warned that the learning curve is steep and long. If my complaints about documentation sound anguished, it's for a reason: a lot of the problems of getting traction on the slope trace back (for me, at least) to writing by people so steeped in Haskell that they no longer remember how they themselves absorbed it, the better to explain it. And, as Simon Peyton-Jones says in one talk, in the early years, the language designers had the incredible luxury of academia -- as he admits, they could spend literally years being confused about the right path forward, at various points, and it didn't matter much for their careers or their goals. That's years of steeping too. And it shows. I don't have years. I have months. And lots of other things I have to get done.
On Fri, Sep 17, 2021 at 11:27 AM Michael Turner
wrote: On Fri, Sep 17, 2021 at 5:52 AM Viktor Dukhovni
wrote: On Wed, Sep 15, 2021 at 04:07:38PM +0900, Michael Turner wrote:
The real problem is that the writing sucks. [snip]
Vikor:
Can you be a bit more specific about which sort of writing you find sufficiently unsatisfactory to say "the writing sucks"?
It's an axiom of sales: quality is what the customer says it is. You'll evaluate writing differently depending on what sort of audience you're in.
I'm in this audience:
-- I have a lot of programming experience with a variety of languages. I have limited patience with writing that feels like it's talking down to me.
-- I have limited patience for a language whose main appeal seems to be to a niche audience of programmers who want to learn a language in part because they've been told it's a challenge -- I'm not seeking status among other programmers. ("Ooh, he knows Haskell. You have to be really smart to write Haskell.")
-- I prefer a presentation that's clear and concise, with a significant reliance on diagrams that show conceptual relations. A picture of how something works is worth a thousand words that tell me how great something was, or how great it's going to be once I'm proficient.
* Books about Haskell - Introductory (e.g. http://learnyouahaskell.com/)
I started here. I was initially drawn in by the welcoming chattiness. It got old, fast. Fonzie? Really? How many young programmers today even know the TV series Happy Days? Waste of page-space. (But see above -- that's MY reaction. Audiences will differ.)
- Comprehensive (e.g. the classic Real World Haskell)
I hadn't looked at this before. So I looked today.
Uh-oh: the introductory part reminds me of that old joke about the IBM salesman, back in the mainframe days, when getting a computer up and running the customer's application could take months, from hardware order to production data processing: The salesman sits on the edge of the bed on his honeymoon night, telling his bride out great it's going to be.
The first bit of code shows how you'd write a very short function (courtesy of "take", not of any inherent feature of Haskell's programming model) to get the k least elements in a list. A claim is made: it's faster because of laziness.
Um, really? Sorting is O(n log n) upper bound no matter what. There's a vaguely game-theoretic proof of this I found delightful back in college, and it came to mind when I read that. How would you arrange numbers in a list so that, no matter what sort algorithm you used, there would be no way in general to get the k smallest elements without a full sort? Well, maybe the authors mean "faster on average, counting using the number of comparisons"? Oh, but wait: isn't there some overhead for implementing laziness? Hm, why aren't they saying something here?
My eyes glaze over all the IBM salesman honeymoon bridal suite talk, so I go to chapter one. I skip through all the stuff I already know, and at the end of the chapter, there's bit of code: count the number of lines in input. After months of looking at Haskell code (but a few weeks of not looking at it) I can't quite get it, the way I'd almost instantly get it if it was written in a more conventional programming language. It's just an exercise in putting some code in a file and running it. Supposedly it will build confidence. It has the opposite effect on me. But maybe it's explicated in the next chapter?
I turn to chapter two, hoping that the line-counter would be explicated. Oh no: long IBM honeymoon-night salesman talk about Why Types Are Good. Oh, fuck you! I already know why types are good! I even know what's bad about the otherwise-good type systems of several programming languages! WHY ARE YOU WASTING MY TIME?!
- Topic focused (e.g. the IMHO rather excellent Parallel and Concurrent Haskell)
It may be excellent if you're already up to speed on Haskell. I'm a newbie drowning in writing that's not for me.
- Theory focused (e.g. https://bartoszmilewski.com/category/category-theory/)
I bailed halfway through a video from Philip Wadler, a talk about Categories for the Working Hacker (or something) because he still hadn't said anything about how knowing category theory was useful to, uh, you know, a working hacker? I ran across a blog that said: forget category theory. It won't help you. At least not starting out.
* The library reference documentation?
Pretty impenetrable for a newbie, though that's hardly unusual for a newbie to any language, with reference docs.
* The GHC User's Guide?
Been a while since I looked at it, but not much easier, as I remember.
* The Haskell report?
Haven't read it.
* Blog posts?
So far, only helpful when they are honest and tell me that I'm having trouble for some very good reasons. I feel less alone. I feel less stupid.
* The Haskell Wiki?
Very sloppy, much neglected. I put in some improvements to articles, but a certain syndrome is very much in evidence: it seems mostly written by Haskell experts who can't seem to get back into a newbie mindset (and who maybe never had a mindset like mine), and who often jump from IBM honeymoon salesman talk straight into unnecessarily complex examples.
* r/haskell?
Sometimes helpful.
* Haskell mailing lists?
When you're being told curtly, "Learn lambda calculus", or someone can't answer your question about something but instead of saying "I don't know where you'd find that", gives you an answer to a question you didn't ask, you're not on a truly beginner-friendly newbies list. Other mailing lists take up topics that feel way over my head at this point.
* All of the above???
So far, pretty much.
I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr...
OK:
"Merging the contribution of the current element with an accumulator value from a partial result is performed by an operator function, either explicitly provided by the caller as in foldr, implicit count as in length, or partly implicit as in foldMap (where each element is mapped into a Monoid, and the monoid's mappend operator performs the merge)."
Let me tell you how I tried to read that lo-o-ong sentence.
"Merging the contribution of the current element--"
Um, you didn't introduce "current element", so "the current element" is confusing. It feels like you're starting in the middle. And wait, is the current element being merged, or is some kind of contribution /from/ the current element being merged? I guess since Haskell is lazy, it could be the latter . . .
"--with an accumulator value--"
Coming from Erlang, I just call them "accumulators". So is an accumulator value different from an accumulator or . . . what?
"--from a partial result--"
So, it's a partial result in the form of the accumulator? Or "the contribution of the current element"? I kinda-sorta feels like it must be the accumulator, so, plunging onward . . .
"--is performed by an operator function"
Uh-oh: passive voice sentence construction. And "operator function" is . . . well, all operators are functions in Haskell, so it's performed by an operator, but--
I could go on for a while, but let it suffice to say, long before I'm halfway through this sentence, the cognitive load is starting to crush me. How many times will I have to read it to understand it? Well, I probably won't understand it with multiple readings. Because there I see "monoid" toward the end (no hyperlink), and again, that feeling: Haskell may be very powerful indeed, but a lot of the writing makes me want to kill somebody. (Don't worry. I don't kill people.)
Now, remember what I said about target audiences? I can imagine a reader who would actually admire that sentence. But that reader is someone who has used foldr and foldr a lot, who has a pretty adequate mental model already, and thinks: "Good. It summarizes all my knowledge, while refreshing me on a point that hasn't mattered much in my coding so far." The sentence may contain a few unnecessary words, but for such a reader, these are like drops of water rolling off a duck's back. It might read more smoothly with a few added words, but this reader is mentally inserting such smoothing linkages in his internal conceptual schema, probably without even subvocalizing those words.
Still, there might be a way to break it up and with not many added words, satisfy more readers. Do you want to do that? Do you need to do that? Let me tell you something, either way: it'll probably be harder for you than writing the sentence you wrote.
This is an interesting point illustrated in a different field, by Paul Krugman, a few years after he started writing about economics for layman-reader outlets like Salon: It was actually harder for him to write the same number of words for a layman, on a given topic, than it was for him to write a journal article to pass peer review. He had to think: "What might my readers NOT quite understand yet? How can I get them there quickly?"
As an economist, he could think in graphs and equations. And he could easily say things like "there could be labor-activism-related variations in how downward-sticky wages become in a liquidity trap." If he said that to your average economist, he could count on being understood. It was actually harder for him to write something more generally accessible, like, "When consumers have snapped their purses shut in a recession and during the slow recovery, most bosses will prefer to lay workers off to cut costs, rather than cut wages to keep workers on payroll. However, others may not. It could depend on how much that boss fears that the workers would go out on strike in response to a pay cut." It was harder for him than for me (I'm not an economist) because first he had to realize: if I say it in the way that's most obvious to me, most people won't get it.
are these a step in the right direction, or examples of more writing that sucks? These are reference documentation, not beginner tutorials, so a more detailed write up of the concepts, pitfalls, ... things to keep in mind when using library, ...
As I say, for a certain kind of reader, that sentence that I only stumble around in may be fine for your audience. But Haskell has apparently plateaued, after surviving a phase where, as Simon Peyton-Jones points out, almost all new languages die. So it's a question of what you want to be a part of: a relatively insular community? Or a drive to expand the community, by making learning easier?
More of that sort of thing would help me to more quickly learn to use some of the libraries that lack this sort of overview prose, but perhaps what you're looking for is something else?
I've written at very low levels -- e.g., very tight assembly code to fit into a 512-byte disk boot block on up to very high levels -- attempts at cognitive-linguistic modeling of spreading activation in conceptual networks, in Erlang. My big problem has been finding a treatment of Haskell that roots its model of computation in something more computationally concrete than, say, lambda calculus or category theory. Like I could find for Lisp. Like I could find for Prolog. Like I could find for Erlang.
I once tried to get Erlang up on a platform with an old C compiler, and got a little obsessive about it: there was a nasty stack crash before I even got to the Erlang shell prompt, and I basically had to resort to laborious breakpoint debugging to find the problematic line of code. On the way down to that line, I passed data structures and C code that had me thinking, "So that's how they do that. Well, but, it was already pretty obvious. It had to be something like that. But guys: clean up your code, OK?"
I have yet to gain any such feeling for Haskell. I once thought I could-- I found a very small 'compiler' for Haskell on github, written in C, that actually worked for a lot of things. (It could swallow Prelude easily.) It was maybe 2000 lines of code. I was encouraged that it generated some kind of VM assembly code. Alas, the VM code emitted turned out to be for some graph-combinator thingie I didn't understand. And I laid the code-study project aside. I thought, certainly, somebody somewhere has drawn a not-too-abstract data structure diagram that shows how thunks relate to data, and how type variables are represented, and how they directly or indirectly link to thunks. With that, I could think about the Haskell execution model with my eyes wandering around the diagram: this happens here, the next thing happens over there, opportunities for parallelism happen on this branch, dispatching of tasks happens down inside this thing, and it all comes back here. And that's why my output looks the way it does. Mystery solved!
Now, if somebody harumphs and remonstrates, "That's not how you should try to understand Haskell!" I'm just going to reply, "Maybe it's not how YOU came to YOUR understanding. But look: I'm not you, OK? I'm me. I need to understand how a language works from the nuts and bolts on up. I won't feel confident enough in it otherwise."
But I'd only say it that way if I was in a mood to be relatively polite.

On Fri, Sep 17, 2021 at 06:54:50PM +0900, Michael Turner wrote:
On Fri, Sep 17, 2021 at 4:58 PM Tom Ellis
wrote: Michael, I have an offer for you (in fact two):
1. I will collaborate with you to produce the guide to Haskell's evaluation that *you* would want to read.
2. I will collaborate with you to write the NLP tool that you want to write in Haskell.
I think if I write anything long, it would first have to be my "Haskell in Plain English: A Guide for Lexical Semanticists."
Fair enough.
I can't do these without collaborating with someone like you. I simply don't know what someone else wants to read.
So much of good writing is just figuring out an audience. As I wrote to Viktor above, the section he presented to me in case I had trouble with it could be absolutely ideal for someone with more grounding. They could admire the sentence I struggled with, for how it encapsulates their understanding while refreshing their memory.
Yes indeed. The audience that I write for is myself since that's the audience I know. I am interested in writing for a more general audience but I don't have motivation at the moment to do so without a member of that general audience on the team.
I'm not sure that gearing a piece toward what /I/ would like to read is a much bigger audience
I suspect the group of people who are interested in yet frustrated by Haskell is *far* bigger than the group of people who are already familiar with Haskell!
If I had to suggest an approach you could try on your own, it would be this:
Thank you for the suggestion. I will bear it in mind if I decide to tackle this in the future. Tom

On Fri, Sep 17, 2021 at 11:27:12AM +0900, Michael Turner wrote:
I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr...
OK:
"Merging the contribution of the current element with an accumulator value from a partial result is performed by an operator function, either explicitly provided by the caller as in foldr, implicit count as in length, or partly implicit as in foldMap (where each element is mapped into a Monoid, and the monoid's mappend operator performs the merge)."
Let me tell you how I tried to read that lo-o-ong sentence.
If I may trouble you just a bit longer, I'd like to ask whether the problem is the phrasing, or the selection of subject matter. If the latter, then any defects in the phrasing are perhaps moot. Supposing I rephrase the same thing as: The contribution of each element to the final result is combined with an accumulator via an /operator/ function. The operator may be explicitly provided by the caller as in `foldr` or may be implicit as in `length`. In the case of `foldMap`, the caller provides a function mapping each element into a suitable 'Monoid', which makes it possible to merge the per-element contributions via that monoid's `mappend` function. Does that help? At all? Enough? The word "Monoid" hyperlinks to the documentation of that class, and I thought it fair to reference Monoids in the reference documentation for a module that defines `foldMap`. If the above is helpful, I am not too proud to rewrite any paragraphs that are poorly phrased, if there is some merit in the subject matter. I don't think the module overview can be a beginner tutorial, that's more the domain of books, ... I think that the descriptive prose in the reference docs that follows the function synopses just need to cover any concepts and API behaviours that don't immediately follow from those synopses. A rough guide for me were Unix manpages, where e.g. setsockopt(2) is not a networking tutorial, and yet still has a bunch of useful prose, and not just function signatures. -- Viktor.

I might have missed your link to "Monoid" because my attention fell on
"monoid mappend". Sorry for that.
"The contribution of each element to the final result is combined with an
accumulator via an /operator/ function. The operator may be explicitly
provided by the caller as in `foldr` or may be implicit as in `length`. In
the case of `foldMap`, the caller provides a function mapping each element
into a suitable 'Monoid', which makes it possible to merge the per-element
contributions via that monoid's `mappend` function."
This is a little better, but I'd write it this way, I think.
"Folds take operators as arguments. In some cases, it's implicit, as
in the function "length". These operators are applied to elements when
lazy evaluation requires it, with a fold 'accumulator' as one of the
operands. 'foldMap' uses a function (implicit? explicit?) that maps
elements into . . . ."
And this is where I bog down, because I don't know what makes a monoid
'suitable' or even really what a monoid is -- it's been a long time
(25 years or so) since my last abstract algebra course.
I don't doubt that you're doing something good here. It's just that
I'm frustrated about what's in the way of benefitting from it. I
gather that there's something about the algebraic properties of folds
that makes some code easier to write and understand, once you have the
underlying concepts fully understood. My biggest problem is probably
out of scope for what you're working on: I need motivating examples.
I've used folds in other programming languages. But how they can be
algebraically leveraged is still beyond me. I still remember
installing a Haskell command-line parser for my project code and
marveling that "semigroup" was one of its dependencies.
My main concern is whether I can use Haskell for a natural-language
processing project. Here, I started on a long Haskell primer for
linguists, in case I leave any useful code behind that linguists might
need to maintain. But also, I sometimes find that understanding comes
from writing. Did I write it clearly? If not, is it because I don't
really understand it? If I don't, writing tells me where I need to dig
deeper.
"Haskell in Plain English: A Primer for the Lexical Semanticist"
https://docs.google.com/document/d/1GIDWMbFBAaOZc-jxOu1Majt6UNzCpg4vxMGeydb4...
In that piece (no, don't read it if you're not interested in
linguistics), I remark that the algebraic facet of Haskell practice
may not be very useful for natural language parsing and understanding.
The truth is: I'd prefer to be proven wrong about that. This area is
so hard, we need all the good tools we can get. IF abstract algebra
could be useful for this application, I want it.
Regards,
Michael Turner
Executive Director
Project Persephone
1-25-33 Takadanobaba
Shinjuku-ku Tokyo 169-0075
Mobile: +81 (90) 5203-8682
turner@projectpersephone.org
Understand - http://www.projectpersephone.org/
Join - http://www.facebook.com/groups/ProjectPersephone/
Donate - http://www.patreon.com/ProjectPersephone
Volunteer - https://github.com/ProjectPersephone
"Love does not consist in gazing at each other, but in looking outward
together in the same direction." -- Antoine de Saint-Exupéry
On Fri, Sep 17, 2021 at 1:24 PM Viktor Dukhovni
On Fri, Sep 17, 2021 at 11:27:12AM +0900, Michael Turner wrote:
I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr...
OK:
"Merging the contribution of the current element with an accumulator value from a partial result is performed by an operator function, either explicitly provided by the caller as in foldr, implicit count as in length, or partly implicit as in foldMap (where each element is mapped into a Monoid, and the monoid's mappend operator performs the merge)."
Let me tell you how I tried to read that lo-o-ong sentence.
If I may trouble you just a bit longer, I'd like to ask whether the problem is the phrasing, or the selection of subject matter. If the latter, then any defects in the phrasing are perhaps moot.
Supposing I rephrase the same thing as:
The contribution of each element to the final result is combined with an accumulator via an /operator/ function. The operator may be explicitly provided by the caller as in `foldr` or may be implicit as in `length`. In the case of `foldMap`, the caller provides a function mapping each element into a suitable 'Monoid', which makes it possible to merge the per-element contributions via that monoid's `mappend` function.
Does that help? At all? Enough? The word "Monoid" hyperlinks to the documentation of that class, and I thought it fair to reference Monoids in the reference documentation for a module that defines `foldMap`.
If the above is helpful, I am not too proud to rewrite any paragraphs that are poorly phrased, if there is some merit in the subject matter.
I don't think the module overview can be a beginner tutorial, that's more the domain of books, ... I think that the descriptive prose in the reference docs that follows the function synopses just need to cover any concepts and API behaviours that don't immediately follow from those synopses.
A rough guide for me were Unix manpages, where e.g. setsockopt(2) is not a networking tutorial, and yet still has a bunch of useful prose, and not just function signatures.
-- Viktor.

On Fri, Sep 17, 2021 at 02:15:28PM +0900, Michael Turner wrote:
"The contribution of each element to the final result is combined with an accumulator via an /operator/ function. The operator may be explicitly provided by the caller as in `foldr` or may be implicit as in `length`. In the case of `foldMap`, the caller provides a function mapping each element into a suitable 'Monoid', which makes it possible to merge the per-element contributions via that monoid's `mappend` function."
This is a little better,
I'll open an MR for that then, it is likely to be seen as an overall improvement by others too I think. This is perhaps another opportunity for any other wordsmithing suggestions that anyone wants to make while the patient is on the operating table...
but I'd write it this way, I think.
"Folds take operators as arguments. In some cases, it's implicit, as in the function "length".
OK, but perhaps s/it's/the operator is/
These operators are applied to elements when lazy evaluation requires it,
Laziness has little to do with this. Whether the result of the fold contains lazy thunks or is strict, the conceptual value is still a result of applying the operator to the next element and the accumulated intermediate result built from the initial value and any previous elements whose contributions have already been folded into the accumulator (whether lazily or strictly).
with a fold 'accumulator' as one of the operands. 'foldMap' uses a function (implicit? explicit?) that maps elements into . . . ."
The `foldMap` method takes an explicit `a -> m` function that maps each element into a Monoid (a structure with an associative binary operation and an element for this binary operation). Combining elements in monoid is often called "concatenation", and thus `foldMap f` just applies (f :: a -> m) for each element `a` and then concatenates all the `m` values using the Monoid's binary operator (mappend or infix <>). I was indeed assuming that the reader has already seen Monoids, or, if not, that the reader would just move on and read the parts that don't require knowledge of Monoids. Since such a reader wouldn't be using foldMap until Monoids "click".
And this is where I bog down, because I don't know what makes a monoid 'suitable' or even really what a monoid is -- it's been a long time (25 years or so) since my last abstract algebra course.
Well it is "suitable" when the fold you want to perform is in fact concatenation of monoid elements computed from the elements of the foldable structure.
I don't doubt that you're doing something good here. It's just that I'm frustrated about what's in the way of benefitting from it. I gather that there's something about the algebraic properties of folds that makes some code easier to write and understand, once you have the underlying concepts fully understood.
Yes, that's the idea. Folds are either strict reductions (iterative strict computation of an accumulator value), or essentially coroutines (corecursion) that lazily yield a stream of values (a lazy list or similar). Folds can be left-associative or right associative, and the typically the left-associative ones are best used strictly, and the right-associative ones are best used corecursively, but as with many rules there are exceptions, and the pages of prose try to lay the groundwork for reasoning about how to use the library properly.
My biggest problem is probably out of scope for what you're working on: I need motivating examples. I've used folds in other programming languages.
Laziness makes it it possible to use folds as coroutines that lazily yield a sequence of values. This is not possible in strict languages, where you'd need explicit support for coroutines (generators) via something like a "yield" primitive. Haskell's lazy lists, used exactly once are coroutines when the elements are computed as needed. When the fold operator is building a recursively defined structure with a lazy tail, you use `foldr` corecursively to yield the head and then when demanded the tail. When the operator is strict ((+), (*), ...) you'd use `foldl'`.
My main concern is whether I can use Haskell for a natural-language processing project.
Can't help you there, I do DNS, TLS, cryptography, ... don't know anything about NLP.
In that piece (no, don't read it if you're not interested in linguistics), I remark that the algebraic facet of Haskell practice may not be very useful for natural language parsing and understanding. The truth is: I'd prefer to be proven wrong about that. This area is so hard, we need all the good tools we can get. IF abstract algebra could be useful for this application, I want it.
Haskell is quite well suited to writing parsers, DSLs, compilers, largely painless multi-threaded concurrency (which is my primary use-case), nice database APIs (e.g. Hasql), nice Web APIs (e.g. Servant and Wai)... Perhaps also NLP, but I woudn't know about that. My impression was that NLP is substantially neural-net based these days... -- Viktor.

On Fri, Sep 17, 2021 at 01:57:58AM -0400, Viktor Dukhovni wrote:
Laziness makes it it possible to use folds as coroutines that lazily yield a sequence of values. This is not possible in strict languages, where you'd need explicit support for coroutines (generators) via something like a "yield" primitive.
Wouldn't an explicit thunk datatype (that takes a lambda as a "constructor") be sufficient? I can't see why going all the way to coroutines would be required. Tom

On 17 Sep 2021, at 3:19 am, Tom Ellis
wrote: On Fri, Sep 17, 2021 at 01:57:58AM -0400, Viktor Dukhovni wrote:
Laziness makes it it possible to use folds as coroutines that lazily yield a sequence of values. This is not possible in strict languages, where you'd need explicit support for coroutines (generators) via something like a "yield" primitive.
Wouldn't an explicit thunk datatype (that takes a lambda as a "constructor") be sufficient? I can't see why going all the way to coroutines would be required.
Yes, sure, coroutines are but one model. Indeed explicit thunks can simulate laziness in a strict language. But then there's the mind bending recent challenge on r/haskell to implement (in Haskell) a general `foldr` using nothing from the underlying Foldable except its `foldl'` (otherwise, any and all Haskell tools are fine). The implementation needs to be no less lazy than the real `foldr`, forcing no more of the structures spine or elements than `foldr` would. It turns out that pretty much the only solutions reported all use coroutines (unsafePerformIO and forkIO) in order to synchronise demand-driven yields of the structure elements by a strict left fold. This tells me that `foldr` as coroutine is one version of the truth, even if there are alternative valid mental models. {-# LANGUAGE LambdaCase #-} import Control.Concurrent import qualified Data.Foldable as F import System.IO.Unsafe foldr :: Foldable f => (a -> b -> b) -> b -> f a -> b foldr f z xs = unsafeDupablePerformIO $ do next <- newEmptyMVar lock <- newEmptyMVar let yield k a = seq (unsafeDupablePerformIO $ putMVar next (Just a) >> takeMVar lock) k loop = takeMVar next >>= \case Nothing -> return z Just a -> unsafeInterleaveIO (putMVar lock () >> loop) >>= pure . f a forkIO $ F.foldl' yield (pure ()) xs >> putMVar next Nothing loop -- Viktor.

Am 17.09.21 um 07:15 schrieb Michael Turner:
I might have missed your link to "Monoid" because my attention fell on "monoid mappend". Sorry for that.
"The contribution of each element to the final result is combined with an accumulator via an /operator/ function. The operator may be explicitly provided by the caller as in `foldr` or may be implicit as in `length`. In the case of `foldMap`, the caller provides a function mapping each element into a suitable 'Monoid', which makes it possible to merge the per-element contributions via that monoid's `mappend` function."
This is a little better, but I'd write it this way, I think.
"Folds take operators as arguments. In some cases, it's implicit, as in the function "length". These operators are applied to elements when lazy evaluation requires it, with a fold 'accumulator' as one of the operands. 'foldMap' uses a function (implicit? explicit?) that maps elements into . . . ."
The problem you two are both facing is this: you want to describe, abstractly, generally, the common principle behind an ad-hoc lumped-together set of functions. This is very likely to result in contortions and provides you with no insight. You need to realize that these methods are all simple specializations of two or three basic functions. The only reason they are methods of the class is to allow specialized implementations for optimization. Foldable is a very bad example to learn from. IMO its documentation should only describe 'toList', define foldMap and fold in terms of foldr or foldl, and otherwise refer you to the functions in Data.List. These are documented in simple terms. The best description for foldr I have ever come across is this: "foldr f z l replaces in the list l the data constructors [] and (:) with z and f, respectively." Once you understand the list data type and remember that (:) is right associative, this makes foldr totally obvious. Left folds are a bit less immediately intuitive as they go against the "natural" associativity, but you can think of foldl as being the analogue of foldr for "snoc" lists (lists with cheap addition of elements at the right end), but implemented for normal lists. Once you understand that, everything else are trivial specializations, such as sum = foldl' (+) 0 The only real subtlety is foldl vs. foldl' i.e. lazy vs. strict. This is quite difficult to understand, as it requires to go pretty deep into the evaluation model of Haskell, but for a beginner it suffices to know that you probably want to use foldl'. Once you understand how folds and their various specialisations work for lists, generalizing that knowledge to Foldable structures is trivial: just think Data.Foldable.foldr f z l = Data.List.foldr f z (toList l) etc. Cheers Ben -- I would rather have questions that cannot be answered, than answers that cannot be questioned. -- Richard Feynman

I really enjoyed reading through the documentation Viktor has contributed, thank
you! No doubt, as Michael writes, things can be improved, but in my opinion it
is much better than before.
Actually, I was astonished to see such a nice overview on Hackage! And this
directly relates to the core of the problem Michael is addressing: I am used to
having "adequate" documentation on Hackage, and these paragraphs stood out.
Viktor Dukhovni
On Wed, Sep 15, 2021 at 04:07:38PM +0900, Michael Turner wrote:
The real problem is that the writing sucks. Not all of it -- and some contributors to the community are stellar writers, even if, in the snarkish commentary they write about Haskell and the community, I don't quite get all the jokes. But speaking as a contributor to the Haskell.org wiki -- to which I contribute at times out of hope that clarifying points I understand will also lead to more clarity for myself -- I have to say it: the writing sucks.
Can you be a bit more specific about which sort of writing you find sufficiently unsatisfactory to say "the writing sucks"?
* Books about Haskell - Introductory (e.g. http://learnyouahaskell.com/) - Comprehensive (e.g. the classic Real World Haskell) - Topic focused (e.g. the IMHO rather excellent Parallel and Concurrent Haskell) - Theory focused (e.g. https://bartoszmilewski.com/category/category-theory/) - ... * The library reference documentation? * The GHC User's Guide? * The Haskell report? * Blog posts? * The Haskell Wiki? * r/haskell? * Haskell mailing lists? ... * All of the above???
I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr...
are these a step in the right direction, or examples of more writing that sucks? These are reference documentation, not beginner tutorials, so a more detailed write up of the concepts, pitfalls, ... things to keep in mind when using library, ...
More of that sort of thing would help me to more quickly learn to use some of the libraries that lack this sort of overview prose, but perhaps what you're looking for is something else?

Am 16.09.21 um 22:52 schrieb Viktor Dukhovni:
I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr...
are these a step in the right direction, or examples of more writing that sucks?
Sorry, but IMO it sucks. "Foldable structures are reduced to a summary value by accumulating contributions to the result one element at a time." The first sentence of the class docs says "The Foldable class represents data structures that can be reduced to a summary value one element at a time." which is more correct and (slightly) more concise with identical meaning. No value added here. Next: "The contribution of each element to the final result is combined with an accumulator via an operator function. [...]" First of all, use of the term "operator function" is confusing. In Haskell an operator is a function, period. Infix versus prefix notation is completely irrelevant here. The whole sentence is formulated in a cryptic and complicated way. It seems to me that what you want to say is "The result is calculated by successively applying the argument function to each element and an accumulator, producing a new accumulator." (*) Your text tries to explain things in a precise manner, but is hampered by insisting that the words "sequence" or "list" must be avoided, even though words like "precede", "follow", "left", and "right" are used all over the place. This understandably leads to awkward formulations. It is much easier to understand the concepts via sequences. The documentation of foldr and foldl has these two lines (**): foldl f z [x1, x2, ..., xn] == (...((z `f` x1) `f` x2) `f`...) `f` xn foldr f z [x1, x2, ..., xn] == x1 `f` (x2 `f` ... (xn `f` z)...) They define semantics in a semi-formal notation, which I find succinct and very intuitive. This can be easily generalized to Foldable via 'toList'. Indeed, there is almost nothing about Foldable that cannot be understood once you understand Data.List and the toList method. Even foldMap is (semantically) just (***): foldMap f = foldr mappend mempty . map f . toList = foldl mappend mempty . map f . toList In fact all class methods have simple semantic definitions in terms of the same named functions from Data.List via 'toList'. This makes any further documentation of them redundant. This would also obviate the need to introduce ambiguous terminology like "explicit operator". The next sentence after the example: "The first argument of both is an explicit operator that merges the contribution of an element of the structure with a partial fold over, respectively, either the preceding or following elements of the structure." "merges the contribution of" is a complicated way to say "combines". Again, the semi-formal notation (**) says it all. As does my alternative proposal for the first sentence above (*), which incidentally demonstrates that you are repeating yourself here. The next sentence I could understand on first reading. Nevertheless, the content of everything up to (including) this sentence can be abbreviated, by adding a singe sentence to (*): "The result is calculated by successively applying the argument function to each element and an accumulator, producing a new accumulator. The accumulator is initialized with the second argument." Generally, the text is too long, too verbose, and the wording is too complicated. Some parts e.g. the discussion about "Chirality" (an extremely obscure term that I have never read anywhere before) again serves to illustrate that, semantically, Foldable is not much more than a glorified 'toList', everything else is optimization. Users would be served better by plainly stating it in this way, rather than obscuring it, as if there were kind of "deep" abstraction going on. Expectations of runtime cost should be either explicitly stated in the docs for each method or left out. The "Notes" section adds useful information. The most important documentation for a class, its laws, is (1) not contained in the class docs, (2) not even linked from there, and (3) too cryptic. I would *at least* expect the Monoid newtype wrappers (Endo, Dual, Sum, Product) and their inverses to be hyperlinked to their definition. The way the first two laws are stated in terms of these newtypes makes them harder to understand than necessary, especially for newcomers. My (equivalent) semantic definition of foldMap above (***) in terms of foldr/foldl does not require knowledge about these types and the semantics of their Monoid instances. Yet another aside: apart from trivial reductions to the corresponding function in Data.List using 'toList' (except 'toList' itself as well as 'fold' and 'foldMap'), the only non-trivial law seems to be "If the type is also a Functor instance, it should satisfy foldMap f = fold . fmap f" I may be wrong but it looks to me as if this could be derived by adding one more method 'fromList' that is required to be left inverse of 'toList': fromList :: Foldable a => [a] -> t a fromList . toList = id Of course, for some Foldables (e.g. those with a fixed finite number of elements), fromList would be partial. Is there a sensible (useful, lawful) Foldable instance which has no 'fromList'? I suspect no. In fact I suspect addition of fromList (with the left inverse law) would nicely serve to rule out useless/surprising instances. Cheers Ben -- I would rather have questions that cannot be answered, than answers that cannot be questioned. -- Richard Feynman

On Fri, Oct 01, 2021 at 02:40:49AM +0200, Ben Franksen wrote:
I am also curious whether I'm part of the solution or part of the precipitate. I've recently contributed new documentation for Data.Foldable and Data.Traversable:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Tr...
are these a step in the right direction, or examples of more writing that sucks?
Sorry, but IMO it sucks.
The main goal of the overview was to help users to be able to reason about the behaviour of the class methods, and be able to choose the correct one of (foldr, foldl, or foldl') without forcing an entire list into memory or needlessly generating a long list of lazy thunks. So I don't think that just saying "toList", done quite does the job. That said, given that there are surely sections of prose that could be better, would you care to submit diffs for inclusion in MR 6555? https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6555 you can check out the branch, and send me diffs? Or is it sufficient to change just the particularly egregious sentences you noted?
They define semantics in a semi-formal notation, which I find succinct and very intuitive. This can be easily generalized to Foldable via 'toList'. Indeed, there is almost nothing about Foldable that cannot be understood once you understand Data.List and the toList method. Even foldMap is (semantically) just (***):
Well a balanced Tree can have an efficient corecursive foldl, or a performant 'foldr`', and Sets can know their size statically, and `elem` runs in linear time even in structures that potentially support faster search. And it is perhaps worth asking whether you feel you still have anything you'd like to learn about Foldable, for if not, perhaps the documentation is not for you, and that's fine...
In fact all class methods have simple semantic definitions in terms of the same named functions from Data.List via 'toList'.
But performance may differ radically, and `toList` may diverge for `snocList` when infinite on the left, though that's a rather pathological example.
Expectations of runtime cost should be either explicitly stated in the docs for each method or left out.
That's not enough if users can't reason about the available choices or don't know how to implement performant instances. Function synopses rarely provide enough room for more than a cursory description and a few examples. That's not their role. This is why Unix manpages have both a SYNOPSIS and a DESCRIPTION section. I am quite open to improved language in the overview, and less open to the idea that it is just baggage to throw overboard. In particular, I've had positive feedback on the material, despite perhaps overly turgid prose in some places. Please help to make it crisp. I find absence of overview (DESCRIPTION if you like) sections in many a non-trivial Haskell library to be quite a barrier to working with the library, the synopses alone are rarely enough for my needs.
The most important documentation for a class, its laws, is (1) not contained in the class docs, (2) not even linked from there, and (3) too cryptic.
The links are fixed in MR 6555, and I've asked David Feuer to contribute prose to clarify the laws, I also find them rather opaque, and left them as is when I wrote the overview. This is a good opportunity to address issues with the laws.
I would *at least* expect the Monoid newtype wrappers (Endo, Dual, Sum, Product) and their inverses to be hyperlinked to their definition.
Agreed, and some sort of explanatory text... These are inherited from earlier versions of the module which had only the laws and no overview.
I may be wrong but it looks to me as if this could be derived by adding one more method 'fromList' that is required to be left inverse of 'toList':
fromList :: Foldable a => [a] -> t a fromList . toList = id
This is roughly the sort of thing one can do with Traversable (recover the structure from its spine and element list, but not its element list alone). The point is that various non-linear (e.g. tree-like) structures with the same element order have distinct "spines".
Of course, for some Foldables (e.g. those with a fixed finite number of elements), fromList would be partial.
And not well defined without a `t ()` spine into which the elements can be inserted.
Is there a sensible (useful, lawful) Foldable instance which has no 'fromList'?
Sure, any tree-like structure where shape is not implied by the element list alone.
I suspect no. In fact I suspect addition of fromList (with the left inverse law) would nicely serve to rule out useless/surprising instances.
While I don't love harsh critiques, I do recognise that a harsh criticism can be an source of energy that can be harnessed to good ends. Therefore, if you're willing to apply that energy to improving the text, I'd love to work with you. If you feel the document is beyond repair, I'll be disappointed, but I'm willing to accept that. Thanks for taking the time to look it over. -- Viktor.

On Thu, Sep 30, 2021 at 09:14:36PM -0400, Viktor Dukhovni wrote:
Is there a sensible (useful, lawful) Foldable instance which has no 'fromList'?
Another salient counter-example, is the Foldable instance of `Map k`, which sequences only the *values* `v` stored in a `Map k v`, forgetting the keys `k`. There is therefore no: fromList :: [v] -> Map k v that could undo: toList :: Map k v -> [v] and for that we'd need the Map spine (key set) its `Traversable` instance: fromValueList :: Ord k => Map k () -> [a] -> Map k a fromValueList = evalState . traverse f where f :: () -> State [v] v f _ = get >>= \ !s -> head s <$ put (tail s) Basically, containers can have a non-trivial "shape" that `toList` "flattens", so it is a one-way operation. --- Switching subtopics to the "Chirality" section, I added it in response to a criticism that earlier text was inaccurate for structures that do not support efficient left-to-right iteration (if you like, have an inefficient or possibly divergent `toList` that might take a long time or forever to return the left-most element). If there's a general feeling that accepting the suggestion to be more accurate was a mistake, the exposition could indeed be shorter if it were fair to assume that all Foldable structures of interest are "left-handed" (can quickly return the left-most element). And that while one can define structures that violate this assumption, they're not a good fit for Foldable and not worthy of explication. -- Viktor.

On Thu, Sep 30, 2021 at 09:14:36PM -0400, Viktor Dukhovni wrote:
In fact all class methods have simple semantic definitions in terms of the same named functions from Data.List via 'toList'.
But performance may differ radically, and `toList` may diverge for `snocList` when infinite on the left, though that's a rather pathological example.
If one can't write Foldable-generic functionality in a way that provides some reasonable uniformity of performance over different instances then one wonders what is point of having Foldable as a typeclass at all. At that point it's just name overloading. Tom

On Fri, Oct 01, 2021 at 10:05:30AM +0100, Tom Ellis wrote:
But performance may differ radically, and `toList` may diverge for `snocList` when infinite on the left, though that's a rather pathological example.
If one can't write Foldable-generic functionality in a way that provides some reasonable uniformity of performance over different instances then one wonders what is point of having Foldable as a typeclass at all. At that point it's just name overloading.
This is why I was reluctant originally to say anything about right-biased structures... They break established expectations. I focused mostly on symmetric structures, for which left and right folds should perform identically (if instances properly take advantage of the symmetry), these are I think practical. About right-biased structures I said: https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo... Finally, in some less common structures (e.g. snoc lists) right to left iterations are cheaper than left to right. Such structures are poor candidates for a Foldable instance, and are perhaps best handled via their type-specific interfaces. If nevertheless a Foldable instance is provided, the material in the sections that follow applies to these also, by replacing each method with one with the opposite associativity (when available) and switching the order of arguments in the fold's operator. Concrete suggestions to address any issues in this section (now that the title is no longer "Chirality") are welcome (MR 6555). https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6555 -- Viktor.

Right-biased Foldable instances are perfectly reasonable. Just don't expect
fromList, foldr, or foldl' to be good for them.
On Fri, Oct 1, 2021, 5:27 AM Viktor Dukhovni
On Fri, Oct 01, 2021 at 10:05:30AM +0100, Tom Ellis wrote:
But performance may differ radically, and `toList` may diverge for `snocList` when infinite on the left, though that's a rather pathological example.
If one can't write Foldable-generic functionality in a way that provides some reasonable uniformity of performance over different instances then one wonders what is point of having Foldable as a typeclass at all. At that point it's just name overloading.
This is why I was reluctant originally to say anything about right-biased structures... They break established expectations.
I focused mostly on symmetric structures, for which left and right folds should perform identically (if instances properly take advantage of the symmetry), these are I think practical. About right-biased structures I said:
https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Fo...
Finally, in some less common structures (e.g. snoc lists) right to left iterations are cheaper than left to right. Such structures are poor candidates for a Foldable instance, and are perhaps best handled via their type-specific interfaces. If nevertheless a Foldable instance is provided, the material in the sections that follow applies to these also, by replacing each method with one with the opposite associativity (when available) and switching the order of arguments in the fold's operator.
Concrete suggestions to address any issues in this section (now that the title is no longer "Chirality") are welcome (MR 6555).
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6555
-- Viktor. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Am 01.10.21 um 03:14 schrieb Viktor Dukhovni:
On Fri, Oct 01, 2021 at 02:40:49AM +0200, Ben Franksen wrote:
They define semantics in a semi-formal notation, which I find succinct and very intuitive. This can be easily generalized to Foldable via 'toList'. Indeed, there is almost nothing about Foldable that cannot be understood once you understand Data.List and the toList method. Even foldMap is (semantically) just (***):
Well a balanced Tree can have an efficient corecursive foldl, or a performant 'foldr`', and Sets can know their size statically, and `elem` runs in linear time even in structures that potentially support faster search.
All true, and I think it is important to document these things. The question is: where? This is a general problem with all kinds of generic "container" classes/interfaces, and not limited to Haskell: performance characteristics of methods will vary widely depending on the implementation. In Haskell this includes semantics insofar as bottom / infinite structures are concerned. Documenting the API /itself/ can go only so far before becoming a manual enumeration of all possible implementations one can think of or happens to know about. This clearly doesn't scale, so it would be better to leave it unspecified and attach such documentation to the instances instead. It would help if rendering of instance docs in haddock were improved (as sub-paragraph under the instance instead of to the right of it).
And it is perhaps worth asking whether you feel you still have anything you'd like to learn about Foldable, for if not, perhaps the documentation is not for you, and that's fine...
I may have earned this remark with the tone of my critique ;-) In case it came over as condescending, please accept my apologies.
In fact all class methods have simple semantic definitions in terms of the same named functions from Data.List via 'toList'.
But performance may differ radically, and `toList` may diverge for `snocList` when infinite on the left, though that's a rather pathological example.
Expectations of runtime cost should be either explicitly stated in the docs for each method or left out.
That's not enough if users can't reason about the available choices or don't know how to implement performant instances.
The truth is (and that is what the docs imply but fail to make explicit as that would be too embarrasing): you *cannot* reason about these things. If the implementation (instance) is free to choose whether foldr or foldl is the "natural" fold, then how can I make an informed choice between them for an arbitrary Foldable? If we were to define the semantics in terms of 'toList', then we would acknowledge that Foldable is biased in the same direction as lists are, so behavior would no longer be implementation defined and could be easily reasoned about.
Function synopses rarely provide enough room for more than a cursory description and a few examples. That's not their role. This is why Unix manpages have both a SYNOPSIS and a DESCRIPTION section.
I am quite open to improved language in the overview, and less open to the idea that it is just baggage to throw overboard. In particular, I've had positive feedback on the material, despite perhaps overly turgid prose in some places. Please help to make it crisp.
I find absence of overview (DESCRIPTION if you like) sections in many a non-trivial Haskell library to be quite a barrier to working with the library, the synopses alone are rarely enough for my needs.
I agree. I do like (not too verbose) introduction of concepts when reading docs of a new module. See below for a proposal.
I may be wrong but it looks to me as if this could be derived by adding one more method 'fromList' that is required to be left inverse of 'toList':
fromList :: Foldable a => [a] -> t a fromList . toList = id
This is roughly the sort of thing one can do with Traversable (recover the structure from its spine and element list, but not its element list alone). The point is that various non-linear (e.g. tree-like) structures with the same element order have distinct "spines". [...]
Is there a sensible (useful, lawful) Foldable instance which has no 'fromList'?
Sure, any tree-like structure where shape is not implied by the element list alone.
Sorry, you are right, of course. I wasn't thinking clearly. Regarding me helping to improve the docs: I have made concrete proposals to re-word the descriptions. But I really think that a large part of the documentation you added comes down to saying: """ As specified, this class is mostly useless, since it does not allow you to reason about the choice of methods (e.g. 'foldr' vs. 'foldl') to use when working with a generic Foldable container. To make this choice, you have to make assumptions about whether it is left-leaning (like lists) or right-leaning (like snoc-lists) or neither (unbiased). The usual assumption is that it is right-leaning or unbiased. This means that all methods can be considered as being defined, semantically, using 'toList' and the corresponding function with the same name from Data.List. (For fold and foldMap, see their default definitions). If you can assume the element type is a Monoid, you can get away by using only the unbiased functions 'fold' and 'foldMap'. This leaves the choice of implementation strategy to the container (instance). """ Cheers Ben -- I would rather have questions that cannot be answered, than answers that cannot be questioned. -- Richard Feynman

When you're writing generically for Foldable, if you want consistent big-O performance, use `foldMap` or `foldMap'`to project to a Monoid that is ideal for how you want to consume the structure, and suck up the larger constant factors.
If you want the best performance for any structure, use the specialized methods (`sum`, `elem`, &c.).
And if you're consuming the structure in an intrinsically biased way, use an intrinsically biased fold.
-- Keith
Sent from my phone with K-9 Mail.
On 1 October 2021 09:52:53 UTC, Ben Franksen
Am 01.10.21 um 03:14 schrieb Viktor Dukhovni:
On Fri, Oct 01, 2021 at 02:40:49AM +0200, Ben Franksen wrote:
They define semantics in a semi-formal notation, which I find succinct and very intuitive. This can be easily generalized to Foldable via 'toList'. Indeed, there is almost nothing about Foldable that cannot be understood once you understand Data.List and the toList method. Even foldMap is (semantically) just (***):
Well a balanced Tree can have an efficient corecursive foldl, or a performant 'foldr`', and Sets can know their size statically, and `elem` runs in linear time even in structures that potentially support faster search.
All true, and I think it is important to document these things. The question is: where?
This is a general problem with all kinds of generic "container" classes/interfaces, and not limited to Haskell: performance characteristics of methods will vary widely depending on the implementation. In Haskell this includes semantics insofar as bottom / infinite structures are concerned.
Documenting the API /itself/ can go only so far before becoming a manual enumeration of all possible implementations one can think of or happens to know about. This clearly doesn't scale, so it would be better to leave it unspecified and attach such documentation to the instances instead.
It would help if rendering of instance docs in haddock were improved (as sub-paragraph under the instance instead of to the right of it).
And it is perhaps worth asking whether you feel you still have anything you'd like to learn about Foldable, for if not, perhaps the documentation is not for you, and that's fine...
I may have earned this remark with the tone of my critique ;-) In case it came over as condescending, please accept my apologies.
In fact all class methods have simple semantic definitions in terms of the same named functions from Data.List via 'toList'.
But performance may differ radically, and `toList` may diverge for `snocList` when infinite on the left, though that's a rather pathological example.
Expectations of runtime cost should be either explicitly stated in the docs for each method or left out.
That's not enough if users can't reason about the available choices or don't know how to implement performant instances.
The truth is (and that is what the docs imply but fail to make explicit as that would be too embarrasing): you *cannot* reason about these things. If the implementation (instance) is free to choose whether foldr or foldl is the "natural" fold, then how can I make an informed choice between them for an arbitrary Foldable?
If we were to define the semantics in terms of 'toList', then we would acknowledge that Foldable is biased in the same direction as lists are, so behavior would no longer be implementation defined and could be easily reasoned about.
Function synopses rarely provide enough room for more than a cursory description and a few examples. That's not their role. This is why Unix manpages have both a SYNOPSIS and a DESCRIPTION section.
I am quite open to improved language in the overview, and less open to the idea that it is just baggage to throw overboard. In particular, I've had positive feedback on the material, despite perhaps overly turgid prose in some places. Please help to make it crisp.
I find absence of overview (DESCRIPTION if you like) sections in many a non-trivial Haskell library to be quite a barrier to working with the library, the synopses alone are rarely enough for my needs.
I agree. I do like (not too verbose) introduction of concepts when reading docs of a new module. See below for a proposal.
I may be wrong but it looks to me as if this could be derived by adding one more method 'fromList' that is required to be left inverse of 'toList':
fromList :: Foldable a => [a] -> t a fromList . toList = id
This is roughly the sort of thing one can do with Traversable (recover the structure from its spine and element list, but not its element list alone). The point is that various non-linear (e.g. tree-like) structures with the same element order have distinct "spines". [...]
Is there a sensible (useful, lawful) Foldable instance which has no 'fromList'?
Sure, any tree-like structure where shape is not implied by the element list alone.
Sorry, you are right, of course. I wasn't thinking clearly.
Regarding me helping to improve the docs: I have made concrete proposals to re-word the descriptions. But I really think that a large part of the documentation you added comes down to saying:
""" As specified, this class is mostly useless, since it does not allow you to reason about the choice of methods (e.g. 'foldr' vs. 'foldl') to use when working with a generic Foldable container. To make this choice, you have to make assumptions about whether it is left-leaning (like lists) or right-leaning (like snoc-lists) or neither (unbiased). The usual assumption is that it is right-leaning or unbiased. This means that all methods can be considered as being defined, semantically, using 'toList' and the corresponding function with the same name from Data.List. (For fold and foldMap, see their default definitions).
If you can assume the element type is a Monoid, you can get away by using only the unbiased functions 'fold' and 'foldMap'. This leaves the choice of implementation strategy to the container (instance). """
Cheers Ben -- I would rather have questions that cannot be answered, than answers that cannot be questioned. -- Richard Feynman
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On Fri, Oct 01, 2021 at 11:52:53AM +0200, Ben Franksen wrote:
Well a balanced Tree can have an efficient corecursive foldl, or a performant 'foldr`', and Sets can know their size statically, and `elem` runs in linear time even in structures that potentially support faster search.
All true, and I think it is important to document these things. The question is: where?
I disagree that everything one should know about Data.Foldable is adequately described in Data.List. At least not without a new overview for Data.List that would cover some of the same ground in that specialised context, and could then be imported by reference. A reader who wants to better understand folds should learn the difference between strict reduction and corecursion, and certainly Data.List is not the best place to discuss tips for construction of Foldable instances. Perhaps the overview could start with a concise version that explains thinking about folds in terms of lists, and notes quickly that one cat typically get by with understanding "foldr", "foldl'" and foldMap. But ultimately one should understand why foldl', how to define instances, why `elem` is stuck doing linear lookup for `Set`, ... Would you like to contribute the "short version" introductory text for the impatient? Different readers will come to the documentation for different needs, most will come for just the synopses, and won't read the Overview, that's fine. If there's a need for a shorter blurb, please contribute. Perhaps the best path forward is to get MR 6555 done and dusted, and then additional MRs can be filed on top of that by those who'd like to see further improvements? -- Viktor.

Am 01.10.21 um 19:43 schrieb Viktor Dukhovni:
On Fri, Oct 01, 2021 at 11:52:53AM +0200, Ben Franksen wrote:
Well a balanced Tree can have an efficient corecursive foldl, or a performant 'foldr`', and Sets can know their size statically, and `elem` runs in linear time even in structures that potentially support faster search.
All true, and I think it is important to document these things. The question is: where?
I disagree that everything one should know about Data.Foldable is adequately described in Data.List. At least not without a new overview for Data.List that would cover some of the same ground in that specialised context, and could then be imported by reference.
A reader who wants to better understand folds should learn the difference between strict reduction and corecursion, and certainly Data.List is not the best place to discuss tips for construction of Foldable instances.
I already admitted elsewhere that my initial position (reduce semantics to that of lists) was too idealistic. My remark above was about attaching the documentation of runtime behaviors for specific instances to those instances.
But ultimately one should understand why foldl',
For lists this is explained in Data.List (though perhaps could use a bit of elaboration). For the general case, as mention before by me and others, this cannot be answered in general unless you know which Foldable you are dealing with or at least make certain assumptions about how instances are implemented.
how to define instances,
How do I define Foldable for snoc-lists? There are two choices: - Isomorphic to that for lists, i.e. from right to left, to conform to common expectations about runtime/bottom behavior for the left/right folds? - Or from left to right, such that foldr is problematic and foldr' the recommended one?
why `elem` is stuck doing linear lookup for `Set`, ...
Agreed, the docs should definitely mentioned that neither `elem` nor in fact any other method can be better than linear.
Different readers will come to the documentation for different needs,
By all means, add advice for writing instances (under a heading that says so). However, I claim that the vast majority of readers will want to know how to use the class methods in their code or understand why some code they try to understand uses a specific method. This is what the bulk of the docs should be about. Unfortunately there doesn't seem to be consensus in the community about the general semantics of the left/right associative folds. Contributing to the docs makes no sense for me until these questions are resolved. Cheers Ben -- I would rather have questions that cannot be answered, than answers that cannot be questioned. -- Richard Feynman
participants (7)
-
Ben Franksen
-
David Feuer
-
Dominik Schrempf
-
Keith
-
Michael Turner
-
Tom Ellis
-
Viktor Dukhovni