Better writing about Haskell through multi-metaphor learning

Well, this is certainly blowing up into something bigger than I thought it would. (At one point last night, 12 people had my "Haskell in Plain English" essay/chapter open on Google Docs, a very unintentional side effect.) Me: "I finally got (most of?) what monads really mean for practical programming when none other than Simon Peyton-Jones said F# called a very similar construct "workflow," a word he likes." A response: "Was it literally just a single sentence introducing a new word for a concept that made you "get it"? Could you elaborate? This is really quite remarkable." Richard Eisenberg: "For me, coming from a mostly Java background (but with a healthy dollop of functional programming thrown in the mix -- but no Haskell), the phrase that unlocked monads was "programmable semicolon"." I think one explanation for the Monad Tutorial Explosion (and how it still didn't work for a lot of people) is that there's no single right metaphor that will finally make monads "click". It depends on what combination of conceptions (and perhaps all misconceptions to be dispelled) is floating around in given mind. And minds vary. Workflow is a metaphor at one useful level. "Programmable semicolon" is a metaphor at another useful level. Semicolon as imperative-language separator actually hides a lot that's going on in a real CPU. Even if the imperative language is assembly language, the compiled instructions go to an instruction fetch/decode unit that -- these days especially -- can be engaged in a bewildering number of tasks, including finishing up several previous instructions, translating the decoded instruction to what amounts to microcode that get sent to an instruction cache, and fetching the next few instructions even as it's been decoding the instruction fetched into a little to-do list for the CPU. Perhaps we can think of instruction fetch/decode as "monadal" -- it's wildly (but not chaotically) mapping a previous state of the machine into new one. After all, it's all still Turing machines down there somewhere (but even the Turing machine was mathematically defined as a function that maps one tape-state to the next while it's also mapping its internal state to a new state.) So the semicolon can be thought of as lexically exposing the fetch-decode operator in the CPU's own little program. The "workflow" metaphor is at a higher level. If you think of how incoming data can be sorted and parceled out as if to a number of desks in an office, with forms processed at a desk going from one outbox to another desk's inbox, you're ignoring exact sequencing. It's the view where you're labeling desks by their specialized tasks and drawing outbox-inbox connections. In this view, it matters little that, in the one-instruction-at-a-time CPU model, there's only one "clerk" who (ridiculously, I know) puts a form in an inbox, then takes it out again, puts it in the inbox of another desk, then sits down at that other desk, takes the same form out and does the very specific form-processing task assigned to that desk. You can instead imagine it as some sluggish bureaucracy where everyone else but the single busy clerk is waiting for a form to hit their empty inboxes. The fact is, "one-instruction-at-a-time CPUs" hardly exist anymore except in the cheapest microcontrollers -- there are all multitasking like crazy. And this is because CPU designers work very hard to see how much of the little bureaucracy they can actually keep busy, ideally at all times, but at least at high throughput for average instruction streams, even if 100% utilization is seldom reached. (Lambda, the Ultimate Semicolon?) Bryan: "Listing sequential actions one after the other is so intuitive you see it everywhere. Take baking recipes as one example. Or assembly code. Or poems." Yes, sequencing just grows out of the fact of time. You'd think that poems aren't really sequences of tasks to do. But in a way, they are, and in a sense that matters for my attempts at natural-language understanding. There IS an implicit imperative in the energy required to speak: the subtext of any statement addressed to another person (even when it's dead poet to living reader) is: think about what I'm saying. Each new word in a sentence being spoken maps the listener's mind-state to a new one. The purpose of the sequencing is to build up state in the listener's mind. "When a Haskeller says, "Monads are great! They let you chain effectful actions together!" it can take a very long time to understand they actually mean exactly what they're saying — the usual intuition is that sequencing actions can't possibly be a real problem, so you go round and round in circles trying to understand what "effectful" means, what "action" means, and how these oh-so-important "laws" have anything to do with it." Yes, it's sort of like aliens from outer space with a very alien consciousness, and a different relation to the dimension of time, and they are exulting (to each other, and to themselves, really, more than to you): "Periods at the ends of sentences are great! They enable these strange blobs of protoplasm who call themselves 'human' to more easily map the states of these internal things they call 'neurons' into new states. No wonder we've been having such trouble trying to tell them what to do, using our intuitively obvious gestaltic visual patterns that allow us to grasp, with our quantum-computer brains, a whole meaning at once, through our retinas." (Yes, I am riffing a little here on the SF movie Arrival.) Math is this world where we can feel a little like that. Instead of thinking of a function as a machine that takes inputs, grinds away for a little while, and extrudes outputs, you can think of all inputs and output as already existing, and reason logically from the resulting properties. Haskell's laziness helps you fake what you can't really have: infinite workspace and the ability to do everything in an instant. It's a fake-it-til-you-make-it language, and you don't have to be Buzz Lightyear going to infinity and beyond, perhaps to all the way to the galaxy Aleph-omega. You just have to make a Haskell program do something that people like as much as their Buzz Lightyear doll. The math view is a useful view at one level, but it can only help you so much. I greatly enjoyed the level where I could define a Turing machine as a mapper from one infinite tape to another, then imagine all possible Turing machines as a countably infinite set. I could then prove, with Cantor diagonalization (abstractly, at least, though not very intuitively) that there are things Turing machines can't do. I loved that. But I also listened to grad students mutter about how recursive function theory may be beautiful and all, but it's now classical, and therefore dead for their own practical purposes (getting a PhD, getting onto tenure track.) And by then I knew I wasn't good enough at math to make a go of computing theory in their world, as a career, no matter what. I think Simon Peyton-Jones had a similar experience of academia -- he wasn't a John Hughes -- but he found a way through anyway. Bryan: "I can understand why that [puzzled newbie] viewpoint might be long-forgotten by those who were involved in the effort to solve it. :) And I appreciate that it was solved in such a general way that it can be applied to so many seemingly unrelated things! That's the beauty of mathematics!" Yes, it's there. But in user experience design (and writings about Haskell ARE part of its user experience) there's a saying: the intuitive is the familiar. At some point in gaining sophistication, it's no longer intuitively obvious to you why something wasn't intuitively obvious for you before. You're so absorbed in where you are that you don't remember quite how you got there. You might even assume that everyone who has gotten where you are has gotten there by the same route. And, if you can dimly remember how you got a flash of inspiration, you might end up writing yet another monad tutorial that few people understand. "Workflow" helped me. And now "programmable semicolon" is helping a little, in another way. I presented a strategy at the NSMCon-2021 conference: think of NSM reductive paraphrasing as writing very complete and precise code for natural-language terms and grammar rules, using NSM primes and syntax as a kind of logic programming language -- one that might even be very simply translatable to Prolog. I did some coding experiments and saw a problem: viewed more procedurally -- i.e., with each line of an NSM explication adding to a conceptual schema of a natural-language meaning -- NSM wasn't single-assignment. A Prolog interpreter couldn't just try a match, looking at each statement of an explication in relative isolation, and bail out when a statement failed to be true in the given context of the attempted match. It had to drag some state along just to resolve what the NSM prime "this" referred to. (Think of "it" in GHCI.) OK: effects. But wait, more: scoped effects. You want to strip some schema-stuff back out if a match failed midway through, then try another match somewhere else. OK: I need a context stack to store assertions that succeeded. Uh-oh, Prolog doesn't give me that paper trail. Dang. And I still want the one-for-one translation of sentences in an NSM reductive paraphrase to something like Prolog. OK: I need to grab control of sequencing, so that I can not only use the intrinsic Horn clause resolution without pain. But I also want it to hold onto the assertions that succeeded, and knit them into the growing conceptual schema that's part of the state. AND I want it roll those back out, transactionally, if a match didn't succeed. It turns out that Prolog makes this not too much harder than, say, a Lisp with transactional memory. So: I think I might now return to my Prolog code and see if I was unconsciously reinventing some kind of monad. Because because I felt forced by Prolog's limits to write my own semicolon. It wasn't too hard, though the resulting code required for NSM explications looks clunkier now. (Prolog is very limited in syntax extensibility.) Maybe there's some way in which Haskell makes it easier still, and nicer-looking too. None of which is to say that "programmable semicolon" will immediately help everyone pick the conceptual lock of monads. But for those who just bonk on "programmable semicolon", it might help to point out that the load-operate-store model of CPUs is a lie these days. Say a little about why it's not true. Then again pose semicolon as a kind of operator. People like to think, "OK, this statement will finish, and the one after the semicolon will start." It's a convenient view, even though in fact, while the imperative statement is in progress, the CPU might have already fetched instructions beyond it, and scheduled them for speculative execution. Or, you can tell them, "Just learn lambda calculus, then study monads algebraically, and here's a side dish of category theory while I'm it. Bon appetit." How's that working for you, guys? It doesn't work for me. And I don't think it's because I can't do the math. It's that I often write code to see whether I'm thinking about a problem the right way. And is that so bad? I believe it was in Stewart Brand's "II Cybernetic Frontiers" where a researcher at Xerox PARC, Steve ("Slug") Russell, one of their star hackers, was described by his peers as "thinking with his fingertips." I'm no star hacker, but "workflow" feels like it will make my fingertips smarter. "Programmable semicolon" feels like it will make my fingertips smarter. You think top-down mathematically? Well, bon appetit. It's not to my taste. No single way is likely to work for all. Math is not some fundamental reality. It's not even reality -- in fact, it gains a lot of its power precisely from being imaginary. And it's just a bunch of metaphors too. Regards, Michael Turner Executive Director Project Persephone 1-25-33 Takadanobaba Shinjuku-ku Tokyo 169-0075 Mobile: +81 (90) 5203-8682 turner@projectpersephone.org Understand - http://www.projectpersephone.org/ Join - http://www.facebook.com/groups/ProjectPersephone/ Donate - http://www.patreon.com/ProjectPersephone Volunteer - https://github.com/ProjectPersephone "Love does not consist in gazing at each other, but in looking outward together in the same direction." -- Antoine de Saint-Exupéry

On Sat, Sep 18, 2021 at 11:56:37AM +0900, Michael Turner wrote:
Workflow is a metaphor at one useful level. "Programmable semicolon" is a metaphor at another useful level. [...] "Workflow" helped me. And now "programmable semicolon" is helping a little, in another way.
What troubles me about "workflow" and "programmable semicolon" is the question of whether they also apply to Arrow[1]. * If they do, then in what sense can "workflow" and "programmable semicolon" said to help with understanding of *Monad*? * If they don't, then why not? I don't see it. One possible answer is that they do, but those descriptions are only intended to improve understanding about purpose not about technical details (I think this is what Michael is suggesting). Tom [1] https://www.stackage.org/haddock/lts-17.7/base-4.14.1.0/Control-Arrow.html

Am 18.09.21 um 11:19 schrieb Branimir Maksimovic:
This is monad: f(g(h(…)))
To make sequence out of functions.
Ah, no. That's just function composition. A monad can do more - e.g. Maybe interleaves the functions with a check whether it was passed in Nothing. Regards, Jo

I don“t see how does that anything to do with Monad ČP It“s just what you pass as return value. Do { … ... } that’s just syntactic sugar, to look more nicely. Greetings, Branimir.
On 18.09.2021., at 12:34, Joachim Durchholz
wrote: Am 18.09.21 um 11:19 schrieb Branimir Maksimovic:
This is monad: f(g(h(…))) To make sequence out of functions.
Ah, no. That's just function composition.
A monad can do more - e.g. Maybe interleaves the functions with a check whether it was passed in Nothing.
Regards, Jo _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Am 18.09.21 um 14:11 schrieb Branimir Maksimovic:
I don“t see how does that anything to do with Monad ČP It“s just what you pass as return value.
Nitpick: "return" and "is" are one and the same in Haskell; there's no software-detectable difference between an expression and its result in safe Haskell.
Do { … ... } that’s just syntactic sugar, to look more nicely.
The do syntax is syntactic sugar alright. However, it's not just the do block that's a monad, its top-level subexpressions must be monads as well. Regards, Jo

I didn’t thought on return function, on return as word, what you return from function.
On 19.09.2021., at 11:04, Joachim Durchholz
wrote: Am 18.09.21 um 14:11 schrieb Branimir Maksimovic:
I don“t see how does that anything to do with Monad ČP It“s just what you pass as return value.
Nitpick: "return" and "is" are one and the same in Haskell; there's no software-detectable difference between an expression and its result in safe Haskell.
Do { … ... } that’s just syntactic sugar, to look more nicely.
The do syntax is syntactic sugar alright.
However, it's not just the do block that's a monad, its top-level subexpressions must be monads as well.
Regards, Jo _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. Once Monad always Monad, am I right? Monad is also way to keep state, and monad passes state between functions? IO Monad is just convenient way to describe side effects. I mean IO, when you see That you always think side effect.
Greetings, Branimir.

Other understandings of informal terms will differ, but here is how I see it: * "Workflow" applies equally well to Monad as to Arrow. Both capture the idea of information flowing from one computation to the next. * "Programmable semicolon" seems a better fit for Monad than for Arrow. In C, I can say:
int result = do_some_thing(); if (result > 10) big_result(); else small_result();
That is, the future flow of my computation can depend on an earlier result. This is possible with Monad but not with Arrow. So, if I had thought that Arrow was a programmable semicolon, I would have felt short-changed a bit when actually learning what Arrow was. Popping up a level: for me, the value in these ideas is not their precision, but rather their familiarity. Give me intuition first, correctness later. Sidenote (please feel free to skip!): In later secondary school and throughout university, I kept learning things that refined earlier knowledge -- indeed suggesting that earlier knowledge was wrong. For example, radians are *much* more convenient to work with than degrees, which are kind of ridiculous. I resolved then that I would teach my kids how to measure angles in radians from the start: why bother with the inferior unit? Now I have a daughter, who is learning degrees. Why? Because degrees are simpler to start with: we can describe a circle or a right angle without any irrational numbers! So, we build intuition with degrees, and then she'll switch to radians when the time comes. Another example: we now know that object do not accelerate parabolically under the influence of gravity, because of general relativity. But general relativity is hard, so we still teach about parabolic curves for objects in freefall. Once a student knows more physics, they can return and refine their knowledge. So can it be with Haskell. We sometimes get so caught up in being "right" and precise, that we forget that it is sometimes useful to paper over some details for the sake of intuition. This means we sometimes have to say "wrong" things. (For example, I have said that "IO Int" is just an Int tagged with a flag that says the function can do I/O and must be written with do-notation.) But if the goal is learning, sometimes the "wrong" thing is the right thing. Richard
On Sep 18, 2021, at 4:00 AM, Tom Ellis
wrote: On Sat, Sep 18, 2021 at 11:56:37AM +0900, Michael Turner wrote:
Workflow is a metaphor at one useful level. "Programmable semicolon" is a metaphor at another useful level. [...] "Workflow" helped me. And now "programmable semicolon" is helping a little, in another way.
What troubles me about "workflow" and "programmable semicolon" is the question of whether they also apply to Arrow[1].
* If they do, then in what sense can "workflow" and "programmable semicolon" said to help with understanding of *Monad*?
* If they don't, then why not? I don't see it.
One possible answer is that they do, but those descriptions are only intended to improve understanding about purpose not about technical details (I think this is what Michael is suggesting).
Tom
[1] https://www.stackage.org/haddock/lts-17.7/base-4.14.1.0/Control-Arrow.html _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Best of all is to avoid being too wrong in your teaching. No, don't teach
children radians, but don't teach them degrees either (except on the side).
Teach them *turns*, and later explain that a degree is 1/360th of a turn,
and still later that a radian is 1/(2π) turns.
On Sun, Sep 19, 2021, 5:07 PM Richard Eisenberg
Other understandings of informal terms will differ, but here is how I see it:
* "Workflow" applies equally well to Monad as to Arrow. Both capture the idea of information flowing from one computation to the next.
* "Programmable semicolon" seems a better fit for Monad than for Arrow. In C, I can say:
int result = do_some_thing(); if (result > 10) big_result(); else small_result();
That is, the future flow of my computation can depend on an earlier result. This is possible with Monad but not with Arrow. So, if I had thought that Arrow was a programmable semicolon, I would have felt short-changed a bit when actually learning what Arrow was.
Popping up a level: for me, the value in these ideas is not their precision, but rather their familiarity. Give me intuition first, correctness later.
Sidenote (please feel free to skip!): In later secondary school and throughout university, I kept learning things that refined earlier knowledge -- indeed suggesting that earlier knowledge was wrong. For example, radians are *much* more convenient to work with than degrees, which are kind of ridiculous. I resolved then that I would teach my kids how to measure angles in radians from the start: why bother with the inferior unit? Now I have a daughter, who is learning degrees. Why? Because degrees are simpler to start with: we can describe a circle or a right angle without any irrational numbers! So, we build intuition with degrees, and then she'll switch to radians when the time comes. Another example: we now know that object do not accelerate parabolically under the influence of gravity, because of general relativity. But general relativity is hard, so we still teach about parabolic curves for objects in freefall. Once a student knows more physics, they can return and refine their knowledge.
So can it be with Haskell. We sometimes get so caught up in being "right" and precise, that we forget that it is sometimes useful to paper over some details for the sake of intuition. This means we sometimes have to say "wrong" things. (For example, I have said that "IO Int" is just an Int tagged with a flag that says the function can do I/O and must be written with do-notation.) But if the goal is learning, sometimes the "wrong" thing is the right thing.
Richard
On Sep 18, 2021, at 4:00 AM, Tom Ellis < tom-lists-haskell-cafe-2017@jaguarpaw.co.uk> wrote:
On Sat, Sep 18, 2021 at 11:56:37AM +0900, Michael Turner wrote:
Workflow is a metaphor at one useful level. "Programmable semicolon" is a metaphor at another useful level. [...] "Workflow" helped me. And now "programmable semicolon" is helping a little, in another way.
What troubles me about "workflow" and "programmable semicolon" is the question of whether they also apply to Arrow[1].
* If they do, then in what sense can "workflow" and "programmable semicolon" said to help with understanding of *Monad*?
* If they don't, then why not? I don't see it.
One possible answer is that they do, but those descriptions are only intended to improve understanding about purpose not about technical details (I think this is what Michael is suggesting).
Tom
[1] https://www.stackage.org/haddock/lts-17.7/base-4.14.1.0/Control-Arrow.html _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Now I'm confused. How are degrees or radians "wrong"? Sent from my iPad
On 2021. Sep 20., at 0:34, David Feuer
wrote: Best of all is to avoid being too wrong in your teaching. No, don't teach children radians, but don't teach them degrees either (except on the side). Teach them *turns*, and later explain that a degree is 1/360th of a turn, and still later that a radian is 1/(2π) turns. On Sun, Sep 19, 2021, 5:07 PM Richard Eisenberg
wrote: Other understandings of informal terms will differ, but here is how I see it:
* "Workflow" applies equally well to Monad as to Arrow. Both capture the idea of information flowing from one computation to the next.
* "Programmable semicolon" seems a better fit for Monad than for Arrow. In C, I can say:
int result = do_some_thing(); if (result > 10) big_result(); else small_result();
That is, the future flow of my computation can depend on an earlier result. This is possible with Monad but not with Arrow. So, if I had thought that Arrow was a programmable semicolon, I would have felt short-changed a bit when actually learning what Arrow was.
Popping up a level: for me, the value in these ideas is not their precision, but rather their familiarity. Give me intuition first, correctness later.
Sidenote (please feel free to skip!): In later secondary school and throughout university, I kept learning things that refined earlier knowledge -- indeed suggesting that earlier knowledge was wrong. For example, radians are *much* more convenient to work with than degrees, which are kind of ridiculous. I resolved then that I would teach my kids how to measure angles in radians from the start: why bother with the inferior unit? Now I have a daughter, who is learning degrees. Why? Because degrees are simpler to start with: we can describe a circle or a right angle without any irrational numbers! So, we build intuition with degrees, and then she'll switch to radians when the time comes. Another example: we now know that object do not accelerate parabolically under the influence of gravity, because of general relativity. But general relativity is hard, so we still teach about parabolic curves for objects in freefall. Once a student knows more physics, they can return and refine their knowledge.
So can it be with Haskell. We sometimes get so caught up in being "right" and precise, that we forget that it is sometimes useful to paper over some details for the sake of intuition. This means we sometimes have to say "wrong" things. (For example, I have said that "IO Int" is just an Int tagged with a flag that says the function can do I/O and must be written with do-notation.) But if the goal is learning, sometimes the "wrong" thing is the right thing.
Richard
On Sep 18, 2021, at 4:00 AM, Tom Ellis
wrote: On Sat, Sep 18, 2021 at 11:56:37AM +0900, Michael Turner wrote:
Workflow is a metaphor at one useful level. "Programmable semicolon" is a metaphor at another useful level. [...] "Workflow" helped me. And now "programmable semicolon" is helping a little, in another way.
What troubles me about "workflow" and "programmable semicolon" is the question of whether they also apply to Arrow[1].
* If they do, then in what sense can "workflow" and "programmable semicolon" said to help with understanding of *Monad*?
* If they don't, then why not? I don't see it.
One possible answer is that they do, but those descriptions are only intended to improve understanding about purpose not about technical details (I think this is what Michael is suggesting).
Tom
[1] https://www.stackage.org/haddock/lts-17.7/base-4.14.1.0/Control-Arrow.html _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Degrees aren't wrong (though I agree I implied this in my description), but they're not as useful in engineering as radians, nor as useful as turns (as David rightly suggests) in everyday applications. Richard
On Sep 19, 2021, at 6:49 PM, Mig Mit
wrote: How are degrees or radians "wrong"?

Neither is really wrong. Degrees are strange, an artifact of the Babylonian
system with no real mathematical significance. Radians are perfectly fine,
and mathematically preferable, but measurements in radians typically have
to be given as multiples of some transcendental number. Traditionally, that
number is π. Didactically, it's probably better to use 𝜏 = 2π, which in
context can be pronounced "turns". You don't have to tell a child about the
"radians" part till they're ready, and don't have to confuse them with 360
anythings. Just teach important angles like 𝜏, ½𝜏, ¼𝜏, ⅛𝜏, ⅙𝜏, and
(1/12)𝜏.
Bringing this back to the Haskell context, it's nice to find a way to avoid
lying without needing to tell all of the truth to someone who's not ready.
On Sun, Sep 19, 2021, 6:50 PM Mig Mit
Now I'm confused. How are degrees or radians "wrong"?
Sent from my iPad
On 2021. Sep 20., at 0:34, David Feuer
wrote: Best of all is to avoid being too wrong in your teaching. No, don't teach children radians, but don't teach them degrees either (except on the side). Teach them *turns*, and later explain that a degree is 1/360th of a turn, and still later that a radian is 1/(2π) turns.
On Sun, Sep 19, 2021, 5:07 PM Richard Eisenberg
wrote: Other understandings of informal terms will differ, but here is how I see it:
* "Workflow" applies equally well to Monad as to Arrow. Both capture the idea of information flowing from one computation to the next.
* "Programmable semicolon" seems a better fit for Monad than for Arrow. In C, I can say:
int result = do_some_thing(); if (result > 10) big_result(); else small_result();
That is, the future flow of my computation can depend on an earlier result. This is possible with Monad but not with Arrow. So, if I had thought that Arrow was a programmable semicolon, I would have felt short-changed a bit when actually learning what Arrow was.
Popping up a level: for me, the value in these ideas is not their precision, but rather their familiarity. Give me intuition first, correctness later.
Sidenote (please feel free to skip!): In later secondary school and throughout university, I kept learning things that refined earlier knowledge -- indeed suggesting that earlier knowledge was wrong. For example, radians are *much* more convenient to work with than degrees, which are kind of ridiculous. I resolved then that I would teach my kids how to measure angles in radians from the start: why bother with the inferior unit? Now I have a daughter, who is learning degrees. Why? Because degrees are simpler to start with: we can describe a circle or a right angle without any irrational numbers! So, we build intuition with degrees, and then she'll switch to radians when the time comes. Another example: we now know that object do not accelerate parabolically under the influence of gravity, because of general relativity. But general relativity is hard, so we still teach about parabolic curves for objects in freefall. Once a student knows more physics, they can return and refine their knowledge.
So can it be with Haskell. We sometimes get so caught up in being "right" and precise, that we forget that it is sometimes useful to paper over some details for the sake of intuition. This means we sometimes have to say "wrong" things. (For example, I have said that "IO Int" is just an Int tagged with a flag that says the function can do I/O and must be written with do-notation.) But if the goal is learning, sometimes the "wrong" thing is the right thing.
Richard
On Sep 18, 2021, at 4:00 AM, Tom Ellis < tom-lists-haskell-cafe-2017@jaguarpaw.co.uk> wrote:
On Sat, Sep 18, 2021 at 11:56:37AM +0900, Michael Turner wrote:
Workflow is a metaphor at one useful level. "Programmable semicolon" is a metaphor at another useful level. [...] "Workflow" helped me. And now "programmable semicolon" is helping a little, in another way.
What troubles me about "workflow" and "programmable semicolon" is the question of whether they also apply to Arrow[1].
* If they do, then in what sense can "workflow" and "programmable semicolon" said to help with understanding of *Monad*?
* If they don't, then why not? I don't see it.
One possible answer is that they do, but those descriptions are only intended to improve understanding about purpose not about technical details (I think this is what Michael is suggesting).
Tom
[1] https://www.stackage.org/haddock/lts-17.7/base-4.14.1.0/Control-Arrow.html _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Am 19.09.21 um 23:04 schrieb Richard Eisenberg:
Popping up a level: for me, the value in these ideas is not their precision, but rather their familiarity. Give me intuition first, correctness later.
[...] Another example: we now know that object do not accelerate parabolically under the influence of gravity, because of general relativity. But general relativity is hard, so we still teach about parabolic curves for objects in freefall. Once a student knows more physics, they can return and refine their knowledge.
So can it be with Haskell. We sometimes get so caught up in being "right" and precise, that we forget that it is sometimes useful to paper over some details for the sake of intuition.
I agree that you need to have a first working model initially, with all the details coming later. In the case of monads, I think that the usual initial model (a monad is a pipeline) is insufficient even at the novice level. I'll argue for that from observation, and from reasoning. Observation: Many novices come up with questions about monads. It's the number one stumbling block, and the thing that keeps people turning away as in "that's all too mathematical". Compare this with Java: public static void main(String... args) which floods the user with four advanced concepts (public, static, void results, varargs parameters), yet gives rise more to ridicule than frustration. More importantly, each of these concepts immediately points to something that you can explore and understand separately. Reasoning: Most expositions of monads start with some kind of pipeline concept. The problem with that is that it's misleading without giving people a clue where they're going wrong. E.g. think "pipeline" and hit Maybe and Error. These concepts don't "look like" a pipeline, more like a variation of function composition - but why is it then a monad instead of function composition? List is even more difficult to fit into the pipeline methaphor. Using List's monad properties is mostly interesting for tracing all cases of nondeterministic behaviour, but you very rarely do that in code, so it's not a big deal - but it still is a cognitive dissonance for the student, a sore spot in the mental model. Even before these two, you see IO. You can easily get away with the pipeline model at first when you just do output, but as soon as you have input and decisionmaking (i.e. the output becomes a decision tree, each run of the program following one path from the root to a leaf - actually it's a DAG and nonterminating programs don't necessarily have leaves), the pipeline model is not a good fit anymore. (I very much like David Feuer's idea how to teach angles in terms of "turns". It carefully avoids putting the student at a wrong track. Since radians are taught waaay after division by fractions, even the division by an irrational number that you need to "explain" radians isn't such a big deal - i.e. it also orders concepts in an order where each concept can be explained based on things the student already knows.) Just my 2 cents, and based on more thinking than programming, so it may not be worth much. Regards, Jo

FYI, it's not my idea; I just figured I could slot it in. I have no idea
what pipelines night mean in this context. Something to do with Unix
utility composition?
On Mon, Sep 20, 2021, 2:05 AM Joachim Durchholz
Popping up a level: for me, the value in these ideas is not their
Am 19.09.21 um 23:04 schrieb Richard Eisenberg: precision, but rather their familiarity. Give me intuition first, correctness later.
[...] Another example: we now know that object do not accelerate
parabolically under the influence of gravity, because of general relativity. But general relativity is hard, so we still teach about parabolic curves for objects in freefall. Once a student knows more physics, they can return and refine their knowledge.
So can it be with Haskell. We sometimes get so caught up in being
"right" and precise, that we forget that it is sometimes useful to paper over some details for the sake of intuition.
I agree that you need to have a first working model initially, with all the details coming later.
In the case of monads, I think that the usual initial model (a monad is a pipeline) is insufficient even at the novice level. I'll argue for that from observation, and from reasoning.
Observation: Many novices come up with questions about monads. It's the number one stumbling block, and the thing that keeps people turning away as in "that's all too mathematical". Compare this with Java: public static void main(String... args) which floods the user with four advanced concepts (public, static, void results, varargs parameters), yet gives rise more to ridicule than frustration. More importantly, each of these concepts immediately points to something that you can explore and understand separately.
Reasoning: Most expositions of monads start with some kind of pipeline concept. The problem with that is that it's misleading without giving people a clue where they're going wrong. E.g. think "pipeline" and hit Maybe and Error. These concepts don't "look like" a pipeline, more like a variation of function composition - but why is it then a monad instead of function composition? List is even more difficult to fit into the pipeline methaphor. Using List's monad properties is mostly interesting for tracing all cases of nondeterministic behaviour, but you very rarely do that in code, so it's not a big deal - but it still is a cognitive dissonance for the student, a sore spot in the mental model.
Even before these two, you see IO. You can easily get away with the pipeline model at first when you just do output, but as soon as you have input and decisionmaking (i.e. the output becomes a decision tree, each run of the program following one path from the root to a leaf - actually it's a DAG and nonterminating programs don't necessarily have leaves), the pipeline model is not a good fit anymore. (I very much like David Feuer's idea how to teach angles in terms of "turns". It carefully avoids putting the student at a wrong track. Since radians are taught waaay after division by fractions, even the division by an irrational number that you need to "explain" radians isn't such a big deal - i.e. it also orders concepts in an order where each concept can be explained based on things the student already knows.)
Just my 2 cents, and based on more thinking than programming, so it may not be worth much.
Regards, Jo _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Am 20.09.21 um 08:17 schrieb David Feuer:
FYI, it's not my idea; I just figured I could slot it in. I have no idea what pipelines night mean in this context. Something to do with Unix utility composition?
The idea is that a monad is a tool to construct a pipeline inside a Haskell program. I think. It isn't my metaphor either, I have seen various people in this thread mention it in passing. Regards, Jo

On 20 Sep 2021, at 2:30 am, Joachim Durchholz
wrote: The idea is that a monad is a tool to construct a pipeline inside a Haskell program. I think. It isn't my metaphor either, I have seen various people in this thread mention it in passing.
A "pipeline" is not necessarily a *static* pipeline. Even in "bash" we have conditional branching: $ seq 1 99 | while read i; do if (( i % 10 == 1 )); then seq $i | grep 0 | fmt; fi; done 10 10 20 10 20 30 10 20 30 40 10 20 30 40 50 10 20 30 40 50 60 10 20 30 40 50 60 70 10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80 90 So IO and State and Maybe and Either all fit. As noted a few times in this thread List takes multiple paths. The "non-determinism" (something that not always explained well, since taking all paths is still rather deterministic) extends the simple pipeline model. The computation is still sequenced, it is just that a given stage can invoke its "continuation" multiple times. An intersting minimal model for the list monads is "jq", which is a small functional language for manipulating JSON values. It is basically the List monad for JSON, but both evaluation and function composition are written as simple-looking pipelines: value | function1 | function2 ... But more advanced "jq" users know that a value or function can be a stream (generator) of values: (value1, value2, ...) | function1 | function2 ... $ jq -n '(1,2,3,4) | select(. % 2 == 0) | range(.) | .+100' 100 101 100 101 102 103 So this is isomorphic to Haskell's List monad. And yet, in part because typically there's only one value on the left, and only one output on the right, the naive "pipeline" model is a fairly good fit if, as needed, one also admits "non-determinism". Where pipelines really don't appear to me to be an adequate mental model is "Cont". Reifying continuations feels very different from even a non-deterministic pipeline. Even "programmable semicolon" likely does not yield much intuition about "Cont". But, to first approximation, and without suffering any harm thereby, pipelines have worked well for me (and I expect others). I know that's not the whole/real story, I don't need the model as a crutch, but it works well enough, often enough to remain a useful way of reasoning about the many situations in which it is applicable. -- Viktor.

Am 20.09.21 um 19:50 schrieb Viktor Dukhovni:
On 20 Sep 2021, at 2:30 am, Joachim Durchholz
wrote: The idea is that a monad is a tool to construct a pipeline inside a Haskell program. I think. It isn't my metaphor either, I have seen various people in this thread mention it in passing.
A "pipeline" is not necessarily a *static* pipeline. Even in "bash" we have conditional branching:
Sure, but we're talking about imagery in the heads of people. And few see things like conditional pipelines.
So IO and State and Maybe and Either all fit.
In a sense, yes. But it requires an additional step - the pipeline itself is active and can be involved in decisionmaking. BTW I don't think that IO is really a pipeline. If you have inputs and decisionmaking, you have a decision tree. I believe the program takes just one path through that tree, but I don't understand IO well enough to validate that assumption.
As noted a few times in this thread List takes multiple paths. The "non-determinism" (something that not always explained well, since taking all paths is still rather deterministic) extends the simple pipeline model.
Yeah, actually it's fully deterministic, my wording was a bit too sloppy.
Where pipelines really don't appear to me to be an adequate mental model is "Cont". Reifying continuations feels very different from even a non-deterministic pipeline. Even "programmable semicolon" likely does not yield much intuition about "Cont".
But, to first approximation, and without suffering any harm thereby, pipelines have worked well for me (and I expect others). I know that's not the whole/real story, I don't need the model as a crutch, but it works well enough, often enough to remain a useful way of reasoning about the many situations in which it is applicable.
Well, I believe that an image should clearly convey the limits of its usefulness, if only with a link to a "fully explanation" article. Very few Monad tutorials do that, which I think is a shame. Also, my model has been pretty easy - it's a variation of associativity. "Variation" because the operands are functions, and the result type of a left operand needs to match the parameter type of a right operand. (That's why it's "pseudo"-associative). Now with these constraints, the most natural thing to do is function composition, possibly with sandwiches default functions. If the operand type is a rich parameterized type, it can do more, to the point that the overall semantics is dominated by the operators instead of what the operands do. I also don't know if there's anything practically useful on that path. And, of course, I may be totally misunderstanding what a monad actually is. All I know is that this variation-of-associativity model seems to fit the monad laws more closely than the pipeline model, so _if_ it is correct I'd find it more useful than the pipeline model, because it tells me what things beyond a pipeline I could do with it. Regards, Jo

On 18/09/2021 04.56, Michael Turner wrote: [--snip--] Just to add: I'm not in education per se, but I've read a bit of research on it, and current research (as of 5-10 years ago) supports the idea that learning is helped most by multiple ways of explaining $THING_TO_BE_LEARNED. (NB: This is *not* Learning Styles, which is not a thing.) I'm guessing the proliferation of monad tutorials is that people think that the last explanation they read was "critical". Research suggests it more that that *happened* to be the one that finally made everything click into place. TL;DR: There is no magic One True Monad Tutorial. People need to learn about Monads (etc.) from multiple angles. Cheers,

Am 18.09.21 um 04:56 schrieb Michael Turner:
Or, you can tell them, "Just learn lambda calculus, then study monads algebraically, and here's a side dish of category theory while I'm it. Bon appetit." How's that working for you, guys? It doesn't work for me. And I don't think it's because I can't do the math. It's that I often write code to see whether I'm thinking about a problem the right way.
Same here.
I'm no star hacker, but "workflow" feels like it will make my fingertips smarter. "Programmable semicolon" feels like it will make my fingertips smarter. You think top-down mathematically?
I'd venture that most mathematicians don't use such a purely formal top-down approach when thinking about a problem. Intuition about concepts is very important. Mathematical writing often gives the wrong impression that in order to understand a new concept, you just have to read the formal definition and then make logical deductions. This is not how it works in practice and those who are honest will readily admit that. You need to study concrete examples, and you need to work through exercises to develop true understanding. As for monads, even though I do like the "overloaded semicolon" metaphor, the crucial point is not the sequencing as such, but rather how to express capturing and referencing intermediate results *in a statically typed fashion*. Cheers Ben -- I would rather have questions that cannot be answered, than answers that cannot be questioned. -- Richard Feynman
participants (10)
-
Bardur Arantsson
-
Ben Franksen
-
Branimir Maksimovic
-
David Feuer
-
Joachim Durchholz
-
Michael Turner
-
Mig Mit
-
Richard Eisenberg
-
Tom Ellis
-
Viktor Dukhovni