[Haskell-cafe] Better writing about Haskell through multi-metaphor learning

18 Sep 2021

      Well, this is certainly blowing up into something bigger than I
thought it would. (At one point last night, 12 people had my "Haskell
in Plain English" essay/chapter open on Google Docs, a very
unintentional side effect.)

Me: "I finally got (most of?) what monads really mean for practical
programming when none other than Simon Peyton-Jones said F# called a
very similar construct "workflow," a word he likes."

A response:  "Was it literally just a single sentence introducing a
new word for a concept that made you "get it"?  Could you elaborate?
This is really quite remarkable."

Richard Eisenberg: "For me, coming from a mostly Java background (but
with a healthy dollop of functional programming thrown in the mix --
but no Haskell), the phrase that unlocked monads was "programmable
semicolon"."

I think one explanation for the Monad Tutorial Explosion (and how it
still didn't work for a lot of people) is that there's no single right
metaphor that will finally make monads "click". It depends on what
combination of conceptions (and perhaps all misconceptions to be
dispelled) is floating around in given mind. And minds vary.

Workflow is a metaphor at one useful level. "Programmable semicolon"
is a metaphor at another useful level.

Semicolon as imperative-language separator actually hides a lot that's
going on in a real CPU. Even if the imperative language is assembly
language, the compiled instructions go to an instruction fetch/decode
unit that -- these days especially -- can be engaged in a bewildering
number of tasks, including finishing up several previous instructions,
translating the decoded instruction to what amounts to microcode that
get sent to an instruction cache, and fetching the next few
instructions even as it's been decoding the instruction fetched into a
little to-do list for the CPU. Perhaps we can think of instruction
fetch/decode as "monadal" -- it's wildly (but not chaotically) mapping
a previous state of the machine into new one. After all, it's all
still Turing machines down there somewhere (but even the Turing
machine was mathematically defined as a function that maps one
tape-state to the next while it's also mapping its internal state to a
new state.) So the semicolon can be thought of as lexically exposing
the fetch-decode operator in the CPU's own little program.

The "workflow" metaphor is at a higher level. If you think of how
incoming data can be sorted and parceled out as if to a number of
desks in an office, with forms processed at a desk going from one
outbox to another desk's inbox, you're ignoring exact sequencing. It's
the view where you're labeling desks by their specialized tasks and
drawing outbox-inbox connections. In this view, it matters little
that, in the one-instruction-at-a-time CPU model, there's only one
"clerk" who (ridiculously, I know) puts a form in an inbox, then takes
it out again, puts it in the inbox of another desk, then sits down at
that other desk, takes the same form out and does the very specific
form-processing task assigned to that desk. You can instead imagine it
as some sluggish bureaucracy where everyone else but the single busy
clerk is waiting for a form to hit their empty inboxes. The fact is,
"one-instruction-at-a-time CPUs" hardly exist anymore except in the
cheapest microcontrollers -- there are all multitasking like crazy.
And this is because CPU designers work very hard to see how much of
the little bureaucracy they can actually keep busy, ideally at all
times, but at least at high throughput for average instruction
streams, even if 100% utilization is seldom reached. (Lambda, the
Ultimate Semicolon?)

Bryan: "Listing sequential actions one after the other is so intuitive
you see it everywhere. Take baking recipes as one example. Or assembly
code. Or poems."

Yes, sequencing just grows out of the fact of time. You'd think that
poems aren't really sequences of tasks to do. But in a way, they are,
and in a sense that matters for my attempts at natural-language
understanding. There IS an implicit imperative in the energy required
to speak: the subtext of any statement addressed to another person
(even when it's dead poet to living reader) is: think about what I'm
saying. Each new word in a sentence being spoken maps the listener's
mind-state to a new one. The purpose of the sequencing is to build up
state in the listener's mind.

"When a Haskeller says, "Monads are great! They let you chain effectful
actions together!" it can take a very long time to understand they actually
mean exactly what they're saying — the usual intuition is that sequencing
actions can't possibly be a real problem, so you go round and round in
circles trying to understand what "effectful" means, what "action" means,
and how these oh-so-important "laws" have anything to do with it."

Yes, it's sort of like aliens from outer space with a very alien
consciousness, and a different relation to the dimension of time, and
they are exulting (to each other, and to themselves, really, more than
to you):

"Periods at the ends of sentences are great! They enable these strange
blobs of protoplasm who call themselves 'human' to more easily map the
states of these internal things they call 'neurons' into new states.
No wonder we've been having such trouble trying to tell them what to
do, using our intuitively obvious gestaltic visual patterns that allow
us to grasp, with our quantum-computer brains, a whole meaning at
once, through our retinas."

(Yes, I am riffing a little here on the SF movie Arrival.)

Math is this world where we can feel a little like that. Instead of
thinking of a function as a machine that takes inputs, grinds away for
a little while, and extrudes outputs, you can think of all inputs and
output as already existing, and reason logically from the resulting
properties. Haskell's laziness helps you fake what you can't really
have: infinite workspace and the ability to do everything in an
instant. It's a fake-it-til-you-make-it language, and you don't have
to be Buzz Lightyear going to infinity and beyond, perhaps to all the
way to the galaxy Aleph-omega. You just have to make a Haskell program
do something that people like as much as their Buzz Lightyear doll.

The math view is a useful view at one level, but it can only help you
so much. I greatly enjoyed the level where I could define a Turing
machine as a mapper from one infinite tape to another, then imagine
all possible Turing machines as a countably infinite set. I could then
prove, with Cantor diagonalization (abstractly, at least, though not
very intuitively) that there are things Turing machines can't do. I
loved that.

But I also listened to grad students mutter about how recursive
function theory may be beautiful and all, but it's now classical, and
therefore dead for their own practical purposes (getting a PhD,
getting onto tenure track.) And by then I knew I wasn't good enough at
math to make a go of computing theory in their world, as a career, no
matter what. I think Simon Peyton-Jones had a similar experience of
academia -- he wasn't a John Hughes -- but he found a way through
anyway.

Bryan: "I can understand why that [puzzled newbie] viewpoint might be
long-forgotten by those who were involved in the effort to solve it.
:) And I appreciate that it was solved in such a general way that it
can be applied to so many seemingly unrelated things! That's the
beauty of mathematics!"

Yes, it's there. But in user experience design (and writings about
Haskell ARE part of its user experience) there's a saying: the
intuitive is the familiar. At some point in gaining sophistication,
it's no longer intuitively obvious to you why something wasn't
intuitively obvious for you before. You're so absorbed in where you
are that you don't remember quite how you got there. You might even
assume that everyone who has gotten where you are has gotten there by
the same route. And, if you can dimly remember how you got a flash of
inspiration, you might end up writing yet another monad tutorial that
few people understand.

"Workflow" helped me. And now "programmable semicolon" is helping a
little, in another way.

I presented a strategy at the NSMCon-2021 conference: think of NSM
reductive paraphrasing as writing very complete and precise code for
natural-language terms and grammar rules, using NSM primes and syntax
as a kind of logic programming language -- one that might even be very
simply translatable to Prolog.

I did some coding experiments and saw a problem: viewed more
procedurally -- i.e., with each line of an NSM explication adding to a
conceptual schema of a natural-language meaning -- NSM wasn't
single-assignment. A Prolog interpreter couldn't just try a match,
looking at each statement of an explication in relative isolation, and
bail out when a statement failed to be true in the given context of
the attempted match. It had to drag some state along just to resolve
what the NSM prime "this" referred to. (Think of "it" in GHCI.)

OK: effects. But wait, more: scoped effects. You want to strip some
schema-stuff back out if a match failed midway through, then try
another match somewhere else. OK: I need a context stack to store
assertions that succeeded. Uh-oh, Prolog doesn't give me that paper
trail. Dang. And I still want the one-for-one translation of sentences
in an NSM reductive paraphrase to something like Prolog.

OK: I need to grab control of sequencing, so that I can not only use
the intrinsic Horn clause resolution without pain. But I also want it
to hold onto the assertions that succeeded, and knit them into the
growing conceptual schema that's part of the state. AND I want it roll
those back out, transactionally, if a match didn't succeed. It turns
out that Prolog makes this not too much harder than, say, a Lisp with
transactional memory.

So: I think I might now return to my Prolog code and see if I was
unconsciously reinventing some kind of monad. Because because I felt
forced by Prolog's limits to write my own semicolon. It wasn't too
hard, though the resulting code required for NSM explications looks
clunkier now. (Prolog is very limited in syntax extensibility.) Maybe
there's some way in which Haskell makes it easier still, and
nicer-looking too.

None of which is to say that "programmable semicolon" will immediately
help everyone pick the conceptual lock of monads. But for those who
just bonk on "programmable semicolon", it might help to point out that
the load-operate-store model of CPUs is a lie these days. Say a little
about why it's not true. Then again pose semicolon as a kind of
operator. People like to think, "OK, this statement will finish, and
the one after the semicolon will start." It's a convenient view, even
though in fact, while the imperative statement is in progress, the CPU
might have already fetched instructions beyond it, and scheduled them
for speculative execution.

Or, you can tell them, "Just learn lambda calculus, then study monads
algebraically, and here's a side dish of category theory while I'm it.
Bon appetit." How's that working for you, guys? It doesn't work for
me. And I don't think it's because I can't do the math. It's that I
often write code to see whether I'm thinking about a problem the right
way.

And is that so bad? I believe it was in Stewart Brand's "II Cybernetic
Frontiers" where a researcher at Xerox PARC, Steve ("Slug") Russell,
one of their star hackers, was described by his peers as "thinking
with his fingertips."

I'm no star hacker, but "workflow" feels like it will make my
fingertips smarter. "Programmable semicolon" feels like it will make
my fingertips smarter. You think top-down mathematically? Well, bon
appetit. It's not to my taste. No single way is likely to work for
all. Math is not some fundamental reality. It's not even reality -- in
fact, it gains a lot of its power precisely from being imaginary. And
it's just a bunch of metaphors too.

Regards,
Michael Turner
Executive Director
Project Persephone
1-25-33 Takadanobaba
Shinjuku-ku Tokyo 169-0075
Mobile: +81 (90) 5203-8682
turner@projectpersephone.org

Understand - http://www.projectpersephone.org/
Join - http://www.facebook.com/groups/ProjectPersephone/
Donate - http://www.patreon.com/ProjectPersephone
Volunteer - https://github.com/ProjectPersephone

"Love does not consist in gazing at each other, but in looking outward
together in the same direction." -- Antoine de Saint-Exupéry