RE: newbie conceptual question [from haskell list]

26 Jul 2001

      David Tweed wrote:
...
I'd like to respectfully disagree with some of this :-)
I figured someone would. Though I thought it would be Fergus. :)
...
...
The most important thing about functional languages is that we know
_what they are_, and that they can be characterized in a regular
fashion which is amenable to analysis; in other words, we know how to
reason about them, and what sorts of properties functional programs
 satisfy.
Umm, I wouldn't agree with that exactly. If you ignore stupidities like
specifying the mod operator % to be implementation defined, I would claim
that C has/can be given semantics which are virtually as simple as
Haskell. Equally I'd argue that we know as much about how to reason about
imperative languages as we do functional ones. The point is that those
semantics don't split the problem of reasoning about what's going on into
nice independent subproblems like the semantics of Haskell do. Equally,
I'd agree we don't know how to perform _effective_ reasoning on them,
probably because it's pretty much impossible without restricting the set
of allowable programs.
Effectively reason, OK. This is what I meant when I wrote "amenable to
analysis."
...
Again, as a C++ programmer I have some grasp of what program
rearrangements are valid (E.g.,I can't move an assignment involving an
expression in another variable, say v, from before an assignment to v to
after an assignment to v), and I use it implicitly every time I think
about structuring or rearranging my programs. The problem is that that
doesn't really help because determining whether there's an assignment to
variable v between where an assignment is now and where I want to move it
to is `decidedly non-trivial'.
I think doing source-level reasoning on C++ programs is virtually
impossible. I always have to think about the implementation in terms of
memory blocks and vtables and pointers.

One reason must be that assignment is used all over the place when local
let-binding would suffice. If C++ actually enforced encapsulation of state,
that would not be so bad, but in practice you never know what sort of
aliasing and sharing is going on.
...
...
Of course, this argument holds for any "simple" language, not just
functional ones, where by "simple" I mean "easy to reason about". For
As I say `easy to _effectively_ reason about' =/= simple semantics.
I'm making this point because the natural rejoinder from a C programmer
being told that C (as an imperative language) has more complex semantics
than Haskell is to say `no it doesn't', which I agree with, the point
being that Haskell has a semantics which makes it easier to reason about
effectively. (I'm glossing over the distinction between the various
kinds of semantics for a language such as denotational and operational
here.)
Again, this is what I meant by "simple".

Actually, I have to admit that when I think about Haskell programs, I don't
really reason using Haskell's semantics, but rather the equational theory of
a polymorphic lambda-calculus, and I hardly ever think about the operational
side. If you idealized C in the same way, you would probably ignore things
like pointers and aliasing which cause problems.

But the thing is that in Haskell I can pretty much get away with this
idealization and still write interesting programs, but in C you really need
these complications to do useful things. So I think FP languages have a
better separation of concerns.
...
...
example, there are concurrent calculi of processes, object calculi and
first-order algebraic languages which also share this property,
and I prefer
any of them to warty conventional languages. (And this is why I
also like
SML: even though it's not referentially transparent like Haskell, I know
what sorts of properties it satisfies.)
Now this to me is a really interesting statement. IIRC from many years ago
theres not necessarily an indication in the type system when a function
involves references. Do you actually reason (albeit subconciously) using a
full semantics which includes possible effects due to aliasing of
references or changes to their contents somewhere deep in the functions,
or do you use a rule `well this is 99% likely to be referentially
transparent so I'll assume it is, but I'll keep in the back of my mind the
possibility this assumption may be wrong'? (That's what I'd end up doing I
think.)
Frankly, I do my best to avoid side-effecting code in ML. The fact that you
can do this and still write useful programs is, again, an indication that
the separation of concerns is better than in C, where it is pretty much
impossible to go without side-effects and aliasing.

But when I have to use effects, I guess I make conservative judgements about
substitutability and try to localize them as much as possible. I also have
started thinking about it "monadically" to some extent, so, yes, I guess I
am doing a bit of informal extra type inference in my head.

Finally, ML is not completely oblivious to effects; for example, to preserve
soundness, it disallows generalizing type variables in certain places which
fail a syntactic condition, and that is something you can use yourself to
figure out which expressions are likely to be impure.
...
...
Haskell (and ML) really is different. First, it's not
implementation-defined.
Umm, again I'd say this is debatable: the volume of e-mail to/from Simon
PJ about the haskell 98 report even though there are various Haskell 98
interpreters/compilers out there suggests this isn't true. Likewise, C and
C++ have specifications which are met to a reasonable degree by many
compilers out there, which prompts the question: when was the last time a
bug in your code in an imperative language was due to an
implmentation-defined part of the language? In 5 years of intensive C++
programming I'd guess I've made maybe 25 bugs due to this, but the number
of bugs I've fixed in my code must be in the tens of thousands.
First, this is not what I mean by "implementation-defined" (my fault). What
I mean is the model of computation: in C, you are always thinking about
blocks on the heap, pointers between them, whether something is on the stack
or not, what order things are happening in and what order this other code
you wrote last year assumes them to be happening in, etc. So you are still
thinking, essentially, at the level of the von Neumann architecture. Sure,
some features which affect portability are hidden away, e.g., how long a
word of memory is, but those are details, and you can name things and refer
to them more nicely, but it's still just assembly programming in disguise.
In an FP language, and more generally languages which support an abstract
semantics, I don't have to think about those things. Instead, I can think
about a problem at a whole different level of abstraction which is closer to
the problem domain.

Second, I am aware that this is an oversimplification and that C/C++ are not
completely implementation-defined, and you can do some reasoning about
sequence points and local stores and so on. But does that happen in
practice? They just aren't taught that way, and that's also part of the
problem.

Third, I don't think it is true that there are many (any?) compilers which
meet the C++ specification to a reasonable degree. C++ must be the single
worst implemented language in computer history.

---
Frank Atanassow, Information & Computing Sciences, Utrecht University
Padualaan 14, PO Box 80.089, 3508TB Utrecht, The Netherlands
Tel +31 (0)30 253-3261 Fax +31 (0)30 251-3791