
On Thu, 26 Jul 2001, Frank Atanassow wrote:
Again, as a C++ programmer I have some grasp of what program rearrangements are valid (E.g.,I can't move an assignment involving an expression in another variable, say v, from before an assignment to v to after an assignment to v), and I use it implicitly every time I think about structuring or rearranging my programs. The problem is that that doesn't really help because determining whether there's an assignment to variable v between where an assignment is now and where I want to move it to is `decidedly non-trivial'.
I think doing source-level reasoning on C++ programs is virtually impossible. I always have to think about the implementation in terms of memory blocks and vtables and pointers.
One reason must be that assignment is used all over the place when local let-binding would suffice. If C++ actually enforced encapsulation of state, that would not be so bad, but in practice you never know what sort of aliasing and sharing is going on.
Note: I'm trying to compare like-with-like, so when I talk about reasoning below it's in the context of the same sort of semi-formal, in my head reasoning that I do in my everyday development of C++ or Haskell. Yes, I guess it's time for a confession: I'm making a rather sweeping assumption that the patterns in which I do and don't program are in some way `average' or `typical', even though they probably aren't. For instance, I don't even use patterns like `a[b++]=c;' just because it requires too many `mental cycles' for me to figure out what's going on when I reread the code; most other C programmers probably would use that construct, I expect. Having said that, within this sort of style I find that I don't really (have to) go beyond the idea of a machine with a store of named variables, some of which are actually arrays or have other structure, and tend to use the simple rule `if you assign/read outside array bounds or to a pointer location which does not map within a store defined variable/variable array, the program may (in a way which is so dependent on history as to be effectively non-determinisitic) either place gibberish in any other stored variable cell, return gibberish, return sensible values and have no observable ill effects or seg-fault'. Whilst you're certainly right to say that a proper semantic model (even an informal one) ought to incorporate the differences between stack allocated variables and heap allocated variables I genuinely don't think I use that distinction whilst thinking about my programs because it's too complicated for me. Likewise I've had one bug in 5 years due to a vtable problem and that was when I knew I was pushing my luck trying to do a really low-level optimisation for efficiency. The bugs that I have tend to fall into three big categories: incorrect algorithms (i.e., I'd misunderstood the problem even at a high level), writing outside array bounds and misaccounting for the assignments and their ordering. Class 1 is a problem even when I write in Haskell, classes 2 and 3 tend to be analysed by sticking print outputs everywhere, and then reasoning using my very simple mental model to figure out why the observed behaviour is happening and fix it. The reason that I'm belabouring this point is that I don't think that imperative programmers are going to be convinced by an argument that `functional programming is more effective because there are a huge number of complexities in the semantics, and even some downright ill-definedness/inconsistency, compared to functional programming' because even my natural response is `that's entirely true about the semantics but, _given that those problems all tend to lie in areas of the "program state space" where I don't write programs_, why should that motivate me to use a functional language?' I still think the compelling argument is not that there are all these nasty tar-pits of complexity in the semantics of an imperative language, say C++, that aren't there in functional languages but that, in the code which makes up 99.99% of imperative programs and where most of the bugs occur, there are simple semantics you just can't use them in any effective way. In contrast you _can_ do effective reasoning about the same sorts of thing in functional languages because their roughly equally simple semantics are more effective.
Actually, I have to admit that when I think about Haskell programs, I don't really reason using Haskell's semantics, but rather the equational theory of a polymorphic lambda-calculus, and I hardly ever think about the operational side. If you idealized C in the same way, you would probably ignore things like pointers and aliasing which cause problems.
Umm, as I stated above, I think I use a very simple model that lets me think about aliasing and pointers. I guess the key point is that I don't try and reason about what goes wrong in detail when `undefined pointer access occurs', merely that all sorts of nasty things can happen unpredictably.
First, this is not what I mean by "implementation-defined" (my fault). What I mean is the model of computation: in C, you are always thinking about blocks on the heap, pointers between them, whether something is on the stack or not, what order things are happening in and what order this other code you wrote last year assumes them to be happening in, etc. So you are still thinking, essentially, at the level of the von Neumann architecture.
Again I agree in principle but I don't see that reasoning using a dentotational semantics for Haskell is less `implementation defined' than thinking about a C program in terms of local stores and sequence points; arguably when you're using your model of Haskell you're thinking at the level of a non-deterministic-rewriting-machine architecture :-) .
Sure, some features which affect portability are hidden away, e.g., how long a word of memory is, but those are details, and you can name things and refer to them more nicely, but it's still just assembly programming in disguise. In an FP language, and more generally languages which support an abstract semantics, I don't have to think about those things. Instead, I can think about a problem at a whole different level of abstraction which is closer to the problem domain.
Again: agree in principle; in practice
Third, I don't think it is true that there are many (any?) compilers which meet the C++ specification to a reasonable degree. C++ must be the single worst implemented language in computer history.
If you assume that, in any given languatge, all programs in that language are equally likely to be written then I'd agree that C++ would easily give you the highest ratio of rejected/implementation defined programs of any language when you consider a random sample of programs over this distribution. What I think is a more interesting question is: I'm a (cautious) human being who doesn't produce programs from all over the program state space but clustered in certain areas that are both useful and more understandable. That being the case, is the above ratio now significantly higher than for other languages? In my (very limited experience) it doesn't seem that much higher. I still think trying to `sell' FP on the basis of huge problems with imperative languages that largely don't affect programmers because they don't write programs that involve these problems is a mistake; it's the fact that even in the nice, well defined area of imperative program where most programmers live things are much more complicated than in a functional language that makes more sense. ___cheers,_dave________________________________________________________ www.cs.bris.ac.uk/~tweed/pi.htm |tweed's law: however many computers email: tweed@cs.bris.ac.uk | you have, half your time is spent work tel: (0117) 954-5250 | waiting for compilations to finish.