
On Dec 8, 2009, at 1:27 PM, Henning Thielemann wrote:
On Tue, 8 Dec 2009, Richard O'Keefe wrote:
On Dec 8, 2009, at 12:28 PM, Henning Thielemann wrote:
It is the responsibility of the programmer to choose number types that are appropriate for the application. If I address pixels on a todays screen I will have to choose at least Word16. On 8-bit computers bytes were enough. Thus, this sounds like an error.
That kind of attitude might have done very well in the 1960s.
I don't quite understand. If it is not the responsibility of the programmer to choose numbers of the right size, who else?
It is the responsibility of the computer to support the numbers that the programmer needs. It is the responsibility of the computer to compute CORRECTLY with the numbers that it claims to support. I repeat: the programmer might not KNOW what the integer sizes are. In Classic C and C89, the size of an 'int' was whatever the compiler chose to give you. In c89, limits.h means that the program can find out at run time what range it got, but that's too late.
If the operating system uses Int32 for describing files sizes and Int16 for screen coordinates, I'm safe to do so as well.
In a very trivial sense, this is true. In any less trivial sense, not hardly. Suppose I am calculating screen coordinates using / x \ / a11 a12 a13 \ / u \ | y | = | a21 a22 a23 | | v | \ 1 / \ 0 0 1 / \ 1 / which is a fairly standard kind of transformation, it is not in general sufficient that u, v, x, and y should all fit into Int16. The intermediate results must _also_ fit. (Assume for the moment that overflow is checked.) If I need to represent *differences* of Int32 pairs, I know perfectly well what type I need: Int33. But there is no such type.
The interface to the operating system could use type synonyms FileSize and ScreenCoordinate that scale with future sizes. But the programmer remains responsible for using ScreenCoordinate actually for coordinates and not for file sizes.
Consider this problem: We are given a whole bunch of files, and want to determine the total space used by all of them. Smalltalk: fileNames detectSum: [:each | (FileStatus fromFile: each) size] The answer is correct, however many file names there are in the collection. But if we're using C, or Pascal, or something like that, we want to do FileCollectionSize fcs = 0; for (i = 0; i < n; i++) { fcs += OS_File_System_Size(filename[i]); } and how on earth do we compute the type FileCollectionSize? Remember, it has to be big enough to hold the sum of the sizes of an *unknown* and quite possibly large number of quite possibly extremely large files, not necessarily confined to a single disc, so the total could well exceed what will fit in FileSize. This is especially so when you realise that there might be many repetitions of the same file names. I can _easily_ set this up to overflow 64 bits on a modern machine. In Haskell, you'd want to switch straight over to Integer and stay there.
In an age when Intel have demonstrated 48 full x86 cores on a single chip, when it's possible to get a single-chip "DSP" with >240 cores that's fast enough to *calculate* MHz radio signals in real time, typical machine-oriented integer sizes run out _really_ fast. For example, a simple counting loop runs out in well under a second using 32-bit integers.
The programmer doesn't always have the information necessary to choose machine-oriented integer sizes. Or it might not offer a choice. Or the choice the programmer needs might not be available: if I want to compute sums of products of 64-bit integers, where are the 128-bit integers I need?
And the consequence is to ship a program that raises an exception about problems with the size of integers?
Yes. Just because the programmer has TRIED to ensure that all the numbers will fit into the computer's arbitrary and application- irrelevant limits doesn't mean s/he succeeded. For that matter, it doesn't mean that the compiler won't use an optimisation that breaks the program. (Yes, I have known this happen, with integer arithmetic, and recently.)
I'm afraid I don't understand what you are arguing for.
I'm arguing for three things. (1) If you look at the CERT web site, you'll discover that there have been enough security breaches due to integer overflow that they recommend working very hard to prevent it, and there's an "As If Ranged" project to enable writing C _as if_ it could be trusted to raise exceptions about problems with the size of integers. It is better to raise an exception than to let organised crime take over your computer. (2) In a language like Ada where the programmer can *say* exactly what range they need, and the bounds can be computed at compile time, and the compiler either does it right or admits right away that it can't, it's the programmer's fault if it's wrong. Otherwise it isn't. (3) Be very worried any time you multiply or divide integers. (INT_MIN / (-1) is an overflow, C is allowed to treat INT_MIN % (-1) as undefined even though it is mathematically 0.) No, make that "be terrified". If you cannot formally verify that results will be in range, use Haskell's Integer type.