A new form of newtype

Haskell is very nearly a high level language. One rather unpleasant way in which it lets the the underlying machine show through is integral types. Aside from the unbounded Integer type, which is fine, there are integral types bounded by machine sizes: Int, size unspecified, Data.Int.Int{8,16,32,64}, and Data.Word.Word{8,16,32,64}. There are a number of problems with these. The principal one is that in order to figure out which size is needed, a programmer must do arithmetic. I still remember a student's bitter complaint about being required to find the base 2 logarithm of 32 in an examination where calculators were forbidden... I propose a small addition to 'newtype' syntax: 'newtype' id '=' '[' expr {',' expr}... ']' [deriving-part] The expressions are Integral expressions made from constants and suitable Prelude functions. Each Haskell implementation supports some sublist of the types [Int1,Word1,Int2,Word2,Int3,Word3,...,Int64,Word64, ...,Int256,Word256,...,Integral, _except (It's almost interesting that the Natural type missing from Haskell would go after Integral in this list, and so would never be used.) The effect of the declaration would be to make "id" a copy of the first type in the list of supported types that is big enough to include the value of every expression on the right hand side, except that minBound = the smallest value on the right hand side maxBound = the largest value on the right hand side rather than being the bounds for the underlying type. The instances that could be derived would include Bits, Bounded, Data, Enum, Eq, Integral, Ix, Num, Ord, PrintfArg, Read, Real, Show, Storable, Typable. I'd actually prefer that the default list of derived instances be short, but it's hard to draw an arbitrary line, so the default instances should be the same as the default instances for Int in the same Haskell system. Suppose I knew that I had a 2TB disc (I don't, and wish I did) and wanted to be able to hold not only any file size but the sums and differences of up to 10,000,000 file sizes. (There could be one giant file, and all the file sizes could be the size of that file.) newtype FileSize = [-10000000*2^41,10000000*2^41] I don't see any simple way to get named values into the expressions defining a new integral type, but the use of cpp with Haskell has been common practice for a long time, so #define Tera 2^40 #define Many 10000000 newtype FileSize = [-Many*2*Tera,Many*2*Tera] In this case, FileSize would be a copy of Integer.

What would these types be used for? If your students are complaining about having to perform logarithms to store integers, inform them of the "Integer" type. The existing sized types -- Word/Int [8, 16, 32, 64] -- are useful primarily because they match up with "standard" integral data types in other languages. They're great for FFI, (un)marshaling via ByteString, and occasional low-level code. In contrast, there is (to my knowledge) no machine with 42-bit integers, so there's no point to defining such intermediate types. In your FileSize example, it would be much easier to simply use: newtype FileSize = FileSize Integer since then your code will cope nicely when everybody upgrades to PB disks in a few years. If you need particular bounds, create your own instance of Bounded.

On Dec 9, 2009, at 1:58 PM, John Millikin wrote:
What would these types be used for? If your students are complaining about having to perform logarithms to store integers, inform them of the "Integer" type.
I mentioned one student who couldn't compute log 32 *himself* 2 to point out that >> people are bad at arithmetic <<. As my later example made clear, base 2 logarithms are not the only calculations one may need to perform. I would want to use these types practically any time that I want to choose bounded integer types for my own purposes.
The existing sized types -- Word/Int [8, 16, 32, 64] -- are useful primarily because they match up with "standard" integral data types in other languages. They're great for FFI, (un)marshaling via ByteString, and occasional low-level code.
Well, no. If you want to match up with Java's byte, short, int, or long, then yes, Int8, Int16, Int32, or Int64 are a good match. If you want to interface with C, you really really ought to be using Data.Foreign.{CChar,CSChar,CUChar,CShort,CUShort,CInt,CUInt,CLong, CULong,CPtrdiff,CSize,CWchar,CSigAomic,CLLong,CULLong,CIntPtr, CUintPtr,CIntMax,CUintMax,CTime} or Foreign.C.Error.Errno, or System.Posix.Types.{you can read the list yourself} and you may well find yourself wishing for others. _These_ are the '"standard" integral data types' in C, and they exist precisely because the programmer has no other portable way of specifying them. MacOS X gives me library support for integers up to 1024 bits; if I want to interface to those, what do I do? Never mind not getting what I ask for, as things standard I can't even _ask_. People are now writing EDSLs using Haskell to generate code for all sorts of interesting things. What if you want to use Haskell as a host for an EDSL targeted at a 24-bit DSP?
In contrast, there is (to my knowledge) no machine with 42-bit integers, so there's no point to defining such intermediate types.
There may not be MACHINES with 42-bit integers, but that doesn't mean there aren't PROBLEMS that need them. This whole idea of "let the machine dictate the sizes" is precisely what I'm complaining of. The Mac OS X library, as I noted above, has /System/Library/Frameworks/Accelerate.framework/Frameworks/\ vecLib.framework/Headers/vBigNum.h vS128 vS256 vS512 vS1024 signed vU128 vU256 vU512 vU1024 unsigned and I'm pretty sure the hardware doesn't do 1024 bit integers directly.
In your FileSize example, it would be much easier to simply use:
newtype FileSize = FileSize Integer
sure it would, BUT IT WOULDN'T BE BOUNDED. By this argument, the Haskell type 'Int' should not exist; except for using System.* and Foreign.* interface types we should not be using bounded integers at all.
since then your code will cope nicely when everybody upgrades to PB disks in a few years. If you need particular bounds, create your own instance of Bounded.
Sorry, but creating instances of Bounded is what a compiler is for. If I don't write it, I won't wrong it. Put together Data.Int, Data.Word, Foreign.*, System.*, and Haskell _already_ has oodles of integral newtypes. All I'm asking for is a human-oriented way of doing more of the same. If wanting a human-oriented way of doing things were unreasonable, we would be unreasonable to use Haskell, wouldn't we?
participants (2)
-
John Millikin
-
Richard O'Keefe