
Dear GHC developers, Probably, it is better to provide Integer or Integral a => a instead of Int in the function sizeFM :: FiniteMap k e -> Int What do you think of this? Copy, please, the answer to mechvel@botik.ru --------------------- Dr. Serge Mechveliani mechvel@botik.ru

On Sun, 2004-04-25 at 14:32, Serge D. Mechveliani wrote:
Dear GHC developers,
Probably, it is better to provide Integer or Integral a => a instead of Int in the function sizeFM :: FiniteMap k e -> Int
What do you think of this?
Are you planning to put more than 2^31 entries into your FiniteMap? I don't think I could afford a machine with the >16GB of ram necessary to do that. I guess this same argument took place over the Prelude.length function. The conclusion was to add List.genericLength :: Num a => [b] -> a Duncan

On Sun, Apr 25, 2004 at 02:45:26PM +0100, Duncan Coutts wrote:
On Sun, 2004-04-25 at 14:32, Serge D. Mechveliani wrote:
Dear GHC developers,
Probably, it is better to provide Integer or Integral a => a instead of Int in the function sizeFM :: FiniteMap k e -> Int
What do you think of this?
Are you planning to put more than 2^31 entries into your FiniteMap?
Why not? I never use Int. Writing a program I imagine it to perform on various machines, and also the ones to appear in future. To my mind, the program itself should not change from machine to machine.
I don't think I could afford a machine with the >16GB of ram necessary to do that.
I guess this same argument took place over the Prelude.length function. The conclusion was to add List.genericLength :: Num a => [b] -> a
So, we probably need genericSizeFM ... ----------------- Serge Mechveliani mechvel@botik.ru

QUESTION: "I read in a newspaper that in l981 you said '640K of memory should be enough for anybody.' What did you mean when you said this?" ANSWER: "I've said some stupid things and some wrong things, but not that. No one involved in computers would ever say that a certain amount of memory is enough for all time." http://www.wired.com/news/print/0,1294,1484,00.html Dell's Poweredge servers address up to 32GB of memory today! There are already 5.7 billion people on the planet (>2^31) and 741 million phone lines. In my mind, there is NO QUESTION that 2^31 keys is a reasonable size for a FiniteMap or will be in the very very near future. Moreover, it is not clear that the CPU/memory overhead of returning Integer rather than Int for sizeFM is sufficiently high to be worth bothering the programmer about. -Alex- _________________________________________________________________ S. Alexander Jacobson mailto:me@alexjacobson.com tel:917-770-6565 http://alexjacobson.com On Sun, 25 Apr 2004, Duncan Coutts wrote:
On Sun, 2004-04-25 at 14:32, Serge D. Mechveliani wrote:
Dear GHC developers,
Probably, it is better to provide Integer or Integral a => a instead of Int in the function sizeFM :: FiniteMap k e -> Int
What do you think of this?
Are you planning to put more than 2^31 entries into your FiniteMap? I don't think I could afford a machine with the >16GB of ram necessary to do that.
I guess this same argument took place over the Prelude.length function. The conclusion was to add List.genericLength :: Num a => [b] -> a
Duncan
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Sun, Apr 25, 2004 at 03:20:42PM -0400, S. Alexander Jacobson wrote:
QUESTION: "I read in a newspaper that in l981 you said '640K of memory should be enough for anybody.' What did you mean when you said this?"
ANSWER: "I've said some stupid things and some wrong things, but not that. No one involved in computers would ever say that a certain amount of memory is enough for all time."
http://www.wired.com/news/print/0,1294,1484,00.html
Dell's Poweredge servers address up to 32GB of memory today! There are already 5.7 billion people on the planet (>2^31) and 741 million phone lines. In my mind, there is NO QUESTION that 2^31 keys is a reasonable size for a FiniteMap or will be in the very very near future.
On the other hand, since they are still 32 bit computers, any given application can still only access 4G of memory. This issue will only be a problem on 64 bit platforms which have a 32 bit Int.
Moreover, it is not clear that the CPU/memory overhead of returning Integer rather than Int for sizeFM is sufficiently high to be worth bothering the programmer about.
I'd say that rather than returning an Integer, we'd be better off just using a 64 bit Int on 64 platforms. -- David Roundy http://www.abridgegame.org

On Sun, Apr 25, 2004 at 04:12:25PM -0400, David Roundy wrote:
On the other hand, since they are still 32 bit computers, any given application can still only access 4G of memory. This issue will only be a problem on 64 bit platforms which have a 32 bit Int.
Here is a funny program that gives wrong result because of length returning Int on a 32-bit computer. import Data.List main = print (length (genericTake 5000000000 (repeat ()))) Running it shows $ ./A 705032704 But this is a strange piece of code and I agree that it will hardly be a problem on a 32 bit platform. I believe it's impossible to write similarly behaving program for Data.FiniteMap with current compilers and libraries. Best regards, Tom -- .signature: Too many levels of symbolic links

On Sun, Apr 25, 2004 at 11:38:19PM +0200, Tomasz Zielonka wrote:
On Sun, Apr 25, 2004 at 04:12:25PM -0400, David Roundy wrote:
On the other hand, since they are still 32 bit computers, any given application can still only access 4G of memory. This issue will only be a problem on 64 bit platforms which have a 32 bit Int.
Here is a funny program that gives wrong result because of length returning Int on a 32-bit computer.
import Data.List main = print (length (genericTake 5000000000 (repeat ())))
Running it shows
$ ./A 705032704
But this is a strange piece of code and I agree that it will hardly be a problem on a 32 bit platform.
In fact, the only way this will be a problem is if your list is lazy and consumed by the length function, but it's hard to see how that could happen except in strange example code. That is, it's hard to imagine when you'd need to know the length of a data structure you don't need... A perhaps slightly less contrived example (although far slower) would be to try to write hFileSize as: hStupidFileSize f = do h <- hOpen f ReadMode length `liftM` hGetContents h which would give wrong results for large files. But of course this is why hFileSize returns Integer... -- David Roundy http://www.abridgegame.org

On Mon, Apr 26, 2004 at 06:42:20AM -0400, David Roundy wrote:
On Sun, Apr 25, 2004 at 11:38:19PM +0200, Tomasz Zielonka wrote:
On Sun, Apr 25, 2004 at 04:12:25PM -0400, David Roundy wrote:
On the other hand, since they are still 32 bit computers, any given application can still only access 4G of memory. This issue will only be a problem on 64 bit platforms which have a 32 bit Int.
Here is a funny program that gives wrong result because of length returning Int on a 32-bit computer.
import Data.List main = print (length (genericTake 5000000000 (repeat ())))
Running it shows
$ ./A 705032704
But this is a strange piece of code and I agree that it will hardly be a problem on a 32 bit platform.
In fact, the only way this will be a problem is if your list is lazy and consumed by the length function, but it's hard to see how that could happen except in strange example code. That is, it's hard to imagine when you'd need to know the length of a data structure you don't need...
Think `wc -l'. Greetings, Carsten -- Carsten Schultz (2:38, 33:47), FB Mathematik, FU Berlin http://carsten.codimi.de/ PGP/GPG key on the pgp.net key servers, fingerprint on my home page.

Sun and Dell both sell 64-bit boxes. But the core question is why have two different types at all? This issue is timely because I just got an error in code that looks vaguely like: h<-openFile "foo" AppendMode pos <- hFileSize h hPutStr $ show something hClose h content <- readFile "foo" return $ take pos content This code produces an error because (madness!): hFileSize::Handle -> IO Integer take::forall a. Int -> [a]->[a] I have to assume conversion between files and lists is not all that rare even in beginner code. Note: I don't really care whether everything is 64bit Int or Integer. I just find having to care about this point in such trivial code ridiculous. And re sizeFM, I would note that Google has more than 2^31 pages indexed in memory. -Alex- _________________________________________________________________ S. Alexander Jacobson mailto:me@alexjacobson.com tel:917-770-6565 http://alexjacobson.com On Sun, 25 Apr 2004, David Roundy wrote:
On Sun, Apr 25, 2004 at 03:20:42PM -0400, S. Alexander Jacobson wrote:
QUESTION: "I read in a newspaper that in l981 you said '640K of memory should be enough for anybody.' What did you mean when you said this?"
ANSWER: "I've said some stupid things and some wrong things, but not that. No one involved in computers would ever say that a certain amount of memory is enough for all time."
http://www.wired.com/news/print/0,1294,1484,00.html
Dell's Poweredge servers address up to 32GB of memory today! There are already 5.7 billion people on the planet (>2^31) and 741 million phone lines. In my mind, there is NO QUESTION that 2^31 keys is a reasonable size for a FiniteMap or will be in the very very near future.
On the other hand, since they are still 32 bit computers, any given application can still only access 4G of memory. This issue will only be a problem on 64 bit platforms which have a 32 bit Int.
Moreover, it is not clear that the CPU/memory overhead of returning Integer rather than Int for sizeFM is sufficiently high to be worth bothering the programmer about.
I'd say that rather than returning an Integer, we'd be better off just using a 64 bit Int on 64 platforms. -- David Roundy http://www.abridgegame.org _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

G'day all.
Quoting "S. Alexander Jacobson"
Sun and Dell both sell 64-bit boxes. But the core question is why have two different types at all?
Operations on an Integer (e.g. addition, multplication) are not O(1). On an Int, they are (for all intents and purposes). This is a sufficiently important (IMO) difference that it's worthwhile to distinguish between integral types which the hardware can handle natively and others. In this case, I think the types are both wrong:
hFileSize::Handle -> IO Integer
Should be :: Handle -> IO Word64 Yes, Word64 is a dirty type. In general, anything which interfaces directly to the operating system is going to have a dirty type. Such is life.
take::forall a. Int -> [a]->[a]
Should be :: (Integral a) => a -> [b] -> [b]
And re sizeFM, I would note that Google has more than 2^31 pages indexed in memory.
I would note, in addition, that they don't have that many pages indexed on a single machine. Almost nobody has a database with that many records on a single machine, even those who have clusters of 64 bit machines. I would make sizeFM an Int, but define Int to be the most reasonable integral type for the underlying platform, and at least 32 bits in size. Cheers, Andrew Bromage

David Roundy
I'd say that rather than returning an Integer, we'd be better off just using a 64 bit Int on 64 platforms.
| 7.19.2. GHC's interpretation of undefined behaviour in Haskell 98 | | This section documents GHC's take on various issues that are left | undefined or implementation specific in Haskell 98. | | Sized integral types | | In GHC the Int type follows the size of an address on the host | architecture; in other words it holds 32 bits on a 32-bit | machine, and 64-bits on a 64-bit machine. Looks like a reasonable way to go about it. -kzm -- If I haven't seen further, it is by standing in the footprints of giants
participants (8)
-
ajb@spamcop.net
-
Carsten Schultz
-
David Roundy
-
Duncan Coutts
-
Ketil Malde
-
S. Alexander Jacobson
-
Serge D. Mechveliani
-
Tomasz Zielonka