
We had a short discussion on the IRC channel the other day about Arrays. I advocated that we do some refactoring work and didn't meet with overwhelming disagreement, so I wanted to propose that the Arrays interfaces be refactored in Haskell'. As a Haskell new-ish-bie, the various Array interfaces seem a bit inconsistent and make learning/using arrays complicated. I *do* understand how to use arrays in Haskell, but I think that the interface could be cleaned up. Examples of current confusions: IArray and Array are dupes (obvious); listArray for IArray, but newListArray for MArrays; "!" for IArray, but readArray for MArrays. Proposal: I would propose the following for Haskell': "Array" is the interface for arrays; drop IArray; "MArray" is the interface for arrays in monads (IO, ST, etc); As much as possible, functions that operate on arrays will be the same for both Array and MArray (e.g. "!" for element access in both Array and MArray); In cases in which confusion would result from identical functions or in which it is beneficial to differentiate, functions will be minimally changed to identify the intended function type (e.g. listToArray for Array; listToMArray for MArray). Descriptive names ("toArray"; "toMArray") instead of termsOfArt ("freeze", "thaw"). --- Perhaps this e-mail could be read more generally as a request to consistencify/update the (Data) libraries in general (e.g. only use "unsafe*" if referential transparency is broken; group refs into Data.Ref.IO, Data.Ref.ST rather than Data.STRef, Data.IORef; create ListInterface Class to propogate consistently "toList" & "fromList" functions; append "_s" to functions that are stricified, "foldl_s" instead of "foldl'"; "//" performs very different functions in List and IArray). Is this possible for Haskell'? Or is this too much of a break? If it's possible, I'm happy to build a wiki page for discussion (I noticed that a short page has been started.) - Alson (A presumptuous new-ish-bie? Why yes, how did you guess?)

alson:
We had a short discussion on the IRC channel the other day about Arrays. I advocated that we do some refactoring work and didn't meet with overwhelming disagreement, so I wanted to propose that the Arrays interfaces be refactored in Haskell'.
As a Haskell new-ish-bie, the various Array interfaces seem a bit inconsistent and make learning/using arrays complicated. I *do* understand how to use arrays in Haskell, but I think that the interface could be cleaned up.
Examples of current confusions: IArray and Array are dupes (obvious); listArray for IArray, but newListArray for MArrays; "!" for IArray, but readArray for MArrays.
And unsafeRead/unsafeWrite are too verbose. They are usually (almost always?) safe (since the code does its own checks), so perhaps this essential-for-performance interface should have nicer names? They're not in the same unsafe league that unsafePerformIO is. Just something I pondered during the shootout massacre a couple of weeks back. Cheers, Don

On Wed, Feb 22, 2006 at 03:39:48PM +1100, Donald Bruce Stewart wrote:
And unsafeRead/unsafeWrite are too verbose. They are usually (almost always?) safe (since the code does its own checks),
The same can be said about most uses of unsafePerformIO - you wouldn't be using it if you weren't certain that your program will behave properly.
so perhaps this essential-for-performance interface should have nicer names?
Any primitive with can destroy the nice properties of Haskell when *misused* should be marked as unsafe. The point is that you can do anything with other nice, non-unsafe functions and you will still stay within the semantics of the language. If you don't like those long names, nobody is stopping you from defining your own local bindings. Thanks to inlining, it should be as efficient as using unsafeWrite/unsafeRead directly.
They're not in the same unsafe league that unsafePerformIO is.
Why not? With unsafeWrite you can write to any address in memory, so you can crash the program, change values which should be constant, etc. Perhaps unsafeRead is not that dangerous, but you can surely cause SEGV with it. Or am I missing something? Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland

tomasz.zielonka:
On Wed, Feb 22, 2006 at 03:39:48PM +1100, Donald Bruce Stewart wrote:
And unsafeRead/unsafeWrite are too verbose. They are usually (almost always?) safe (since the code does its own checks),
The same can be said about most uses of unsafePerformIO - you wouldn't be using it if you weren't certain that your program will behave properly.
so perhaps this essential-for-performance interface should have nicer names?
Any primitive with can destroy the nice properties of Haskell when *misused* should be marked as unsafe. The point is that you can do anything with other nice, non-unsafe functions and you will still stay within the semantics of the language.
If you don't like those long names, nobody is stopping you from defining your own local bindings. Thanks to inlining, it should be as efficient as using unsafeWrite/unsafeRead directly.
They're not in the same unsafe league that unsafePerformIO is.
Why not? With unsafeWrite you can write to any address in memory, so you can crash the program, change values which should be constant, etc. Perhaps unsafeRead is not that dangerous, but you can surely cause SEGV with it.
It's not a terribly serious suggestion ;) I just found that using unsafeRead/Write is very important for shootout entries (we used it a lot -- it's the only way to beat C), but a lot uglier on the page than (the equally dangerous) peek/poke, which get nice short names for some reason. Cheers, Don

Bruce Stewart wrote:
And unsafeRead/unsafeWrite are too verbose. They are usually (almost always?) safe (since the code does its own checks),
The same can be said about most uses of unsafePerformIO - you wouldn't be using it if you weren't certain that your program will behave properly.
so perhaps this essential-for-performance interface should have nicer names?
Any primitive with can destroy the nice properties of Haskell when *misused* should be marked as unsafe. The point is that you can do anything with other nice, non-unsafe functions and you will still stay within the semantics of the language.
Based on a ShootoutEntry discussion that Don and I had, I was under the mistaken impression that "unsafeWrite" broke an ST assumption because "unsafePerformIO" broke an IO assumption. However, I think that I agree with both of you because we're using multiple definitions of "unsafe". I see these as different degrees of "unsafe": unsafePerformIO - breaks an IO assumption; unsafeWrite - doesn't do a bounds check...
With unsafeWrite you can write to any address in memory, so you can crash the program hmm... If I put an incorrect index into IArray.write, Ix.index errors and the program exits/dies/crashes (without SEGV). This doesn't seem much "safer". To be "safe": readArray :: (MArray a e m, Ix i) => a i e -> i -> m e writeArray :: (MArray a e m, Ix i) => a i e -> i -> e -> m () could be readArray :: (MArray a e m, Ix i) => a i e -> i -> m (Maybe e) writeArray :: (MArray a e m, Ix i) => a i e -> i -> e -> m (Maybe ()) ...but this seems to be carrying it a bit far.
I think that I'd prefer clear markings for different specializations: unsafePerformIO - UNSAFE; use with caution; writeArray - write to an array; returns m (Maybe ()); very safe; writeArray_q - write _quickly without bounds check; moderately less safe; foldl - blah blah; foldl_s - "foldl" made more _strict. - Alson

On Wed, Feb 22, 2006 at 09:59:07AM -0800, Alson Kemp wrote:
With unsafeWrite you can write to any address in memory, so you can crash the program hmm... If I put an incorrect index into IArray.write, Ix.index errors and the program exits/dies/crashes (without SEGV). This doesn't seem much "safer".
There is a huge difference - you know it will fail and how it will fail. Also, in GHC it will throw an exception which can be caught. On the other hand, with incorrectly used unsafeWrite your program *may* fail, but it can also do something bizarre, like sending an email to Santa Claus or simply returning wrong results. Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland

While we're on the topic, I have a couple of problems with the current array system that cut deeper than the naming: * The function for getting the bounds of an MArray is pure, so the interface can't accommodate resizable arrays. * unsafeAt, unsafeRead and unsafeWrite take 0-based indices, and the bounds checking and conversion is handled externally, based on the bounds you return. This means the interfaces can't support array windowing, at least in the multidimensional case. I'd be happy with windowing for one-dimensional arrays only, but there's no way to restrict your array type to one-dimensional index types. -- Ben

Hello Ben, Wednesday, February 22, 2006, 9:47:19 PM, you wrote: BRG> While we're on the topic, I have a couple of problems with the current array BRG> system that cut deeper than the naming: BRG> * The function for getting the bounds of an MArray is pure, so the BRG> interface can't accommodate resizable arrays. i think that it is because such arrays can be implemented more efficiently. then you can implement dynamic arrays on top of MArray interface (although i'm not sure that this will be efficient. GHC's classes efficiency is black magic :) BRG> * unsafeAt, unsafeRead and unsafeWrite take 0-based indices, and the BRG> bounds checking and conversion is handled externally, based on the BRG> bounds you return. This means the interfaces can't support array BRG> windowing, at least in the multidimensional case. I'd be happy with BRG> windowing for one-dimensional arrays only, but there's no way to BRG> restrict your array type to one-dimensional index types. for one-dimensional arrays it's easy to implement. i agree with you, though, that we can move more operations to the class interface -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Wed, Feb 22, 2006 at 06:47:19PM +0000, Ben Rudiak-Gould wrote:
While we're on the topic, I have a couple of problems with the current array system that cut deeper than the naming:
* The function for getting the bounds of an MArray is pure, so the interface can't accommodate resizable arrays.
Indeed. this has bothered me a whole lot too. I keep on trying to implement an expanding circular buffer and then being sad when it can't be done. We could fix it fairly easily, we just need to get rid of HasBounds as a superclass of MArray and add a new method 'getBounds' that returns the bounds in the monad and then modify the default methods to use getBounds rather than bounds. since they are all already in the monad this will work just fine. it would be almost perfectly backwards compatable, the only change would be some code might need to list HasBounds in their type signatures seperatly. If we can get this in before the next release of ghc that would be really great. John -- John Meacham - ⑆repetae.net⑆john⑈
participants (6)
-
Alson Kemp
-
Ben Rudiak-Gould
-
Bulat Ziganshin
-
dons@cse.unsw.edu.au
-
John Meacham
-
Tomasz Zielonka