
Peter Simons wrote:
Ben Rudiak-Gould writes:
start :: IO ctx feed :: ctx -> Buffer -> IO () commit :: ctx -> IO a
'feed' cannot have this signature because it needs to update the context.
Sure it can -- it's just like writeIORef :: IORef a -> a -> IO ().
I guess it's mood to argue that point. I don't want a stream processor to have a global state, so using an internally encapsulated IORef is not an option for me.
I am looking for an more _general_ API, not one that forces implementation details on the stream processor. That's what my StreamProc data type does already. :-)
I'm not arguing about generality; I simply don't understand how your interface is supposed to be used. E.g.: do ctx <- start ctx1 <- feed ctx array1 ctx2 <- feed ctx array2 val1 <- commit ctx1 val2 <- commit ctx2 return (val1,val2) Should this return (MD5 of array1, MD5 of array2), or (MD5 of array1+array2, MD5 of array1+array2), or cause a runtime error? Any of these three might be reasonable, but for your interface to be well-defined you need to stipulate which one is correct. Once you're decided which one is correct, there's no reason not to change the interface so that no one can misinterpret it. My two interfaces are only less general than yours in that they don't have multiple interpretations -- which is a good thing.
start :: ctx feed :: ctx -> Buffer -> IO ctx commit :: ctx -> a
In this interface contexts are supposed to be immutable Haskell values, so there's no meaning in creating new ones or finalizing old ones.
I don't want to restrict the API to immutable contexts. A context could be anything, _including_ an IORef or an MVar. But the API shouldn't enforce that.
It doesn't. Even (length :: [a] -> Int) is likely to cause destructive updating of thunks when it's called, but that's not a reason to change the interface to [a] -> IO Int. The important thing is whether, from the caller's perspective, the function is pure. If it's pure, it shouldn't be in the IO monad, even if that forces some implementations to use unsafePerformIO under the hood. I think you're hoping to have it both ways, capturing destructive- update semantics and value semantics in a single interface. That's not going to work, unfortunately. You must decide whether to enforce single-threading or not.
I would implement feedSTUArray and friends as wrappers around the Ptr interface, not as primitive computations of the stream processor.
I think it's impossible to do this safely, but it would be great if I were wrong.
wrap :: (Storable a, MArray arr a IO) => Ptr a -> Int -> IO (arr Int a) wrap ptr n = peekArray n ptr >>= newListArray (0,n)
Isn't this going in the wrong direction? I think what we want is something like withArrayPtr :: (MArray arr Word8 IO) => arr i Word8 -> (Ptr Word8 -> IO a) -> IO a You're right, though, this can be written safely: withArrayPtr arr act = getElems arr >>= flip withArray act It's terribly slow, though. Ideally one wants a pointer into the original array together with a guarantee that it won't be moved by the garbage collector during the execution of your IO action. I think current versions of GHC will never move the array if your IO action performs no heap allocation, but I can easily imagine that changing in other/future implementations. I suppose you could also have withArrayPtrM :: (MArray arr Word8 m, Ix i) => arr i Word8 -> (Ptr Word8 -> m b) -> m b withArrayPtrI :: (IArray arr a, Ix i) => arr i Word8 -> (Ptr Word8 -> IO b) -> IO b though I'm not sure how much sense those types (or names) make. The first one would force the use of unsafeIOToST if you wanted to use it with ST arrays, but probably that's unavoidable. -- Ben