New subject: Fwd: Semantics of iteratees, enumerators, enumeratees?

24 Aug 2010

      I think the big problem with chunking is that many useful iteratees need to
be able to inspect the length of the chunk.  The most obvious is "drop", but
there are many others.  Or if not inspect the length, have a new function on
the stream "dropReport :: Int -> s -> (s, Int)" which reports how much was
dropped.  Either way, chunking adds a deal of implementation burden.

I suspect that the proper vocabulary for iteratees wouldn't include chunks
at all, only single elements.  This discussion has prompted me to consider
the implications of such an implementation, as it would be much simpler.  I
have one idea that I think will at least maintain performance for many
operations, although there will be performance hits too.  If the drawbacks
are in areas that aren't particularly useful, though, it may be acceptable.

John
...
From: Conal Elliott 
Here's a way I've been tinkering with to think about iteratees clearly.
For simplicity, I'll stick with pure, error-free iteratees for now, and
take
chunks to be strings.  Define a function that runs the iteratee:
...
runIter :: Iteratee a -> [String] -> (a, [String])
Note that chunking is explicit here.
Next, a relation that an iteratee implements a given specification, defined
by a state transformer:
...
sat :: Iteratee a -> State String a -> Bool
Define sat in terms of concatenating chunks:
...
sat it st =
  second concat . runIter it == runState st . second concat
where the RHS equality is between functions (pointwise/extensionally), and
runState uses the representation of State directly
...
runState :: State s a -> s -> (a,s)
(I think this sat definition is what Conrad was alluding to.)
Now use sat to specify and verify operations on iteratees and to
*synthesize* those operations from their specifications.  Some iteratees
might not satisfy *any* (State-based) specification.  For instance, an
iteratee could look at the lengths or number of its chunks and produce
results accordingly.  I think of such iteratees as abstraction leaks.  Can
the iteratee vocabulary be honed to make only well-behaved (specifiable)
iteratees possible to express?  If so, can we preserve performance
benefits?
If indeed the abstraction leaks can be fixed, I expect there will be a
simpler & more conventional semantics than sat above.
- Conal
On Tue, Aug 24, 2010 at 2:55 PM, Conrad Parker 
wrote:
...
On 24 August 2010 14:47, Jason Dagit  wrote:
...
On Mon, Aug 23, 2010 at 10:37 PM, Conrad Parker 
wrote:
...
On 24 August 2010 14:14, Jason Dagit  wrote:
...
I'm not a semanticist, so I apologize right now if I say something
stupid or
incorrect.
On Mon, Aug 23, 2010 at 9:57 PM, Conal Elliott 
wrote:
...
...
...
...
>
> So perhaps this could be a reasonable semantics?
>
> Iteratee a = [Char] -> Maybe (a, [Char])
I've been tinkering with this model as well.
However, it doesn't really correspond to the iteratee interfaces
I've
seen, since those interfaces allow an iteratee to notice size and
number of
chunks.  I suspect this ability is an accidental abstraction leak,
which
raises the question of how to patch the leak.
From a purely practical viewpoint I feel that treating the chunking
as
an
abstraction leak might be missing the point.  If you said, you
wanted
the
semantics to acknowledge the chunking but be invariant under the
size
or
number of the chunks then I would be happier.
I think that's the point, ie. to specify what the invariants should
be. For example (to paraphrase, very poorly, something Conal wrote on
the whiteboard behind me):
run [concat [chunk]] == run [chunk]
ie. the (a, [Char]) you maybe get from running an iteratee over any
partitioning of chunks should be the same, ie. the same as from
running it over the concatenation of all chunks, which is the whole
input [Char].
I find this notation foreign.  I get [Char], that's the Haskell String
type, but what is [chunk]?  I doubt you mean a list of one element.
sorry, that was just my way of writing "the list of chunks" or perhaps
"the stream of chunks that represents the input".
Conrad.
...
...
...
I use iteratees when I need to be explicit about chunking and when I
don't
want the resources to "leak outside" of the stream processing.  If
...
...
...
took
those properties away, I wouldn't want to use it anymore because
you
then
it
...
...
...
would just be an inelegant way to do things.
Then I suppose the model for Enumerators is different than that for
Iteratees; part of the point of an Enumerator is to control the size
of the chunks, so that needs to be part of the model. An Iteratee, on
the other hand, should not have to know the size of its chunks. So you
don't want to be able to know the length of a chunk (ie. a part of the
stream), but you do want to be able to, say, fold over it, and to be
able to stop the computation at any time (these being the main point
of iteratees ...).
I think I agree with that.
Jason

Re: [Haskell-cafe] Re: Fwd: Semantics of iteratees, enumerators, enumeratees?

John Lato

Conal Elliott

John Lato

tags

participants (2)