The container problem

newer
Re: [Haskell-cafe] Re: Haddock +...

older
[Haskell-cafe] Version 0.4.3 of...

Andrew Coppin

26 Sep 2008 26 Sep '08

6:15 p.m.

Take a look around you. Haskell provides several sorts of container. We have: Data.List Data.Set Data.Map Data.Hashtable Data.ByteString Data.Sequence Data.Array Data.Tree Data.IntSet Data.IntMap ... In other words, we have *a lot* of different data containers. And yet, each one provides its own unique API. To anybody used to OO languages, that sounds pretty crazy. In something like Java or Smalltalk or Eiffel, you have an abstract class that represents "container", and maybe a seperate one for "ordered container", and then concrete subclasses for each kind of container. Each one may add a few unique methods of its own, but basically all the containers have a uniform API, and you can write functions that work for any arbitrary [ordered] container. In Haskell, you can't do this. Why is that? To me, it seems that there are two sticking points: 1. Historical reasons. 2. The Haskell '98 type system. (1) is obviously solvable. (2) is harder. Some containers can contain *any* type of data. Haskell permits parametric polymorphism, so this is no problem: Data.List.map :: (a -> b) -> [a] -> [b] Other containers only support *one* type of data: Data.ByteString.Char8.map :: (Char -> Char) -> ByteString -> ByteString The type has a different kind, and the function parameter's type is more constrained. Yet still this poses no problem. However... now try writing a class that both of these functions could be methods of. Good luck with that, by the way... This is AFAIK also the reason why, e.g., Set is *not* an instance of Monad; you can't write a valid instance. The type checker won't have it. To ears accustomed to the OO way, all this makes it sound like Haskell's type system sucks. (Which is rich, because in half the OO languages, you can't write a type-safe container that works for arbitrary element types in the first place! Haskell is a Big Win here.) If I understand this correctly, to solve this problem you need either Functional Dependencies or Associated Types. Is that correct? I also gather that "FDs have problems" - although I have no idea what those problems are. Everybody's hoping that ATs will fix this, but ATs are still kinda new. (Are they even fully implemented in GHC yet?) Can anybody correct/expand on this state of affires? I just want to make sure I understand our position correctly...

Show replies by date

Albert Y. C. Lai

26 Sep 26 Sep

6:57 p.m.

Andrew Coppin wrote:

...

If I understand this correctly, to solve this problem you need either Functional Dependencies or Associated Types. Is that correct?

A motivating example in papers on FD is exactly typeclasses for containers. Okasaki puts this into practice in the Edison library. Despite its comprehensiveness, elegance, and the Okasaki name brand, it did not become mainstream. I don't know why.

Andrew Coppin

27 Sep 27 Sep

5:23 p.m.

Albert Y. C. Lai wrote:

...

Andrew Coppin wrote:

...
If I understand this correctly, to solve this problem you need either Functional Dependencies or Associated Types. Is that correct?

A motivating example in papers on FD is exactly typeclasses for containers. Okasaki puts this into practice in the Edison library. Despite its comprehensiveness, elegance, and the Okasaki name brand, it did not become mainstream. I don't know why.

Can anybody actually demonstrate concretely how FDs and/or ATs would solve this problem? (I.e., enable you to write a class that any container can be a member of, despite constraints on the element types.)

Antoine Latter

5:39 p.m.

On Sat, Sep 27, 2008 at 12:23 PM, Andrew Coppin wrote:

...

Can anybody actually demonstrate concretely how FDs and/or ATs would solve this problem? (I.e., enable you to write a class that any container can be a member of, despite constraints on the element types.)

Sure! Using type-families:

...

class Container c where type Elem c insert :: Elem c -> c -> c

...

instance Container [a] where type Elem [a] = a insert = (:)

...

instance Container ByteString where type Elem ByteString = Word8 insert = BS.cons

...

instance Ord a => Container (Set a) where type Elem (Set a) = a insert = Set.insert

In GHCi:

...

:t insert insert :: forall c. (Container c) => Elem c -> c -> c

Now the hard part is coming up with a proper API and class hierarchy. -Antoine

Andrew Coppin

6:34 p.m.

New subject: ATs [The container problem]

Antoine Latter wrote:

...

Sure! Using type-families:

...
class Container c where type Elem c insert :: Elem c -> c -> c

...
instance Container [a] where type Elem [a] = a insert = (:)

...
instance Container ByteString where type Elem ByteString = Word8 insert = BS.cons

...
instance Ord a => Container (Set a) where type Elem (Set a) = a insert = Set.insert

That's more or less how I was hoping it works. (Was unsure of the actual syntax, and the documentation is rather terse.) So there's a class called Container that has a _type_ that is _associated_ with it, representing the type of the elements? And you can set that type to be either a type variable or an explicit type?

...

Now the hard part is coming up with a proper API and class hierarchy.

So... exactly like in every OOP language in existence then? ;-)

Bulat Ziganshin

6:08 p.m.

New subject: Re[2]: The container problem

Hello Andrew, Saturday, September 27, 2008, 9:23:47 PM, you wrote:

...

Can anybody actually demonstrate concretely how FDs and/or ATs would solve this problem? (I.e., enable you to write a class that any container can be a member of, despite constraints on the element types.)

you may find comprehensive explanation in ghc user manual, it's chapter about FDs use this as motivating example :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Andrew Coppin

6:43 p.m.

New subject: FDs [The container problem]

Bulat Ziganshin wrote:

...

Hello Andrew,

Saturday, September 27, 2008, 9:23:47 PM, you wrote:

...
Can anybody actually demonstrate concretely how FDs and/or ATs would solve this problem? (I.e., enable you to write a class that any container can be a member of, despite constraints on the element types.)

you may find comprehensive explanation in ghc user manual, it's chapter about FDs use this as motivating example :)

Section 8.6.2, "Functional Dependencies": "There should be more documentation, but there isn't (yet). Yell if you need it." Yeah, that's real helpful. :-P But hey, there's an academic paper... *sigh* Ooo, wait a sec, section 8.6.2.2. That helps... Mmm, OK. Now can somebody explain the "FDs cause problems" part?

David Menendez

29 Sep 29 Sep

2:04 a.m.

New subject: FDs [The container problem]

On Sat, Sep 27, 2008 at 2:43 PM, Andrew Coppin wrote:

...

Ooo, wait a sec, section 8.6.2.2. That helps...

Mmm, OK. Now can somebody explain the "FDs cause problems" part?

It's not so much that fundeps cause problems, as they're less convenient than associated types/type families in some cases. If you look at the collection example again, class CollFD e c | c -> e where emptyFD :: c insertFD :: e -> c -> c class CollTF c where type Elem c :: * emptyTF :: c insertTF :: Elem c -> c -> c If I want to write code that's polymorphic over instances of CollFD, I need to use two type variables, even though one of them is completely determined by the other. ==== If I want to write a function that, say, adds 0 to an arbitrary Int-collection, I would write insertZeroFD :: (CollFD Int c) => c -> c insertZeroFD = insertFD 0 insertZeroTF :: (CollTF c, Elem c ~ Int) => c -> c insertZeroTF = insertTF 0 But the class context for insertZeroFD isn't valid, because it doesn't have the form CollFD e c, where e and c are type variables. So we need the FlexibleContexts extension to make that work. (I think this is in section 8.7.1.1 of the GHC manual, although it doesn't refer to FlexibleContexts by name. See also http://hackage.haskell.org/trac/haskell-prime/wiki/FlexibleContexts.) ==== Let's say I have a collection wrapper that stores the size of the collection, like in Edison. data SizedFD1 e c = SizedFD1 Int c instance (CollFD e c) => CollFD e (SizedFD1 e c) where ... data SizedFD2 c = SizedFD2 Int c instance (CollFD e c) => CollFD e (SizedFD2 c) where ... data SizedTF c = SizedTF Int c instance (CollTF c) => CollTF (SizedTF c) where type Elem (SizedTF c) = Elem c ... The two versions of SizedFD each have annoyances. First, they both have a type variable directly in the instance head, which requires another extension, FlexibleInstances. (Section 8.6.3.1 of the GHC manual.) SizedFD1 has a phantom parameter to capture the element type, so that the element type can be connected to the collection type. Without that, we have SizedFD2, which will not work without yet another extension, UndecidableInstances. (Section 8.6.3.2 of the GHC manual.) -- Dave Menendez http://www.eyrie.org/~zednenem/

Achim Schneider

26 Sep 26 Sep

7:01 p.m.

Andrew Coppin wrote:

...

[...] I completely agree that Hashtable should instance Map and alike.

...

Data.List.map :: (a -> b) -> [a] -> [b]

Other containers only support *one* type of data:

Data.ByteString.Char8.map :: (Char -> Char) -> ByteString -> ByteString

The type has a different kind, and the function parameter's type is more constrained. Yet still this poses no problem.

However... now try writing a class that both of these functions could be methods of. Good luck with that, by the way...

Well the type _could_ be map :: (a -> b) -> ByteString a -> ByteString b and have a hell a lot of plumbing that utterly destroys ByteString's optimisation. But then you'll be happy to know that there's already Data.Stream.List, with more coming at the same speed as we can order pizza for dons. http://hackage.haskell.org/trac/ghc/ticket/915 -- (c) this sig last receiving data processing entity. Inspect headers for copyright history. All rights reserved. Copying, hiring, renting, performance and/or broadcasting of this signature prohibited.

Dougal Stanton

7:23 p.m.

On Fri, Sep 26, 2008 at 8:01 PM, Achim Schneider wrote:

...

But then you'll be happy to know that there's already Data.Stream.List, with more coming at the same speed as we can order pizza for dons.

http://hackage.haskell.org/trac/ghc/ticket/915

I was hoping that ticket would reveal the delivery address that we had to send pizza to, but instead it revealed that streams stuff is not scheduled for inclusion until GHC 6.12. :-( Cheers, D

John Van Enk

7:25 p.m.

...

...revealed that streams stuff is not scheduled for inclusion until GHC 6.12

How many pizzas will it take to bump that to 6.10? :D On Fri, Sep 26, 2008 at 3:23 PM, Dougal Stanton wrote:

...

On Fri, Sep 26, 2008 at 8:01 PM, Achim Schneider wrote:

...
But then you'll be happy to know that there's already Data.Stream.List, with more coming at the same speed as we can order pizza for dons.

http://hackage.haskell.org/trac/ghc/ticket/915

I was hoping that ticket would reveal the delivery address that we had to send pizza to, but instead it revealed that streams stuff is not scheduled for inclusion until GHC 6.12. :-(

Cheers,

D _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

-- /jve

Andrew Coppin

7:39 p.m.

John Van Enk wrote:

...

...
...revealed that streams stuff is not scheduled for inclusion until GHC 6.12

How many pizzas will it take to bump that to 6.10? :D So basically Don is like the dining philosophers, except instead of turning spaghetti into philosophy, he turns pizza into world-beating Haskell awesomeness?

Damn, we need to pool our resources and buy more pizza, people!! o_O

John Van Enk

7:42 p.m.

Does the quality of the pizza matter, or is it a matter of sheer quantity? If it's a balance of the two, then we need to do some experimentation. Lets start donating a specific quantity of pizzas to Don every week, but vary the quality of the pizza. We'll check his Hackage submissions and then tune the pizza algorithm to the desired output. :) On Fri, Sep 26, 2008 at 3:39 PM, Andrew Coppin wrote:

...

John Van Enk wrote:

...
...
...revealed that streams stuff is not scheduled for inclusion until GHC 6.12

How many pizzas will it take to bump that to 6.10? :D

So basically Don is like the dining philosophers, except instead of turning spaghetti into philosophy, he turns pizza into world-beating Haskell awesomeness?

Damn, we need to pool our resources and buy more pizza, people!! o_O

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

-- /jve

Brandon S. Allbery KF8NH

11:03 p.m.

On 2008 Sep 26, at 15:42, John Van Enk wrote:

...

Lets start donating a specific quantity of pizzas to Don every week, but vary the quality of the pizza. We'll check his Hackage submissions and then tune the pizza algorithm to the desired output. :)

Premature optimization is the root of all evil. :) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Achim Schneider

8:40 p.m.

Andrew Coppin wrote:

...

John Van Enk wrote:

So basically Don is like the dining philosophers, except instead of turning spaghetti into philosophy, he turns pizza into world-beating Haskell awesomeness?

Is that the problem where you have to calculate the probability of the pizza arriving pre-cut as ordered based on the true and pretended availability of knives and pizza cutters at the delivery address? -- (c) this sig last receiving data processing entity. Inspect headers for copyright history. All rights reserved. Copying, hiring, renting, performance and/or broadcasting of this signature prohibited.

Andrew Coppin

7:40 p.m.

Dougal Stanton wrote:

...

I was hoping that ticket would reveal the delivery address that we had to send pizza to, but instead it revealed that streams stuff is not scheduled for inclusion until GHC 6.12. :-(

As I understand it, the [list] stream-fusion library is on Hackage *now*, you can use it *now*. They just want to wait awhile and benchmark it a bit more before making it the GHC default. ...is how *I* understood things. Clarification, anybody?

Derek Elkins

9:12 p.m.

On Fri, 2008-09-26 at 19:15 +0100, Andrew Coppin wrote:

...

Take a look around you. Haskell provides several sorts of container. We have:

Data.List Data.Set Data.Map Data.Hashtable Data.ByteString Data.Sequence Data.Array Data.Tree Data.IntSet Data.IntMap ...

In other words, we have *a lot* of different data containers. And yet, each one provides its own unique API.

To anybody used to OO languages, that sounds pretty crazy. In something like Java or Smalltalk or Eiffel, you have an abstract class that represents "container", and maybe a seperate one for "ordered container", and then concrete subclasses for each kind of container. Each one may add a few unique methods of its own, but basically all the containers have a uniform API, and you can write functions that work for any arbitrary [ordered] container.

In Haskell, you can't do this. Why is that?

Obviously you certainly can. That there isn't a "standard" form is in my opinion not really historical or limitations of H98, though there are certainly some aspect of those. You can do a not horrible job in just H98. You can do a much better job using common extensions (and yes, in particular MPTCs and fundeps.) As Albert Lai alluded to, you can use Edison if you want this, right now, today. I think the problem is again that the Perfect Interface hasn't been found and for several reasons which I'll enumerate, there is not a pressing desire for a reasonable compromise. One aspect of it is a bit of a You Aren't Going To Need It. Particularly for applications, there is usually very little gain in practice and for Haskell many of the container libraries have identical interface subsets so that you do end up being able to change implementation by changing a single import. This is further reinforced by there being a single obvious choice for common data structures. Admittedly, it would still be nice to be more explicit about this and to program to interfaces, especially for library code. Another reason is that often you can use an intermediate data structure, especially lists, for operations. Lists are the iterators/IEnumerable/generators of Haskell. So if I just need to traverse a data structure, I can just write an interface that accepts a list. Of course, since Haskell isn't Java, I'm not subject to the choice of interfaces the data structure implementer decided to implement or not. When this happens in Java the "standard" solution is to use an adapter class. In Haskell, I can just write the instance. In particular, I can decide on whatever interface I need, write my code to that interface and just instantiate it for the relevant data types and users can instantiate it for their data types. If you want an OrderableCollection class, you can simply write one today. You don't need to ask anyone's permission or coordinate with anyone. There are some general reasons too. Typically asymptotic complexity guarantees are considered part of the interface for a data structure. If you do this you either end up with loose constraints that aren't very useful or tight ones that provide nice guarantees but exclude all but a few data structures. This leads back to the first reason, YAGNI. You often have particular properties that you want and thus end up with particular data structures. Admittedly, you can still require tight complexity constraints and if someone wants to violate them, the performance problems are their fault but maybe convenience outweighs performance in that case. Usually, though, there are convenient conversions between the types. Finally, there -are- several more or less standard classes that capture different general operations on data structures (though there certainly could be more.) They, of course, have different names and different interfaces and different factorings from imperative equivalents. We have Functor, Applicative, Monad, MonadPlus, Monoid, Foldable, Traversable, IArray, MArray and others. Notice how the ridiculous proliferation of array types in Haskell has pressed the issue and led to the creation of IArray and MArray. Ultimately, it would still be beneficial to have some more standard interfaces for this sort of thing. There just hasn't been enough of a fire under the community's rear. This again suggests that YAGNI.

Andrew Coppin

9:37 p.m.

Derek Elkins wrote:

...

One aspect of it is a bit of a You Aren't Going To Need It.

Personally, I haven't had a huge problem with this in practice. What it basically means is that if you write a library function, *you* have to decide what containers you're going to use. It's not really possible to let the user decide. You kind of have to hard-wire it yourself. That's the only real problem with it. Sometimes it's irritating to convert an array to a list, feed it to some function, and then immediately convert it back to an array again, for example. (Not only annoying, but presumably not that efficient.)

...

Another reason is that often you can use an intermediate data structure, especially lists, for operations. Lists are the iterators/IEnumerable/generators of Haskell. So if I just need to traverse a data structure, I can just write an interface that accepts a list.

This is the other part of it. In Haskell, a list in some sense "is" a control-flow loop. If the compiler's inlining is half as good as it's supposed to be, converting an array to a list and then feeding it to some function hopefully ends up being inlined so that you end up with the function directly iterating over the array. Hopefully the function's output ends up being similar. So it's not like you build a while list in memory and then consume it. It's not even like the GC has to go round and free all the list cells. The list itself never actually exists as such at runtime. Alternatively, I'm talking complete nonesense... o_O

...

Of course, since Haskell isn't Java, I'm not subject to the choice of interfaces the data structure implementer decided to implement or not. When this happens in Java the "standard" solution is to use an adapter class. In Haskell, I can just write the instance. In particular, I can decide on whatever interface I need, write my code to that interface and just instantiate it for the relevant data types and users can instantiate it for their data types. If you want an OrderableCollection class, you can simply write one today. You don't need to ask anyone's permission or coordinate with anyone.

Haskell class membership is "open" in this way - a useful feature, IMHO.

...

Finally, there -are- several more or less standard classes that capture different general operations on data structures (though there certainly could be more.) They, of course, have different names and different interfaces and different factorings from imperative equivalents. We have Functor, Applicative, Monad, MonadPlus, Monoid, Foldable, Traversable, IArray, MArray and others. Notice how the ridiculous proliferation of array types in Haskell has pressed the issue and led to the creation of IArray and MArray.

As already noted, Data.Set *should* be a Monad, but can't be. The type system won't allow it. (And I know I'm not the first person to notice this...) Similar fun and frolics with Functor, and presumably Applicative and Foldable (I haven't actually heard of these until just now). Frankly, the whole "array" thing is slightly crazy to me. There are several things which the array libraries ought to support, but don't: - Making "slices" of arrays. (I.e., generating a subarray in O(1) by using transparent reindexing.) - Linked lists of arrays that provide an array-like interface. (ByteString.Lazy does this, but only for Word8 or Char.) - It really ought to be possible to unbox *any* type. Technically this is implementable now, but I can't find details of how... - Performing "map" in-place for mutable arrays. (This must surely be a very common operation.) - Build-in functions for joining arrays together, and splitting at a given index. - Array sorting. [Arrays have O(1) indexing, which has big implications for what sorting algorithm to choose.] - Lists have about 5,000,000 functions for processing them. Arrays have, like, a dozen. Just how efficient is it to convert an array to a list, process it, and then convert it back?

...

Ultimately, it would still be beneficial to have some more standard interfaces for this sort of thing. There just hasn't been enough of a fire under the community's rear. This again suggests that YAGNI.

I see... ;-)

David Menendez

10:12 p.m.

On Fri, Sep 26, 2008 at 5:37 PM, Andrew Coppin wrote:

...

As already noted, Data.Set *should* be a Monad, but can't be. The type system won't allow it. (And I know I'm not the first person to notice this...)

I wouldn't say that. It's important to remember that Haskell class Monad does not, and can not, represent *all* monads, only (strong) monads built on a functor from the category of Haskell types and functions to itself. Data.Set is a functor from the category of Haskell types *with decidable ordering* and *order-preserving* functions to itself. That's not the same category, although it is closely related. -- Dave Menendez http://www.eyrie.org/~zednenem/

Jason Dusek

10:25 p.m.

Can someone explain, why is it that Set can not be a Monad? -- _jsn

Jonathan Cast

10:27 p.m.

On Fri, 2008-09-26 at 15:25 -0700, Jason Dusek wrote:

...

Can someone explain, why is it that Set can not be a Monad?

It can't even be a functor (which all monads are). You can't implement fmap (+) $ Set.fromList [1, 2, 3] with Data.Set, because you can't order functions of type Integer -> Integer in a non-arbitrary way. So you can't have a balanced binary tree of them in a non-arbitrary way, either. Something like fmap putStrLn $ Set.fromList ["Hello", "world"] is similar. Since Data.Set is implemented in Haskell, it can only use facilities available to Haskell libraries. So it can't work for arbitrary elements; but a Functor instance requires that it does work. jcc

Dan Weston

10:50 p.m.

More specifically, although a set is a perfectly good (lowercase) functor, Set is not a (Haskell) Functor. Set's map has an Ord constraint, but the Functor type constructor is parametric over *all* types, not just that proper subset of them that have a total ordering. But see attempts to fix this: http://okmij.org/ftp/Haskell/types.html#restricted-datatypes http://www.randomhacks.net/articles/2007/03/15/data-set-monad-haskell-macros Dan Jonathan Cast wrote:

...

On Fri, 2008-09-26 at 15:25 -0700, Jason Dusek wrote:

...
Can someone explain, why is it that Set can not be a Monad?

It can't even be a functor (which all monads are). You can't implement

fmap (+) $ Set.fromList [1, 2, 3]

with Data.Set, because you can't order functions of type Integer -> Integer in a non-arbitrary way. So you can't have a balanced binary tree of them in a non-arbitrary way, either. Something like

fmap putStrLn $ Set.fromList ["Hello", "world"]

is similar.

Since Data.Set is implemented in Haskell, it can only use facilities available to Haskell libraries. So it can't work for arbitrary elements; but a Functor instance requires that it does work.

Andrew Coppin

27 Sep 27 Sep

1:24 p.m.

David Menendez wrote:

...

I wouldn't say that. It's important to remember that Haskell class Monad does not, and can not, represent *all* monads, only (strong) monads built on a functor from the category of Haskell types and functions to itself.

Data.Set is a functor from the category of Haskell types *with decidable ordering* and *order-preserving* functions to itself. That's not the same category, although it is closely related.

I nominate this post for the September 2008 Most Incomprehensible Cafe Post award! :-D Seriously, that sounded like gibberish. (But then, you're talking to somebody who can't figure out the difference between a set and a class, so...) All I know is that sometimes I write stuff in the list monad when the result really ought to be *sets*, not lists, because 1. there is no senamically important ordering 2. there should be no duplicates But Haskell's type system forbids me. (It also forbids me from making Set into a Functor, actually... so no fmap for you!) PS. Text is unpredictable, so just in case... If this post sounds like a flame, it isn't meant to be. ;-)

Matthieu Sozeau

1:53 p.m.

Le 27 sept. 08 à 15:24, Andrew Coppin a écrit :

...

David Menendez wrote:

...
I wouldn't say that. It's important to remember that Haskell class Monad does not, and can not, represent *all* monads, only (strong) monads built on a functor from the category of Haskell types and functions to itself.

Data.Set is a functor from the category of Haskell types *with decidable ordering* and *order-preserving* functions to itself. That's not the same category, although it is closely related.

I nominate this post for the September 2008 Most Incomprehensible Cafe Post award! :-D

Seriously, that sounded like gibberish. (But then, you're talking to somebody who can't figure out the difference between a set and a class, so...)

All I know is that sometimes I write stuff in the list monad when the result really ought to be *sets*, not lists, because

1. there is no senamically important ordering

2. there should be no duplicates

But Haskell's type system forbids me. (It also forbids me from making Set into a Functor, actually... so no fmap for you!)

Think about it this way: fmap is supposed to be an homomorphism on the functor's structure, it just changes the type of the holes in the structure. To implement such map function in Set (not debating if Set should require Ord or not here!) and keep the structure invariants, the function you give to map should be order-preserving. Actually, Set.map accepts any function but it must construct the new Set using a fold behind the scenes because otherwise the function may break the internal balancing invariants. But map_monotonous is there for the case where it does respect the orders and the map can be done much more naturally and efficiently. There's simply no way to state that a function must be monotonous using haskell's limited type system. except by using a new datatype that represents only the order-preserving functions between any two types A and B (is that even possible?). So you only see the [Ord] constraint getting in the way of defining a functor on Sets, but it's more profound than that, the functions themselves don't fit exactly. Otherwise, to implement Sets correctly I think you need at least [Eq] (and give [Eq] preserving functions to fmap). You can certainly declare a new EqFunctor (f : * -> *) where eqfmap : Eq a, Eq b => (a -> b) -> f a -> f b and assume that functions are [Eq]-preserving there (similarly with [Ord]). Hope this helps, -- Matthieu

Brandon S. Allbery KF8NH

4:29 p.m.

On 2008 Sep 27, at 9:24, Andrew Coppin wrote:

...

David Menendez wrote:

...
I wouldn't say that. It's important to remember that Haskell class Monad does not, and can not, represent *all* monads, only (strong) monads built on a functor from the category of Haskell types and functions to itself.

Data.Set is a functor from the category of Haskell types *with decidable ordering* and *order-preserving* functions to itself. That's not the same category, although it is closely related.

I nominate this post for the September 2008 Most Incomprehensible Cafe Post award! :-D

Seriously, that sounded like gibberish. (But then, you're talking to somebody who can't figure out the difference between a set and a class, so...)

That response required a certain amount of category theory to grok. When you have a typeclass, the constraints (that is, the (Foo a =>) contexts) on it are the exact constraints on members of the class. You can't add more or leave some out. Set and Map both require an additional constraint over those of Functor and Monad: (Ord a =>). Since you can't add constraints on top of a typeclass, you can't make them members of Functor or Monad. (Unless you use some Oleg-style hackery.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Andrew Coppin

4:41 p.m.

Brandon S. Allbery KF8NH wrote:

...

On 2008 Sep 27, at 9:24, Andrew Coppin wrote:

...
I nominate this post for the September 2008 Most Incomprehensible Cafe Post award! :-D

Seriously, that sounded like gibberish. (But then, you're talking to somebody who can't figure out the difference between a set and a class, so...)

That response required a certain amount of category theory to grok.

No kidding. ;-)

...

When you have a typeclass, the constraints (that is, the (Foo a =>) contexts) on it are the exact constraints on members of the class. You can't add more or leave some out.

Set and Map both require an additional constraint over those of Functor and Monad: (Ord a =>). Since you can't add constraints on top of a typeclass, you can't make them members of Functor or Monad. (Unless you use some Oleg-style hackery.)

Yes. This I understand. And ByteString constrains the element type to just Word8. Or Char, depending which one you use. And in principle other containers might exist with other constraints. (E.g., hashtables require hash functions.) I'm not sure how that qualifies set as "not really a true monad anyway" - but then, I don't know what a monad is, originally. I only know what it means in Haskell. Also... Who or what is an Oleg, and why do I keep hearing about it? ;-)

Brandon S. Allbery KF8NH

5:06 p.m.

On 2008 Sep 27, at 12:41, Andrew Coppin wrote:

...

I'm not sure how that qualifies set as "not really a true monad anyway" - but then, I don't know what a monad is, originally. I only know what it means in Haskell.

I think you read him backwards: Map and Set are category-theory ("true") monads, but they can't be Haskell Monads because Haskell isn't expressive enough to represent more than a subset of category- theoretical monads.

...

Also... Who or what is an Oleg, and why do I keep hearing about it? ;-)

Oleg Kiselyov. http://okmij.org/ftp/ He's somewhat legendary in the Haskell community for his ability to make Haskell do what people think it can't, and his tendency to program at the type level instead of at the value level like most people. :) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Andrew Coppin

5:21 p.m.

Brandon S. Allbery KF8NH wrote:

...

On 2008 Sep 27, at 12:41, Andrew Coppin wrote:

...
I'm not sure how that qualifies set as "not really a true monad anyway" - but then, I don't know what a monad is, originally. I only know what it means in Haskell.

I think you read him backwards: Map and Set are category-theory ("true") monads, but they can't be Haskell Monads because Haskell isn't expressive enough to represent more than a subset of category-theoretical monads.

Ah, OK. That makes more sense then... What (if anything) do we do about that? I'm not actually bothered about every possible monad being representable as such in Haskell. I'd just like Set to work. ;-)

...

...
Also... Who or what is an Oleg, and why do I keep hearing about it? ;-)

Oleg Kiselyov. http://okmij.org/ftp/ He's somewhat legendary in the Haskell community for his ability to make Haskell do what people think it can't, and his tendency to program at the type level instead of at the value level like most people. :)

Ah - so the "Prolog programs as type signatures" thing is *his* fault?! ;-)

Ariel J. Birnbaum

10:46 p.m.

...

I'm not actually bothered about every possible monad being representable as such in Haskell. I'd just like Set to work. ;-)

What would "work" mean in this case? I see two different meanings: 1. Use monadic operations (mapM, guard) on Sets. How would you decide which operations are allowed and which aren't? A possible answer would be: if you can add an implicit Ord constraint for every argument of m (where m is constrained to be a Monad), you can instantiate m as Set. So sequence :: (Monad m) => [m a] -> m [a] would work since [a] is an instance of Ord whenever a is but ap :: (Monad m) => m (a -> b) -> m a -> m b wouldn't since we can't have a (meaningful) Ord instance for a -> b even if a and b are themselves instances. Such a mechanism is, of course, broken. Consider the following alternative definition for liftM2: liftM2 :: (Monad m) => (a -> b -> c) -> m a -> m b -> m c liftM2 f ma mb mc = return f `ap` ma `ap` mb `ap` mc -- deliberately avoiding Applicative and Functor While the type of liftM2 indicates it should work (and the definition found on GHC actually does), in this case it would utterly break at the "return f" and the "ap"s. In other words, one can't rely on the type alone to know whether a monadic operation is applicable to Set. In OOP, I think they'd call this a violation of Liskov's Substitution Principle. 2. Make the nice monadic syntax work for sets. In this case I'd restate the problem as not being able to extend Haskell's syntax within your program (a problem shared by most non-Lisp languages). While TH provides a fairly decent solution in this respect, it's still far from Lisp's flexibility. In this respect, does anyone know how the Liskell project is doing? The site and mailing list seem pretty silent... -- Ariel J. Birnbaum

Achim Schneider

28 Sep 28 Sep

8:57 a.m.

...

...
I'm not actually bothered about every possible monad being representable as such in Haskell. I'd just like Set to work. ;-)

What would "work" mean in this case? I see two different meanings:

1. Use monadic operations (mapM, guard) on Sets. 2. Make the nice monadic syntax work for sets.

"Ariel J. Birnbaum" wrote: 3. implement >>=, guardS and mapS for data MyHack Set = MyHack Set and abuse the do-notation for your own wicked purposes. -- (c) this sig last receiving data processing entity. Inspect headers for copyright history. All rights reserved. Copying, hiring, renting, performance and/or broadcasting of this signature prohibited.

Aaron Denney

8:42 a.m.

On 2008-09-27, Andrew Coppin wrote:

...

Brandon S. Allbery KF8NH wrote:

...
Oleg Kiselyov. http://okmij.org/ftp/ He's somewhat legendary in the Haskell community for his ability to make Haskell do what people think it can't, and his tendency to program at the type level instead of at the value level like most people. :)

Ah - so the "Prolog programs as type signatures" thing is *his* fault?! ;-)

No, he merely takes advantage of it. The fault is that constraint satisfaction is natural match for type-inference, because it's essentially what type-inference is. Given that that's essentially what Prolog is too, it shouldn't be surprising that you can express quite a lot with the type system. -- Aaron Denney -><-

Andrew Coppin

8:47 a.m.

New subject: Type-level programming [The container problem]

Aaron Denney wrote:

...

On 2008-09-27, Andrew Coppin wrote:

...
Ah - so the "Prolog programs as type signatures" thing is *his* fault?! ;-)

No, he merely takes advantage of it.

Heh. OK. ;-) By the way... I've seen a lot of type-level programs that allow you to express (and therefore verify) some pretty extreme properties of your code. In other words, you can make the compiler do more checking than it normally would. But the actual compiled code (assuming it does indeed compile) works exactly the same way as before. Is there any way to use type-level programming to actually alter the behaviour of the program in a useful/interesting way?

Brandon S. Allbery KF8NH

2:02 p.m.

New subject: Type-level programming [The container problem]

On 2008 Sep 28, at 4:47, Andrew Coppin wrote:

...

By the way... I've seen a lot of type-level programs that allow you to express (and therefore verify) some pretty extreme properties of your code. In other words, you can make the compiler do more checking than it normally would. But the actual compiled code (assuming it does indeed compile) works exactly the same way as before. Is there any way to use type-level programming to actually alter the behaviour of the program in a useful/interesting way?

Aren't phantom types an example of this? Absent the phantoms the program would (if it worked at all) treat expressions the same that it treats differently with them. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Jason Dusek

8:22 p.m.

Aaron Denney wrote:

...

Andrew Coppin wrote:

...
Brandon S. Allbery KF8NH wrote:

...
Oleg Kiselyov. http://okmij.org/ftp/ He's somewhat legendary in the Haskell community for his ability to make Haskell do what people think it can't, and his tendency to program at the type level instead of at the value level like most people. :)

Ah - so the "Prolog programs as type signatures" thing is *his* fault?! ;-)

No, he merely takes advantage of it. The fault is that constraint satisfaction is natural match for type-inference, because it's essentially what type-inference is. Given that that's essentially what Prolog is too, it shouldn't be surprising that you can express quite a lot with the type system.

Where can I find a discussion of this? -- _jsn

David Menendez

29 Sep 29 Sep

12:59 a.m.

On Sat, Sep 27, 2008 at 9:24 AM, Andrew Coppin wrote:

...

David Menendez wrote:

...
I wouldn't say that. It's important to remember that Haskell class Monad does not, and can not, represent *all* monads, only (strong) monads built on a functor from the category of Haskell types and functions to itself.

Data.Set is a functor from the category of Haskell types *with decidable ordering* and *order-preserving* functions to itself. That's not the same category, although it is closely related.

I nominate this post for the September 2008 Most Incomprehensible Cafe Post award! :-D

Seriously, that sounded like gibberish. (But then, you're talking to somebody who can't figure out the difference between a set and a class, so...)

Sorry about that. I was rushing out the door at the time.

...

All I know is that sometimes I write stuff in the list monad when the result really ought to be *sets*, not lists, because

1. there is no senamically important ordering

2. there should be no duplicates

But Haskell's type system forbids me. (It also forbids me from making Set into a Functor, actually... so no fmap for you!)

I understand your frustration. The point that I was trying to make is that this isn't just some arbitrary limitation in Haskell's type system. Data.Set and [] can both be thought of as monads, but they aren't the same kind of monad. ==== Incidentally, there are other ways to simulate a set monad. Depending on your usage pattern, you may find this implementation preferable to using the list monad:

...

{-# LANGUAGE PolymorphicComponents #-}

import Control.Monad import qualified Data.Set as Set type Set = Set.Set

newtype SetM a = SetM { unSetM :: forall b. (Ord b) => (a -> Set b) -> Set b }

toSet :: (Ord a) => SetM a -> Set a toSet m = unSetM m Set.singleton

fromSet :: (Ord a) => Set a -> SetM a fromSet s = SetM (\k -> Set.unions (map k (Set.toList s)))

instance Monad SetM where return a = SetM (\k -> k a) m >>= f = SetM (\k -> unSetM m (\a -> unSetM (f a) k))

instance MonadPlus SetM where mzero = SetM (\_ -> Set.empty) mplus m1 m2 = SetM (\k -> Set.union (unSetM m1 k) (unSetM m2 k))

It will still duplicate work. For example, if you write, return x `mplus` return x >>= f then "f x" will get evaluated twice. You can minimize that by inserting "fromSet . toSet" in strategic places. -- Dave Menendez http://www.eyrie.org/~zednenem/

Albert Y. C. Lai

5:17 p.m.

Andrew Coppin wrote:

...

Seriously, that sounded like gibberish.

Please don't say that. I think we are too polite to rudeness. While we shouldn't condemn people to "RTFM", we shouldn't tolerate people calling us "gibberish" either. I mean unless we say something objectively gibberish.

Bulat Ziganshin

26 Sep 26 Sep

10:36 p.m.

New subject: Re[2]: The container problem

Hello Andrew, Saturday, September 27, 2008, 1:37:12 AM, you wrote: answering your questions 1) there is 2 libs providing common Java-like interfaces to containers: Edison and Collections. almost noone uses it 2) having common type class for various things is most important when you write library that whould be able to deal with any if these things. when you just write application program, having the same interface plus ability to change imports in most cases are enough. and such meta-libraries are rather rare in Haskell world 3) as laready said, we have classes for traversing containers that probably covers most of usage scenarios for Java too now about arrays - they are much less used in Haskell than in imperative languages, especially mutable ones. to some degree, you may use parallel arrays, which are still informally supported, to some degree you may add required operations yourself (array package is pretty basic), and for some of your operations you need to provide more advanced array datastructure supporting slicing. try to use lists when something you need cannot be implemented with arrays. of my 10kloc "realworld" program, i had only one place when arrays are used -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Don Stewart

10:58 p.m.

bulat.ziganshin:

...

Hello Andrew,

Saturday, September 27, 2008, 1:37:12 AM, you wrote:

answering your questions

1) there is 2 libs providing common Java-like interfaces to containers: Edison and Collections. almost noone uses it

2) having common type class for various things is most important when you write library that whould be able to deal with any if these things. when you just write application program, having the same interface plus ability to change imports in most cases are enough. and such meta-libraries are rather rare in Haskell world

3) as laready said, we have classes for traversing containers that probably covers most of usage scenarios for Java too

now about arrays - they are much less used in Haskell than in imperative languages, especially mutable ones. to some degree, you may use parallel arrays, which are still informally supported, to some degree you may add required operations yourself (array package is pretty basic), and for some of your operations you need to provide more advanced array datastructure supporting slicing. try to use lists when something you need cannot be implemented with arrays. of my 10kloc "realworld" program, i had only one place when arrays are used

Bulat, have you looked at any of the newer array libraries, such as uvector, vector, carray or hmatrix? I'd be interested what you think of them. Especially uvector's interface. -- Don

Derek Elkins

27 Sep 27 Sep

12:01 a.m.

On Fri, 2008-09-26 at 22:37 +0100, Andrew Coppin wrote:

...

Derek Elkins wrote:

...
One aspect of it is a bit of a You Aren't Going To Need It.

Personally, I haven't had a huge problem with this in practice.

I suspected as much. Personally I'd recomend worrying about the problems you actually encounter, rather than worrying about problems that maybe you'll have later. Solving problems that you don't have isn't very gratifying for you.

...

...
Finally, there -are- several more or less standard classes that capture different general operations on data structures (though there certainly could be more.) They, of course, have different names and different interfaces and different factorings from imperative equivalents. We have Functor, Applicative, Monad, MonadPlus, Monoid, Foldable, Traversable, IArray, MArray and others. Notice how the ridiculous proliferation of array types in Haskell has pressed the issue and led to the creation of IArray and MArray.

As already noted, Data.Set *should* be a Monad, but can't be.

No it shouldn't. Data.Set forms a categorical monad but not one over Haskell which is what the Monad class expresses. Data.Set doesn't meet the interface of Monad and doesn't provide the same guarantees. Incidentally, Java would have the same problem if it was capable of expressing something equivalent to the Monad type class; the "issue" is with parametric polymorphism not type classes. So unsurprisingly the type system is right because, in my opinion, parametricity is a property to valuable to lose. This does have the effect, however, that join corresponds to the useful function unions with it's same definition only using different "monad" operations. Note that, for this particular example there is a beautiful solution. We don't really need to take the union of a -Set- of Sets, all we need to be able to do is traverse the outer structure. Taking a hint from my previous reply, we could specialize to lists and we would end up with mconcat from the Data.Set instance of Data.Monoid. If we didn't feel like imposing the conversion to lists on the user we could write combine = mconcat . toList. Conveniently, Data.Foldable has a generic toList function, however, even more conveniently the function we're looking for is simply Data.Foldable.fold.

...

The type system won't allow it. (And I know I'm not the first person to notice this...) Similar fun and frolics with Functor, and presumably Applicative and Foldable (I haven't actually heard of these until just now).

Frankly, the whole "array" thing is slightly crazy to me. There are several things which the array libraries ought to support, but don't:

...

- Making "slices" of arrays. (I.e., generating a subarray in O(1) by using transparent reindexing.) - Linked lists of arrays that provide an array-like interface. (ByteString.Lazy does this, but only for Word8 or Char.) - It really ought to be possible to unbox *any* type. Technically this is implementable now, but I can't find details of how... - Performing "map" in-place for mutable arrays. (This must surely be a very common operation.) - Build-in functions for joining arrays together, and splitting at a given index. - Array sorting. [Arrays have O(1) indexing, which has big implications for what sorting algorithm to choose.] - Lists have about 5,000,000 functions for processing them. Arrays have, like, a dozen. Just how efficient is it to convert an array to a list, process it, and then convert it back?

With the exception of slicing, none of these are interface issues and thus are irrelevant to the topic of this thread. All the functions you want can be implemented with reasonable efficiency and correct asymptotic complexity in terms of the provided interface. Whether these functions are in the standard library or not has no effect on the contractual obligations between chunks of code. Slicing can't be implemented with the correct asymptotic behaviour in terms of these operations. So then it comes down to a cost/benefit analysis of whether the cost of adding it to the interface is justified by the benefits of being able to slice generically. In this case, I think the scales tilt in favor of adding such support.

6130

Age (days ago)

6133

Last active (days ago)

List overview

Download

38 comments

17 participants

participants (17)

Aaron Denney
Achim Schneider
Albert Y. C. Lai
Andrew Coppin
Antoine Latter
Ariel J. Birnbaum
Brandon S. Allbery KF8NH
Bulat Ziganshin
Dan Weston
David Menendez
Derek Elkins
Don Stewart
Dougal Stanton
Jason Dusek
John Van Enk
Jonathan Cast
Matthieu Sozeau