Abstraction in data types

older
searching a function by providing...

Darrin Chandler

18 Mar 2010 18 Mar '10

4:20 a.m.

Hi, Trying to get up to speed in Haskell, I'm playing with doing some abstraction in data types. Specifically, I have this: type Cartesian_coord = Float type Latitude = Float type Longitude = Float data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude) type Center = Point type Radius = Float data Shape = Circle Center Radius | Polygon [Point] This obviously stinks since a Polygon could contain mixed Cartesian and Spherical points. Polygon needs to be one or the other, but not mixed. I could define seperate types for Cartesian and Spherical and seperate CartesianPoly and SphericalPoly, but that doesn't seem very elegant and also increases as I add more coordinate systems and shapes. I read a little on GADTs, et al, but I'm not sure if that's what I want for this or not. Any help appreciated! -- Darrin Chandler | Phoenix BSD User Group | MetaBUG dwchandler@stilyagin.com | http://phxbug.org/ | http://metabug.org/ http://www.stilyagin.com/ | Daemons in the Desert | Global BUG Federation

Attachments:

attachment.sig (application/pgp-signature — 195 bytes)

Show replies by date

Alexander Solla

18 Mar 18 Mar

4:56 a.m.

I wrote this to Darrin, but didn't CC cafe: On Mar 17, 2010, at 9:20 PM, Darrin Chandler wrote:

...

type Cartesian_coord = Float

type Latitude = Float type Longitude = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

type Center = Point type Radius = Float

data Shape = Circle Center Radius | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and Spherical points. Polygon needs to be one or the other, but not mixed.

My suggestion would be to use an alternate representation of "spherical" points in terms of polar coordinates, and then to normalize and mix at will: type Theta = Float type Radius = Float data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Polar (Theta, Radius) normalize_point :: Point -> Point normalize_point Cartesian x y = Cartesian x y normalize_point Polar t r = Cartesian x y where x = r * cos t; y = r * sin t; It really depends on what you want to do with your points. If you want to do linear algebra, you might want your points to depend on a basis, for example. But your "spherical" points don't really form a basis in three-space, or even over all of two-space.

Lyndon Maydwell

5:06 a.m.

You could probably also use a typeclass for pointy things rather than a data type, this would then require you to use existential quantification to construct a hetrogenous list. For example: Class Point where getCartesian :: ... getPolar :: ... data Shape = Point p => ... | Polygon [p] Correct me if this is wrong though :-) On Thu, Mar 18, 2010 at 12:56 PM, Alexander Solla wrote:

...

I wrote this to Darrin, but didn't CC cafe: On Mar 17, 2010, at 9:20 PM, Darrin Chandler wrote:

type Cartesian_coord = Float

type Latitude = Float type Longitude = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

type Center = Point type Radius = Float

data Shape = Circle Center Radius | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and Spherical points. Polygon needs to be one or the other, but not mixed.

My suggestion would be to use an alternate representation of "spherical" points in terms of polar coordinates, and then to normalize and mix at will: type Theta = Float type Radius = Float data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Polar (Theta, Radius) normalize_point :: Point -> Point normalize_point Cartesian x y = Cartesian x y normalize_point Polar t r = Cartesian x y where x = r * cos t; y = r * sin t; It really depends on what you want to do with your points. If you want to do linear algebra, you might want your points to depend on a basis, for example. But your "spherical" points don't really form a basis in three-space, or even over all of two-space. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Darrin Chandler

5:27 a.m.

On Thu, Mar 18, 2010 at 01:06:25PM +0800, Lyndon Maydwell wrote:

...

You could probably also use a typeclass for pointy things rather than a data type, this would then require you to use existential quantification to construct a hetrogenous list.

For example:

Class Point where getCartesian :: ... getPolar :: ...

data Shape = Point p => ... | Polygon [p]

Correct me if this is wrong though :-)

So in "normal" use Polygon list would be homogeneous, but could be made heterogeneous with effort? If I have that right it's closer, but I'd love to have the compiler whine if someone tried to mix them.

...

On Thu, Mar 18, 2010 at 12:56 PM, Alexander Solla wrote:

...
I wrote this to Darrin, but didn't CC cafe: On Mar 17, 2010, at 9:20 PM, Darrin Chandler wrote:

type Cartesian_coord = Float

type Latitude = Float type Longitude = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

type Center = Point type Radius = Float

data Shape = Circle Center Radius | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and Spherical points. Polygon needs to be one or the other, but not mixed.

My suggestion would be to use an alternate representation of "spherical" points in terms of polar coordinates, and then to normalize and mix at will: type Theta = Float type Radius = Float data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Polar (Theta, Radius) normalize_point :: Point -> Point normalize_point Cartesian x y = Cartesian x y normalize_point Polar t r = Cartesian x y where x = r * cos t; y = r * sin t; It really depends on what you want to do with your points. If you want to do linear algebra, you might want your points to depend on a basis, for example. But your "spherical" points don't really form a basis in three-space, or even over all of two-space. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Lyndon Maydwell

5:45 a.m.

Well you need to define a new datatype to make a hertrogenous list, so I don't think there's any real way you can get around people doing that... On Thu, Mar 18, 2010 at 1:27 PM, Darrin Chandler wrote:

...

On Thu, Mar 18, 2010 at 01:06:25PM +0800, Lyndon Maydwell wrote:

...
You could probably also use a typeclass for pointy things rather than a data type, this would then require you to use existential quantification to construct a hetrogenous list.

For example:

Class Point where getCartesian :: ... getPolar :: ...

data Shape = Point p => ... | Polygon [p]

Correct me if this is wrong though :-)

So in "normal" use Polygon list would be homogeneous, but could be made heterogeneous with effort? If I have that right it's closer, but I'd love to have the compiler whine if someone tried to mix them.

...
On Thu, Mar 18, 2010 at 12:56 PM, Alexander Solla wrote:

...
I wrote this to Darrin, but didn't CC cafe: On Mar 17, 2010, at 9:20 PM, Darrin Chandler wrote:

type Cartesian_coord = Float

type Latitude = Float type Longitude = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

type Center = Point type Radius = Float

data Shape = Circle Center Radius | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and Spherical points. Polygon needs to be one or the other, but not mixed.

My suggestion would be to use an alternate representation of "spherical" points in terms of polar coordinates, and then to normalize and mix at will: type Theta = Float type Radius = Float data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Polar (Theta, Radius) normalize_point :: Point -> Point normalize_point Cartesian x y = Cartesian x y normalize_point Polar t r = Cartesian x y where x = r * cos t; y = r * sin t; It really depends on what you want to do with your points. If you want to do linear algebra, you might want your points to depend on a basis, for example. But your "spherical" points don't really form a basis in three-space, or even over all of two-space. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

-- Darrin Chandler | Phoenix BSD User Group | MetaBUG dwchandler@stilyagin.com | http://phxbug.org/ | http://metabug.org/ http://www.stilyagin.com/ | Daemons in the Desert | Global BUG Federation

Brandon S. Allbery KF8NH

5:48 a.m.

On Mar 18, 2010, at 01:27 , Darrin Chandler wrote:

...

On Thu, Mar 18, 2010 at 01:06:25PM +0800, Lyndon Maydwell wrote:

...
You could probably also use a typeclass for pointy things rather than a data type, this would then require you to use existential quantification to construct a hetrogenous list.

So in "normal" use Polygon list would be homogeneous, but could be made heterogeneous with effort? If I have that right it's closer, but I'd love to have the compiler whine if someone tried to mix them.

They can be mixed only with significant effort. And you can't really prevent it once your users know the magic of existential quantification; *but* those lists won't typecheck when passed to your routines expecting a Polygon, because you will be expecting a (Point a => Polygon [a]) but they will be passing a (Polygon [forall a. Point a => a]) or something similar. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Stephen Tetley

9:27 a.m.

On 18 March 2010 05:48, Brandon S. Allbery KF8NH wrote:

...

They can be mixed only with significant effort. And you can't really prevent it once your users know the magic of existential quantification; *but* those lists won't typecheck when passed to your routines expecting a Polygon, because you will be expecting a (Point a => Polygon [a]) but they will be passing a (Polygon [forall a. Point a => a]) or something similar.

If you can live without constructors, an ADT view is another possibility... Jeremy Gibbons gives a clear explanation of using an ADT to implement complex numbers with polar and cartesian representations. Both representations share the same type, so can be stored together in lists - figuratively speaking the the existential type has been moved "further down" inside the type. See section 2: http://www.comlab.ox.ac.uk/jeremy.gibbons/publications/adt.pdf Best wishes Stephen

Alexander Solla

6:36 a.m.

On Mar 17, 2010, at 10:27 PM, Darrin Chandler wrote: Let's go back to your original code: data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude) type Center = Point type Radius = Float data Shape = Circle Center Radius | Polygon [Point] normalize_shape :: Shape -> Shape normalize_shape Circle c r = Circle c r normalize_shape Polygon ps = Polygon $ fmap normalize_point ps where normalize_point = something appropriate for the function. In fact, you could lift this into a higher order function, that takes a normalize_point function as an argument: normalize_shape :: (Point -> Point) -> Shape -> Shape normalize_shape f (Circle c r)= Circle (f c) r normalize_shape f (Polygon ps) = Polygon $ fmap f ps Now, I'm not suggesting that you should always normalize shapes, as I had with normalize_point before. But this combinator captures some nice, generic logic. For example, you can do stuff like: cartesian_shape :: Shape -> Shape cartesian_shape = normalize_shape cartesian_point where ... normalize_shape is the sort of function you would use while defining a function, and possibly provide function specific behavior in the function's where clause. double_shape :: Shape -> Shape double_shape (Circle c r) = Circle c (2 * r) double_shape (Polygon ps) = Polygon $ normalize_shape (double_point . cartesian_point) ps where double_point Cartesian (x, y) = Cartesian (sqrt(2) * x, sqrt(2) * y)

Alexander Solla

5:20 a.m.

On Mar 17, 2010, at 9:56 PM, Alexander Solla wrote:

...

But your "spherical" points don't really form a basis in three- space, or even over all of two-space.

I'll take this back. Lattitude and longitude is enough to "form a basis" on R^2, by taking a basis for the surface of the sphere in terms of latitude and longitude and projecting it stereographically. So if you wanted to use the normalization idea, you could use the stereographic projection formulas to turn a spherical point into a Cartesian point. http://en.wikipedia.org/wiki/Stereographic_projection

Darrin Chandler

5:22 a.m.

On Wed, Mar 17, 2010 at 10:20:02PM -0700, Alexander Solla wrote:

...

On Mar 17, 2010, at 9:56 PM, Alexander Solla wrote:

...
But your "spherical" points don't really form a basis in three- space, or even over all of two-space.

I'll take this back. Lattitude and longitude is enough to "form a basis" on R^2, by taking a basis for the surface of the sphere in terms of latitude and longitude and projecting it stereographically. So if you wanted to use the normalization idea, you could use the stereographic projection formulas to turn a spherical point into a Cartesian point.

http://en.wikipedia.org/wiki/Stereographic_projection

Darrin Chandler

5:20 a.m.

On Wed, Mar 17, 2010 at 09:56:14PM -0700, Alexander Solla wrote:

...

I wrote this to Darrin, but didn't CC cafe:

On Mar 17, 2010, at 9:20 PM, Darrin Chandler wrote:

...
type Cartesian_coord = Float

type Latitude = Float type Longitude = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

type Center = Point type Radius = Float

data Shape = Circle Center Radius | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and Spherical points. Polygon needs to be one or the other, but not mixed.

My suggestion would be to use an alternate representation of "spherical" points in terms of polar coordinates, and then to normalize and mix at will:

type Theta = Float type Radius = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Polar (Theta, Radius)

normalize_point :: Point -> Point normalize_point Cartesian x y = Cartesian x y normalize_point Polar t r = Cartesian x y where x = r * cos t; y = r * sin t;

It really depends on what you want to do with your points. If you want to do linear algebra, you might want your points to depend on a basis, for example. But your "spherical" points don't really form a basis in three-space, or even over all of two-space.

I see what you mean, but I don't think that's what I need. I want to have keep Lat/Lon, as I may have large groups of shapes in Lat/Lon and want to do things with them as is. And the same for cartesian coords. Sometimes I will translate betweem lat/lon and cartesian, but many times I will be doing calculations in "native" coordinates. But it's a nice technique you show, and it will come in handy elsewhere. -- Darrin Chandler | Phoenix BSD User Group | MetaBUG dwchandler@stilyagin.com | http://phxbug.org/ | http://metabug.org/ http://www.stilyagin.com/ | Daemons in the Desert | Global BUG Federation

Tillmann Rendel

1:58 p.m.

Darrin Chandler wrote:

...

data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

type Center = Point type Radius = Float

data Shape = Circle Center Radius | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and Spherical points. Polygon needs to be one or the other, but not mixed.

The problem is that you cannot distinguish in the type system whether a Point was created by the Cartesian or the Spherical constructor. These constructors have the following types: Cartesian :: (Cartesian_coord, Cartesion_coor) -> Point Spherical :: (Latitude, Latitude) -> Point The types of both constructors end with the same type, the type Point. If you manage to change your code so that the constructors end in different types, then the difference between them will be visible to the type system, and you will have the static knowledge you want. I know of two strategies to make the constructors end in different types. They could belong to different datatypes, or they could belong to different instances of a family of datatypes. You could split the type Point into two types as follows. data Cartesian = Cartesian Cartesian_coord Cartesian_coord data Spherical = Spherical Latitude Longitude Now the constructors have the following types: Cartesian :: Cartesian_coord -> Cartesian_coord -> Cartesian Spherical :: Latitude -> Longitude -> Spherical Since the types of the constructors end in different types, we can distinguish values created with Cartesian from values created with Spherical in the type system. (Note that I dropped the use of tuples from your types to make the example more idiomatic Haskell). However, how are we going to write the Shape datatype now that we have two different types of points? We write a family of Shape datatypes parameterized in the type of points we want to use. data Shape point = Circle point Radius | Polygon [point] Note that point is a type variable, which can be instantiated with whatever type we like, including Cartesian and Spherical. For example, Shape Cartesian is the type of shapes described with cartesian coordinates, and Shape Spherical is the type of shapes described with spherical coordinates. But we could also have a type like Shape Integer, which is the type of shapes described with Integer points, whatever that means. This generality of the Shape family of types has advantages and disadvantages. An advantage is that we can make Shape an instance of the type class Functor as follows. instance Functor Shape where fmap f (Circle p r) = Circle (f p) r fmap f (Polygon ps) = Polygon (map f ps) Note that fmap applies a function to every point in the shape, but does not change the overall shape structure. This could be used, for example, to convert between a spherical and a cartesian shape, given some function to convert between spherical and cartesian points. convertPoint :: Spherical -> Cartesian convertPoint = ... convertShape :: Shape Spherical -> Shape Cartesian convertShape = fmap convertPoint Another example could be a shape with all the cartesian points stored in same database. Lets assume that we have a function to lookup which point is stored in the database under some key. lookupPoint :: Database -> Key -> Cartesian lookupPoint = ... We can use fmap to write the function which takes a Shape with keys as points, and looks up all the keys in the database, returning a Shape of cartesian coordinates. lookupShape :: Database -> Shape Key -> Shape Cartesian lookupShape db = fmap (lookupPoint db) The Functor typeclass is further extended and specialized by typeclasses like Traversable, Foldable, Applicative, Alternative, Monad, etc. Probably some of them are applicable here. So an advantage of this kind of open family Shape is that we can write instances for standard typeclasses, and another advantage is that we may find good uses for types such as (Shape Key). However, a disadvantage may be that we cannot restrict a priori which types of points are useable in our program. That also means that we cannot use pattern matching to discover at runtime which type of point was used in some shape. If we make cartesian and spherical points belong to the same family of types, we relate them somewhat, but keep their distinction visible for the type system. We want to introduce a family of types with two members, one for spherical, and one for cartesian coordinates. We introduce empty data types to index the family. data Cartesian data Spherical And we express Point as a GADT using Spherical and Cartesian as follows. data Point i = Cartesian :: Cartesian_coord -> Cartesian_coord -> Point Cartesian | Spherical :: Latitude -> Longitude -> Point Spherical Again, the constructors have different types. The types differ in the type argument to Point. Since the type argument i is never used in the definition of Point to describe values, it is called a phantom type. Now, we can use pattern matching to figure out the type of points used in some statically unknown shape. Lets again assume that we know how to convert a spherical into an cartesian point. convertPoint :: Point Spherical -> Point Cartesian convertPoint (Spherical latitude longitude) = ... Note that we do not have to write a clause for Cartesian points, because a (Point Spherical) value could never have been created by the Cartesian constructor. We can now write a normalizePoint function which converts spherical points into cartesian points, but which does not change already cartesian points. normalize :: Point i -> Point Cartesian normalize p@(Spherical _ _) = convertPoint p normalize p@(Cartesian _ _) = p Note that after matching p with Spherical, we are allowed to call convertPoint, and after matching p with Cartesian, we are allowed to return p as a Point Cartesian. This is allowed by the type checker, because the type checker tracks which statically unknown information we rediscover at runtime. We can use the same phantom types Cartesian and Spherical to write the Shape datatype. data Shape i = Circle (Point i) Radius | Polygon [Point i] But we cannot make Shape an instance of Functor. So if you want the type system to dinstiguish between values created by different constructors of the same datatype, you have to make the difference between the constructors visible to the type system. You could either split the datatype into a bunch of unrelated types, or you could rewrite the datatype into a family of datatypes. The most important difference between these approaches is whether you want to have an open or a closed family of types. Tillmann

Ozgur Akgun

2:10 p.m.

I was just reading through the discussion, and Tillmann, your reply is one of the best written descriptions I've ever seen here. (or even in any other mail list!) Of course, I see many good replies here, but they almost always turn out to be irrelevant to the original question. Yours on the other hand directly answers the original question. Just wanted to thank. -- Ozgur On 18 March 2010 13:58, Tillmann Rendel wrote:

...

Darrin Chandler wrote:

...
data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

type Center = Point type Radius = Float

data Shape = Circle Center Radius | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and Spherical points. Polygon needs to be one or the other, but not mixed.

The problem is that you cannot distinguish in the type system whether a Point was created by the Cartesian or the Spherical constructor. These constructors have the following types:

Cartesian :: (Cartesian_coord, Cartesion_coor) -> Point Spherical :: (Latitude, Latitude) -> Point

The types of both constructors end with the same type, the type Point. If you manage to change your code so that the constructors end in different types, then the difference between them will be visible to the type system, and you will have the static knowledge you want.

I know of two strategies to make the constructors end in different types. They could belong to different datatypes, or they could belong to different instances of a family of datatypes.

You could split the type Point into two types as follows.

data Cartesian

= Cartesian Cartesian_coord Cartesian_coord

data Spherical = Spherical Latitude Longitude

Now the constructors have the following types:

Cartesian :: Cartesian_coord -> Cartesian_coord -> Cartesian Spherical :: Latitude -> Longitude -> Spherical

Since the types of the constructors end in different types, we can distinguish values created with Cartesian from values created with Spherical in the type system.

(Note that I dropped the use of tuples from your types to make the example more idiomatic Haskell).

However, how are we going to write the Shape datatype now that we have two different types of points? We write a family of Shape datatypes parameterized in the type of points we want to use.

data Shape point = Circle point Radius | Polygon [point]

Note that point is a type variable, which can be instantiated with whatever type we like, including Cartesian and Spherical. For example, Shape Cartesian is the type of shapes described with cartesian coordinates, and Shape Spherical is the type of shapes described with spherical coordinates. But we could also have a type like Shape Integer, which is the type of shapes described with Integer points, whatever that means.

This generality of the Shape family of types has advantages and disadvantages. An advantage is that we can make Shape an instance of the type class Functor as follows.

instance Functor Shape where fmap f (Circle p r) = Circle (f p) r fmap f (Polygon ps) = Polygon (map f ps)

Note that fmap applies a function to every point in the shape, but does not change the overall shape structure. This could be used, for example, to convert between a spherical and a cartesian shape, given some function to convert between spherical and cartesian points.

convertPoint :: Spherical -> Cartesian convertPoint = ...

convertShape :: Shape Spherical -> Shape Cartesian convertShape = fmap convertPoint

Another example could be a shape with all the cartesian points stored in same database. Lets assume that we have a function to lookup which point is stored in the database under some key.

lookupPoint :: Database -> Key -> Cartesian lookupPoint = ...

We can use fmap to write the function which takes a Shape with keys as points, and looks up all the keys in the database, returning a Shape of cartesian coordinates.

lookupShape :: Database -> Shape Key -> Shape Cartesian lookupShape db = fmap (lookupPoint db)

The Functor typeclass is further extended and specialized by typeclasses like Traversable, Foldable, Applicative, Alternative, Monad, etc. Probably some of them are applicable here.

So an advantage of this kind of open family Shape is that we can write instances for standard typeclasses, and another advantage is that we may find good uses for types such as (Shape Key).

However, a disadvantage may be that we cannot restrict a priori which types of points are useable in our program. That also means that we cannot use pattern matching to discover at runtime which type of point was used in some shape.

If we make cartesian and spherical points belong to the same family of types, we relate them somewhat, but keep their distinction visible for the type system. We want to introduce a family of types with two members, one for spherical, and one for cartesian coordinates. We introduce empty data types to index the family.

data Cartesian data Spherical

And we express Point as a GADT using Spherical and Cartesian as follows.

data Point i = Cartesian :: Cartesian_coord -> Cartesian_coord -> Point Cartesian | Spherical :: Latitude -> Longitude -> Point Spherical

Again, the constructors have different types. The types differ in the type argument to Point. Since the type argument i is never used in the definition of Point to describe values, it is called a phantom type.

Now, we can use pattern matching to figure out the type of points used in some statically unknown shape. Lets again assume that we know how to convert a spherical into an cartesian point.

convertPoint :: Point Spherical -> Point Cartesian convertPoint (Spherical latitude longitude) = ...

Note that we do not have to write a clause for Cartesian points, because a (Point Spherical) value could never have been created by the Cartesian constructor.

We can now write a normalizePoint function which converts spherical points into cartesian points, but which does not change already cartesian points.

normalize :: Point i -> Point Cartesian normalize p@(Spherical _ _) = convertPoint p normalize p@(Cartesian _ _) = p

Note that after matching p with Spherical, we are allowed to call convertPoint, and after matching p with Cartesian, we are allowed to return p as a Point Cartesian. This is allowed by the type checker, because the type checker tracks which statically unknown information we rediscover at runtime.

We can use the same phantom types Cartesian and Spherical to write the Shape datatype.

data Shape i = Circle (Point i) Radius | Polygon [Point i]

But we cannot make Shape an instance of Functor.

So if you want the type system to dinstiguish between values created by different constructors of the same datatype, you have to make the difference between the constructors visible to the type system. You could either split the datatype into a bunch of unrelated types, or you could rewrite the datatype into a family of datatypes. The most important difference between these approaches is whether you want to have an open or a closed family of types.

Tillmann

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

-- Ozgur Akgun

Darrin Chandler

4:45 p.m.

Thanks to those who responded. Solutions that didn't work for my specific case still taught me more about expressing things in the Haskell type system. And... this particular response is extremely well written and useful. You've made the issues involved very clear and understandable. I really appreciate it. Thanks! On Thu, Mar 18, 2010 at 02:58:04PM +0100, Tillmann Rendel wrote:

...

Darrin Chandler wrote:

...
data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

type Center = Point type Radius = Float

data Shape = Circle Center Radius | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and Spherical points. Polygon needs to be one or the other, but not mixed.

The problem is that you cannot distinguish in the type system whether a Point was created by the Cartesian or the Spherical constructor. These constructors have the following types:

Cartesian :: (Cartesian_coord, Cartesion_coor) -> Point Spherical :: (Latitude, Latitude) -> Point

The types of both constructors end with the same type, the type Point. If you manage to change your code so that the constructors end in different types, then the difference between them will be visible to the type system, and you will have the static knowledge you want.

I know of two strategies to make the constructors end in different types. They could belong to different datatypes, or they could belong to different instances of a family of datatypes.

You could split the type Point into two types as follows.

data Cartesian = Cartesian Cartesian_coord Cartesian_coord

data Spherical = Spherical Latitude Longitude

Now the constructors have the following types:

Cartesian :: Cartesian_coord -> Cartesian_coord -> Cartesian Spherical :: Latitude -> Longitude -> Spherical

Since the types of the constructors end in different types, we can distinguish values created with Cartesian from values created with Spherical in the type system.

(Note that I dropped the use of tuples from your types to make the example more idiomatic Haskell).

However, how are we going to write the Shape datatype now that we have two different types of points? We write a family of Shape datatypes parameterized in the type of points we want to use.

data Shape point = Circle point Radius | Polygon [point]

Note that point is a type variable, which can be instantiated with whatever type we like, including Cartesian and Spherical. For example, Shape Cartesian is the type of shapes described with cartesian coordinates, and Shape Spherical is the type of shapes described with spherical coordinates. But we could also have a type like Shape Integer, which is the type of shapes described with Integer points, whatever that means.

This generality of the Shape family of types has advantages and disadvantages. An advantage is that we can make Shape an instance of the type class Functor as follows.

instance Functor Shape where fmap f (Circle p r) = Circle (f p) r fmap f (Polygon ps) = Polygon (map f ps)

Note that fmap applies a function to every point in the shape, but does not change the overall shape structure. This could be used, for example, to convert between a spherical and a cartesian shape, given some function to convert between spherical and cartesian points.

convertPoint :: Spherical -> Cartesian convertPoint = ...

convertShape :: Shape Spherical -> Shape Cartesian convertShape = fmap convertPoint

Another example could be a shape with all the cartesian points stored in same database. Lets assume that we have a function to lookup which point is stored in the database under some key.

lookupPoint :: Database -> Key -> Cartesian lookupPoint = ...

We can use fmap to write the function which takes a Shape with keys as points, and looks up all the keys in the database, returning a Shape of cartesian coordinates.

lookupShape :: Database -> Shape Key -> Shape Cartesian lookupShape db = fmap (lookupPoint db)

The Functor typeclass is further extended and specialized by typeclasses like Traversable, Foldable, Applicative, Alternative, Monad, etc. Probably some of them are applicable here.

So an advantage of this kind of open family Shape is that we can write instances for standard typeclasses, and another advantage is that we may find good uses for types such as (Shape Key).

However, a disadvantage may be that we cannot restrict a priori which types of points are useable in our program. That also means that we cannot use pattern matching to discover at runtime which type of point was used in some shape.

If we make cartesian and spherical points belong to the same family of types, we relate them somewhat, but keep their distinction visible for the type system. We want to introduce a family of types with two members, one for spherical, and one for cartesian coordinates. We introduce empty data types to index the family.

data Cartesian data Spherical

And we express Point as a GADT using Spherical and Cartesian as follows.

data Point i = Cartesian :: Cartesian_coord -> Cartesian_coord -> Point Cartesian | Spherical :: Latitude -> Longitude -> Point Spherical

Again, the constructors have different types. The types differ in the type argument to Point. Since the type argument i is never used in the definition of Point to describe values, it is called a phantom type.

Now, we can use pattern matching to figure out the type of points used in some statically unknown shape. Lets again assume that we know how to convert a spherical into an cartesian point.

convertPoint :: Point Spherical -> Point Cartesian convertPoint (Spherical latitude longitude) = ...

Note that we do not have to write a clause for Cartesian points, because a (Point Spherical) value could never have been created by the Cartesian constructor.

We can now write a normalizePoint function which converts spherical points into cartesian points, but which does not change already cartesian points.

normalize :: Point i -> Point Cartesian normalize p@(Spherical _ _) = convertPoint p normalize p@(Cartesian _ _) = p

Note that after matching p with Spherical, we are allowed to call convertPoint, and after matching p with Cartesian, we are allowed to return p as a Point Cartesian. This is allowed by the type checker, because the type checker tracks which statically unknown information we rediscover at runtime.

We can use the same phantom types Cartesian and Spherical to write the Shape datatype.

data Shape i = Circle (Point i) Radius | Polygon [Point i]

But we cannot make Shape an instance of Functor.

So if you want the type system to dinstiguish between values created by different constructors of the same datatype, you have to make the difference between the constructors visible to the type system. You could either split the datatype into a bunch of unrelated types, or you could rewrite the datatype into a family of datatypes. The most important difference between these approaches is whether you want to have an open or a closed family of types.

Tillmann

Brandon S. Allbery KF8NH

19 Mar 19 Mar

4:36 a.m.

Only one small nit here: On Mar 18, 2010, at 09:58 , Tillmann Rendel wrote:

...

And we express Point as a GADT using Spherical and Cartesian as follows.

data Point i = Cartesian :: Cartesian_coord -> Cartesian_coord -> Point Cartesian | Spherical :: Latitude -> Longitude -> Point Spherical

The actual syntax for this is

...

data Point i where Cartesian :: Cartesian_coord -> Cartesian_coord -> Point Cartesian Spherical :: Latitude -> Longitude -> Point Spherical

-- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Job Vranish

18 Mar 18 Mar

4:54 p.m.

A phantom type might do what you want: -- notice the type parameter on point that isn't used in the type data Point a = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude) -- make some dummy types data SphericalP data CartesianP --make some constructors that add a restriction to the phantom type -- notice the CartesianP type restriction, this isn't needed but allows us to restrict our type later if we want mkCartesian :: (Cartesian_coord, Cartesian_coord) -> Point CartesianP mkCartesian = Cartesian mkSherical :: (Latitude, Longitude) -> Point SphericalP mkSherical = Spherical type Center = Point type Radius = Float -- now the shape type doesn't care which type of point you have, but requires that all the points are the same data Shape a = Circle Center Radius | Polygon [Point a] The main problem here, is that you want to hide the Cartesian and Spherical constructors and only use mkCartesian and mkSherical to make Points (so that they have the proper restrictions). But this prevents you from using pattern matching where you have the constructors hidden. GADTs however will solve that: data Point a where Cartesian :: (Cartesian_coord, Cartesian_coord) -> Point CartesianP Spherical :: (Latitude, Longitude)-> Point SphericalP Hope that helps :) - Job On Thu, Mar 18, 2010 at 12:20 AM, Darrin Chandler wrote:

...

Hi,

Trying to get up to speed in Haskell, I'm playing with doing some abstraction in data types. Specifically, I have this:

type Cartesian_coord = Float

type Latitude = Float type Longitude = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

type Center = Point type Radius = Float

data Shape = Circle Center Radius | Polygon [Point]

This obviously stinks since a Polygon could contain mixed Cartesian and Spherical points. Polygon needs to be one or the other, but not mixed.

I could define seperate types for Cartesian and Spherical and seperate CartesianPoly and SphericalPoly, but that doesn't seem very elegant and also increases as I add more coordinate systems and shapes. I read a little on GADTs, et al, but I'm not sure if that's what I want for this or not.

Any help appreciated!

-- Darrin Chandler | Phoenix BSD User Group | MetaBUG dwchandler@stilyagin.com | http://phxbug.org/ | http://metabug.org/ http://www.stilyagin.com/ | Daemons in the Desert | Global BUG Federation

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

John Meacham

6:17 p.m.

On Wed, Mar 17, 2010 at 09:20:49PM -0700, Darrin Chandler wrote:

...

data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

Just a quick unrelated note, though you are probably aware of this, doing

...

data Foo = Foo (X,Y) means something subtly different than data Foo = Foo X Y and can be less efficient.

A quick way to see they are different is to count the bottoms, in the first case (where _ is bottom and X is some value) you have the cases

...

Foo _ Foo (_,_) Foo (X,_) Foo (_,X) Foo (X,X) and in the other case you have Foo _ _ Foo X _ Foo _ X Foo X X

so one has 5 distinct values, and the other has 4, hence they are not isomorphic. All things being equal, this means the second case will be more efficient as there is one less case it needs to distinguish (every potential bottom implys an 'eval' somewhere). Depending on your code, all things may not be equal and there are rare times when the tupled version is more efficient however. John -- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/

Luke Palmer

6:25 p.m.

On Thu, Mar 18, 2010 at 12:17 PM, John Meacham wrote:

...

On Wed, Mar 17, 2010 at 09:20:49PM -0700, Darrin Chandler wrote:

...
data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

Just a quick unrelated note, though you are probably aware of this, doing

...
data Foo = Foo (X,Y) means something subtly different than data Foo = Foo X Y and can be less efficient.

On the other hand, the latter is equivalent to: newtype Foo = Foo (X,Y)

...

A quick way to see they are different is to count the bottoms,

in the first case (where _ is bottom and X is some value) you have the cases

...
Foo _ Foo (_,_) Foo (X,_) Foo (_,X) Foo (X,X) and in the other case you have Foo _ _ Foo X _ Foo _ X Foo X X

so one has 5 distinct values, and the other has 4, hence they are not isomorphic. All things being equal, this means the second case will be more efficient as there is one less case it needs to distinguish (every potential bottom implys an 'eval' somewhere). Depending on your code, all things may not be equal and there are rare times when the tupled version is more efficient however.

John

-- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/ _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

John Meacham

6:26 p.m.

On Wed, Mar 17, 2010 at 09:20:49PM -0700, Darrin Chandler wrote:

...

Trying to get up to speed in Haskell, I'm playing with doing some abstraction in data types. Specifically, I have this:

type Cartesian_coord = Float

type Latitude = Float type Longitude = Float

data Point = Cartesian (Cartesian_coord, Cartesian_coord) | Spherical (Latitude, Longitude)

type Center = Point type Radius = Float

data Shape = Circle Center Radius | Polygon [Point]

phantom types can help you, providing the ability to distinguish the two without the run-time overhead of checking the Cartesioan and Spherical constructors

...

data Cartesian -- empty, just used for the type constructor data Spherical data Point a = Point Float Float data Shape a = Circle (Point a) Radius | Polygon [Point a]

...

spPoint :: Latitude -> Longitude -> Point Spherical cPoint :: Cartesian_coord -> Cartesian_coord -> Point Cartesian to create points of each, yet you can still have functions on 'Point a'

now you can have routines like that will work on any type of point. You may want to create a class that converts between the two

...

class Coordinated f where toCartesian :: f Spherical -> f Cartesian toSpherical :: f Cartesian -> f Spherical

instance Coordinated Point where ... instance Coordinated Shape where ...

John -- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/

5596

Age (days ago)

5597

Last active (days ago)

List overview

Download

18 comments

10 participants

participants (10)

Alexander Solla
Brandon S. Allbery KF8NH
Darrin Chandler
Job Vranish
John Meacham
Luke Palmer
Lyndon Maydwell
Ozgur Akgun
Stephen Tetley
Tillmann Rendel