Re: I just don't get it (data structures and OO)

Let's say i have a deep nested data structure. Universe containing galaxies, containing solar systems, containing planets, containing countries, containing inhabitants, containing ...whatever.
Oh. I had /exactly/ this problem. If you use separate types (i.e. a newtyped integer, acting kindof like a pointer) to represent (1) the identity and (2) the state of the object, you can use a separate data structure that remembers which object is inside which other object. http://www.downstairspeople.org/darcs/unstable/roguestar-engine/src/Insidene... Where "InsidenessMap a b c" represents a relationship where b's are inside a's, and b's have a state of c. Then, you need to declare a separate InsidenessMap for each possible relationship, but this ensures that you'll never put a galaxy inside a solar system. Or you can make 'a' be a reference to any type of object; there are options. But, you just update this structure once, with no thought of recursively updating a heterogenous tree of data. It may not be the best solution, but if I knew of something better I would be using the better thing instead. --Lane

You could also use 'compositional functional references'. These are introduced in the paper "A Functional Programming Technique for Forms in Graphical User Interfaces" by Sander Evers, Peter Achten and Jan Kuper. === Introduction === There are two things one typically wants to do when working with a substructure of some larger data structure: (1) extract the substructure; and (2) change the larger structure by acting on the substructure. A 'Ref cx t' encodes both of these functions (for a substructure of type 't' and larger structure (context) of type 'cx').
data Ref cx t = Ref { select :: cx -> t , update :: (t -> t) -> cx -> cx }
A Ref is a bit like a typed and composable incarnation of apfelmus's indices, or a wrapper around Tillmann's change* functions, containing not only a setter but also the accompanying getter. === Use === These Refs are compositional: given 'x :: Ref a b' and 'y :: Ref b c', we can form
z :: Ref a c z = Ref { select = select b . select a , update = update a . update b }
. In fact we can almost make 'Ref' into an arrow. (Only 'pure :: (a -> b) -> Ref a b' does not make sense, because a pure function 'f : a -> b' doesn't give information about how to transform a change in 'b' into a change in 'a'.) I've written a template haskell function to derive Refs from a data structure definition (with record syntax): given
data Universe = Universe { galaxies :: [Galaxy] }
, it creates
galaxiesRef :: Ref Universe [Galaxy]
. Together with some Ref functions for container types (List, Map, etc.) this eliminates most (all?) necessary boilerplate code. Greetings, Arie

Arie Peterson writes:
There are two things one typically wants to do when working with a substructure of some larger data structure: (1) extract the substructure; and (2) change the larger structure by acting on the substructure. A 'Ref cx t' encodes both of these functions (for a substructure of type 't' and larger structure (context) of type 'cx').
data Ref cx t = Ref { select :: cx -> t , update :: (t -> t) -> cx -> cx }
A Ref is a bit like a typed and composable incarnation of apfelmus's indices, or a wrapper around Tillmann's change* functions, containing not only a setter but also the accompanying getter.
That's a neat idiom. I wonder how far one could usefully generalize it.
For example,
type Ref cx t = forall f. Functor f => (t -> f t) -> cx -> f cx
newtype Id a = Id { unId :: a }
instance Functor Id where fmap f = Id . f . unId
newtype K t a = K { unK :: t }
instance Functor (K t) where fmap = K . unK
select :: Ref cx t -> cx -> t
select ref = unK . ref K
update :: Ref cx t -> (t -> t) -> cx -> cx
update ref f = unId . ref (Id . f)
rfst :: Ref (a,b) a
rfst f (x,y) = fmap (\x' -> (x',y)) (f x)
In this implementation, composition of Refs is just function
composition.
select (rfst . rfst) :: ((a,b),c) -> a
--
David Menendez

|> > data Ref cx t |> > = Ref |> > { |> > select :: cx -> t |> > , update :: (t -> t) -> cx -> cx |> > } |> |> A Ref is a bit like a typed and composable incarnation of apfelmus's |> indices, or a wrapper around Tillmann's change* functions, containing |> not only a setter but also the accompanying getter. | |That's a neat idiom. I wonder how far one could usefully generalize it. you might find Koji Kagawa's papers interesting: http://guppy.eng.kagawa-u.ac.jp/~kagawa/publication/index-e.html in particular Mutable Data Structures and Composable References in a Pure Functional Language , Koji Kagawa, In SIPL '95: State in Programming Languages, San Francisco, USA, January 1995. Compositional References for Stateful Functional Programming, Koji Kagawa, ICFP 1997, June 1997, Amsterdam, the Netherlands. claus

David Menendez wrote: | That's a neat idiom. I wonder how far one could usefully generalize it. | | For example, | | type Ref cx t = forall f. Functor f => (t -> f t) -> cx -> f cx | | newtype Id a = Id { unId :: a } | instance Functor Id where fmap f = Id . f . unId | | newtype K t a = K { unK :: t } | instance Functor (K t) where fmap = K . unK | | | select :: Ref cx t -> cx -> t | select ref = unK . ref K | | update :: Ref cx t -> (t -> t) -> cx -> cx | update ref f = unId . ref (Id . f) | | | rfst :: Ref (a,b) a | rfst f (x,y) = fmap (\x' -> (x',y)) (f x) | | In this implementation, composition of Refs is just function | composition. | | select (rfst . rfst) :: ((a,b),c) -> a Cool! Reversely, you can usefully convert "my" Refs to yours: fRef :: Original.Ref cx t -> David.Ref cx t fRef r h cx = fmap (($ cx) . update r . const) $ h (select r cx) Claus Reinke wrote: | you might find Koji Kagawa's papers interesting: | | [...] Mm, more reading... :-) Greetings, Arie

Christopher Lane Hinson wrote:
Where "InsidenessMap a b c" represents a relationship where b's are inside a's, and b's have a state of c. Then, you need to declare a separate InsidenessMap for each possible relationship, but this ensures that you'll never put a galaxy inside a solar system. Or you can make 'a' be a reference to any type of object; there are options.
Ketil Malde wrote:
Identity can be emulated by relatively straightforward means: store all planets in a Map indexed by something that is useful as an identifier (i.e. stays constant and is unique), and have a Galaxy keep a list of identifiers.
So basically you guys are saying I should rethink the data structure into a relational model instead of sticking to the OO model... I think i could do this pretty easily. a table would be a map of id to instance ...then another map for foreign keys, or maybe just as a member of each data Is the relational model a better fit than the object model for functional programming ? Sacha

Phlex wrote:
Christopher Lane Hinson wrote:
Where "InsidenessMap a b c" represents a relationship where b's are inside a's, and b's have a state of c. Then, you need to declare a separate InsidenessMap for each possible relationship, but this ensures that you'll never put a galaxy inside a solar system. Or you can make 'a' be a reference to any type of object; there are options.
Ketil Malde wrote:
Identity can be emulated by relatively straightforward means: store all planets in a Map indexed by something that is useful as an identifier (i.e. stays constant and is unique), and have a Galaxy keep a list of identifiers.
So basically you guys are saying I should rethink the data structure into a relational model instead of sticking to the OO model... I think i could do this pretty easily. a table would be a map of id to instance ...then another map for foreign keys, or maybe just as a member of each data
Is the relational model a better fit than the object model for functional programming ?
Of course it depends on what you are doing, but I actually have never needed to encode a relational model like this, even when I have deeply nested structures. I find that my usual solution involves doing my transformations on the data structure "all at once". By that I mean that instead of performing a number of steps from the root of the data structure, I do all the operations in one pass. To keep the algorithms decoupled I usually end up passing the operations to perform as an argument. Higher-order functions are your friend. Because Haskell is lazy I don't really worry about doing "too much" and if I really need it, I can use the result as part of the transformation (its like recursion, but with values). Between laziness and HOF I rarely need any kind of state. Its not directly related to your question, but I found the iterative root-finding and differentiation examples in "Why Functional Programming Matters" [1] to be eye-opening about the "functional way" because they are algorithms that are almost always described as stateful computations, but are shown to be very elegant in a pure functional implementation. [1] http://www.math.chalmers.se/~rjmh/Papers/whyfp.html

Phlex wrote:
Ketil Malde wrote:
Identity can be emulated by relatively straightforward means: store all planets in a Map indexed by something that is useful as an identifier (i.e. stays constant and is unique), and have a Galaxy keep a list of identifiers.
So basically you guys are saying I should rethink the data structure into a relational model instead of sticking to the OO model... I think i could do this pretty easily. a table would be a map of id to instance ....then another map for foreign keys, or maybe just as a member of each data
Is the relational model a better fit than the object model for functional programming ?
Well, not really. I mean, if the problem is indeed to store all known planets in the universe, then it's indeed a database in nature and you have to support fine grained operations like delete :: Key -> Database -> Database insert :: Key -> Item -> Database -> Database ... and so on ... (Note that some proposals like changeGalaxies $ changePlanet 0 $ changeName $ const "first" or functional references can be interpreted as keys for 'insert' or 'delete'. I mean that this expression already is the key to look up a planet inside the universe, it's just that this key has a rather unusual type. And that you can compose keys.) But if the problem at hand is perhaps a binary search tree or some other data structure, you can implement many operations without using per-element 'delete' or 'insert' although every operation can in principle be built up from those. Maybe it helps if you elaborate on your concrete problem? Regards, apfelmus

apfelmus wrote:
Phlex wrote:
Ketil Malde wrote:
Is the relational model a better fit than the object model for functional programming ?
Well, not really. I mean, if the problem is indeed to store all known planets in the universe, then it's indeed a database in nature and you have to support fine grained operations like
delete :: Key -> Database -> Database insert :: Key -> Item -> Database -> Database ... and so on ...
(Note that some proposals like
changeGalaxies $ changePlanet 0 $ changeName $ const "first"
or functional references can be interpreted as keys for 'insert' or 'delete'. I mean that this expression already is the key to look up a planet inside the universe, it's just that this key has a rather unusual type. And that you can compose keys.)
But if the problem at hand is perhaps a binary search tree or some other data structure, you can implement many operations without using per-element 'delete' or 'insert' although every operation can in principle be built up from those. Maybe it helps if you elaborate on your concrete problem?
Regards, apfelmus
Well the current state of my haskell knowledge doesn't allow any kind of serious work, so I'm indeed currently playing with Universe, Galaxies, Systems, Planets and Moons =P. I'm an application programmer, and i thought that if I can model a moving universe (a small one though!), I guess i'll be one step closer to translating this knowledge to building statefull server applications. I'm worried that my OO view of things might interfere with the haskell ways. So yes, what i'm looking for right now is more of a high level "how-to", and that's precisely what you guys provided to me. Thanks for that ! So here is one more question : Let's say I want unique System names across the Universe ... that would mean i need to have a Data.Map in the Universe, with Name keys and System values. Since all data are values instead of references, would i end up with two copies of each System (in Universe's Data.Map and in its Galaxy), or would these be shared somehow ? In other words, should i go for integer (or maybe access key/tuple) identified objects or just put the System in both Data.Maps it belongs to ? regards, Sacha

Hi, On 06.06.2007, at 07:00, Phlex wrote:
So here is one more question :
Let's say I want unique System names across the Universe ... that would mean i need to have a Data.Map in the Universe, with Name keys and System values. Since all data are values instead of references, would i end up with two copies of each System (in Universe's Data.Map and in its Galaxy), or would these be shared somehow ? In other words, should i go for integer (or maybe access key/tuple) identified objects or just put the System in both Data.Maps it belongs to ?
That depends. In an OO world everything has its own implicit identity. The new operator in Java provides you with an object with a new and unique key which is not easy to observe. But most formal semantics have it. One can think of it as the address of the allocated object. In FP there are no objects, there are only terms. Whether two terms are identically is answered by a structural traversal over the values and subvalues. Now, I return to your question. What makes your "objects" galaxies, planets, stars, etc. unique? Is it their coordinate in space, their name, their structural position in your tree? What is it? Let's assume you say their names are unique. Then you only have to store a set of all names used in your universe. If you want a planet which orbits around star "a" to be different form another planet that is in orbit of star "b", although both planets are the same in every other aspect. Then you might think about introducing arbitrary unique integer keys. This is similar to database design. There are those normal form laws [1] which guide you to improve your db schema. In database design there are people who introduce artificial primary keys almost always. Although there are natural primary keys most of the time. I hope these random thoughts help a bit to change the perspective. Regards, Jean-Marie [1] http://en.wikipedia.org/wiki/Database_normalization

apfelmus wrote:
I mean, if the problem is indeed to store all known planets in the universe, then it's indeed a database in nature and you have to support fine grained operations like
delete :: Key -> Database -> Database insert :: Key -> Item -> Database -> Database ... and so on ...
(Note that some proposals like
changeGalaxies $ changePlanet 0 $ changeName $ const "first"
or functional references can be interpreted as keys for 'insert' or 'delete'. I mean that this expression already is the key to look up a planet inside the universe, it's just that this key has a rather unusual type. And that you can compose keys.)
Here's an elaboration of the last remark which gives a Haskell-98 way to compose keys for "nested" finite maps (like Map k (Map k' a)). It's inspired by functional references and similar to the type-class approach. The problem we want to solve is to access/change/delete values of type 'a' in a nested map like say Data.Map k (Data.Map k' a). Here, k and k' are known as "keys", but we will redefine the notion "key" shortly. The basic observation is that we need a pair (k,k') to access a value of type 'a', just like accessing a file via an absolute path requires a sequence of directory names. Directories can be "composed" via the well known slash '/' and that's what we're going to do as well: > :: Key m m' -> Key m' a -> Key m a Here, a value of type data Key m a = ... -- abstract for now means that it's a key to access values of type 'a' in a finite map of type 'm' that stores theses values. In other words, we are given operations lookup :: Key m a -> m -> Maybe a insert :: Key m a -> a -> m -> m delete :: Key m a -> m -> m ... to change the finite map at a particular key. Of course, we already have concrete keys k for Data.Map k a. We can turn those into abstract keys via a given embedding at :: k -> Key (Data.Map k a) a Now, how to implement Key? The observation is that all operations on keys take them as their first argument which means that we can implement them as record selectors! For maximum generality, we use the following data Key m a = Key { lookup :: m -> Maybe a , singleton :: a -> m , alter :: (Maybe a -> Maybe a) -> m -> m } Here, 'alter' combines the functionality of insert, delete and adjust, which can be implemented as follows: insert k x = update k $ const $ Just x delete k = update k $ const $ Nothing adjust k f = update k $ fmap f Key composition is readily implemented as k > k' = Key { lookup = \m -> lookup k m >>= lookup k' , singleton = \x -> singleton k (singleton k' x) , alter = \f m -> alter k (alter' f) m } where alter' f (Just m') = Just $ alter k' f m' alter' f Nothing = f Nothing >>= singleton k' Finally, we can implement keys for a concrete finite map implementation at :: k -> Key (Data.Map k a) a at k = Key { lookup = Data.Map.lookup k , singleton = Data.Map.singleton k , alter = flip Data.Map.alter k } As an example, we have Just "Earth" == lookup (at "Milky Way" > at "Sun") universe assuming that universe :: Data.Map String (Data.Map String String) Regards, apfelmus

apfelmus
As an example, we have
Just "Earth" == lookup (at "Milky Way" > at "Sun") universe
assuming that
universe :: Data.Map String (Data.Map String String)
All this hard work for something that in a lesser language would be the unimpressive: universe["Milky Way"]["Sun"] ;-P -- G.

Grzegorz wrote:
apfelmus
writes: [ .. lengthy discussion and implementation .. ]
As an example, we have
Just "Earth" == lookup (at "Milky Way" > at "Sun") universe
assuming that
universe :: Data.Map String (Data.Map String String)
All this hard work for something that in a lesser language would be the unimpressive: universe["Milky Way"]["Sun"] ;-P
Yo, and the lesser language surely has first-class keys as in do let worstCaseScenario k = forM_ parallelUniverses $ destroySystem k worstCaseScenario ["Milky Way"]["Sun"] , right. Right? Hopefully not ;) The Haskell version is safe though, as the old universes are still in memory because there's no update in place. Very useful in case the imperative imperator decides to segfault the space-time continuum and thus destroys this and all parallel worlds. Don't worry, the old ones are still there. Regards, apfelmus

On 6/7/07, Grzegorz
All this hard work for something that in a lesser language would be the unimpressive: universe["Milky Way"]["Sun"] ;-P
Well, if you want to get picky there is the '!' operator defined in Data.Map:
import Data.Map
universe :: Map String (Map String String) universe = fromList [("Milky Way", fromList [("Sun", "It's home!")])]
earth :: String earth = universe ! "Milky Way" ! "Sun"
Justin

Justin Bailey wrote:
On 6/7/07, Grzegorz
wrote: All this hard work for something that in a lesser language would be the unimpressive: universe["Milky Way"]["Sun"] ;-P
Well, if you want to get picky there is the '!' operator defined in Data.Map:
universe ! "Milky Way" ! "Sun"
I second that. I particularly like the elimination of ]'s. We certainly need some symbol to separate the map and the key; but we do really need to also mark "here be the end of the key"? (Evidently, this carries over to array indexing and function application too, i.e., a!!i vs a[i], f$x or f x vs f(x).)

I second that. I particularly like the elimination of ]'s. We certainly need some symbol to separate the map and the key; but we do really need to also mark "here be the end of the key"?
and how (arr ! "key" ++ "data") should be parsed? :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On 6/8/07, Bulat Ziganshin
I second that. I particularly like the elimination of ]'s. We certainly need some symbol to separate the map and the key; but we do really need to also mark "here be the end of the key"?
and how (arr ! "key" ++ "data") should be parsed? :)
Roughly the same way you'd parse 2 * 3 + 4. /g -- The man who'd introduced them didn't much like either of them, though he acted as if he did, anxious as he was to preserve good relations at all times. One never knew, after all, now did one now did one now did one.
participants (13)
-
Al Falloon
-
Albert Y. C. Lai
-
apfelmus
-
Arie Peterson
-
Bulat Ziganshin
-
Christopher Lane Hinson
-
Claus Reinke
-
David Menendez
-
Grzegorz
-
J. Garrett Morris
-
Jean-Marie Gaillourdet
-
Justin Bailey
-
Phlex