Question about data structures

Russ Abbott

24 Nov 2010 24 Nov '10

7:40 p.m.

I would appreciate some advice about the best way to manipulate data structures in Haskell. Let's assume I have what in an OO language would be a class with a number of instance variables. When I want to change one of the values of one of those instance variables in Haskell I must rebuild the entire structure. Even worse, if one of those instance variables is a reference to another data structure, then when I change that referenced data structure, I am forced to rebuild my top level variable. For example. data Struct1 = Struct1 {var1 :: Struct11, var2 :: Struct1, ... } data Struct11 = Struct11 {var11 :: ... } Let's assume I have x :: Struct1 and that I change the value of var11 . var1 $ x. Doesn't that require that I rebuild x? Is there a better way? Thanks. * -- Russ*

Attachments:

attachment.html (text/html — 2.3 KB)

Show replies by date

Daniel Fischer

24 Nov 24 Nov

8:26 p.m.

On Wednesday 24 November 2010 20:40:02, Russ Abbott wrote:

...

I would appreciate some advice about the best way to manipulate data structures in Haskell.

Let's assume I have what in an OO language would be a class with a number of instance variables. When I want to change one of the values of one of those instance variables in Haskell I must rebuild the entire structure. Even worse, if one of those instance variables is a reference to another data structure, then when I change that referenced data structure, I am forced to rebuild my top level variable. For example.

data Struct1 = Struct1 {var1 :: Struct11, var2 :: Struct1, ... } data Struct11 = Struct11 {var11 :: ... }

Let's assume I have x :: Struct1 and that I change the value of var11 . var1 $ x. Doesn't that require that I rebuild x?

No, x doesn't change. What you probably mean is y = x{ var1 = (var1 x){ var11 = somethingNew } } Then y shares all fields except var1 with x, the var1 field of y must be built, but it shares everything except the var11 field with (var1 x). Since values are immutable, the bulk of the structure is shared between an original value and a modified one (except if the structures are very small so there's little to share, but then building is cheap, or the modification changes very much of the structure, but then the modification would be expensive also in a language with mutable values).

...

Is there a better way?

Thanks. * -- Russ*

Russ Abbott

9:12 p.m.

Cool. I wasn't aware of that notation. It doesn't quite get to the issue though. The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need to change something buried within it, it seems necessary to rebuild everything that includes the changed thing. That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed, and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements. Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf objects, I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root. One way to avoid that is for the leaves to refer to their objects through a Map. Then changing a leaf object requires only that the value associated with key represented by the leaf be (re-)inserted into the Map. The Tree itself need not change at all. But that trick isn't always available. In the example we are talking about we can't make a Map where they keys are the instance variable names and the values are their values. That would seem to do the job, but since the values are of different types, we can't create such a Map. So now what? * -- Russ * On Wed, Nov 24, 2010 at 12:26 PM, Daniel Fischer wrote:

...

On Wednesday 24 November 2010 20:40:02, Russ Abbott wrote:

...
I would appreciate some advice about the best way to manipulate data structures in Haskell.

Let's assume I have what in an OO language would be a class with a number of instance variables. When I want to change one of the values of one of those instance variables in Haskell I must rebuild the entire structure. Even worse, if one of those instance variables is a reference to another data structure, then when I change that referenced data structure, I am forced to rebuild my top level variable. For example.

data Struct1 = Struct1 {var1 :: Struct11, var2 :: Struct1, ... } data Struct11 = Struct11 {var11 :: ... }

Let's assume I have x :: Struct1 and that I change the value of var11 . var1 $ x. Doesn't that require that I rebuild x?

No, x doesn't change. What you probably mean is

y = x{ var1 = (var1 x){ var11 = somethingNew } }

Then y shares all fields except var1 with x, the var1 field of y must be built, but it shares everything except the var11 field with (var1 x).

Since values are immutable, the bulk of the structure is shared between an original value and a modified one (except if the structures are very small so there's little to share, but then building is cheap, or the modification changes very much of the structure, but then the modification would be expensive also in a language with mutable values).

...
Is there a better way?

Thanks. * -- Russ*

Daniel Schoepe

9:26 p.m.

Russ Abbott writes:

...

Cool. I wasn't aware of that notation. It doesn't quite get to the issue though.

The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need to change something buried within it, it seems necessary to rebuild everything that includes the changed thing. That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed, and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements.

The data-accessor package makes those changes more syntactically lightweight: http://hackage.haskell.org/package/data-accessor http://hackage.haskell.org/package/data-accessor-template

...

Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf objects, I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

You could define a function like mapTree, or mapLeaves along with the data structure, that recurses over the tree and applies a function to its components. This also allows changing a leaf without going through the entire tree manually:

...

mapLeaves (\leaf -> if leaf == wantedLeaf then modifiedLeaf else leaf) tree

If you're concerned about memory usage when "rebuilding" the data structures, Daniel Fischer's previous statements still apply: The modified data structure will share everything with the old one except for the changed parts. Regards, Daniel

Daniel Fischer

9:52 p.m.

On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote:

...

Cool. I wasn't aware of that notation. It doesn't quite get to the issue though.

The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need to change something buried within it, it seems necessary to rebuild everything that includes the changed thing.

In general, values are immutable, so you can't "change something buried within it". You have to build a new value containing some of the old stuff and a new part. Building the new value usually consists mostly of copying a couple of pointers (plus building the new part of course), so isn't too expensive normally. You can have mutable values in the IO or (ST s) monads, if you need them.

...

That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed,

The thing with the changed component is not x anymore.

...

and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements.

The compiler takes care of that.

...

Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf objects,

You can't in general, the thing with a different property is a different object.

...

I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

Yes (well, not you, the compiler does it), except if your tree contains mutable objects (IORefs/STRefs for example).

...

One way to avoid that is for the leaves to refer to their objects through a Map. Then changing a leaf object requires only that the value associated with key represented by the leaf be (re-)inserted into the Map. The Tree itself need not change at all.

Oh, it will. If you change a Map, you get a new one, thus you get a new tree containing the new Map.

...

But that trick isn't always available. In the example we are talking about we can't make a Map where they keys are the instance variable names and the values are their values. That would seem to do the job, but since the values are of different types, we can't create such a Map.

So now what?

Well, what's the problem with the compiler copying some nodes? Normally, that doesn't cost very much performance, if it does in your case, we'd need to know more to suggest the best way to go.

...

* -- Russ *

Russ Abbott

10:02 p.m.

OK. So putting a Map at Leaf nodes doesn't solve the problem. (Apparently I haven't been able to communicate what I see as the problem.) The problem that I'm trying to get to is the need to write excessive code for something that would require a lot less code in an OO world. It's not a matter of execution time or space. It's a matter of the amount of code one is required to write. * -- Russ * On Wed, Nov 24, 2010 at 1:52 PM, Daniel Fischer wrote:

...

On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote:

...
Cool. I wasn't aware of that notation. It doesn't quite get to the issue though.

The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need to change something buried within it, it seems necessary to rebuild everything that includes the changed thing.

In general, values are immutable, so you can't "change something buried within it". You have to build a new value containing some of the old stuff and a new part. Building the new value usually consists mostly of copying a couple of pointers (plus building the new part of course), so isn't too expensive normally.

You can have mutable values in the IO or (ST s) monads, if you need them.

...
That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed,

The thing with the changed component is not x anymore.

...
and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements.

The compiler takes care of that.

...
Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf objects,

You can't in general, the thing with a different property is a different object.

...
I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

Yes (well, not you, the compiler does it), except if your tree contains mutable objects (IORefs/STRefs for example).

...
One way to avoid that is for the leaves to refer to their objects through a Map. Then changing a leaf object requires only that the value associated with key represented by the leaf be (re-)inserted into the Map. The Tree itself need not change at all.

Oh, it will. If you change a Map, you get a new one, thus you get a new tree containing the new Map.

...
But that trick isn't always available. In the example we are talking about we can't make a Map where they keys are the instance variable names and the values are their values. That would seem to do the job, but since the values are of different types, we can't create such a Map.

So now what?

Well, what's the problem with the compiler copying some nodes? Normally, that doesn't cost very much performance, if it does in your case, we'd need to know more to suggest the best way to go.

...
* -- Russ *

matthew coolbeth

10:19 p.m.

May we see a code sample of what you're describing? Generally, when you're writing haskell programs, you will not approach the problem in the same way that you would appoach it in a language with mutable objects. If you can provide a reasonably simple example of a computing task that someone would want to perform, I'm sure someone will be able to show you how it would be best approached in haskell, and how you need not write too many lines of code, provided that you use the proper, idiomatic style. On 2010-11-24, Russ Abbott wrote:

...

OK. So putting a Map at Leaf nodes doesn't solve the problem. (Apparently I haven't been able to communicate what I see as the problem.)

The problem that I'm trying to get to is the need to write excessive code for something that would require a lot less code in an OO world. It's not a matter of execution time or space. It's a matter of the amount of code one is required to write. * -- Russ *

On Wed, Nov 24, 2010 at 1:52 PM, Daniel Fischer wrote:

...
On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote:

...
Cool. I wasn't aware of that notation. It doesn't quite get to the issue though.

The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need to change something buried within it, it seems necessary to rebuild everything that includes the changed thing.

In general, values are immutable, so you can't "change something buried within it". You have to build a new value containing some of the old stuff and a new part. Building the new value usually consists mostly of copying a couple of pointers (plus building the new part of course), so isn't too expensive normally.

You can have mutable values in the IO or (ST s) monads, if you need them.

...
That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed,

The thing with the changed component is not x anymore.

...
and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements.

The compiler takes care of that.

...
Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf objects,

You can't in general, the thing with a different property is a different object.

...
I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

Yes (well, not you, the compiler does it), except if your tree contains mutable objects (IORefs/STRefs for example).

...
One way to avoid that is for the leaves to refer to their objects through a Map. Then changing a leaf object requires only that the value associated with key represented by the leaf be (re-)inserted into the Map. The Tree itself need not change at all.

Oh, it will. If you change a Map, you get a new one, thus you get a new tree containing the new Map.

...
But that trick isn't always available. In the example we are talking about we can't make a Map where they keys are the instance variable names and the values are their values. That would seem to do the job, but since the values are of different types, we can't create such a Map.

So now what?

Well, what's the problem with the compiler copying some nodes? Normally, that doesn't cost very much performance, if it does in your case, we'd need to know more to suggest the best way to go.

...
* -- Russ *

-- mac

Russ Abbott

11:11 p.m.

Thanks, I'm still struggling with it. But here is what it looks like now. I'm working on a KenKen solver. One issue I'm having is that I'm not sure how many types to declare -- where by types I mean a "type X = ..." declaration. The more there are the better documented the code, but the more there are the harder it seems to be to remember what they all are. I've pasted the current version of my declarations below. By way of explanation, here's the representation strategy. The puzzle board is called a Grid. It consists primarily of a Map CellRef Cells. I'm using a Map instead of an array (list of lists) as a way to avoid rebuilding the list of lists of cells when a Cell gets a value. A CellRef is (Int, Int) for row and column. A Cell contains its own CellRef along with a reference to the Cage that refers to it. It also has a place to put its actual value, an Int (called type CellValue). (In the current version a Cell has one value. In a subsequent version it will be more like a FiniteDomain variable with a Set of still possible values.) A Grid also contains two other Maps. Each is of the type Map Int ValueSet. One of them represents the unused values for the rows; the other the unused values for the columns. A ValueSet is a Set of allowable values. If the puzzle is (size x size), its ValueSet will be Set.fromList [1 .. size]. The other main structure is the puzzle presentation, which is called a KenKen. It is a list of Cages, where a Cage is: a target value (what the operation on the cells has to come out to), an Op (the Operation to be performed), and a list of CellRef (which are (row, col) references to Cells). When I give a Cell a value, I am forced to rebuild: the Map of row ValueSets, the Map of column ValueSets, the Map of Cells themselves, and the Grid that contains all these pieces. In this case, the pieces do change. (The cell changes since it gets a value; the candidates still left for the row and column also change since that value is no longer available.) But still it's a pain rebuilding it all. So perhaps the first question is whether there is a cleaner and simpler way to represent all this. type CellValueList = [CellValue] type CellRef = (Int, Int) type CellValue = Int data Grid = Grid { rowCandidatesSets :: CandidatesSets , colCandidatesSets :: CandidatesSets , gridCells :: (Map CellRef Cell) } type KenKen = [Cage] type ValueSet = Set CellValue type CandidatesSets = Map Int ValueSet data Cell = Cell { cellRef :: CellRef , cage:: Cage , value:: CellValue } deriving (Eq, Show) row cell = fst (cellRef cell) col cell = snd (cellRef cell) data Cage = Cage {target :: CellValue , op :: Op , cageCells :: [CellRef] } deriving (Eq, Show) data Op = Add | Subtract | Multiply | Divide deriving (Show, Eq) * -- Russ * On Wed, Nov 24, 2010 at 2:19 PM, matthew coolbeth wrote:

...

May we see a code sample of what you're describing?

Generally, when you're writing haskell programs, you will not approach the problem in the same way that you would appoach it in a language with mutable objects.

If you can provide a reasonably simple example of a computing task that someone would want to perform, I'm sure someone will be able to show you how it would be best approached in haskell, and how you need not write too many lines of code, provided that you use the proper, idiomatic style.

...
OK. So putting a Map at Leaf nodes doesn't solve the problem. (Apparently I haven't been able to communicate what I see as the problem.)

The problem that I'm trying to get to is the need to write excessive code for something that would require a lot less code in an OO world. It's not a matter of execution time or space. It's a matter of the amount of code one is required to write. * -- Russ *

On Wed, Nov 24, 2010 at 1:52 PM, Daniel Fischer wrote:

...
On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote:

...
Cool. I wasn't aware of that notation. It doesn't quite get to the issue though.

The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need to change something buried within it, it seems necessary to rebuild everything that includes the changed thing.

In general, values are immutable, so you can't "change something buried within it". You have to build a new value containing some of the old stuff and a new part. Building the new value usually consists mostly of copying a couple of pointers (plus building the new part of course), so isn't too expensive normally.

You can have mutable values in the IO or (ST s) monads, if you need

On 2010-11-24, Russ Abbott wrote: them.

...
...
...
That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed,

The thing with the changed component is not x anymore.

...
and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements.

The compiler takes care of that.

...
Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf

objects,

...
You can't in general, the thing with a different property is a different object.

...
I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

Yes (well, not you, the compiler does it), except if your tree contains mutable objects (IORefs/STRefs for example).

...
One way to avoid that is for the leaves to refer to their objects through a Map. Then changing a leaf object requires only that the

value

...
...
associated with key represented by the leaf be (re-)inserted into the Map. The Tree itself need not change at all.

Oh, it will. If you change a Map, you get a new one, thus you get a new tree containing the new Map.

...
But that trick isn't always available. In the example we are talking about we can't make a Map where they keys are the instance variable names and the values are their values. That would seem to do the job, but since the values are of different types, we can't create such a

Map.

...
So now what?

Well, what's the problem with the compiler copying some nodes? Normally, that doesn't cost very much performance, if it does in your case, we'd need to know more to suggest the best way to go.

...
* -- Russ *

-- mac

Russ Abbott

25 Nov 25 Nov

1:47 a.m.

Here's a simple example that arises in the KenKen problem. Let's assume that instead of keeping for each cage the Operation and target number we pre-compute all the combinations of values that will satisfy the cage. For example, assume that in a 4x4 game, we have a cage that refers to cells C1 and C2 and that the cage requires those two cells to produce a value of 2 using division, Pre-computing the possibilities, we store [(1, 2), (2, 1), (2, 4), (4, 2)] in this cage. The cage points to the two cells it constrains, and each cell points to the cage that constrains it. When a cell (say C1) gets a value of (say) 2, we want to change the cage to be [(2, 1), (2, 4)]. But when that happens, it's necessary to change cell C2 to point to the new cage -- and of course since the cell was changed the construct that held that cell has to be changed also. It's a bit of a pain to write the code to do all that. This isn't to say that the code will take that long to run or that it will use excessive memory -- only that it feels like one is being forced to write code that one shouldn't have to write. * -- Russ * On Wed, Nov 24, 2010 at 3:11 PM, Russ Abbott wrote:

...

Thanks, I'm still struggling with it. But here is what it looks like now.

I'm working on a KenKen solver. One issue I'm having is that I'm not sure how many types to declare -- where by types I mean a "type X = ..." declaration. The more there are the better documented the code, but the more there are the harder it seems to be to remember what they all are. I've pasted the current version of my declarations below. By way of explanation, here's the representation strategy.

The puzzle board is called a Grid. It consists primarily of a Map CellRef Cells. I'm using a Map instead of an array (list of lists) as a way to avoid rebuilding the list of lists of cells when a Cell gets a value. A CellRef is (Int, Int) for row and column. A Cell contains its own CellRef along with a reference to the Cage that refers to it. It also has a place to put its actual value, an Int (called type CellValue). (In the current version a Cell has one value. In a subsequent version it will be more like a FiniteDomain variable with a Set of still possible values.)

A Grid also contains two other Maps. Each is of the type Map Int ValueSet. One of them represents the unused values for the rows; the other the unused values for the columns. A ValueSet is a Set of allowable values. If the puzzle is (size x size), its ValueSet will be Set.fromList [1 .. size].

The other main structure is the puzzle presentation, which is called a KenKen. It is a list of Cages, where a Cage is: a target value (what the operation on the cells has to come out to), an Op (the Operation to be performed), and a list of CellRef (which are (row, col) references to Cells).

When I give a Cell a value, I am forced to rebuild: the Map of row ValueSets, the Map of column ValueSets, the Map of Cells themselves, and the Grid that contains all these pieces. In this case, the pieces do change. (The cell changes since it gets a value; the candidates still left for the row and column also change since that value is no longer available.) But still it's a pain rebuilding it all.

So perhaps the first question is whether there is a cleaner and simpler way to represent all this.

type CellValueList = [CellValue]

type CellRef = (Int, Int)

type CellValue = Int

data Grid = Grid { rowCandidatesSets :: CandidatesSets , colCandidatesSets :: CandidatesSets , gridCells :: (Map CellRef Cell) }

type KenKen = [Cage]

type ValueSet = Set CellValue

type CandidatesSets = Map Int ValueSet

data Cell = Cell { cellRef :: CellRef , cage:: Cage , value:: CellValue } deriving (Eq, Show)

row cell = fst (cellRef cell) col cell = snd (cellRef cell)

data Cage = Cage {target :: CellValue , op :: Op , cageCells :: [CellRef] } deriving (Eq, Show)

data Op = Add | Subtract | Multiply | Divide deriving (Show, Eq)

* -- Russ *

On Wed, Nov 24, 2010 at 2:19 PM, matthew coolbeth
...
wrote:

...
May we see a code sample of what you're describing?

Generally, when you're writing haskell programs, you will not approach the problem in the same way that you would appoach it in a language with mutable objects.

If you can provide a reasonably simple example of a computing task that someone would want to perform, I'm sure someone will be able to show you how it would be best approached in haskell, and how you need not write too many lines of code, provided that you use the proper, idiomatic style.

...
OK. So putting a Map at Leaf nodes doesn't solve the problem. (Apparently I haven't been able to communicate what I see as the problem.)

The problem that I'm trying to get to is the need to write excessive code for something that would require a lot less code in an OO world. It's not a matter of execution time or space. It's a matter of the amount of code one is required to write. * -- Russ *

On Wed, Nov 24, 2010 at 1:52 PM, Daniel Fischer wrote:

...
On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote:

...
Cool. I wasn't aware of that notation. It doesn't quite get to the issue though.

The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need to change something buried within it, it seems necessary to rebuild everything that includes the changed thing.

In general, values are immutable, so you can't "change something buried within it". You have to build a new value containing some of the old stuff and a new part. Building the new value usually consists mostly of copying a couple of pointers (plus building the new part of course), so isn't too expensive normally.

You can have mutable values in the IO or (ST s) monads, if you need

On 2010-11-24, Russ Abbott wrote: them.

...
...
...
That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed,

The thing with the changed component is not x anymore.

...
and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements.

The compiler takes care of that.

...
Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf

objects,

...
You can't in general, the thing with a different property is a

different

...
object.

...
I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

Yes (well, not you, the compiler does it), except if your tree contains mutable objects (IORefs/STRefs for example).

...
One way to avoid that is for the leaves to refer to their objects through a Map. Then changing a leaf object requires only that the

value

...
associated with key represented by the leaf be (re-)inserted into the Map. The Tree itself need not change at all.

Oh, it will. If you change a Map, you get a new one, thus you get a new tree containing the new Map.

...
But that trick isn't always available. In the example we are talking about we can't make a Map where they keys are the instance variable names and the values are their values. That would seem to do the

job,

...
but since the values are of different types, we can't create such a Map.

So now what?

Well, what's the problem with the compiler copying some nodes? Normally, that doesn't cost very much performance, if it does in your case, we'd need to know more to suggest the best way to go.

...
* -- Russ *

-- mac

Russ Abbott

1:54 a.m.

Actually using a Map does solve the problem. The Map has to be kept at the level of the Tree rather than have each leaf node point to it. So instead of just a Tree one has, say (Map, Tree). Then when one wants to change the property of something associated with a leaf node, one can just change the map. The Tree is unchanged. * -- Russ* *** * ** On Wed, Nov 24, 2010 at 2:02 PM, Russ Abbott wrote:

...

OK. So putting a Map at Leaf nodes doesn't solve the problem. (Apparently I haven't been able to communicate what I see as the problem.)

The problem that I'm trying to get to is the need to write excessive code for something that would require a lot less code in an OO world. It's not a matter of execution time or space. It's a matter of the amount of code one is required to write. * -- Russ *

On Wed, Nov 24, 2010 at 1:52 PM, Daniel Fischer wrote:

...
On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote:

...
Cool. I wasn't aware of that notation. It doesn't quite get to the issue though.

The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need to change something buried within it, it seems necessary to rebuild everything that includes the changed thing.

In general, values are immutable, so you can't "change something buried within it". You have to build a new value containing some of the old stuff and a new part. Building the new value usually consists mostly of copying a couple of pointers (plus building the new part of course), so isn't too expensive normally.

You can have mutable values in the IO or (ST s) monads, if you need them.

...
That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed,

The thing with the changed component is not x anymore.

...
and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements.

The compiler takes care of that.

...
Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf objects,

You can't in general, the thing with a different property is a different object.

...
I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

Yes (well, not you, the compiler does it), except if your tree contains mutable objects (IORefs/STRefs for example).

...
One way to avoid that is for the leaves to refer to their objects through a Map. Then changing a leaf object requires only that the value associated with key represented by the leaf be (re-)inserted into the Map. The Tree itself need not change at all.

Oh, it will. If you change a Map, you get a new one, thus you get a new tree containing the new Map.

...
But that trick isn't always available. In the example we are talking about we can't make a Map where they keys are the instance variable names and the values are their values. That would seem to do the job, but since the values are of different types, we can't create such a Map.

So now what?

Well, what's the problem with the compiler copying some nodes? Normally, that doesn't cost very much performance, if it does in your case, we'd need to know more to suggest the best way to go.

...
* -- Russ *

Russ Abbott

2:58 a.m.

My previous two messages suggest a Haskell coding principle: distinguish between fixed and (quasi-)mutable data structures. (I know values don't change, but I hope you understand what I mean. The cage and the cell in my previous example are quasi-mutable. They are conceptually mutable in the context of the problem.) The fixed data structures can be organized any way one wants. The quasi-/conceptually-mutable elements should be referred to symbolically and stored in a Map. The maps themselves should be stored at a global level of the system's data structure so that it is easy to replace them when values change. * -- Russ * On Wed, Nov 24, 2010 at 5:54 PM, Russ Abbott wrote:

...

Actually using a Map does solve the problem. The Map has to be kept at the level of the Tree rather than have each leaf node point to it. So instead of just a Tree one has, say (Map, Tree). Then when one wants to change the property of something associated with a leaf node, one can just change the map. The Tree is unchanged. * -- Russ* *** * **

On Wed, Nov 24, 2010 at 2:02 PM, Russ Abbott wrote:

...
OK. So putting a Map at Leaf nodes doesn't solve the problem. (Apparently I haven't been able to communicate what I see as the problem.)

The problem that I'm trying to get to is the need to write excessive code for something that would require a lot less code in an OO world. It's not a matter of execution time or space. It's a matter of the amount of code one is required to write. * -- Russ *

On Wed, Nov 24, 2010 at 1:52 PM, Daniel Fischer
...
wrote:

...
On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote:

...
Cool. I wasn't aware of that notation. It doesn't quite get to the issue though.

The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need to change something buried within it, it seems necessary to rebuild everything that includes the changed thing.

In general, values are immutable, so you can't "change something buried within it". You have to build a new value containing some of the old stuff and a new part. Building the new value usually consists mostly of copying a couple of pointers (plus building the new part of course), so isn't too expensive normally.

You can have mutable values in the IO or (ST s) monads, if you need them.

...
That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed,

The thing with the changed component is not x anymore.

...
and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements.

The compiler takes care of that.

...
Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf objects,

You can't in general, the thing with a different property is a different object.

...
I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

Yes (well, not you, the compiler does it), except if your tree contains mutable objects (IORefs/STRefs for example).

...
One way to avoid that is for the leaves to refer to their objects through a Map. Then changing a leaf object requires only that the value associated with key represented by the leaf be (re-)inserted into the Map. The Tree itself need not change at all.

Oh, it will. If you change a Map, you get a new one, thus you get a new tree containing the new Map.

...
But that trick isn't always available. In the example we are talking about we can't make a Map where they keys are the instance variable names and the values are their values. That would seem to do the job, but since the values are of different types, we can't create such a

Map.

...
So now what?

Well, what's the problem with the compiler copying some nodes? Normally, that doesn't cost very much performance, if it does in your case, we'd need to know more to suggest the best way to go.

...
* -- Russ *

Patrick LeBoutillier

2:17 p.m.

Russ, If I understand correctly KenKen is something like Sudoku except that the (more complicated) "cages" constraints replace the usual "square" constraints. Have you seen these sudoku solvers: http://www.haskell.org/haskellwiki/Sudoku ? Maybe you can get ideas there on how to attack the problem in a more functional fashion. Patrick On Wed, Nov 24, 2010 at 9:58 PM, Russ Abbott wrote:

...

My previous two messages suggest a Haskell coding principle: distinguish between fixed and (quasi-)mutable data structures. (I know values don't change, but I hope you understand what I mean. The cage and the cell in my previous example are quasi-mutable. They are conceptually mutable in the context of the problem.) The fixed data structures can be organized any way one wants. The quasi-/conceptually-mutable elements should be referred to symbolically and stored in a Map. The maps themselves should be stored at a global level of the system's data structure so that it is easy to replace them when values change.

-- Russ

On Wed, Nov 24, 2010 at 5:54 PM, Russ Abbott wrote:

...
Actually using a Map does solve the problem. The Map has to be kept at the level of the Tree rather than have each leaf node point to it. So instead of just a Tree one has, say (Map, Tree). Then when one wants to change the property of something associated with a leaf node, one can just change the map. The Tree is unchanged.

-- Russ

On Wed, Nov 24, 2010 at 2:02 PM, Russ Abbott wrote:

...
OK. So putting a Map at Leaf nodes doesn't solve the problem. (Apparently I haven't been able to communicate what I see as the problem.) The problem that I'm trying to get to is the need to write excessive code for something that would require a lot less code in an OO world. It's not a matter of execution time or space. It's a matter of the amount of code one is required to write. -- Russ

On Wed, Nov 24, 2010 at 1:52 PM, Daniel Fischer wrote:

...
On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote:

...
Cool. I wasn't aware of that notation. It doesn't quite get to the issue though.

The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need to change something buried within it, it seems necessary to rebuild everything that includes the changed thing.

In general, values are immutable, so you can't "change something buried within it". You have to build a new value containing some of the old stuff and a new part. Building the new value usually consists mostly of copying a couple of pointers (plus building the new part of course), so isn't too expensive normally.

You can have mutable values in the IO or (ST s) monads, if you need them.

...
That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed,

The thing with the changed component is not x anymore.

...
and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements.

The compiler takes care of that.

...
Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf objects,

You can't in general, the thing with a different property is a different object.

...
I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

Yes (well, not you, the compiler does it), except if your tree contains mutable objects (IORefs/STRefs for example).

...
One way to avoid that is for the leaves to refer to their objects through a Map. Then changing a leaf object requires only that the value associated with key represented by the leaf be (re-)inserted into the Map. The Tree itself need not change at all.

Oh, it will. If you change a Map, you get a new one, thus you get a new tree containing the new Map.

...
But that trick isn't always available. In the example we are talking about we can't make a Map where they keys are the instance variable names and the values are their values. That would seem to do the job, but since the values are of different types, we can't create such a Map.

So now what?

Well, what's the problem with the compiler copying some nodes? Normally, that doesn't cost very much performance, if it does in your case, we'd need to know more to suggest the best way to go.

...
* -- Russ *

_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

-- ===================== Patrick LeBoutillier Rosemère, Québec, Canada

Russ Abbott

5:12 p.m.

Thanks, Patrick. I'm disappointed, though, that no one has actually responded to my question. It wasn't how to solve KenKen. It was how best to deal with quasi-mutable data structures. * -- Russ * On Thu, Nov 25, 2010 at 6:17 AM, Patrick LeBoutillier < patrick.leboutillier@gmail.com> wrote:

...

Russ,

If I understand correctly KenKen is something like Sudoku except that the (more complicated) "cages" constraints replace the usual "square" constraints.

Have you seen these sudoku solvers: http://www.haskell.org/haskellwiki/Sudoku ?

Maybe you can get ideas there on how to attack the problem in a more functional fashion.

Patrick

...
My previous two messages suggest a Haskell coding principle: distinguish between fixed and (quasi-)mutable data structures. (I know values don't change, but I hope you understand what I mean. The cage and the cell in my previous example are quasi-mutable. They are conceptually mutable in the context of the problem.) The fixed data structures can be organized any way one wants. The quasi-/conceptually-mutable elements should be referred to symbolically and stored in a Map. The maps themselves should be stored at a global level of the system's data structure so that it is easy to replace them when values change.

-- Russ

On Wed, Nov 24, 2010 at 5:54 PM, Russ Abbott wrote:

...
Actually using a Map does solve the problem. The Map has to be kept at

...
...
level of the Tree rather than have each leaf node point to it. So instead of just a Tree one has, say (Map, Tree). Then when one wants to change

...
...
property of something associated with a leaf node, one can just change

...
...
map. The Tree is unchanged.

-- Russ

On Wed, Nov 24, 2010 at 2:02 PM, Russ Abbott wrote:

...
OK. So putting a Map at Leaf nodes doesn't solve the problem. (Apparently I haven't been able to communicate what I see as the

...
...
...
The problem that I'm trying to get to is the need to write excessive code for something that would require a lot less code in an OO world. It's not a matter of execution time or space. It's a matter of the amount of code one is required to write. -- Russ

On Wed, Nov 24, 2010 at 1:52 PM, Daniel Fischer wrote:

...
On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote:

...
Cool. I wasn't aware of that notation. It doesn't quite get to the issue though.

The problem I'm concerned about is the need to define y in the first place. If one is chasing through a data structure and finds a need

to

...
...
change something buried within it, it seems necessary to rebuild everything that includes the changed thing.

In general, values are immutable, so you can't "change something buried within it". You have to build a new value containing some of the old stuff and a new part. Building the new value usually consists mostly of copying a couple of pointers (plus building the new part of course), so isn't too expensive normally.

You can have mutable values in the IO or (ST s) monads, if you need them.

...
That is, I can't change a component of somethingNew without creating y. The point is there's nothing about x that changed,

The thing with the changed component is not x anymore.

...
and there may be nothing about (var1 x) that changed, and there may be nothing about var11 . var1 $ x that changed, etc. Yet one is apparently forced to keep track of and reconstruct all those elements.

The compiler takes care of that.

...
Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf objects,

You can't in general, the thing with a different property is a different object.

...
I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

Yes (well, not you, the compiler does it), except if your tree contains mutable objects (IORefs/STRefs for example).

...
One way to avoid that is for the leaves to refer to their objects through a Map. Then changing a leaf object requires only that the value associated with key represented by the leaf be (re-)inserted into

On Wed, Nov 24, 2010 at 9:58 PM, Russ Abbott wrote: the the the problem.) the

...
...
...
...
...
Map. The Tree itself need not change at all.

Oh, it will. If you change a Map, you get a new one, thus you get a new tree containing the new Map.

...
But that trick isn't always available. In the example we are

talking

...
about we can't make a Map where they keys are the instance variable names and the values are their values. That would seem to do the job, but since the values are of different types, we can't create such a Map.

So now what?

Well, what's the problem with the compiler copying some nodes? Normally, that doesn't cost very much performance, if it does in your case, we'd need to know more to suggest the best way to go.

...
* -- Russ *

_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

-- ===================== Patrick LeBoutillier Rosemère, Québec, Canada

Ozgur Akgun

5:42 p.m.

On 25 November 2010 17:12, Russ Abbott wrote:

...

Thanks, Patrick. I'm disappointed, though, that no one has actually responded to my question. It wasn't how to solve KenKen. It was how best to deal with quasi-mutable data structures.

You gave an example about how your data structure look like, but I haven't seen an example case of "quasi-mutating" an expression of such data structures. If you give a concrete example, people can suggest better ways of doing it, it there are any. Best, Ozgur

Russ Abbott

5:53 p.m.

I gave this concrete example a couple of messages ago. Let's assume that instead of keeping for each cage the Operation and target number we pre-compute all the combinations of values that will satisfy the cage. For example, assume that in a 4x4 game, we have a cage that refers to cells C1 and C2 and that the cage requires those two cells to produce a value of 2 using division, Pre-computing the possibilities, we store [(1, 2), (2, 1), (2, 4), (4, 2)] in this cage. The cage points to the two cells it constrains, and each cell points to the cage that constrains it. When a cell (say C1) gets a value of (say) 2, we want to change the cage to be [(2, 1), (2, 4)]. But when that happens, it's necessary to change cell C2 to point to the new cage -- and of course since the cell was changed the construct that held that cell has to be changed also. It's a bit of a pain to write the code to do all that. This isn't to say that the code will take that long to run or that it will use excessive memory -- only that it feels like one is being forced to write code that one shouldn't have to write. * -- Russ * On Thu, Nov 25, 2010 at 9:42 AM, Ozgur Akgun wrote:

...

On 25 November 2010 17:12, Russ Abbott wrote:

...
Thanks, Patrick. I'm disappointed, though, that no one has actually responded to my question. It wasn't how to solve KenKen. It was how best to deal with quasi-mutable data structures.

You gave an example about how your data structure look like, but I haven't seen an example case of "quasi-mutating" an expression of such data structures. If you give a concrete example, people can suggest better ways of doing it, if there are any.

Best, Ozgur

Brent Yorgey

8:40 p.m.

On Thu, Nov 25, 2010 at 09:12:05AM -0800, Russ Abbott wrote:

...

Thanks, Patrick. I'm disappointed, though, that no one has actually responded to my question. It wasn't how to solve KenKen. It was how best to deal with quasi-mutable data structures. * -- Russ *

Someone did respond to your question, with a link to the data-accessor package on Hackage (you could also take a look at fclabels or lenses). Do those help address your issue? -Brent

...

On Thu, Nov 25, 2010 at 6:17 AM, Patrick LeBoutillier < patrick.leboutillier@gmail.com> wrote:

...
Russ,

If I understand correctly KenKen is something like Sudoku except that the (more complicated) "cages" constraints replace the usual "square" constraints.

Have you seen these sudoku solvers: http://www.haskell.org/haskellwiki/Sudoku ?

Maybe you can get ideas there on how to attack the problem in a more functional fashion.

Patrick

...
My previous two messages suggest a Haskell coding principle: distinguish between fixed and (quasi-)mutable data structures. (I know values don't change, but I hope you understand what I mean. The cage and the cell in my previous example are quasi-mutable. They are conceptually mutable in the context of the problem.) The fixed data structures can be organized any way one wants. The quasi-/conceptually-mutable elements should be referred to symbolically and stored in a Map. The maps themselves should be stored at a global level of the system's data structure so that it is easy to replace them when values change.

-- Russ

On Wed, Nov 24, 2010 at 5:54 PM, Russ Abbott wrote:

...
Actually using a Map does solve the problem. The Map has to be kept at

...
...
level of the Tree rather than have each leaf node point to it. So instead of just a Tree one has, say (Map, Tree). Then when one wants to change

...
...
property of something associated with a leaf node, one can just change

...
...
map. The Tree is unchanged.

-- Russ

On Wed, Nov 24, 2010 at 2:02 PM, Russ Abbott wrote:

...
OK. So putting a Map at Leaf nodes doesn't solve the problem. (Apparently I haven't been able to communicate what I see as the

...
...
...
The problem that I'm trying to get to is the need to write excessive code for something that would require a lot less code in an OO world. It's not a matter of execution time or space. It's a matter of the amount of code one is required to write. -- Russ

On Wed, Nov 24, 2010 at 1:52 PM, Daniel Fischer wrote:

...
On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote: > Cool. I wasn't aware of that notation. It doesn't quite get to the > issue though. > > The problem I'm concerned about is the need to define y in the first > place. If one is chasing through a data structure and finds a need

to

...
> change something buried within it, it seems necessary to rebuild > everything that includes the changed thing.

In general, values are immutable, so you can't "change something buried within it". You have to build a new value containing some of the old stuff and a new part. Building the new value usually consists mostly of copying a couple of pointers (plus building the new part of course), so isn't too expensive normally.

You can have mutable values in the IO or (ST s) monads, if you need them.

> That is, I can't change a > component of somethingNew without creating y. The point is there's > nothing about x that changed,

The thing with the changed component is not x anymore.

> and there may be nothing about (var1 x) > that changed, and there may be nothing about var11 . var1 $ x that > changed, etc. Yet one is apparently forced to keep track of and > reconstruct all those elements.

The compiler takes care of that.

> > Another example is to imagine a Tree in which the leaves contain > "objects." If I want to change a property of one of those leaf > objects,

You can't in general, the thing with a different property is a different object.

> I am forced to rebuild all the ancestor nodes of that leaf down to > rebuilding the root.

Yes (well, not you, the compiler does it), except if your tree contains mutable objects (IORefs/STRefs for example).

> > One way to avoid that is for the leaves to refer to their objects > through a Map. Then changing a leaf object requires only that the > value > associated with key represented by the leaf be (re-)inserted into

On Wed, Nov 24, 2010 at 9:58 PM, Russ Abbott wrote: the the the problem.) the

...
...
...
...
> Map. The Tree itself need not change at all.

Oh, it will. If you change a Map, you get a new one, thus you get a new tree containing the new Map.

> > But that trick isn't always available. In the example we are talking > about we can't make a Map where they keys are the instance variable > names and the values are their values. That would seem to do the job, > but since the values are of different types, we can't create such a > Map. > > So now what?

Well, what's the problem with the compiler copying some nodes? Normally, that doesn't cost very much performance, if it does in your case, we'd need to know more to suggest the best way to go.

> * > -- Russ * >

_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

-- ===================== Patrick LeBoutillier Rosemère, Québec, Canada

...

_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

Stephen Tetley

24 Nov 24 Nov

10:19 p.m.

On 24 November 2010 21:12, Russ Abbott wrote: [Trunc]

...

Another example is to imagine a Tree in which the leaves contain "objects." If I want to change a property of one of those leaf objects, I am forced to rebuild all the ancestor nodes of that leaf down to rebuilding the root.

Specifically for re-writing trees, attribute grammars are probably more concise than OO or functional methods. Haskell has an attribute grammar pre-processor UUAG available. http://hackage.haskell.org/package/uuagc Zippers too may or may not be a convenience for this sort of manipulation.

5342

Age (days ago)

5343

Last active (days ago)

List overview

Download

16 comments

8 participants

participants (8)

Brent Yorgey
Daniel Fischer
Daniel Schoepe
matthew coolbeth
Ozgur Akgun
Patrick LeBoutillier
Russ Abbott
Stephen Tetley