Re: [Haskell-cafe] relational data representation in memory using haskell?

22 May 2008

      On Thu, May 22, 2008 at 08:16:54AM +0200, Salvatore Insalaco wrote:
...
2008/5/22 Marc Weber :
...
I'd like to illustrate two different ideas using a small example:
(A)
       data CD = CD { title :: String, tracks :: [ Track ] }
       data Track = Track { track :: String, cd :: CD }
       data PDB = PDB { cds :: Set CD, tracks :: Set Track }
because it's not using foreign ids but kind of pointers I'll call this
the pointer method
This doesn't look like a relational structure at all in Haskell.
Let's take the CD and Track relations. In a relational database you
have something like:
CD (1, 'Querying about you')
Track (1, 'Inserting some love', 1)
Track (2, 'Updating my feelings', 1)
Track (3, 'Deleting my hopes', 1)
In an imperative language you can do something similar in memory using
objects (you can in haskell to with IORefs and so on, but let's stay
on "data"). You get something like:
0x000 CD('Querying about you')
0x004 Track('Inserting some love, 0x004)
You are right. But ghc does represent those things as pointers.. :)
So it is indeed unless you ask ghc to not use (eg by using strict fields
etc), correct? IORefs are not that good because you can't read them
within STM. But you are right: Using IORefs was my first idea.
...
similar to a CD, you are storing a *value* (low level you probably
have a pointer, but you have not pointer semantics). As you noticed,
you cannot "update" the CD title without changing each Track. That's a
way to store information, and a good way too, but it's not a
relational structure by any extent.
I agree it's not in general and higly implementation dependant
Maybe I'm totally wrong. I imagine ghc having some internal
representation of data types which come close to
struct CD {
  Track * tracks;
  title ** char;
} cd;

struct Track {
  cd * CD;
  title ** char;
} track;

(does'nt compile but you get the idea. My C knowldge is good enough for
reading only).

So if you start seeing the whole database as list of connected structs
vio pointers adding / deleting/ inserting is quite the same as adding
some nodes to a Data.Map. You replace the nodes you have to replace and
finally get a poiter pointing to te new database state. As long as you
don't loose the original "pointer" you can easily rollpack.
Consider 

let x = Cd ...
forkIO $ ( do something with x } -- (1)
print x -- (2) 

How can ghc know when running line (2) that (1) hasen't changed the
record? I see two solutions:
a) give the forked process a copy (Then my design will collapse)
   but this is expensive to copy data without knowing you ned to
b) use pointers and replace x ony on updating. Thus if (1) changes the
   title a new struct wil be created poiting to the old list but a new
   title String. line (2) doesn't have to care at all.
I'm guessing that analyzing a) when values have to be copied is
complicated - thus b) is implemented ? quicksilver has told me so as
well - But I'm not sure that's why I'm asking.
...
If you want to use "normal" Haskell ADT, are you sure that a
relational storage is what you want? Keeping that in memory doesn't
give you some advantages of relational databases (e.g. uniform
representation), and the impedance between the functional and the
relational world is not easy to manage.
Maybe I misunderstood what you are trying to accomplish, and you only
want to do a generic data structure with fast lookups on the content
of the items?
Exactly.. Of course I only want a generic data structure with fast
lookups and content of items.. But I don't want to write the insert
delete and update functions for each table again and again..
Does this already exist?
Marc Weber