
Some time ago I stumbled upon SQLAlchemy which is a great ORM wrapper library for python. It has a nice syntax I'd like to see in a haskell library as well. SQLAlchemy already provides some lazy features such as loading subitems on access etc. All haskell SQL libraries I know only let you run SQL statements but not much more. To start real business you no longer want to write many SQL commands. Example why it matters: schools - 1:n - teachers - 1:n - pupils If you want to list all schools which have a pupil with age > 3 you'd write an sql query like this: SELECT dictinct * FROM schools as s JOIN teachers t ON (t.school_id = s.id) JOIN pupils as p ON (p.teacher_id = t.id) WHERE p.age > 3 in SQLAlchemy it looks like this: session.query(School).join(School.teachers).join(Teacher.pupils).filter(Pupil.age > 3).all() difference? Because SQLAlchemy knows about the relations you don't have to remember alias names. So there is no chance to get that wrong. Another example: Updating the age of a pupil: row = SELECT * FROM pupils where age = 13; UPDATE pupils SET age = 14 WHERE id = <the id you got above> p = session.query(Pupil).filter(Pupil.age==13).one().age=14 session.commit() difference? You don't have to care about ids. you just assign a new value and tell the engine that it should commit. So again less chances to get something wrong. What about trees (eg web site navigation) id | title | parent_id 1 | top | null 2 | submenu | 1 3 | submenu2 | 1 should result in top - submenu - submenu2 using SQLAlchemy you can just do parent = session.query('nodes').filter(Node.id = 1) def print(node): print node.title print node.subnodes # this will run a subquery automatically for you returning submenu{,2} Again no sql. No chance to get something wrong? You can skim the manual to get a better idea how SQLAlchemy works http://www.sqlalchemy.org/docs/05/sqlalchemy_0_5_5.pdf I have to admit that I haven't used SQLAlchemy in a real project yet. However I can imagine doing so. Comparing this to what we have on hackage I'd say some work has to be done to get close to SQLAlchemy. The backend doesn't have to be a relational database. However I'd like to use this kind of abstraction in haskell. Is there anyone interested in helping building a library which a) let's you define kind of model of you data b) let's you store you model in any backend (maybe a relational database) c) does static checking of your queries at compilation time? Right now I'd say the best way to go is define the model in the application and not get the scheme from an existing database because there is not way to store all scheme details within a relational model. I think SQLAlchemy does it right by providing a way to define the model in python. Of course haskell doesn't have "objects" to store. But GADTs could be stored (data Foo = ...) So are there any volunteers who are interested in helping writing this kind of storage solution for haskell which could be used in real world business apps? Maybe this does already exist and I've missed it? Marc Weber

I don't have a good answer to your question, but I'm curious of how
lazy loading of SQL-based records would work. It seems like having
another user of the database modify your record before you've loaded
it all could lead to an inconsistent record (assuming you've already
loaded and memoized some fields and not others).
On Tue, Jun 30, 2009 at 1:52 PM, Marc Weber
Some time ago I stumbled upon SQLAlchemy which is a great ORM wrapper library for python. It has a nice syntax I'd like to see in a haskell library as well.
SQLAlchemy already provides some lazy features such as loading subitems on access etc.
All haskell SQL libraries I know only let you run SQL statements but not much more. To start real business you no longer want to write many SQL commands.
Example why it matters: schools - 1:n - teachers - 1:n - pupils
If you want to list all schools which have a pupil with age > 3 you'd write an sql query like this:
SELECT dictinct * FROM schools as s JOIN teachers t ON (t.school_id = s.id) JOIN pupils as p ON (p.teacher_id = t.id) WHERE p.age > 3
in SQLAlchemy it looks like this: session.query(School).join(School.teachers).join(Teacher.pupils).filter(Pupil.age > 3).all()
difference? Because SQLAlchemy knows about the relations you don't have to remember alias names. So there is no chance to get that wrong.
Another example: Updating the age of a pupil:
row = SELECT * FROM pupils where age = 13; UPDATE pupils SET age = 14 WHERE id = <the id you got above>
p = session.query(Pupil).filter(Pupil.age==13).one().age=14 session.commit()
difference? You don't have to care about ids. you just assign a new value and tell the engine that it should commit. So again less chances to get something wrong.
What about trees (eg web site navigation)
id | title | parent_id 1 | top | null 2 | submenu | 1 3 | submenu2 | 1
should result in
top - submenu - submenu2
using SQLAlchemy you can just do
parent = session.query('nodes').filter(Node.id = 1)
def print(node): print node.title print node.subnodes # this will run a subquery automatically for you returning submenu{,2}
Again no sql. No chance to get something wrong?
You can skim the manual to get a better idea how SQLAlchemy works http://www.sqlalchemy.org/docs/05/sqlalchemy_0_5_5.pdf
I have to admit that I haven't used SQLAlchemy in a real project yet. However I can imagine doing so. Comparing this to what we have on hackage I'd say some work has to be done to get close to SQLAlchemy.
The backend doesn't have to be a relational database. However I'd like to use this kind of abstraction in haskell.
Is there anyone interested in helping building a library which a) let's you define kind of model of you data b) let's you store you model in any backend (maybe a relational database) c) does static checking of your queries at compilation time?
Right now I'd say the best way to go is define the model in the application and not get the scheme from an existing database because there is not way to store all scheme details within a relational model. I think SQLAlchemy does it right by providing a way to define the model in python.
Of course haskell doesn't have "objects" to store. But GADTs could be stored (data Foo = ...)
So are there any volunteers who are interested in helping writing this kind of storage solution for haskell which could be used in real world business apps?
Maybe this does already exist and I've missed it?
Marc Weber _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

It has to do with treating groups of records from a table like an object.
You have the object Employee, which consists of rows from the Person table,
the Account table and the Building table.
When you instantiate the object. if you don't ask to view the Employee's
building information, it doesn't bother to retrieve it.
In the case you mention, the data hasn't yet been loaded, and even if it
were, it can be loaded with traditional transactional semantics (read only,
read with possible write, write, etc)
On Tue, Jun 30, 2009 at 2:29 PM, Daniel Peebles
I don't have a good answer to your question, but I'm curious of how lazy loading of SQL-based records would work. It seems like having another user of the database modify your record before you've loaded it all could lead to an inconsistent record (assuming you've already loaded and memoized some fields and not others).
On Tue, Jun 30, 2009 at 1:52 PM, Marc Weber
wrote: Some time ago I stumbled upon SQLAlchemy which is a great ORM wrapper library for python. It has a nice syntax I'd like to see in a haskell library as well.
SQLAlchemy already provides some lazy features such as loading subitems on access etc.
All haskell SQL libraries I know only let you run SQL statements but not much more. To start real business you no longer want to write many SQL commands.
Example why it matters: schools - 1:n - teachers - 1:n - pupils
If you want to list all schools which have a pupil with age > 3 you'd write an sql query like this:
SELECT dictinct * FROM schools as s JOIN teachers t ON (t.school_id = s.id) JOIN pupils as p ON (p.teacher_id = t.id) WHERE p.age > 3
in SQLAlchemy it looks like this:
session.query(School).join(School.teachers).join(Teacher.pupils).filter(Pupil.age
3).all()
difference? Because SQLAlchemy knows about the relations you don't have to remember alias names. So there is no chance to get that wrong.
Another example: Updating the age of a pupil:
row = SELECT * FROM pupils where age = 13; UPDATE pupils SET age = 14 WHERE id = <the id you got above>
p = session.query(Pupil).filter(Pupil.age==13).one().age=14 session.commit()
difference? You don't have to care about ids. you just assign a new value and tell the engine that it should commit. So again less chances to get something wrong.
What about trees (eg web site navigation)
id | title | parent_id 1 | top | null 2 | submenu | 1 3 | submenu2 | 1
should result in
top - submenu - submenu2
using SQLAlchemy you can just do
parent = session.query('nodes').filter(Node.id = 1)
def print(node): print node.title print node.subnodes # this will run a subquery automatically for you returning submenu{,2}
Again no sql. No chance to get something wrong?
You can skim the manual to get a better idea how SQLAlchemy works http://www.sqlalchemy.org/docs/05/sqlalchemy_0_5_5.pdf
I have to admit that I haven't used SQLAlchemy in a real project yet. However I can imagine doing so. Comparing this to what we have on hackage I'd say some work has to be done to get close to SQLAlchemy.
The backend doesn't have to be a relational database. However I'd like to use this kind of abstraction in haskell.
Is there anyone interested in helping building a library which a) let's you define kind of model of you data b) let's you store you model in any backend (maybe a relational database) c) does static checking of your queries at compilation time?
Right now I'd say the best way to go is define the model in the application and not get the scheme from an existing database because there is not way to store all scheme details within a relational model. I think SQLAlchemy does it right by providing a way to define the model in python.
Of course haskell doesn't have "objects" to store. But GADTs could be stored (data Foo = ...)
So are there any volunteers who are interested in helping writing this kind of storage solution for haskell which could be used in real world business apps?
Maybe this does already exist and I've missed it?
Marc Weber _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- "The greatest obstacle to discovering the shape of the earth, the continents, and the oceans was not ignorance but the illusion of knowledge." - Daniel J. Boorstin

Hi Marc
Example why it matters: schools - 1:n - teachers - 1:n - pupils
If you want to list all schools which have a pupil with age > 3 you'd write an sql query like this:
SELECT dictinct * FROM schools as s JOIN teachers t ON (t.school_id = s.id) JOIN pupils as p ON (p.teacher_id = t.id) WHERE p.age > 3
in SQLAlchemy it looks like this: session.query(School).join(School.teachers).join(Teacher.pupils).filter(Pupil.age > 3).all()
difference? Because SQLAlchemy knows about the relations you don't have to remember alias names. So there is no chance to get that wrong.
I do not get this explanation, could you expand? I would have thought it should be: "difference? Because SQLAlchemy knows about the relationships (not relations, but relation_ships_), it do not have to explicitly join on foreign keys.". Actually SQL has natural joins, where you can do without explicit join conditions. Unfortunately, natural joins seems like they were explicitly designed to create trouble. It would be nice if "they" fixed SQL to consider relationships. Greetings, Mads Lindstrøm

Mads Lindstrøm
I do not get this explanation, could you expand? I would have thought it should be: "difference? Because SQLAlchemy knows about the relationships (not relations, but relation_ships_), it do not have to explicitly join on foreign keys.".
I think this is a common terminology issue - lots of people think "relational" refers linking tables together (foreign keys), and not the mathematical definition of a relation as a subset of T1 x T2 x ... x Tn -- i.e. a table. -k -- If I haven't seen further, it is by standing in the footprints of giants

Hi Marc Weber
Another example: Updating the age of a pupil:
row = SELECT * FROM pupils where age = 13; UPDATE pupils SET age = 14 WHERE id = <the id you got above>
p = session.query(Pupil).filter(Pupil.age==13).one().age=14 session.commit()
difference? You don't have to care about ids. you just assign a new value and tell the engine that it should commit. So again less chances to get something wrong.
Could you not do in SQL: UPDATE pupils SET age = 14 WHERE age = 13 That is, without using ids. Greetings, Mads Lindstrøm

Hi Marc Weber
Another example: Updating the age of a pupil:
row = SELECT * FROM pupils where age = 13; UPDATE pupils SET age = 14 WHERE id = <the id you got above>
p = session.query(Pupil).filter(Pupil.age==13).one().age=14 session.commit()
difference? You don't have to care about ids. you just assign a new value and tell the engine that it should commit. So again less chances to get something wrong.
Could you not do in SQL:
UPDATE pupils SET age = 14 WHERE age = 13 Of course. But: you can pass around that pupil object to another function and still assign a new age
Hi Mads! On Tue, Jun 30, 2009 at 11:49:40PM +0200, Mads Lindstrøm wrote: then run session.commit(). When passing around the pupile you can follow the relation_ships (relations?) back to school. def doSomething(pupil): pupil['age'] = 13 pupil.teacher.school.rating += 1 doSomething(session.query(Pupil).filter(Pupil.age==13)) session.commit() Now how would you do this using SQL? Sorry about the confustion (relation / relation-ship). I mixed up the terminology. Anyway I guess you can see here how powerful an ORM can be and why we should write such a library for haskell. I think it's very hard to invent such a short synax in haskell cause you have to take monads into account etc.. And it matters how much time you have to spend writing code. Thanks for your feedback. I hope there will be some more. Marc Weber

Hi Marc Weber
Hi Mads!
Hi Marc Weber
Another example: Updating the age of a pupil:
row = SELECT * FROM pupils where age = 13; UPDATE pupils SET age = 14 WHERE id = <the id you got above>
p = session.query(Pupil).filter(Pupil.age==13).one().age=14 session.commit()
difference? You don't have to care about ids. you just assign a new value and tell the engine that it should commit. So again less chances to get something wrong.
Could you not do in SQL:
UPDATE pupils SET age = 14 WHERE age = 13 Of course. But: you can pass around that pupil object to another function and still assign a new age
On Tue, Jun 30, 2009 at 11:49:40PM +0200, Mads Lindstrøm wrote: then run session.commit(). When passing around the pupile you can follow the relation_ships (relations?) back to school.
def doSomething(pupil): pupil['age'] = 13 pupil.teacher.school.rating += 1
doSomething(session.query(Pupil).filter(Pupil.age==13)) session.commit()
Now how would you do this using SQL?
As far as I know, you cannot. And it is very nice. On the other hand you sometimes want to execute more of the logic on the DBMS, as it leads to better performance. But maybe we can somehow have our cake and eat it too. E.g. if the user still have some control about where the logic is executed.
Sorry about the confustion (relation / relation-ship). I mixed up the terminology. Anyway I guess you can see here how powerful an ORM can be and why we should write such a library for haskell.
Don't be sorry about that. I know people often confuse the terms and I could have replied just asking if you had not swapped the two terms. In my native language relation can mean both relation and to relationship. Guess, it is the same in other languages...
I think it's very hard to invent such a short synax in haskell cause you have to take monads into account etc..
And it matters how much time you have to spend writing code.
Yes and yes. I think (my gut tells me so) you will need to use Template Haskell, if you want something as succinct as the Python code you showed. Or maybe if you give up type safety, but I guess your are not willing to do that. But it could be fun (and challenging) coming up with something really nice.
Thanks for your feedback. I hope there will be some more.
Marc Weber
I may have sounded a bit negative in my previous mails. But I really can see something cool about what you describe. That said, I think people are sometimes too eager to replace SQL. As you properly are already aware, SQL+Relational databases has some very nice properties (list below is form the top of my head, there are other advantages): * They are accessed with a declarative language (SQL) * They can make high-level optimization automatically and guided by the user * They can, transparently, execute queries/updates using multiple servers/CPUs. It may require some changes to the database, but it can be done without changing your SQL * They are based on a nice theoretical foundation * If databases are normalized properly, data are a lot more transparent than other ways of organizing data. At least other ways I have seen. * Normalization provides a lot less ambiguous guidance, than other "development methodologies". * Transaction support But as you point out yourself, everything is not rosy. And in addition to what you write, all the (un)marshaling you need when accessing databases from Haskell is quite cumbersome. And I realize that you are not trying to replace RDBs, just building a nicer interface to them. I am just concerned that some of the nice properties are lost in the process. I think my main concern comes from seeing people create databases, by automatically generating tables from OO-classes. They invariably ends up with something not nearly as nice, as if they had constructed the database in a more traditional fashion. To summarize, what you propose is cool. Just do not throw the baby out with the bathwater. Greetings, Mads Lindstrøm

And I realize that you are not trying to replace RDBs, just building a nicer interface to them. I am just concerned that some of the nice properties are lost in the process. I think my main concern comes from seeing people create databases, by automatically generating tables from OO-classes. They invariably ends up with something not nearly as nice, as if they had constructed the database in a more traditional fashion.
To summarize, what you propose is cool. Just do not throw the baby out with the bathwater.
Hi Mad. Maybe I want to replace a RDBMS. But this will cost. You can fire arbitrary SQL statements in no time at RDBMS. You can't do things like that that easy using a haskell (in memory) only solutions. So SQL wins here at the moment if you want to be productive. If you read the .pdf I posted you saw that SQLAlchemy is that coold that a) you can use the shortcut style and define objects and tables at the same time but you als can b) define both layers independent of each other. And: Most applications today don't have to scale to an extend forcing you to move all logic into a database system. If you can one request a sec on a web application .. That's already very much for most small shop system. (I'm not talking about Amazon or ebay or such!) I'm talking about customers who want some more customizations as you can do with oscommerce. My main purpose was to see wether someone else would be interested in spending more effort into such a solution. Anyway I have still have to do some other work before I can jump into such a project. And Mad, I think you already know that RDBMS are not the best solution because it's very hard to ask a RDBMS wether a query retuns a nullable value or not. And that makes a huge difference wether you have to use a unsafeFromNull like function all the time or not. Time will tell how much energy I can spend on such this topic in the future. cu all Marc Weber

At Tue, 30 Jun 2009 19:52:08 +0200, Marc Weber wrote:
Is there anyone interested in helping building a library which a) let's you define kind of model of you data b) let's you store you model in any backend (maybe a relational database) c) does static checking of your queries at compilation time?
Maybe this does already exist and I've missed it?
Happstack-state is close in many respects. Your persistent state can basically be any haskell data type for which you could create instances of Read/Show where, read . show == id. (Note, happstack state doesn't actually use Read/Show, but it does use a similar class with similar restrictions). Your queries are just normal functions in the Reader or State monad. Reader if you only want to read the data, State if you want to read and update the data. So, you don't have to learn any special query language or DSL. And you get all your favorite (pure) haskell functions, static type checking, etc. If you want a relation type storage you can use happstack-ixset. You also get write-ahead logging and checkpointing for recovery, data migration, and more. Additionaly, there is multimater support (though not quite suitable for real deployment yet), and plans for sharding support. It does not currently provide a mechanism for mapping the data to a relational database. In theory it would be possible to write a layer that mapped the current state into a SQL database. That would allow other programs to use SQL to read the values. However, it would not be able to support other apps writing updates to the database. - jeremy

Hey Marc, On 30 jun 2009, at 19:52, Marc Weber wrote:
Is there anyone interested in helping building a library which a) let's you define kind of model of you data b) let's you store you model in any backend (maybe a relational database) c) does static checking of your queries at compilation time?
[...]
Maybe this does already exist and I've missed it?
I've something working that sort of does this. You define your model in the following way: data User = User {name :: String, password :: String, age :: Int, post :: BelongsTo Post} data Post = Post {title :: String, body :: String} Then there's some boilerplate code (that ultimately will be generated by TH), and from that moment on you can do things like this: test = do conn <- connectSqlite3 "example.sqlite3" runDB conn $ do user <- fromJust <$> find typeUser 1 user' <- fillBelongsTo user relPost return (post user') By default, no relations will be fetched, but by doing the fillBelongsTo the user will be updated. I currently have support for new, update and find. All of this code is very alpha, and only works using HDBC and Sqlite3, but still. You can find and fork my code on http://github.com/chriseidhof/ generics. I'll be happy to answer any questions about the code or the ideas behind it. -chris

Chris Eidhof wrote:
I've something working that sort of does this. You define your model in the following way:
data User = User {name :: String, password :: String, age :: Int, post :: BelongsTo Post} data Post = Post {title :: String, body :: String}
Then there's some boilerplate code (that ultimately will be generated by TH), and from that moment on you can do things like this:
test = do conn <- connectSqlite3 "example.sqlite3" runDB conn $ do user <- fromJust <$> find typeUser 1 user' <- fillBelongsTo user relPost return (post user')
By default, no relations will be fetched, but by doing the fillBelongsTo the user will be updated. I currently have support for new, update and find. All of this code is very alpha, and only works using HDBC and Sqlite3, but still.
So in this example, both user and user' are of type User, but if I ask for "post user", this is undefined? I have done something similar as you, except that I filled the related field with an unsafePerformIO fetching the related data from the database. Regards, -- Jochem Berndsen | jochem@functor.nl GPG: 0xE6FABFAB

On 3 jul 2009, at 11:28, Jochem Berndsen wrote:
Chris Eidhof wrote:
I've something working that sort of does this. You define your model in the following way:
data User = User {name :: String, password :: String, age :: Int, post :: BelongsTo Post} data Post = Post {title :: String, body :: String}
Then there's some boilerplate code (that ultimately will be generated by TH), and from that moment on you can do things like this:
test = do conn <- connectSqlite3 "example.sqlite3" runDB conn $ do user <- fromJust <$> find typeUser 1 user' <- fillBelongsTo user relPost return (post user')
By default, no relations will be fetched, but by doing the fillBelongsTo the user will be updated. I currently have support for new, update and find. All of this code is very alpha, and only works using HDBC and Sqlite3, but still.
So in this example, both user and user' are of type User, but if I ask for "post user", this is undefined? I have done something similar as you, except that I filled the related field with an unsafePerformIO fetching the related data from the database.
No, it will never be undefined. The BelongsTo datatype is defined as following: data BelongsTo a = BTNotFetched | BTId Int | BTFetched (Int, a) So either there is no information available (BTNotFetched), we know the id (the foreign key, BTId) or we know the id and the value (BTFetched). It is currently just a proof of concept, but for me, an extend version of this will do. I think that almost every mapping from Haskell datatypes to a RDBMS will be slightly awkward, this is my way to find a balance ;) -chris

And I realize that you are not trying to replace RDBs, just building a
nicer interface to them. I am just concerned that some of the nice properties are lost in the process. I think my main concern comes from seeing people create databases, by automatically generating tables from OO-classes. They invariably ends up with something not nearly as nice, as if they had constructed the database in a more traditional fashion.
for web applications in the Internet, due to security reasons, 99% of the databases are handled exlusively by te web application. This increases the arguments in favor of spending less time in database design. Moreover, since there are no concurrent updates from different applications, (communication with other applications are done trough the middle tier of the web application rather than trough the database), the database just provides transaction coherence (for the single application) and storage. Then it is much faster to perform transactions in the application trough STM and leave the database for storage purposes. At this time the database can be substituted with advantage by files. All of this gives credit to ORM solutions and HappStack or SQLalchemy, Database design and maintenance don't worth the pain in this scenario. My package TCache http://hackage.haskell.org/package/TCache is made also around this philosophy. If the ORM has an interface such is SQLalchemy, it would be nice to have two "drivers" one for pure SQL databases, where all the primitives would be executed in the database, and other pure Haskell where the primitives are executed in memory. For example, the transactions would be executed trough STM. This driver would have configurable persistence (either in files, manualy designed databases or whatever) . Perhaps mixed drivers can be added later. This would unite the best of both worlds. It would be flexible enough to permit the change of scenario without breaking the code. This would be nice for prototyping for example.
participants (9)
-
Alberto G. Corona
-
Chris Eidhof
-
Daniel Peebles
-
Jeremy Shaw
-
Jochem Berndsen
-
Ketil Malde
-
Mads Lindstrøm
-
Marc Weber
-
Rick R