Sql join feature in persistent

Hi all, I'm new to this mailing list. I did web development in python with Django framework before. But I really like the haskell language, so considering building my next site (a LBS-SN like service) in haskell. I learn snap for a while and think it's not so mature and powerful. After I learned the yesod docs in one day and read some yesod and haskellers.com codes in the following few days, I appreciate it and try to use it seriously. Now I need a database query across two table, so as to sql join feature. Currently, persistent doesn't provide this functionality. I think maybe adding a function to PersistBackend class, and implementing it in PersistBackend (SqlPersist m) may solve the problem. However, I am not sured. Will it be difficult and consistent to implement this feature? Thanks for any help. Sincerely, -- James Deng department of computer science, school of information science & technology, Sun-yat-sen University, Guangzhou, China mailTo: cnJamesDeng@gmail.com homepage: http://cnjdeng.appspot.com

Hi James,
It's funny you mention this now. I recently got an email from Антон
Чешков about this very topic. Hopefully he can share the code he sent
me with the list. The one improvement that we're planning on making is
to write it using enumerators instead of lists so that the memory
requirements are lower.
For now, however, you might have noticed that the Haskellers code does
a number of "pseudo-joins" all over the place, which currently works
well enough. Антон's suggested code basically automates a lot of this
process.
Michael
On Tue, Feb 22, 2011 at 11:25 AM, James
Hi all,
I'm new to this mailing list. I did web development in python with Django framework before. But I really like the haskell language, so considering building my next site (a LBS-SN like service) in haskell. I learn snap for a while and think it's not so mature and powerful. After I learned the yesod docs in one day and read some yesod and haskellers.com codes in the following few days, I appreciate it and try to use it seriously.
Now I need a database query across two table, so as to sql join feature. Currently, persistent doesn't provide this functionality. I think maybe adding a function to PersistBackend class, and implementing it in PersistBackend (SqlPersist m) may solve the problem. However, I am not sured. Will it be difficult and consistent to implement this feature?
Thanks for any help.
Sincerely, -- James Deng department of computer science, school of information science & technology, Sun-yat-sen University, Guangzhou, China mailTo: cnJamesDeng@gmail.com homepage: http://cnjdeng.appspot.com
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

I strongly disagree. Because Persistent doesn't do joins, it happens to be a good candidate for a "web-scale" horizontally sharded solution, which I'll get to in a minute. Its perfectly easy to perform joins on the app-server in haskell code rather than relying on the database to do them. Facebook does a lot of this. We do this. Note that its much easier to replicate app-servers than DB servers. About making persistent-webscale, there are two features whose additions would basically accomplish this. Michael, I am thinking of offering a small bounty if you want to put them together. One is sharding. In the mkPersist function, if you could add a partition function which is somethng like: PersistEntity e => e -> SqlConnection and then alter the PersistBackend instance to respect that function, you get sharding almost for free. Its a fair bit more complicated than I am describing here but I believe its very possible. joins would make this inordinately more complicated. Also, integrated memcached. It would be great if you could annotate tables to store in memcached. when you set up your connection, optionally include a memcached connection. then persistent would check in each select, get, getBy call if the value is in memcached before going to the DB. I'm not very familiar with persistent internals so I could be wrong, but I think these shouldn't be too hard to add. </rant> Max On Feb 22, 2011, at 5:37 PM, Michael Snoyman wrote:
Hi James,
It's funny you mention this now. I recently got an email from Антон Чешков about this very topic. Hopefully he can share the code he sent me with the list. The one improvement that we're planning on making is to write it using enumerators instead of lists so that the memory requirements are lower.
For now, however, you might have noticed that the Haskellers code does a number of "pseudo-joins" all over the place, which currently works well enough. Антон's suggested code basically automates a lot of this process.
Michael
On Tue, Feb 22, 2011 at 11:25 AM, James
wrote: Hi all,
I'm new to this mailing list. I did web development in python with Django framework before. But I really like the haskell language, so considering building my next site (a LBS-SN like service) in haskell. I learn snap for a while and think it's not so mature and powerful. After I learned the yesod docs in one day and read some yesod and haskellers.com codes in the following few days, I appreciate it and try to use it seriously.
Now I need a database query across two table, so as to sql join feature. Currently, persistent doesn't provide this functionality. I think maybe adding a function to PersistBackend class, and implementing it in PersistBackend (SqlPersist m) may solve the problem. However, I am not sured. Will it be difficult and consistent to implement this feature?
Thanks for any help.
Sincerely, -- James Deng department of computer science, school of information science & technology, Sun-yat-sen University, Guangzhou, China mailTo: cnJamesDeng@gmail.com homepage: http://cnjdeng.appspot.com
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

From my point of view the main problem is effective scale join algorithm. For example i have two datasets "A" and "B". I want join its by some rule and take only first "N" results. In this case i do not want serve all datasets entirely. I would like to make as little as
We all want to achieve the same goal make data join in application layer.
possible.
I think this algorithms are exist.
2011/2/22 Max Cantor
I strongly disagree.
Because Persistent doesn't do joins, it happens to be a good candidate for a "web-scale" horizontally sharded solution, which I'll get to in a minute.
Its perfectly easy to perform joins on the app-server in haskell code rather than relying on the database to do them. Facebook does a lot of this. We do this. Note that its much easier to replicate app-servers than DB servers.
About making persistent-webscale, there are two features whose additions would basically accomplish this. Michael, I am thinking of offering a small bounty if you want to put them together.
One is sharding. In the mkPersist function, if you could add a partition function which is somethng like: PersistEntity e => e -> SqlConnection and then alter the PersistBackend instance to respect that function, you get sharding almost for free. Its a fair bit more complicated than I am describing here but I believe its very possible. joins would make this inordinately more complicated.
Also, integrated memcached. It would be great if you could annotate tables to store in memcached. when you set up your connection, optionally include a memcached connection. then persistent would check in each select, get, getBy call if the value is in memcached before going to the DB.
I'm not very familiar with persistent internals so I could be wrong, but I think these shouldn't be too hard to add.
</rant>
Max
On Feb 22, 2011, at 5:37 PM, Michael Snoyman wrote:
Hi James,
It's funny you mention this now. I recently got an email from Антон Чешков about this very topic. Hopefully he can share the code he sent me with the list. The one improvement that we're planning on making is to write it using enumerators instead of lists so that the memory requirements are lower.
For now, however, you might have noticed that the Haskellers code does a number of "pseudo-joins" all over the place, which currently works well enough. Антон's suggested code basically automates a lot of this process.
Michael
On Tue, Feb 22, 2011 at 11:25 AM, James
wrote: Hi all,
I'm new to this mailing list. I did web development in python with Django framework before. But I really like the haskell language, so considering building my next site (a LBS-SN like service) in haskell. I learn snap for a while and think it's not so mature and powerful. After I learned the yesod docs in one day and read some yesod and haskellers.com codes in the following few days, I appreciate it and try to use it seriously.
Now I need a database query across two table, so as to sql join feature. Currently, persistent doesn't provide this functionality. I think maybe adding a function to PersistBackend class, and implementing it in PersistBackend (SqlPersist m) may solve the problem. However, I am not sured. Will it be difficult and consistent to implement this feature?
Thanks for any help.
Sincerely, -- James Deng department of computer science, school of information science & technology, Sun-yat-sen University, Guangzhou, China mailTo: cnJamesDeng@gmail.com homepage: http://cnjdeng.appspot.com
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
-- Best regards, Cheshkov Anton Phone: +7 909 005 18 82 Phone: +7 931 511 47 37 Skype: cheshkov_anton

Agree. As long as its done in the app-level, it doesn't affect scaling. max On Feb 22, 2011, at 7:19 PM, Антон Чешков wrote:
We all want to achieve the same goal make data join in application layer. From my point of view the main problem is effective scale join algorithm. For example i have two datasets "A" and "B". I want join its by some rule and take only first "N" results. In this case i do not want serve all datasets entirely. I would like to make as little as possible. I think this algorithms are exist.
2011/2/22 Max Cantor
I strongly disagree. Because Persistent doesn't do joins, it happens to be a good candidate for a "web-scale" horizontally sharded solution, which I'll get to in a minute.
Its perfectly easy to perform joins on the app-server in haskell code rather than relying on the database to do them. Facebook does a lot of this. We do this. Note that its much easier to replicate app-servers than DB servers.
About making persistent-webscale, there are two features whose additions would basically accomplish this. Michael, I am thinking of offering a small bounty if you want to put them together.
One is sharding. In the mkPersist function, if you could add a partition function which is somethng like: PersistEntity e => e -> SqlConnection and then alter the PersistBackend instance to respect that function, you get sharding almost for free. Its a fair bit more complicated than I am describing here but I believe its very possible. joins would make this inordinately more complicated.
Also, integrated memcached. It would be great if you could annotate tables to store in memcached. when you set up your connection, optionally include a memcached connection. then persistent would check in each select, get, getBy call if the value is in memcached before going to the DB.
I'm not very familiar with persistent internals so I could be wrong, but I think these shouldn't be too hard to add.
</rant>
Max
On Feb 22, 2011, at 5:37 PM, Michael Snoyman wrote:
Hi James,
It's funny you mention this now. I recently got an email from Антон Чешков about this very topic. Hopefully he can share the code he sent me with the list. The one improvement that we're planning on making is to write it using enumerators instead of lists so that the memory requirements are lower.
For now, however, you might have noticed that the Haskellers code does a number of "pseudo-joins" all over the place, which currently works well enough. Антон's suggested code basically automates a lot of this process.
Michael
On Tue, Feb 22, 2011 at 11:25 AM, James
wrote: Hi all,
I'm new to this mailing list. I did web development in python with Django framework before. But I really like the haskell language, so considering building my next site (a LBS-SN like service) in haskell. I learn snap for a while and think it's not so mature and powerful. After I learned the yesod docs in one day and read some yesod and haskellers.com codes in the following few days, I appreciate it and try to use it seriously.
Now I need a database query across two table, so as to sql join feature. Currently, persistent doesn't provide this functionality. I think maybe adding a function to PersistBackend class, and implementing it in PersistBackend (SqlPersist m) may solve the problem. However, I am not sured. Will it be difficult and consistent to implement this feature?
Thanks for any help.
Sincerely, -- James Deng department of computer science, school of information science & technology, Sun-yat-sen University, Guangzhou, China mailTo: cnJamesDeng@gmail.com homepage: http://cnjdeng.appspot.com
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
-- Best regards, Cheshkov Anton Phone: +7 909 005 18 82 Phone: +7 931 511 47 37 Skype: cheshkov_anton
participants (4)
-
James
-
Max Cantor
-
Michael Snoyman
-
Антон Чешков