Finally tagless and abstract relational Algebra

Hi, I'm still trying to build an *abstract* Relation Algebra using the finally tagless style. My guess is that finally tagless style allows one to create a syntax without any initial dependency to an implementation. Ie. once one has created the syntax in this style one can then proceed to construct terms. So this is my goal, create a syntax for relational algebra, express an abstract relational algorithm in this syntax and worry about implementation / compilation / evaluation *later*. But at least being able to express a correctly typed term. I presume I will need to employ HList at some point, but I'm not entirely certain where. Will I need it at the very beginning, as constrains in the syntax so that only correct abstract terms can be built, or will I need it in on of the interpreters / compilers later? Günther

Günther Schmidt wrote:
My guess is that finally tagless style allows one to create a syntax without any initial dependency to an implementation. Ie. once one has created the syntax in this style one can then proceed to construct terms. Yes.
So this is my goal, create a syntax for relational algebra, express an abstract relational algorithm in this syntax and worry about implementation / compilation / evaluation *later*. But at least being able to express a correctly typed term. Good plan. But syntax design is hard, whatever style you choose.
I presume I will need to employ HList at some point, but I'm not entirely certain where. Will I need it at the very beginning, as constrains in the syntax so that only correct abstract terms can be built, or will I need it in on of the interpreters / compilers later? You will not need HList for constraining the syntax -- Haskell's type system should already provide all you need to constrain the syntax. In fact, in tagless final style (rather than in initial style), for the lambda calculus you don't even need GADTs to deal with exotic terms!
Why do you think you'll have to use HList? While HList is great, in this particular instance, I don't think you'll need it. Jacques

Hi Jacques, well in short my post is supposed to pretty much lay bare my lack of understanding of the problem I try to solve, with the hope that someone is willing to fill the gaps. I do know that I could express my algorithms via list-comprehension or in a List Monad, all using tuples. And that would be concrete and grossly inefficient. But I also wouldn't be able to express an incorrectly typed term *thanks* to using tuples. So how would it be possible to express selecting /field/ b from /record/ x and field c from record y, creating record z, while making sure that record x does have field b and record y does have field c? I mean design a syntax for it? Günther Am 28.12.09 16:15, schrieb Jacques Carette:
Günther Schmidt wrote:
My guess is that finally tagless style allows one to create a syntax without any initial dependency to an implementation. Ie. once one has created the syntax in this style one can then proceed to construct terms.
Yes.
So this is my goal, create a syntax for relational algebra, express an abstract relational algorithm in this syntax and worry about implementation / compilation / evaluation *later*. But at least being able to express a correctly typed term.
Good plan. But syntax design is hard, whatever style you choose.
I presume I will need to employ HList at some point, but I'm not entirely certain where. Will I need it at the very beginning, as constrains in the syntax so that only correct abstract terms can be built, or will I need it in on of the interpreters / compilers later?
You will not need HList for constraining the syntax -- Haskell's type system should already provide all you need to constrain the syntax. In fact, in tagless final style (rather than in initial style), for the lambda calculus you don't even need GADTs to deal with exotic terms!
Why do you think you'll have to use HList? While HList is great, in this particular instance, I don't think you'll need it.
Jacques

I do know that I could express my algorithms via list-comprehension or in a List Monad, all using tuples. And that would be concrete and grossly inefficient. You should probably tell us what these algorithms accomplish, rather
Günther Schmidt wrote: than how one implementation goes. From a higher-level view of what you're trying to do [but not as high as saying 'implement abstract relational algebra'], it will be easier to give concrete advice.
So how would it be possible to express selecting /field/ b from /record/ x and field c from record y, creating record z, while making sure that record x does have field b and record y does have field c? I mean design a syntax for it? Perhaps you should tell us why you think you need records at all, and record sub-typing to boot. You might well be right, but the higher-level requirements will have a much bigger influence on the design than anything else.
Jacques

Dear Jacques, I'll try to explain by a concrete example and what I'm hoping to achieve. my app imports 4 CSV files, generated from a hospital's IT system and nationally standardized (Germany). Theses 4 files contain records of how much money the hospital received for a patient, when he was initial admitted, what kind of procedures a patient had and when, his transfers and stays from departments within the hospital and his initial admission date. The common key in all these different kind of records is a patient's case-id. From all this information and more I calculate through a very complicated scheme a departments share of the revenue created for a individual patient. Well not even patients as such but rather hospital-case. The scheme is rather complicated and a bit of a moving target, because the auditors in a hospital do have the possibility to redistribute shares from one department to another. For instance revenue shares on surgical procedures may be redistributed from a non-surgical department to a surgical one and so on. Initially I had simply imported the CSV files into empty tables in a database and done the calculations directly in SQL, never ever again! The algorithms for looking up values in one table and matching them with the next, and sometimes creating new values, for instance figuring out in which department a patient had which procedures, those I'd like to express concisely and abstractly. Because I may or may not choose to evaluate the algorithm then to in-memory haskell code, using Maps or what-have-you or to compile to SQL, or whatever. But my 1st goal here is to express the algorithm. I figure if I have to change the abstract algorithm due to new requirements (but not the syntax), the concrete and probably more elaborate implementation follows suit automatically. Günther Am 28.12.09 17:49, schrieb Jacques Carette:
Günther Schmidt wrote:
I do know that I could express my algorithms via list-comprehension or in a List Monad, all using tuples. And that would be concrete and grossly inefficient.
You should probably tell us what these algorithms accomplish, rather than how one implementation goes. From a higher-level view of what you're trying to do [but not as high as saying 'implement abstract relational algebra'], it will be easier to give concrete advice.
So how would it be possible to express selecting /field/ b from /record/ x and field c from record y, creating record z, while making sure that record x does have field b and record y does have field c? I mean design a syntax for it?
Perhaps you should tell us why you think you need records at all, and record sub-typing to boot. You might well be right, but the higher-level requirements will have a much bigger influence on the design than anything else.
Jacques

Günther Schmidt wrote:
Initially I had simply imported the CSV files into empty tables in a database and done the calculations directly in SQL, never ever again!
[snip]
But my 1st goal here is to express the algorithm.
Sounds like you want a better DSL than SQL. You're in massive company. Conal gives a lot of useful advice on DSL design. One way to start is to articulate existing pain. Where and why is SQL painful? Another trick is to work backwards: What kind of code do you really want to write? Whether you employ GADTs, initial datatypes, finally-tagless codata, etc. isn't really relevant at this stage. Prematurely latching on to a particular tool gets everything treated like a nail, even when they're nowhere close. -- View this message in context: http://old.nabble.com/Finally-tagless-and-abstract-relational-Algebra-tp2694... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Hello Kim-Ee, well right now I would even go for an abstract comprehension DSL. I do think there's a big difference between the various DSL techniques, most are designed with a particular evaluation in mind, tagless-style ones are focused on constructing typed terms first, and the evaluation / compilation is somewhat detached from that. Günther Am 29.12.09 14:36, schrieb Kim-Ee Yeoh:
Günther Schmidt wrote:
Initially I had simply imported the CSV files into empty tables in a database and done the calculations directly in SQL, never ever again!
[snip]
But my 1st goal here is to express the algorithm.
Sounds like you want a better DSL than SQL. You're in massive company.
Conal gives a lot of useful advice on DSL design. One way to start is to articulate existing pain. Where and why is SQL painful? Another trick is to work backwards: What kind of code do you really want to write?
Whether you employ GADTs, initial datatypes, finally-tagless codata, etc. isn't really relevant at this stage. Prematurely latching on to a particular tool gets everything treated like a nail, even when they're nowhere close.

On Tue, Dec 29, 2009 at 6:36 AM, Kim-Ee Yeoh
Conal gives a lot of useful advice on DSL design. One way to start is to articulate existing pain. Where and why is SQL painful? Another trick is to work backwards: What kind of code do you really want to write?
A bit of unsolicited opinion: Be careful with the latter tactic. It is a good way to get ideas flowing, but to cling to it is to commit yourself to a possibly inferior solution. The code we "want" to write is that which matches the way we think, and I suspect most people who have learned Haskell (at least me) have experienced their way of thinking being inferior. Haskell is more beautiful and powerful than the best language I could have designed before I learned it. My way is to think hard about what the best way to think about things is. FRP is a beautiful example: reasoning with functions of time. If there is not something immediately obvious, or the obvious thing is too complicated, scour the brains of smart people and wikipedia for mathematical abstractions which capture the same thing you are trying to capture -- mathematicians spend a lot of time thinking about how to think about things. The best API design will come from inventing the perfect abstraction -- if it has already been invented, chances are it has been written approximately as well as it can be (modulo minor details). The best API will teach its users to think better -- but the designer must have already retaught himself to have invented the API in the first place.
Whether you employ GADTs, initial datatypes, finally-tagless codata, etc. isn't really relevant at this stage. Prematurely latching on to a particular tool gets everything treated like a nail, even when they're nowhere close.
Amen! Luke

The code we "want" to write is that which matches the way we think .... [snip] My way is to think hard about what the best way to think about things is. I'm in two minds. On the one hand, we're in violent agreement: The code we /want/ to write is that which matches the way we /want/ to think, genuflecting, as it were, before the cold altar of mathematical perfection. On the other hand, with a view towards AI, I'd want to code just the way I now currently think, warts and all, except that I've designed the DSL to fix all my idiosyncrasies, hidden contexts, annoying ambiguities, utter silliness, etc. You might claim that the former is easier and more achievable. I don't see why. It seems likely that the kind of perfection you seek can only be obtained by piercing insights into the nature of the mind's present drosses (if indeed they be so). If you do reach such lofty heights of re-cognition, why not just program all of that into the DSL so that it corrects for all those mental lapses? Surely no more difficult than retooling the mind permanently. Plus you get to share the wealth: make it easy for others to reach programming nirvana too. (We're already there: it's that tarpit called Perl *smacks forehead*) The completed journey ends with a return to where one first began. Or something like that. mathematicians spend a lot of time thinking about how to think about things. G.C. Rota (RIP): The philosopher's role has always been that of stating facts that may have been on everybody's mind but that no one dared state clearly. -- View this message in context: http://old.nabble.com/Finally-tagless-and-abstract-relational-Algebra-tp2694... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Hi guys, any suggestions on how to go about it then? I'm really still no step further on the DSL for Relational Algebra thingy, and I'd even settle for a "comprehension" DSL. I've spent months now, trying to figure it out by myself, studying HaskellDB, HList and many others. Yeah, I even had a look at LINQ. But there isn't really much around on this particular subject, so I figure that better minds than mine also haven't really solved this particular one yet, cause I'm sure it's been on the agenda. Günther

Günther Schmidt wrote:
Hi guys,
any suggestions on how to go about it then?
I'm really still no step further on the DSL for Relational Algebra thingy, and I'd even settle for a "comprehension" DSL.
I've spent months now, trying to figure it out by myself, studying HaskellDB, HList and many others. Yeah, I even had a look at LINQ.
But there isn't really much around on this particular subject, so I figure that better minds than mine also haven't really solved this particular one yet, cause I'm sure it's been on the agenda.
In my opinion, Dan Leijen pretty much nailed it with HaskellDB. The idea is that database queries aren't really different from working with plain old lists, in particular with functions such as filter and concatMap . See also chapter 5 in his PhD thesis: Dan Leijen. The λ Abroad. http://research.microsoft.decenturl.com/dan-leijen-phd-thesis :: pdf Luke Palmer wrote:
My way is to think hard about what the best way to think about things is. [...] The best API design will come from inventing the perfect abstraction
I agree with Luke. The downside of this approach is of course that it's quite difficult to perform. After all, searching for a perfect abstraction means trying to think in a way you never thought before. It involves a lot of time, experience, simplicity and "taste", in particular for judging when an abstraction doesn't cut it; and there's no guarantee that you'll actually succeed in finding one. That being said, I think that searching for abstractions is always worth it; and there is much to gain by setting the bar a bit lower than the "perfect" abstraction. Now, this doesn't mean to settle with a less elegant abstraction - elegance should never be sacrificed - it means to settle with a less comprehensive one. In Günther's case, I would reduce the scope from finding an abstraction that can solve every problem in relational algebra to one that can solve most of problems that are relevant to Günther. Judging from Günther's account of the problem domain, it even appears to me that list comprehensions solve the problem just fine? Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Hi Günther Have you looked at Daan Leijen's PhD thesis? There's a lot more stuff in it, than was covered in the "HaskellDB" paper. Best wishes Stephen

Hi Stephen, no I haven't, I only know of 2 papers on HaskellDB, chapter 5 from "The lambda calculus abroad" and a longer version, "Domain specific embedded compilers", both co-authored with Erik Meijer. Is there another one? Günther Am 29.12.09 22:03, schrieb Stephen Tetley:
Hi Günther
Have you looked at Daan Leijen's PhD thesis? There's a lot more stuff in it, than was covered in the "HaskellDB" paper.
Best wishes
Stephen

Hi Günther
The Lambda Calculus Abroad - is Daan Leijen's PhD (so you do already
know it...).
Best wishes
Stephen
2009/12/29 Günther Schmidt
Hi Stephen,
no I haven't, I only know of 2 papers on HaskellDB, chapter 5 from "The lambda calculus abroad" and a longer version, "Domain specific embedded compilers", both co-authored with Erik Meijer.
Is there another one?
Günther
participants (6)
-
Günther Schmidt
-
Heinrich Apfelmus
-
Jacques Carette
-
Kim-Ee Yeoh
-
Luke Palmer
-
Stephen Tetley