Re: practicality of typeful programming

Daniil Elovkov wrote:
I've recently asked some questions here about some little type hackery implementing an embedded dsl. But now I wonder if it's worth the effort at all...
Yes it is. Typed embedded DSL are quite useful and widely used. For example, Lava (high-level hardware description language) uses phantom types to prevent the designer from building meaningless circuits (e.g., connecting a Bool and an Int wires). http://citeseer.ist.psu.edu/69503.html There are other such hardware design languages which profitably use types (which ought to be popularized more). Using types can decrease the amount of error checking in the implementation. I highly recommend the following _very_ good thesis on this topic: Morten Rhiger Higher-Order Program Generation http://www.brics.dk/DS/01/4/index.html

Thank you,
yes, I absolutely didn't question the usefulness of typeful
programming to _some_ degree. What is interesting is where the limits
are. And I have a feeling that they are quite close.
The very idea of how proofs are supplied in typeful Haskell and
dependently typed languages seems to put a serious burden on the
programmer. Epigram authors stated 'pay as you go' (if I remember the
wording right). That's true, but still, (awfully sorry if the
following is rubbish) when I choose to garantee the sortedness of the
list, I introduce quite a bit of stuff to define the appropriate list
type and have to deal with it since then, even if I don't care about
that property in other places. Same with typeful haskell (but not
always, I think).
The fact that structure is mixed with properties seems to put some
limits on both doability and, even more, practilaty of encoding
complex properties.
Oleg, do I remember it right that in your (with Lammel and Schupke)
paper "Strongly typed heterogeneous collections" you say, that the
given approach only works for statically specified SQL query, I mean
encoded in the Haskell program, not parsed from the untyped input
string? (I've just flipped through it, but failed to find this
place...) Either in case when data schema is statically known, or,
even worse, when it's also dynamic.
Interesting, can all the assertions be proved in that case? Like
correspondence between select field types and resultset record types.
2007/6/16, oleg@pobox.com
Daniil Elovkov wrote:
I've recently asked some questions here about some little type hackery implementing an embedded dsl. But now I wonder if it's worth the effort at all...
Yes it is. Typed embedded DSL are quite useful and widely used. For example, Lava (high-level hardware description language) uses phantom types to prevent the designer from building meaningless circuits (e.g., connecting a Bool and an Int wires). http://citeseer.ist.psu.edu/69503.html
There are other such hardware design languages which profitably use types (which ought to be popularized more). Using types can decrease the amount of error checking in the implementation.
I highly recommend the following _very_ good thesis on this topic: Morten Rhiger Higher-Order Program Generation http://www.brics.dk/DS/01/4/index.html

Daniil Elovkov wrote:
The fact that structure is mixed with properties seems to put some limits on both doability and, even more, practilaty of encoding complex properties.
That's why phantom types, attached via a newtype wrapper, are so appealing. If we remove the wrapper, we get the original value to be used in other parts of the program. Adding the wrapper requires either a dynamic test or a special way of constructing a value. Please see the Non-empty list discussion and the example on Haskell Wiki.
in the paper "Strongly typed heterogeneous collections" you say, that the given approach only works for statically specified SQL query, I mean encoded in the Haskell program, not parsed from the untyped input string? (I've just flipped through it, but failed to find this place...) Either in case when data schema is statically known, or, even worse, when it's also dynamic.
Interesting, can all the assertions be proved in that case? Like correspondence between select field types and resultset record types.
Indeed, the assumption of the HList paper is that the query is embedded in a program (entered by the programmer alongside the code) and the database schema is known. This is quite a common use case. There are cases however when the database schema is dynamic and so is the query: that is, the query is received in a string or file, after the (part of the) code has been compiled. Then we need to typecheck that query. The best way of doing this is via Template Haskell. We read the query form the string, make an AST, and then splice it in. The latter operation implicitly invokes the Haskell typechecker. If it is happy, the result can be used as if the query were static to start with. Frederik Eaton used this approach for strongly-typed linear algebra. http://ofb.net/~frederik/stla/ In his system, the type of the matrix includes includes the matrix size and dimensions, so invalid operations like improper matrix multiplication can be rejected statically. And yet, his library permits matrices read from files.

On Wednesday 27 June 2007 23:28:44 oleg@pobox.com wrote:
In his system, the type of the matrix includes includes the matrix size and dimensions, so invalid operations like improper matrix multiplication can be rejected statically. And yet, his library permits matrices read from files.
Read from files but still at compile time, correct? titto

2007/6/28, Pasqualino 'Titto' Assini
On Wednesday 27 June 2007 23:28:44 oleg@pobox.com wrote:
In his system, the type of the matrix includes includes the matrix size and dimensions, so invalid operations like improper matrix multiplication can be rejected statically. And yet, his library permits matrices read from files.
Read from files but still at compile time, correct?
(titto, sorry for dupliate) No, what is meant, I believe, is reading from a file at run-time and parsing this way (u :: UnTyped) <- readFromFile case parse u of Nothing -> -- failed to parse, because those data wouldn't satisfy constraints Just (t::Typed) -> -- if we're here, we have the typed t, and there are garantees that it is well formed -- in terms of our invariants parse :: UnTyped -> Maybe Typed so deciding whether we have correct data is made at run-time, by calling parse and examining its return value.

Hi Daniil, I had a look at the paper and associated code that Oleg refers to there is no special parsing taking place: From Vector/read-examples.hs: v3 = do let m1 = $(dAM [[1,2],[3,4]]) s <- readFile "Vector/example-data.txt" listMatRow (read s) (\(m2::AVector Double a) -> print $ m2 *> trans m1 ) It does not make any difference if the list that is used to populate the matrix is specified in the code or read from a file. In both cases, if the list lenght is incorrect, an error is generated at run-time (I think, I cannot run the actual code). The TH trickery, that Oleg refers to, is there to solve a different problem: <quote> Note that in each example we print the matrix _inside_ the function argument to the list* functions. We cannot, for instance, just return it, because this causes a universally quantified type to escape:
listVec_ [1,2,3] (\v -> v) <interactive>:1:0: Inferred type is less polymorphic than expected Quantified type variable `n' escapes In the second argument of `listVec', namely `(\ v -> v)' In the definition of `it': it = listVec [1, 2, 3] (\ v -> v)
This is why it is not possible to have a function which takes a list and returns a vector of unknown type. The 'fromList' member of the Vector class is only used when we want to turn a list into a vector whose type is known in advance. (see v4 below) </quote> So, in order to play around with matrices of unknown type in GHCi what they do (if I read the code correctly) is to convert the matrix to TH, specifying the exact type, and compiling/splicing it back: liftVec :: (GetType e, Lift e, GetType a, Dom a, Vector v e, GetType (v a)) => v a -> ExpQ liftVec (v::v a) = do es <- lift (elems v) let at = getType (__::a) let et = getType (eltType v) let vt = getType (__::(v a)) return $ (SigE (AppE (VarE $ mkName "fromList") (SigE es (AppT ListT et))) vt ) Crazy haskellers. Is it just me that some time thinks with nostalgia to Apple II Basic? Best, titto On Thursday 28 June 2007 15:38:17 Daniil Elovkov wrote:
2007/6/28, Pasqualino 'Titto' Assini
: On Wednesday 27 June 2007 23:28:44 oleg@pobox.com wrote:
In his system, the type of the matrix includes includes the matrix size and dimensions, so invalid operations like improper matrix multiplication can be rejected statically. And yet, his library permits matrices read from files.
Read from files but still at compile time, correct?
(titto, sorry for dupliate)
No, what is meant, I believe, is reading from a file at run-time and parsing this way
(u :: UnTyped) <- readFromFile case parse u of Nothing -> -- failed to parse, because those data wouldn't satisfy constraints Just (t::Typed) -> -- if we're here, we have the typed t, and there are garantees that it is well formed -- in terms of our invariants
parse :: UnTyped -> Maybe Typed
so deciding whether we have correct data is made at run-time, by calling parse and examining its return value.

Hello Titto
2007/6/28, Pasqualino 'Titto' Assini
Hi Daniil,
I had a look at the paper and associated code that Oleg refers to there is no special parsing taking place:
From Vector/read-examples.hs:
v3 = do let m1 = $(dAM [[1,2],[3,4]]) s <- readFile "Vector/example-data.txt" listMatRow (read s) (\(m2::AVector Double a) -> print $ m2 *> trans m1 )
It does not make any difference if the list that is used to populate the matrix is specified in the code or read from a file.
In both cases, if the list lenght is incorrect, an error is generated at run-time (I think, I cannot run the actual code).
You're right. Let me explain how I see what's going on there. It looks like, indeed, in both cases (matrix read from a file and matrix given in the program) the error will be reported at run-time. But that is because in both those cases Frederik converts an untyped value to a typed one, via listVec*, listMat*. The untyped values being lists and lists of lists. Compare that to what we see, for example, in Oleg's (and other's) paper Strongly typed hetergeneous collections, where the value is constructed as typed in the first place. In case of reading from a file, an error obviously can't be reported at compile-time. So, the 'parsing' takes place in both of those cases. But, indeed, there seems to be no specific parsing, in the sense of graceful (UnTyped -> Maybe Typed). I failed to find the place, but I suspect there should be the 'error' function called somewhere in a class method which builds the typed thing. I think, the typechecker sees the constraint of (*>) and supposes the correct type for that lambda bound parameter. Having guessed that, it calls the appropriate instance's method. The latter fails to build the right thing and fails with an error. This is only a guess.

Hello Oleg
2007/6/28, oleg@pobox.com
Daniil Elovkov wrote:
The fact that structure is mixed with properties seems to put some limits on both doability and, even more, practilaty of encoding complex properties.
That's why phantom types, attached via a newtype wrapper, are so appealing. If we remove the wrapper, we get the original value to be used in other parts of the program. Adding the wrapper requires either a dynamic test or a special way of constructing a value. Please see the Non-empty list discussion and the example on Haskell Wiki.
Yes, but phantom types seem to only solve rather simple problems. In HList, every (needed) list function had to be lifted to the type level. No phantom type will let us work with an HList like an ordinary list, with ordinary list functions, right?
in the paper "Strongly typed heterogeneous collections" you say, that the given approach only works for statically specified SQL query, I mean encoded in the Haskell program, not parsed from the untyped input string? (I've just flipped through it, but failed to find this place...) Either in case when data schema is statically known, or, even worse, when it's also dynamic.
Interesting, can all the assertions be proved in that case? Like correspondence between select field types and resultset record types.
Indeed, the assumption of the HList paper is that the query is embedded in a program (entered by the programmer alongside the code) and the database schema is known. This is quite a common use case. There are cases however when the database schema is dynamic and so is the query: that is, the query is received in a string or file, after the (part of the) code has been compiled. Then we need to typecheck that query. The best way of doing this is via Template Haskell. We read the query form the string, make an AST, and then splice it in. The latter operation implicitly invokes the Haskell typechecker. If it is happy, the result can be used as if the query were static to start with.om files. ... <snip>
Well, it is expected to be _hard_. AnimalSchema is a type defined in the program. I really believe this particular thing is doable. But how acceptable is the price? It takes a lot of effort. It is enough non-trivial to be a good subject for a scientific paper. But how _practical_ is it?
participants (3)
-
Daniil Elovkov
-
oleg@pobox.com
-
Pasqualino 'Titto' Assini