What is the state of the art in testing code generation?

older
Diony Rosa - 7/14/2014 2:14:04 AM

Tom Ellis

11 Jul 2014 11 Jul '14

5:58 p.m.

I am implementing an EDSL that compiles to SQL and I am wondering what is the state of the art in testing code generation. All the Haskell libraries I could find that deal with SQL generation are tested by implementing multiple one-off adhoc queries and checking that when either compiled to SQL or run against a database they give the expected, prespecified result. * https://github.com/prowdsponsor/esqueleto/blob/master/test/Test.hs * https://github.com/m4dc4p/haskelldb/blob/master/test/TestCases.hs * https://github.com/yesodweb/persistent/blob/master/persistent-test/SumTypeTe... I couldn't find any tests for groundhog. * https://github.com/lykahb/groundhog I also had a look at Javascript generators. They take a similar adhoc, one-off approach. * https://github.com/valderman/haste-compiler/tree/master/Tests * https://github.com/faylang/fay/tree/master/tests Is this the best we can do in Haskell? Certainly it seems hard to use a QuickCheck/SmallCheck approach for this purpose. Is there any way this kind of testing can be automated or made more robust? Thanks, Tom

Show replies by date

Alexander Solla

11 Jul 11 Jul

8:03 p.m.

On Fri, Jul 11, 2014 at 10:58 AM, Tom Ellis < tom-lists-haskell-cafe-2013@jaguarpaw.co.uk> wrote:

...

I am implementing an EDSL that compiles to SQL and I am wondering what is the state of the art in testing code generation.

All the Haskell libraries I could find that deal with SQL generation are tested by implementing multiple one-off adhoc queries and checking that when either compiled to SQL or run against a database they give the expected, prespecified result.

...

Is this the best we can do in Haskell? Certainly it seems hard to use a QuickCheck/SmallCheck approach for this purpose. Is there any way this kind of testing can be automated or made more robust?

Personally, I would test this in the same way I'd test a compiler: as purely as possible. You have an EDSL, and possibly an AST for it, and finally a target language. Figure out if any one of the layers is particularly "shallow" (and therefore "easy" to validate by inspection). Use the shallow layer to validate the other two. The trouble with this approach is that you'll need to find a way to "interpret" raw SQL statements, since different can be equivalent modulo ordering of fields, subqueries, conditions, etc. So, as an architectural point, I would make the AST -> SQL layer the "easy" one to validate. Then, you can check that your EDSL -> AST layer produces the expected trees. You can even use QuickCheck for this validation. Otherwise, you will have to do it impurely, like the other packages do.

João Cristóvão

8:33 p.m.

I'm not sure if this is related and/or applicable, but you didn't seem to mention: https://hackage.haskell.org/package/hssqlppp Its a SQL parser/checker. Cheers, João 2014-07-11 21:03 GMT+01:00 Alexander Solla :

...

On Fri, Jul 11, 2014 at 10:58 AM, Tom Ellis wrote:

...
I am implementing an EDSL that compiles to SQL and I am wondering what is the state of the art in testing code generation.

All the Haskell libraries I could find that deal with SQL generation are tested by implementing multiple one-off adhoc queries and checking that when either compiled to SQL or run against a database they give the expected, prespecified result.

...
Is this the best we can do in Haskell? Certainly it seems hard to use a

QuickCheck/SmallCheck approach for this purpose. Is there any way this kind of testing can be automated or made more robust?

Personally, I would test this in the same way I'd test a compiler: as purely as possible. You have an EDSL, and possibly an AST for it, and finally a target language. Figure out if any one of the layers is particularly "shallow" (and therefore "easy" to validate by inspection). Use the shallow layer to validate the other two.

The trouble with this approach is that you'll need to find a way to "interpret" raw SQL statements, since different can be equivalent modulo ordering of fields, subqueries, conditions, etc. So, as an architectural point, I would make the AST -> SQL layer the "easy" one to validate. Then, you can check that your EDSL -> AST layer produces the expected trees. You can even use QuickCheck for this validation.

Otherwise, you will have to do it impurely, like the other packages do.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Tom Ellis

14 Jul 14 Jul

12:59 p.m.

New subject: What is the state of the art in testing code generation?

On Fri, Jul 11, 2014 at 09:33:42PM +0100, João Cristóvão wrote:

...

I'm not sure if this is related and/or applicable, but you didn't seem to mention:

https://hackage.haskell.org/package/hssqlppp

Its a SQL parser/checker.

It seems to take the same ad hoc approach: * https://github.com/JakeWheat/hssqlppp/tree/master/hssqlppp/tests/Database/Hs... * https://github.com/JakeWheat/hssqlppp/tree/master/hssqlppp/tests/Database/Hs... Tom

Tom Ellis

1:11 p.m.

New subject: What is the state of the art in testing code generation?

On Fri, Jul 11, 2014 at 01:03:44PM -0700, Alexander Solla wrote:

...

On Fri, Jul 11, 2014 at 10:58 AM, Tom Ellis < tom-lists-haskell-cafe-2013@jaguarpaw.co.uk> wrote:

...
I am implementing an EDSL that compiles to SQL and I am wondering what is the state of the art in testing code generation.

All the Haskell libraries I could find that deal with SQL generation are tested by implementing multiple one-off adhoc queries and checking that when either compiled to SQL or run against a database they give the expected, prespecified result. [..] Is this the best we can do in Haskell? Certainly it seems hard to use a QuickCheck/SmallCheck approach for this purpose. Is there any way this kind of testing can be automated or made more robust?

Personally, I would test this in the same way I'd test a compiler: as purely as possible. You have an EDSL, and possibly an AST for it, and finally a target language. Figure out if any one of the layers is particularly "shallow" (and therefore "easy" to validate by inspection). Use the shallow layer to validate the other two.

The trouble with this approach is that you'll need to find a way to "interpret" raw SQL statements, since different can be equivalent modulo ordering of fields, subqueries, conditions, etc. So, as an architectural point, I would make the AST -> SQL layer the "easy" one to validate. Then, you can check that your EDSL -> AST layer produces the expected trees. You can even use QuickCheck for this validation.

Right, this EDSL -> AST layer is exactly what I don't know how to test. How would you go about doing that in a non-trivial way? One complication is that the EDSL is typed, making it harder to generate terms with QuickCheck as far as I can tell. Tom

Justin Bailey

11 Jul 11 Jul

11:52 p.m.

Do you want to share your library yet? Sounds pretty cool. On Fri, Jul 11, 2014 at 10:58 AM, Tom Ellis wrote:

...

I am implementing an EDSL that compiles to SQL and I am wondering what is the state of the art in testing code generation.

All the Haskell libraries I could find that deal with SQL generation are tested by implementing multiple one-off adhoc queries and checking that when either compiled to SQL or run against a database they give the expected, prespecified result.

* https://github.com/prowdsponsor/esqueleto/blob/master/test/Test.hs * https://github.com/m4dc4p/haskelldb/blob/master/test/TestCases.hs * https://github.com/yesodweb/persistent/blob/master/persistent-test/SumTypeTe...

I couldn't find any tests for groundhog.

* https://github.com/lykahb/groundhog

I also had a look at Javascript generators. They take a similar adhoc, one-off approach.

* https://github.com/valderman/haste-compiler/tree/master/Tests * https://github.com/faylang/fay/tree/master/tests

Is this the best we can do in Haskell? Certainly it seems hard to use a QuickCheck/SmallCheck approach for this purpose. Is there any way this kind of testing can be automated or made more robust?

Thanks,

Tom _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Tom Ellis

14 Jul 14 Jul

1:15 p.m.

New subject: What is the state of the art in testing code generation?

On Fri, Jul 11, 2014 at 04:52:22PM -0700, Justin Bailey wrote:

...

Do you want to share your library yet? Sounds pretty cool.

It is pretty cool, but needs thorough testing before release :)

...

On Fri, Jul 11, 2014 at 10:58 AM, Tom Ellis wrote:

...
I am implementing an EDSL that compiles to SQL and I am wondering what is the state of the art in testing code generation.

All the Haskell libraries I could find that deal with SQL generation are tested by implementing multiple one-off adhoc queries and checking that when either compiled to SQL or run against a database they give the expected, prespecified result.

* https://github.com/prowdsponsor/esqueleto/blob/master/test/Test.hs * https://github.com/m4dc4p/haskelldb/blob/master/test/TestCases.hs * https://github.com/yesodweb/persistent/blob/master/persistent-test/SumTypeTe...

I couldn't find any tests for groundhog.

* https://github.com/lykahb/groundhog

I also had a look at Javascript generators. They take a similar adhoc, one-off approach.

* https://github.com/valderman/haste-compiler/tree/master/Tests * https://github.com/faylang/fay/tree/master/tests

Is this the best we can do in Haskell? Certainly it seems hard to use a QuickCheck/SmallCheck approach for this purpose. Is there any way this kind of testing can be automated or made more robust?

Thanks,

Tom

Jake Wheat

12 Jul 12 Jul

8:08 a.m.

...

Is this the best we can do in Haskell? Certainly it seems hard to use a QuickCheck/SmallCheck approach for this purpose. Is there any way this kind of testing can be automated or made more robust?

You could make a simple AST which can be used to generate SQL query text directly, and also the Haskell source code for your DSL. Then you can compare the results of the two code paths 'AST -> concrete SQL' and 'AST -> generate Haskell -> run the Haskell -> concrete SQL'. If you can get this working then this could make it easier to use Quickcheck, etc..

Tom Ellis

14 Jul 14 Jul

1:20 p.m.

New subject: What is the state of the art in testing code generation?

On Sat, Jul 12, 2014 at 11:08:51AM +0300, Jake Wheat wrote:

...

...
Is this the best we can do in Haskell? Certainly it seems hard to use a QuickCheck/SmallCheck approach for this purpose. Is there any way this kind of testing can be automated or made more robust?

You could make a simple AST which can be used to generate SQL query text directly, and also the Haskell source code for your DSL. Then you can compare the results of the two code paths 'AST -> concrete SQL' and 'AST -> generate Haskell -> run the Haskell -> concrete SQL'. If you can get this working then this could make it easier to use Quickcheck, etc..

Thanks Jake that's an interesting idea, though I fear that any AST powerful enough to compile to both SQL and my EDSL would be no less complicated than my DSL in the first place. Tom

Vo Minh Thu

9:26 a.m.

Hi Tom, I think there is still some opportunity for something like QuickCheck in your case. Certainly you can use QC to generate expressions/statements in your EDSL. Then if you can also generate schemas and data you should be able to write down some properties, for instance that some class of queries on empty tables should return no rows. A special case of this, and a more specific example, is very similar to the introductory examples to QC: it must be possible to retrieve a row after inserting it in an empty table, or deleting it after inserting it must leave the table unchanged. Even if you don't go as far as writing properties that involves the schema/data generation, generating arbitrary valid AST that must compile successfully to SQL is interesting. HTH, Thu 2014-07-11 19:58 GMT+02:00 Tom Ellis :

...

I am implementing an EDSL that compiles to SQL and I am wondering what is the state of the art in testing code generation.

All the Haskell libraries I could find that deal with SQL generation are tested by implementing multiple one-off adhoc queries and checking that when either compiled to SQL or run against a database they give the expected, prespecified result.

* https://github.com/prowdsponsor/esqueleto/blob/master/test/Test.hs * https://github.com/m4dc4p/haskelldb/blob/master/test/TestCases.hs * https://github.com/yesodweb/persistent/blob/master/persistent-test/SumTypeTe...

I couldn't find any tests for groundhog.

* https://github.com/lykahb/groundhog

I also had a look at Javascript generators. They take a similar adhoc, one-off approach.

* https://github.com/valderman/haste-compiler/tree/master/Tests * https://github.com/faylang/fay/tree/master/tests

Is this the best we can do in Haskell? Certainly it seems hard to use a QuickCheck/SmallCheck approach for this purpose. Is there any way this kind of testing can be automated or made more robust?

Thanks,

Tom _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Tom Ellis

1:26 p.m.

New subject: What is the state of the art in testing code generation?

The first impediment to using QuickCheck is that my EDSL is typed. I don't know how to generate random terms in the presence of types. The second impediment is that, although I can indeed test trivial properties, I don't know how to test functionality to its full level of sophistication. How can I test a left join, for example, without reimplementing left join functionality within Haskell? (Indeed I may end up doing this, but would prefer to find an easier path ...) Tom On Mon, Jul 14, 2014 at 11:26:14AM +0200, Vo Minh Thu wrote:

...

I think there is still some opportunity for something like QuickCheck in your case. Certainly you can use QC to generate expressions/statements in your EDSL. Then if you can also generate schemas and data you should be able to write down some properties, for instance that some class of queries on empty tables should return no rows.

A special case of this, and a more specific example, is very similar to the introductory examples to QC: it must be possible to retrieve a row after inserting it in an empty table, or deleting it after inserting it must leave the table unchanged.

Even if you don't go as far as writing properties that involves the schema/data generation, generating arbitrary valid AST that must compile successfully to SQL is interesting.

...

2014-07-11 19:58 GMT+02:00 Tom Ellis :

...
I am implementing an EDSL that compiles to SQL and I am wondering what is the state of the art in testing code generation.

All the Haskell libraries I could find that deal with SQL generation are tested by implementing multiple one-off adhoc queries and checking that when either compiled to SQL or run against a database they give the expected, prespecified result.

* https://github.com/prowdsponsor/esqueleto/blob/master/test/Test.hs * https://github.com/m4dc4p/haskelldb/blob/master/test/TestCases.hs * https://github.com/yesodweb/persistent/blob/master/persistent-test/SumTypeTe...

I couldn't find any tests for groundhog.

* https://github.com/lykahb/groundhog

I also had a look at Javascript generators. They take a similar adhoc, one-off approach.

* https://github.com/valderman/haste-compiler/tree/master/Tests * https://github.com/faylang/fay/tree/master/tests

Is this the best we can do in Haskell? Certainly it seems hard to use a QuickCheck/SmallCheck approach for this purpose. Is there any way this kind of testing can be automated or made more robust?

4011

Age (days ago)

4014

Last active (days ago)

List overview

Download

10 comments

6 participants

participants (6)

Alexander Solla
Jake Wheat
João Cristóvão
Justin Bailey
Tom Ellis
Vo Minh Thu