On Fri, Jul 11, 2014 at 10:58 AM, Tom Ellis <tom-lists-haskell-cafe-2013@jaguarpaw.co.uk> wrote:

I am implementing an EDSL that compiles to SQL and I am wondering what is
the state of the art in testing code generation.

All the Haskell libraries I could find that deal with SQL generation are
tested by implementing multiple one-off adhoc queries and checking that when
either compiled to SQL or run against a database they give the expected,
prespecified result.

Is this the best we can do in Haskell? Certainly it seems hard to use a
QuickCheck/SmallCheck approach for this purpose. Is there any way this kind
of testing can be automated or made more robust?

Personally, I would test this in the same way I'd test a compiler: as purely as possible. You have an EDSL, and possibly an AST for it, and finally a target language. Figure out if any one of the layers is particularly "shallow" (and therefore "easy" to validate by inspection). Use the shallow layer to validate the other two.

The trouble with this approach is that you'll need to find a way to "interpret" raw SQL statements, since different can be equivalent modulo ordering of fields, subqueries, conditions, etc. So, as an architectural point, I would make the AST -> SQL layer the "easy" one to validate. Then, you can check that your EDSL -> AST layer produces the expected trees. You can even use QuickCheck for this validation.

Otherwise, you will have to do it impurely, like the other packages do.