
So, I have a three layer stack: PostgreSQL data base, Haskell back end, PureScript front end. And there is a fault somewhere in this stack: sometimes some items present in the data base do not make it to the front end. A small fortune in programmer hours has been spent searching for the fault but there is no progress. No one can figure out even roughly where the fault is.
Are you running at the standard isolation level? If yes, your code may be affected by nonrepeatable read, phantom read, or serialization anomaly - in a nutshell, this is what happens if you expect to get the same results if the query is run twice, but the rows that affect this result are somehow not included in the transactions (there are edge cases that can do this, the isolation levels progressively exclude them, at the expense of performance loss). More details are on https://www.postgresql.org/docs/current/transaction-iso.html .
Theoretically, we can carefully reset the data base back to the initial state after each run of the property check. Given that only a small amount of data is modified, it would take imperceptibly little time. But it is hard to know what exactly should be reset, and it is easy to get it wrong if done by hand!
Have you tried nested transactions? I have no PSQL experience myself, but it looks like it should do the minimal-effort rollback you're after: Start a transaction, let the unit tests run (including their own transactions, just they're nested now), roll back the overarching initial transaction.
One trick we do use is SQL transactions. We take a single connexion, start a transaction, perform a check and then roll the transaction back. Unfortunately, even this is not completely safe: it looks as though sequences used to draw primary keys from are not reset to their previous values! _(Or maybe we are doing something wrong — I am not really a PostgreSQL guru so I am not too sure of myself.)_
No, this is documented (if surprising) behaviour. But TBH if your code is affected by the values you get from a sequence, that's not what sequences are intended for - they're for being unique, not for their numerical properties. See https://www.postgresql.org/message-id/501B1494.9040502@ringerc.id.au for more details.
But even ignoring this problem _(it is irrelevant to most checks)_,
there is another, more severe problem: transactions guarantee that uncommitted data is only visible inside the same connexion. So, there is no way to request it from the HTTP API. This trick is suitable for fuzzing the data base – back end layer, but nothing beyond that.
Yes, Postgresql gives you no way to read dirty data. This is pretty typical of MVCC databases. But even if there were a way, you'd never get a guarantee that you're actually getting the data that another transaction sees. If you wish to see what another transaction sees, you'll have to instrument the code that does the transaction. This should be doable in the backend.
Another thing I heard being done is to selectively drop and re-create exactly the tables affected by a run of a property check. This seems challenging to automate though. How can I know in a general way if a table has been touched? And then, how can I re-create exactly the affected tables given that the data base initialization code is an opaque blob?
I wonder if this problem is already solved. In any case, any advice is welcome!
I doubt it's a useful way forward. It's just not a use case that database are built for, and you have a high risk of hitting nonfunctional problems (performance, possibly from cold query optimization caches and such) or functional problems (sequencing of dropping/creating DB objects that depend on each other, for example, and you already saw the sequence number issue). Background is that databases are such complex beasts that the developer teams struggle to get the core goals done, nice-to-have goals get dropped pretty quickly. PGSQL is exceptional in that it tries to keep as many nice-to-have goals intact as possible, but I wouldn't count on that. Regards, Jo P.S.: This is concentrating on the database layer, which is unrelated to Haskell. Which might be because your problem is database-related, or because I happen to know more about databases than about Haskell. Or because you described your problems in terms of database issue. If it's really database-related, you may have more concrete advice on a PGQSL-related forum than here.