ANNOUNCE: hdbi-1.0.0 and hdbi-postgresql-1.0.0

Hello, haskellers. This is the first release of HDBI (Haskell Database Independent interface). It is the fork of HDBC. HDBI has some improvements in design, it also has better testing and performance (mainly because of using Text instead of String anywhere). HDBI designed to be more flexible and correct database interface. You can find out more information in the documentation. http://hackage.haskell.org/package/hdbi http://hackage.haskell.org/package/hdbi-postgresql I am planning to implement MySql and Sqlite drivers as well https://github.com/s9gf4ult/hdbi https://github.com/s9gf4ult/hdbi-postgresql Issues and pull requests are welcome, as well as collaboration.

On 2013-07-31 05:45, Alexey Uimanov wrote:
Hello, haskellers. This is the first release of HDBI (Haskell Database Independent interface). It is the fork of HDBC. HDBI has some improvements in design, it also has better testing and performance (mainly because of using Text instead of String anywhere). HDBI designed to be more flexible and correct database interface.
You can find out more information in the documentation. http://hackage.haskell.org/package/hdbi http://hackage.haskell.org/package/hdbi-postgresql
I am planning to implement MySql and Sqlite drivers as well
https://github.com/s9gf4ult/hdbi https://github.com/s9gf4ult/hdbi-postgresql Issues and pull requests are welcome, as well as collaboration.
Looks interesting. Just a couple of comments from skimming the documentation. Apologies, if these are already addressed -- didn't see any mention of it from my read-through. Regard parameterized SQL: It might be worth using named parameters (e.g. ":foo" and ":bar" or something like that) rather than "?" as placeholders in SQL/prepared SQL. This will make it slightly more flexible if you need to provide different SQL strings for different databases, but want to reuse the code which does the actual running of the SQL. It's also more flexible if you need to repeat parameters -- the latter is typical with PostgreSQL if you want to emulate "update-or-insert" in a single SQL statement. Regarding migrations: If you haven't already, please have a look at Liquibase (http://www.liquibase.org/documentation/index.html) before attempting to implement migrations. The most important attributes of Liquibase are: a) migrations are *manually* specified deltas as opposed to deltas automatically derived from two snapshots of the database. Anything based on automatically getting from snapshot A to snapshot B *will* break or do undesirable/unpredictable things to data at some point. b) It deliberately does NOT provide migrations in a Turing Complete language -- IME, if you need TC migrations on a routine basis you're already in deep trouble and need to think more about what you're doing (at a "process" or "engineering" level). If you *really* need to, you *can* extend it do custom migrations via code, but the barrier to entry is sufficiently high that you'll rarely be tempted unless you *really* need it and the migration step is likely to be reusable. c) the concept of "contexts" (see the documentation) which allows you to vary the migrations across different environments as needed. (This is definitely open to abuse, but when you need it you *really* need it, IME.) d) It can also generate full SQL for "changes-about-to-be-applied" so that you can audit and/or apply them manually -- some devops teams need this, others may not care. A minor but important detail is that the generated SQL includes all the metadata for tracking the application of the migrations themselves. Liquibase is the only system *I've* ever seen that is even close to doing migrations the Right Way According To Me(TM). (I've used it since the 1.x days using the XML format and still haven't come across anything that can really compete.) Liquibase certainly has flaws(*), but one should think *really* hard about whether there isn't some existing migrations system which is "good enough" before foisting yet another migrations system on the world since it's highly likely to be at least as flawed as all the existing systems in one way or another. (Of course, being implemented in Java, there is one aspect of Liquibase which is annoying if using it for Haskell projects: It's not very convenient to auto-apply all migrations at startup time without some sort of "run-this-executable-script-before-running-the-main-program" hack. That, and you need a Java Runtime Environment installed.) Anyway, just throwing that out there... Regards, /b (*) Sub-par internal consistency checking across combinations of contexts for changelog files being one of them. There also the fact that it was until recently XML-based and that the the new syntaxes all have their own problems. Maybe a Haskell DSL could lead to some real improvements here...

Regard parameterized SQL: It might be worth using named parameters (e.g. ":foo" and ":bar" or something like that) rather than "?" as placeholders in SQL/prepared SQL. This will make it slightly more flexible if you need to provide different SQL strings for different databases, but want to reuse the code which does the actual running of the SQL. It's also more flexible if you need to repeat parameters -- the latter is typical with PostgreSQL if you want to emulate "update-or-insert" in a single SQL statement
Named parameters might be more flexible, but it is need to think hard about how to implement this. If you are using named parameters you need to pass not just list [SqlValue] as parameters, but Map Text SqlValue or something. So named parameters will not be compatible with unnamed and will need separate query parser.
Regarding migrations: If you haven't already, please have a look at Liquibase (http://www.liquibase.org/documentation/index.html) before attempting to implement migrations. The most important attributes of Liquibase are:
What I am trying to implement is not a new migration system, but just the common interface for simple schema actions, here is my in-mind draft: newtype TableName = TableName Text data TableDescription = TableDescription {tableName :: TableName ,tableFields :: [FieldDescription] } class (Connection con) => Introspect con where getTableNames:: con -> IO [TableName] describeTable :: con -> TableName -> IO TableDescription getIndexes :: con -> [IndexDescription] class (Connection con) => SchemaChange con where createTable :: con -> TableDescription -> IO () dropTable :: con -> TableName -> IO () addColumn :: con -> TableName -> FieldDescription -> IO () ............... This typeclasses must provide database-independent schema introspection and changing. Migration system can be anything you want. I also have the idea do not throw the exceptions in IO but return (Either SqlError a) from all the Connection and Statement methods for safe data processing. What do you think about ?

Alexey,
Regarding named parameters - another option is to use numbered parameters
like :1, :2 etc. It will help with repeated parameters at least. I didn't
understandthe first Bardur's point about "different SQL strings" though.
Kind regards,
Kirill Zaborsky
2013/7/31 Alexey Uimanov
Regard parameterized SQL: It might be worth using named parameters (e.g.
":foo" and ":bar" or something like that) rather than "?" as placeholders in SQL/prepared SQL. This will make it slightly more flexible if you need to provide different SQL strings for different databases, but want to reuse the code which does the actual running of the SQL. It's also more flexible if you need to repeat parameters -- the latter is typical with PostgreSQL if you want to emulate "update-or-insert" in a single SQL statement
Named parameters might be more flexible, but it is need to think hard about how to implement this. If you are using named parameters you need to pass not just list [SqlValue] as parameters, but Map Text SqlValue or something. So named parameters will not be compatible with unnamed and will need separate query parser.
Regarding migrations: If you haven't already, please have a look at Liquibase (http://www.liquibase.org/documentation/index.html) before attempting to implement migrations. The most important attributes of Liquibase are:
What I am trying to implement is not a new migration system, but just the common interface for simple schema actions, here is my in-mind draft:
newtype TableName = TableName Text
data TableDescription = TableDescription {tableName :: TableName ,tableFields :: [FieldDescription] }
class (Connection con) => Introspect con where getTableNames:: con -> IO [TableName] describeTable :: con -> TableName -> IO TableDescription getIndexes :: con -> [IndexDescription]
class (Connection con) => SchemaChange con where createTable :: con -> TableDescription -> IO () dropTable :: con -> TableName -> IO () addColumn :: con -> TableName -> FieldDescription -> IO () ...............
This typeclasses must provide database-independent schema introspection and changing. Migration system can be anything you want.
I also have the idea do not throw the exceptions in IO but return (Either SqlError a) from all the Connection and Statement methods for safe data processing. What do you think about ?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- You received this message because you are subscribed to a topic in the Google Groups "Haskell-cafe" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/haskell-cafe/9X2J65gXGXs/unsubscribe. To unsubscribe from this group and all its topics, send an email to haskell-cafe+unsubscribe@googlegroups.com. To post to this group, send email to haskell-cafe@googlegroups.com. Visit this group at http://groups.google.com/group/haskell-cafe. For more options, visit https://groups.google.com/groups/opt_out.

On 2013-07-31 09:22, Alexey Uimanov wrote:
Regard parameterized SQL: It might be worth using named parameters (e.g. ":foo" and ":bar" or something like that) rather than "?" as placeholders in SQL/prepared SQL. This will make it slightly more flexible if you need to provide different SQL strings for different databases, but want to reuse the code which does the actual running of the SQL. It's also more flexible if you need to repeat parameters -- the latter is typical with PostgreSQL if you want to emulate "update-or-insert" in a single SQL statement
Named parameters might be more flexible, but it is need to think hard about how to implement this. If you are using named parameters you need to pass not just list [SqlValue] as parameters, but Map Text SqlValue or something. So named parameters will not be compatible with unnamed and will need separate query parser.
The use case I'm thinking of it something like this: reportSQL :: DatabaseType -> SQL reportSQL MySQL = " SELECT ... custName = :custName ... INTERSECTION SELECT ... custName = :custName " reportSQL PostgreSQL = " SELECT ... AS cust WHERE cust.custName = :custName FROM SELECT ... AS foo WHERE foo.custName = cust.custName " For this fictitious example we imagine that PostgreSQL can handle a nested query of some particular shape where we need an INTERSECTION query in MySQL. Obviously this is a made up example, but you get the idea. The point is that the MySQL query may need to refer to the ":custName" parameter multiple times whereas the PostgreSQL one doesn't. Similarly the positions in the SQL may need to be different. You perhaps still want to have a way to run both variants using the exact same code: runReport :: DatabaseType -> Text -> IO whatever runReport databaseType customerName = do result <- runSQLWithParameters (reportSQL databaseType) [("custName", customerName)] ... do stuff with result ... return whatever Of course this being Haskell you can always use higher-order functions (e.g. a function DatabaseType -> Text -> IO QueryResult which encompasses the runSQLWithParameters *and* reportSQL function, but then you're mixing up the running of the query with the query itself) for similar purposes, but I tend to find named parameters of this type to be quite useful for readability. As Kirill mentioned, you can also use numbered parameters, but I tend to like named parameters for readability. Implementation should be reasonably simple: Replace all ":xyz" (or whatever syntax you choose) parameters in input with "$n" and maintain a map which tells you which parameter "$1", "$2", etc. correspond to. Anyway, this is an issue I've sometimes run across with JDBC (which uses ?) in particular and it can be very annoying. Perhaps the best thing in Haskell would be to just avoid raw SQL entirely in favor of combinators, but then you often end up with suboptimal SQL which can't really exploit all the features of your chosen database. Even so it would be nice to have a DB interface/library that can hit that sweet spot where you can write your own SQL but your program won't be too tied to a single DB backend (modulo the concrete SQL).
Regarding migrations: If you haven't already, please have a look at Liquibase (http://www.liquibase.org/documentation/index.html) before attempting to implement migrations. The most important attributes of Liquibase are:
What I am trying to implement is not a new migration system, but just the common interface for simple schema actions, here is my in-mind draft:
newtype TableName = TableName Text
[--snip--]
This typeclasses must provide database-independent schema introspection and changing. Migration system can be anything you want.
Ah, OK, I see I just misinterpreted the bit in the package description about migrations then :). You might end up having a little trouble reconciling metadata from the different database backends, but certainly there must be *some* useful common subset of table/index/etc. metadata :).
I also have the idea do not throw the exceptions in IO but return (Either SqlError a) from all the Connection and Statement methods for safe data processing. What do you think about ?
I don't think I'm qualified to have an opinion either way, but perhaps http://www.randomhacks.net/articles/2007/03/10/haskell-8-ways-to-report-erro... and particularly http://hackage.haskell.org/package/errors can serve as insipration :). Regards,

On Wed, Jul 31, 2013 at 01:22:42PM +0600, Alexey Uimanov wrote:
I also have the idea do not throw the exceptions in IO but return (Either SqlError a) from all the Connection and Statement methods for safe data processing. What do you think about ?
I feel very strongly that you should use Either. One of the things I find worst about postgres-simple is the exceptions it throws. The benefit of Haskell is that we can do things in a Haskelly way! Tom

On Wed, Jul 31, 2013 at 09:45:50AM +0600, Alexey Uimanov wrote:
Hello, haskellers. This is the first release of HDBI (Haskell Database Independent interface).
Hi, thanks for this Alexey. It's great that there is continued development of this really important infrustructure for Haskell. I have a question about variable interpolation, that is, using "?" parameter placeholders in the query strings, as documented here: http://hackage.haskell.org/packages/archive/hdbi/1.0.0/doc/html/Database-HDB... I know postgresql-simple does this, and presumably database access libraries in other languages do this too. What is the rationale for this when in Haskell we have safer methods of interpolation at our disposal (for example HoleyMonoid)? Is it simply a matter of using the most familiar interface, or is there a deeper reason this is necessary? Thanks, Tom

The rationale is that the low-level database interface accepts parameters
directly instead of
inserting them inside the query manually (like HoleyMonoid would do).
Postgresql-simple
also does parameter substitution on haskell side. This is not safe and may
cause to
http://en.wikipedia.org/wiki/SQL_injection because of not properly done
backquoting. Low-level
database interface knows better how to work with parameters, so the driver
must pass them to it instead
of parameters substitution.
hdbi-postgresql just replace "?" to "$1" sequence properly parsing and
ignoring question marks inside the doublequoted identifiers, quoted
literals and even dollar quoted literals 4.1.2.2. Dollar-Quoted String
Constantshttp://www.postgresql.org/docs/8.2/static/sql-syntax-lexical.html
2013/7/31 Tom Ellis
On Wed, Jul 31, 2013 at 09:45:50AM +0600, Alexey Uimanov wrote:
Hello, haskellers. This is the first release of HDBI (Haskell Database Independent interface).
Hi, thanks for this Alexey. It's great that there is continued development of this really important infrustructure for Haskell.
I have a question about variable interpolation, that is, using "?" parameter placeholders in the query strings, as documented here:
http://hackage.haskell.org/packages/archive/hdbi/1.0.0/doc/html/Database-HDB...
I know postgresql-simple does this, and presumably database access libraries in other languages do this too.
What is the rationale for this when in Haskell we have safer methods of interpolation at our disposal (for example HoleyMonoid)? Is it simply a matter of using the most familiar interface, or is there a deeper reason this is necessary?
Thanks,
Tom
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Wed, Jul 31, 2013 at 05:28:02PM +0600, Alexey Uimanov wrote:
The rationale is that the low-level database interface accepts parameters directly instead of inserting them inside the query manually. [...] Low-level database interface knows better how to work with parameters, so the driver must pass them to it instead of parameters substitution.
Letting the low-level database interface (I'm guessing you're talking about a C library provided by the database vendor) do the escaping certainly makes a lot of sense. However, it would still be possible to make sure that the *number* of parameters supplied matches the number of placeholders in the query string. That would make sense, don't you think? Tom
participants (4)
-
Alexey Uimanov
-
Bardur Arantsson
-
Kirill Zaborsky
-
Tom Ellis