Using Haskell types in a database

Suppose I need to represent a user in a database. I could make a table like this one (using Yesod's config/models syntax): User name Text email Text Maybe is_verified Bool default=false last_login UTCTime default=now() Instead, it would be better to do the following, so I couldn't swap name and email by accident or feed a wrong value to is_verified and last_login: User name UserName email UserEmail v Maybe last_login UserLastLogin where newtype UserName = UserName Text deriving (Show, Read) derivePersistField "UserName" newtype UserLastLogin = UserLastLogin UTCTime deriving (Show, Read) derivePersistField "UserLastLogin" data Verified data Unverified newtype UserEmail v = UserEmail Text deriving (Show, Read) derivePersistField "UserEmail" (So I could define functions like sendAnnouncement :: UserEmail Verified -> IO () sendVerification :: UserEmail Unverified -> IO ()) The Haskell part is not new and has been successfully used in the large (see http://code.haskell.org/~dons/talks/dons-google-2015-01-27.pdf, pp. 17--21), but I wonder whether it's reasonable in the database layer. The first table will be mapped to this in PostgreSQL: User name character varying not null email character varying is_verified boolean not null default false last_login timestamp with time zone not null default now() while the second will use character varying for everything. The only downsides of the latter approach I can think of: 1. You can no longer rely on the database to ensure (basic) type safety, so everything needs to be done in the Haskell layer. Otherwise, you will get runtime errors. 2. I haven't benchmarked this, but I assume that comparing, say, booleans should be faster than comparing varchars. So, is it worth it in practice? Do I need to worry about these issues? And are there any other problems?

On Tue, Feb 10, 2015 at 05:29:55PM +0300, Nikita Karetnikov wrote:
User name UserName email UserEmail v Maybe last_login UserLastLogin
where
data Verified data Unverified newtype UserEmail v = UserEmail Text deriving (Show, Read) derivePersistField "UserEmail"
I think you want the verification status to exist at the value level as well as the type level. What you currently have is essentially the same as defining your table to be [(UserName, Maybe (UserEmail v), UserLastLogin)] where I guess you're assuming the `v` will be existentially quantified. Presumably this isn't actually what you want. You could however have (the database version of) [(UserName, Maybe UserEmail, UserLastLogin)] and something like (the database version of) UserEmail -> Maybe UserEmailVerified UserEmail -> Maybe UserEmailUnverified which would give you the type safety you're looking for, I think. Opaleye supports user-created types for this kind of purpose, and I've found it invaluable. Tom

But does it (that is, having varchars everywhere) significantly affect performance in practice? If it depends, then when is it a reasonable price to pay?

On Tue, Feb 10, 2015 at 07:33:12PM +0300, Nikita Karetnikov wrote:
But does it (that is, having varchars everywhere) significantly affect performance in practice? If it depends, then when is it a reasonable price to pay?
That part I don't understand. Why does your proposal require varchars everywhere?

That part I don't understand. Why does your proposal require varchars everywhere?
That's what Persistent generates (using derivePersistField) for the second table. For example, the Haskell value MyText "foo" where MyText is newtype MyText = MyText Text deriving (Show, Read) derivePersistField "MyText" will be stored in PostgreSQL as a varchar 'MyText "foo"' I assume that Persistent (or its PostgreSQL backend) has a predefined mapping from a commonly used Haskell types to the database types. And everything else is represented as a varchar. So there are two choices, corresponding to the two tables shown in the original message: 1. Design tables with the (basic) database types in mind (integer, boolean, timestamp, varchar, etc.) then wrap and unwrap values each time you interact with the database. 2. Use Haskell types like MyText or something more complex directly and end up with varchar everywhere. I wonder how bad is the second option in practice because wrapping/unwrapping is error-prone. Of course, the varchar thing makes the alarm bells ring in my head, but when does it significantly affect performance? For example, maybe it's okay for a small shop, or is it a no go for everyone?

If you're using Postgres you can use types in the database
http://www.postgresql.org/docs/9.2/static/sql-createtype.html
I don't know what these look like from the Haskell drivers but you can have
typed SQL if you want it.
On Tuesday, February 10, 2015, Nikita Karetnikov
That part I don't understand. Why does your proposal require varchars everywhere?
That's what Persistent generates (using derivePersistField) for the second table. For example, the Haskell value
MyText "foo"
where MyText is
newtype MyText = MyText Text deriving (Show, Read) derivePersistField "MyText"
will be stored in PostgreSQL as a varchar
'MyText "foo"'
I assume that Persistent (or its PostgreSQL backend) has a predefined mapping from a commonly used Haskell types to the database types. And everything else is represented as a varchar.
So there are two choices, corresponding to the two tables shown in the original message:
1. Design tables with the (basic) database types in mind (integer, boolean, timestamp, varchar, etc.) then wrap and unwrap values each time you interact with the database.
2. Use Haskell types like MyText or something more complex directly and end up with varchar everywhere.
I wonder how bad is the second option in practice because wrapping/unwrapping is error-prone. Of course, the varchar thing makes the alarm bells ring in my head, but when does it significantly affect performance? For example, maybe it's okay for a small shop, or is it a no go for everyone?

You shouldn't be using derivePersistField here if you want another db type.
As you say there is a predefined mapping but you can override this by
defining custom instances for your newtype.
On Tue, Feb 10, 2015 at 7:11 PM, Bob Ippolito
If you're using Postgres you can use types in the database http://www.postgresql.org/docs/9.2/static/sql-createtype.html
I don't know what these look like from the Haskell drivers but you can have typed SQL if you want it.
On Tuesday, February 10, 2015, Nikita Karetnikov
wrote: That part I don't understand. Why does your proposal require varchars everywhere?
That's what Persistent generates (using derivePersistField) for the second table. For example, the Haskell value
MyText "foo"
where MyText is
newtype MyText = MyText Text deriving (Show, Read) derivePersistField "MyText"
will be stored in PostgreSQL as a varchar
'MyText "foo"'
I assume that Persistent (or its PostgreSQL backend) has a predefined mapping from a commonly used Haskell types to the database types. And everything else is represented as a varchar.
So there are two choices, corresponding to the two tables shown in the original message:
1. Design tables with the (basic) database types in mind (integer, boolean, timestamp, varchar, etc.) then wrap and unwrap values each time you interact with the database.
2. Use Haskell types like MyText or something more complex directly and end up with varchar everywhere.
I wonder how bad is the second option in practice because wrapping/unwrapping is error-prone. Of course, the varchar thing makes the alarm bells ring in my head, but when does it significantly affect performance? For example, maybe it's okay for a small shop, or is it a no go for everyone?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, Feb 10, 2015 at 09:02:58PM +0300, Nikita Karetnikov wrote:
That part I don't understand. Why does your proposal require varchars everywhere?
I assume that Persistent (or its PostgreSQL backend) has a predefined mapping from a commonly used Haskell types to the database types. And everything else is represented as a varchar.
OK, well that's surprising. Opaleye by contrast does not force everything into a varchar. Tom
participants (4)
-
Adam Bergmark
-
Bob Ippolito
-
Nikita Karetnikov
-
Tom Ellis