ANN: url-generic-0.1: Parse/format generic key/value URLs from record data types.

Hello, chaps. I thought that it would be an interesting idea to generate URLs from plain old Haskell record data types using Data.Data/Typeable. Repo: https://github.com/chrisdone/url-generic Docs: http://hackage.haskell.org/packages/archive/url-generic/0.1/doc/html/Web-URL... Seems like a success. Whether it's a good idea I'm still considering. Good points: * Works with standard data types. * Derives automatically and therefore predictably. * Disallows conflicts between URL representations (e.g. can't accidentally make up /1234 for "event 1234" and /1234 for "blog post 1234") * The data type, being simple, can therefore be nicely pattern-matched upon, something I use extensively is record wildcarding. * Ordering of parameters doesn't matter. * Fields of type Maybe a can be optionally provided or not. * Enum data types "just work", nice for providing choice in URLs. Bad points: * Sometimes you want custom URLs like "/1234" for convenience. * Not sure how this would interact with web-routes and those type of packages because they use only *one* type for *all* URLs. Let me know your thoughts. Ciao!

Greetings from a lurker
I thought that it would be an interesting idea to generate URLs from plain old Haskell record data types using Data.Data/Typeable.
Repo: https://github.com/chrisdone/url-generic
Docs: http://hackage.haskell.org/packages/archive/url-generic/0.1/doc/html/Web-URL...
Seems like a success. Whether it's a good idea I'm still considering. [snip] Let me know your thoughts.
I like it. Sorry, that is an understatement. Is it a good idea, definitely yes. A very similar approach for URLs is used in Drupal, sans the automatic generation of the url schemes. The benefit is the mapping between urls and type constructors. It makes all kinds of interesting URL expressions possible. For example having a compound page with several different posts in it, expressed in the url. This language gives control to the user to express exactly what content they want combined and how. This power is often under estimated. The scenarios are too many to mention.
From experience, it is good to combine this with named abstractions, both compile time and data driven ones.
The compile time ones are useful for abbreviations. They could probably be done using web-routes even now, but not sure how well that will work in practice. The data/value driven names are no more but a map from user specified paths to expessions in some data store. Both of the above can be used to provide clean, human readable urls, which are good both for usablility and SEO. In short - the url DSLs this package makes easy to create makes gives me a warm fuzzy feeling. You have a beer or a drink of your choice, or chocolate, on me if we meet somewhere. And a question: data Event = Event { eventId :: Maybe Integer -- ^ The event id. , eventScope :: Bool -- ^ Show the scope? , eventLayout :: Layout -- ^ Layout for the page. , eventRec :: Rec } deriving (Data,Typeable,Show) data Rec = Rec { one :: Int, two :: Int } deriving (Data,Typeable,Show) Would it be possible to express the above in a URL? I understand that at the moment it is not, but what are your thoughts on how far it would be reasonable to push the encoding. Cheers, Vlado

On 20 June 2011 09:54, Vladimir Zlatanov
expressions possible. For example having a compound page with several different posts in it, expressed in the url. This language gives control to the user to express exactly what content they want combined and how. This power is often under estimated. The scenarios are too many to mention.
For example? What's an example of a compound page, expressed in the url?
In short - the url DSLs this package makes easy to create makes gives me a warm fuzzy feeling. You have a beer or a drink of your choice, or chocolate, on me if we meet somewhere.
Haha, I'm glad I'm incuding warm fuzzy feelings.
And a question:
data Event = Event { eventId :: Maybe Integer -- ^ The event id. , eventScope :: Bool -- ^ Show the scope? , eventLayout :: Layout -- ^ Layout for the page. , eventRec :: Rec } deriving (Data,Typeable,Show)
data Rec = Rec { one :: Int, two :: Int } deriving (Data,Typeable,Show)
Would it be possible to express the above in a URL? I understand that at the moment it is not, but what are your thoughts on how far it would be reasonable to push the encoding.
How would you expect that to be represented? I thought a little about it… it seems like if it was
data Rec = Rec { one :: Int, two :: Int } deriving (Data,Typeable,Show)
Then you could represent it as /event/id/1/one/a/two/b or /event/id/1/rec-one/a/rec-two/b The former could be a problem as it may lead to ambiguities, unless you decided that sub-types' fields would have precedence?

expressions possible. For example having a compound page with several different posts in it, expressed in the url. This language gives control to the user to express exactly what content they want combined and how. This power is often under estimated. The scenarios are too many to mention.
For example? What's an example of a compound page, expressed in the url?
Ok. Let's try. I'll simplify it a bit. This will assume that we have a product combinator type data Comb a b = Comb { left::a, right::b } deriving (Data,Typeable,Show) Let's have a page which includes two different pieces of content side by side, let's say product comparison. The product type could be something like data Product = Prouct { productId::Integer } We should be able to form values of type type TwoProducts = Comb Product Product That type could be expressed in a URL like: http://example.com/Comb/Product/12/Product/14 The whole idea is to have the primitive resources as types, and to provide the the necessary combinators - product, and maybe a sum one.
data Event = Event { eventId :: Maybe Integer -- ^ The event id. , eventScope :: Bool -- ^ Show the scope? , eventLayout :: Layout -- ^ Layout for the page. , eventRec :: Rec } deriving (Data,Typeable,Show)
data Rec = Rec { one :: Int, two :: Int } deriving (Data,Typeable,Show)
Would it be possible to express the above in a URL? I understand that at the moment it is not, but what are your thoughts on how far it would be reasonable to push the encoding.
How would you expect that to be represented? I thought a little about it… it seems like if it was
data Rec = Rec { one :: Int, two :: Int } deriving (Data,Typeable,Show)
Then you could represent it as
/event/id/1/one/a/two/b
or
/event/id/1/rec-one/a/rec-two/b
The former could be a problem as it may lead to ambiguities, unless you decided that sub-types' fields would have precedence?
To be honest I'm not sure. But if we use '/' as white space, and have only left associativie combinators/constructors, with no variadic arguments, the result should be unambiguous. It will be possibly comprehendible (is it a word?) by humans for short urls. I would use the constructors like: /Event/id/1/Rec/one/a/two/b This makes it unambiguous what we mean, in case we have the same field names, but slightly more verbose. /Event/1/Rec/a/b would work as well, but you need the source or specs to read it

Hello,
How is this actually different from web-routes (which already supports
automatic url deriving via Template Haskell, generics, or
quasi-quotation)?
You suggest that web-routes requires a single URL type for all the
routes and that your code allows for multiple types. But I do not see
how that is actually done.
If you have two url types, Home and Event, and a URL comes in.. how do
you know if you need to call:
parseURLPath url :: Maybe Event
or
parseURLPath url :: Maybe Home
- jeremy
On Mon, Jun 20, 2011 at 2:18 AM, Christopher Done
Hello, chaps.
I thought that it would be an interesting idea to generate URLs from plain old Haskell record data types using Data.Data/Typeable.
Repo: https://github.com/chrisdone/url-generic
Docs: http://hackage.haskell.org/packages/archive/url-generic/0.1/doc/html/Web-URL...
Seems like a success. Whether it's a good idea I'm still considering. Good points:
* Works with standard data types. * Derives automatically and therefore predictably. * Disallows conflicts between URL representations (e.g. can't accidentally make up /1234 for "event 1234" and /1234 for "blog post 1234") * The data type, being simple, can therefore be nicely pattern-matched upon, something I use extensively is record wildcarding. * Ordering of parameters doesn't matter. * Fields of type Maybe a can be optionally provided or not. * Enum data types "just work", nice for providing choice in URLs.
Bad points:
* Sometimes you want custom URLs like "/1234" for convenience. * Not sure how this would interact with web-routes and those type of packages because they use only *one* type for *all* URLs.
Let me know your thoughts.
Ciao!
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

On 20 June 2011 20:05, Jeremy Shaw
Hello,
How is this actually different from web-routes (which already supports automatic url deriving via Template Haskell, generics, or quasi-quotation)?
1) Erm, it's actually documented? ;-)
You suggest that web-routes requires a single URL type for all the routes and that your code allows for multiple types. But I do not see how that is actually done.
2) {-# LANGUAGE RecordWildCards, DeriveDataTypeable #-} import Data.Data import Data.List import Data.Maybe import Web.URL.Generic data Event = Event { eventId :: Integer -- ^ The event id. , eventScope :: Bool -- ^ Show the scope? , eventLayout :: Layout -- ^ Layout for the page. } deriving (Data,Typeable,Show) data Layout = Wide | Thin | Collapsed deriving (Typeable,Data,Show,Enum) data Home = Home deriving (Data,Typeable,Show) router :: String -> IO String router url = fromMaybe (error "404!") (listToMaybe $ mapMaybe ($ url) routes) routes :: [String -> Maybe (IO String)] routes = [route eventPage ,route homePage] route :: Data a => (a -> IO String) -> (String -> Maybe (IO String)) route handler url = case parseURLPath url of Just a -> Just $ handler a Nothing -> Nothing eventPage :: Event -> IO String eventPage Event{..} = return $ "Event page! Layout: " ++ show eventLayout homePage :: Home -> IO String homePage Home = return $ "Home page! Click here to go to the event page: " ++ formatURLPath Event { eventId = 1, eventScope = False, eventLayout = Thin } λ> router "/event/id/1/layout/wide" Loading package syb-0.1.0.2 ... linking ... done. Loading package transformers-0.2.2.0 ... linking ... done. Loading package mtl-2.0.1.0 ... linking ... done. "Event page! Layout: Wide" λ> router "/home" "Home page! Click here to go to the event page: /event/id/1/layout/thin" λ> router "" *** Exception: 404!

On Mon, Jun 20, 2011 at 2:50 PM, Christopher Done
On 20 June 2011 20:05, Jeremy Shaw
wrote: Hello,
How is this actually different from web-routes (which already supports automatic url deriving via Template Haskell, generics, or quasi-quotation)?
1) Erm, it's actually documented? ;-)
I am planning to write documentation for web-routes this week or next week. There is an additional web-routes extension I need to finish up first. In fact, I was working on that code when I got distracted by this thread :p
You suggest that web-routes requires a single URL type for all the routes and that your code allows for multiple types. But I do not see how that is actually done.
The same approach can be done with the existing web-routes code. Here is a conversion: {-# LANGUAGE RecordWildCards, DeriveDataTypeable, TemplateHaskell #-} import Data.Data import Data.List import Data.Maybe import Web.Routes import Web.Routes.TH data Event = Event { eventId :: Integer -- ^ The event id. , eventScope :: Bool -- ^ Show the scope? , eventLayout :: Layout -- ^ Layout for the page. } deriving (Data,Typeable,Show) data Layout = Wide | Thin | Collapsed deriving (Typeable,Data,Show,Enum) data Home = Home deriving (Data,Typeable,Show) $(derivePathInfo ''Bool) $(derivePathInfo ''Layout) $(derivePathInfo ''Home) $(derivePathInfo ''Event) router :: String -> IO String router url = fromMaybe (error "404!") (listToMaybe $ mapMaybe ($ url) routes) routes :: [String -> Maybe (IO String)] routes = [route eventPage ,route homePage] route :: (PathInfo a) => (a -> IO String) -> (String -> Maybe (IO String)) route handler url = case fromPathInfo url of Right a -> Just $ handler a Left e -> Nothing eventPage :: Event -> IO String eventPage Event{..} = return $ "Event page! Layout: " ++ show eventLayout homePage :: Home -> IO String homePage Home = return $ "Home page! Click here to go to the event page: " ++ toPathInfo Event { eventId = 1, eventScope = False, eventLayout = Thin } Here I am using the existing template haskell code to derive the printer/parser pair. The template haskell code uses a different set of formatting rules than your SYB code. But there is no reason why the syb code could not be used with web-routes. In fact, it only requires a few small modifications to your existing code. I have made the changes here: http://hpaste.org/48039 The essence of the change is that it should work using [String] rather than String. Where [String] is the list of / separated path segments. That version will actually do the right thing if your record contains a String and the String contains special characters like / or non-ascii unicode characters. Plus it can be easily integrated with the rest of the web-routes framework. It is just another way of generating the printers/parsers. That was one of the goals of the web-routes library -- the base package does not provide any rules for how the urls should be formatted. Instead we provide extension packages that provide various methods and schemes. Or people can write their own -- there is nothing sacred about methods we provide. So, I would be happy to work with you to create a web-routes-syb package that uses your set of encoding rules. (Actually, that modified code on hpaste plus a little extra glue code is all you really need). That said, I do have a complaint about the particular set of rules you use. You allow the fields to be specified in any order -- but that makes it difficult to support record types that contain other record types, doesn't it? I am not sure why arbitrary ordering is more valuable than nested records. If you are using the library to generate the urls, then you already know the order. If you wanted to extend Layout with a constructor that takes an argument; data Layout = Wide | Thin | Collapsed | Arbitrary Integer Then you are kind of stuck? If you get rid of variable field ordering, then you can encode a much larger range of URL types. Anyway, I highly recommend you build on top of web-routes. web-routes is a pretty lightweight dependency and it already solves the problems of low-level URL encoding, escaping, etc. It also provides integration with yesod, happstack, hsp, and more. That means you can just focus on the high-level issue of deriving a nice looking URL scheme using SYB. If there is someway in which web-routes in not suitable to build on, I would love to fix it. - jeremy

On 21 June 2011 00:03, Jeremy Shaw
On Mon, Jun 20, 2011 at 2:50 PM, Christopher Done
wrote: 1) Erm, it's actually documented? ;-)
I am planning to write documentation for web-routes this week or next week. There is an additional web-routes extension I need to finish up first. In fact, I was working on that code when I got distracted by this thread :p
Ah, good. Honestly I've tried several times to figure out how to actually start using web-routes and failed. I understand each module individually, but I don't know how it's supposed to be used as a whole, I figured I'd wait for more documentation. I noticed that it allowed for custom URL-formatting, so I thought I'd do this. As I said up there, I didn't know how this would interact with web-routes. I thought I'd try to set an example a bit by putting extra effort into documenting url-generic. :-P
So, I would be happy to work with you to create a web-routes-syb package that uses your set of encoding rules. (Actually, that modified code on hpaste plus a little extra glue code is all you really need).
That would be cool. I was hoping it could be used with web-routes.
You allow the fields to be specified in any order -- but that makes it difficult to support record types that contain other record types, doesn't it?
Does it? Why? /foo/x/1/y/a/1/b/2/z/2 seems no more ambiguous than /foo/y/b/1/a/2/z/2/x/1 I'd think the problem stopping embedded types is optional fields.
I am not sure why arbitrary ordering is more valuable than nested records.
Mostly I'm thinking about users (and myself) composing these URLs. I
initially had strict ordering but then composing the URLs to (from the
user perspective) a strict "arbitrary" ordering is annoying and kind
of needless. This is partly why optionality is nice. Maybe I want to
go to /event/1234, and I don't want to specify
/event/1234/scope/false.
But yeah, I don't see why nested records can't be accomplished with
arbitrary ordering.
Vladimir also brought this up:
On 20 June 2011 13:37, Vladimir Zlatanov
To be honest I'm not sure. But if we use '/' as white space, and have only left associativie combinators/constructors, with no variadic arguments, the result should be unambiguous. It will be possibly comprehendible (is it a word?) by humans for short urls.
I would use the constructors like:
/Event/id/1/Rec/one/a/two/b
This makes it unambiguous what we mean, in case we have the same field names, but slightly more verbose.
/Event/1/Rec/a/b
would work as well, but you need the source or specs to read it
It seems like we don't need to mention the constructors, we can just write out the structure as in JSON. /foo/x/1/y/a/1/b/2/z/2 => {foo:{x:1,y:{a:1,b:2},z:2}} If you know how many fields a record should have you know how many to count. I suppose we could also stick the constructor in there for readability. /event/id/1/pagination/limit/5/offset/6/scope/true => {event:{id:1,pagination:{limit:5,offset:6}},scope:true} I guess that is more readable. :-)
If you are using the library to generate the urls, then you already know the order.
As above, I don't think order matters when you have the names and the count
If you wanted to extend Layout with a constructor that takes an argument;
data Layout = Wide | Thin | Collapsed | Arbitrary Integer
Then you are kind of stuck? If you get rid of variable field ordering, then you can encode a much larger range of URL types.
Again, is it relevant? The question seems more like how do you encode that field? Like this? /event/id/1/layout/arbitrary/1 Is that any harder to parse than /event/layout/arbitrary/1/id/1? I think if it is then one's parser is faulty. It's a bit late, I have an inexplicable headache. Maybe I'm missing something on this.
Anyway, I highly recommend you build on top of web-routes. web-routes is a pretty lightweight dependency and it already solves the problems of low-level URL encoding, escaping, etc. It also provides integration with yesod, happstack, hsp, and more. That means you can just focus on the high-level issue of deriving a nice looking URL scheme using SYB. If there is someway in which web-routes in not suitable to build on, I would love to fix it.
That's what I was hoping to do anyway. I've been putting off doing any URL thing for hpaste until I'm happy with the right library. Sadly there isn't a Snap web-routes thingie. Greg Collins said he was working on one but I think that never fruited. Once I grok how web-routes work I doubt I'll have a problem laying it ontop of Snap. I hope that I can merely derive MonadRoute (or whatever it was) for my monad stack and that I don't have to use lifts and stuff. I think there was some RouteT monad with functions with concrete RouteT types instead of classes which can be a pain but I can't remember. I'm going to bed! Anyway, looking forward to some documentation, even though it's my fault for being lazy and impatient. ;-) Ciao!

On Mon, Jun 20, 2011 at 6:12 PM, Christopher Done
You allow the fields to be specified in any order -- but that makes it difficult to support record types that contain other record types, doesn't it?
Does it? Why?
/foo/x/1/y/a/1/b/2/z/2 seems no more ambiguous than /foo/y/b/1/a/2/z/2/x/1
I'd think the problem stopping embedded types is optional fields.
Yeah actually, it should work out ok. But you will need to take a rather different approach to parsing. It just won't be fun to write the code. But that is your job not mine :p So you have: /foo/x/1/y/a/1/b/2/z/2 /foo/y/b/1/a/2/z/2/x/1 I believe the type that comes from is:
data Foo = Foo { x :: Int , y :: Y , z :: Int }
data Y = Y { a :: Int , b :: Int }
So, the current SYB stuff is going to have issues with this I think. It will first look at the 'foo' for the 'Foo' constructor, and then try to split the rest of the path segments up into key/value pairs via urlToAssoc: /foo /x/1 /y/a -- oops. things went wrong here /1/b /2/z /2 Well, that does not really work. We can hack around the off-by-one issue by introducing another y for the Y constructor: /foo/x/1/y/y/a/1/b/2/z/2 That gives us: /foo/x/1/y/y/a/1/b/2/z/2 so now we get: /foo /x/1 /y/y /a/1 /b/2 /z/2 which seems ok. But what if we extend Foo to have a second Y field:
data Foo = Foo { w :: Y , x :: Int , y :: Y , z :: Int }
giving us: /foo/w/w/a/3/b/4/x/1/y/y/a/1/b/2/z/2 now when split it up we get: /foo /w/w /a/3 /b/4 /x/1 /y/y /a/1 /b/2 /z/2 Oops. Now we have a & b twice. That doesn't really work. Clearly we need to know if the a & b are associated with the w field or the y field. So, that means urlToAssoc is not going to cut it -- we need to replace it with something that understands the types involved. In order to consume: /foo/w/w/a/3/b/4/x/1/y/y/a/1/b/2/z/2 We need to parse the segments in the order they appear. step 1: /foo We see /foo. We know now that we are looking for four fields, w, x, y and z. step 2: /w We know we are parsing Y type, so we call the parseCons for Y. That consumes: /w/w/a/3/b/4/ and leaves us with: /x/1/y/y/a/1/b/2/z/2 step 3: We do the same thing for /x, /y, and /z. step 4: Then we assemble the fields in the right order and apply the Foo constructor to them. We have to be careful to handle the cases where the incoming url is missing a field or (incorrectly) has a field twice. Or maybe I am missing something obvious. I have not thought about this too much. I'll read the rest of your reply later. I have some stuff I need to do now. - jeremy

On 21 June 2011 02:02, Jeremy Shaw
On Mon, Jun 20, 2011 at 6:12 PM, Christopher Done
wrote: You allow the fields to be specified in any order -- but that makes it difficult to support record types that contain other record types, doesn't it?
Does it? Why?
/foo/x/1/y/a/1/b/2/z/2 seems no more ambiguous than /foo/y/b/1/a/2/z/2/x/1
I'd think the problem stopping embedded types is optional fields.
Yeah actually, it should work out ok. But you will need to take a rather different approach to parsing.
It just won't be fun to write the code. But that is your job not mine :p
Sure. :-)
So, the current SYB stuff is going to have issues with this I think.
Of course, I had no intention for embedded types. That idea came from you guys later. The current parser clearly wouldn't work at all.
We have to be careful to handle the cases where the incoming url is missing a field or (incorrectly) has a field twice.
You mean accepting URLs with missing fields? I think missing fields could make the whole embedded types thing ambiguous: data Foo = Foo { fooRed :: Integer, fooBar :: Bar, fooBlue :: Integer } data Bar = Bar { barYellow :: Integer, barBlue :: Maybe Integer } /foo/red/123/bar/yellow/678/blue/999 What does blue refer to? If we take it to parse the inner one first, then suddenly order matters. So we either need to have order matter or remove optional parameters, or somehow put up with this. Tricky.
participants (3)
-
Christopher Done
-
Jeremy Shaw
-
Vladimir Zlatanov