On Thu, Mar 25, 2010 at 12:29 PM, Michael Snoyman <michael@snoyman.com> wrote:OK, here are my initial code comments:* Do we want to move everything into Web.URLT? More to the point, I'm not sure I see the point of calling this URLT, since it doesn't really require any monad transformers; maybe we should call it web-routes and then the module would be Web.Routes?I think Web.Routes is a fine name. I'll make it happen. In the rest of this post I refer to things by the old names, but I do intend to change the module names and rename the package to web-routes.* I like the PathInfo class and to/fromPathSegments. Perhaps we should bundle that with the decode/encodePathInfo into a single module?I put PathInfo in a separate module because I am a little dubious of classes these days. I find it a bit annoying that you can only have one PathInfo instance per type. And I think it helps show that using PathInfo is not actually required. But, in practice, I think having less modules is probably a good thing in this case, since it does not affect the dependency chain at all. Just because I *can* put every function in it's own module doesn't mean I should. ;) Also, we probably do want people to provide PathInfo instances, even if they don't have to..
* I'd like to minimize dependencies as much as possible for the basic package. The two dependencies I've noticed are Consumer and applicative-extras. I think the type signatures would be clearer *without* those packages included, eg:fromPathSegments :: [String] -> Either ErrMsg aExcept that is not a usable type. fromPathSegments may consume, some, but not all of the path segments. Consider the type:data SiteURL = Foo Int IntfromPathSegments is going to receive the path segments:["Foo","1","2"]If you wrote a parser by hand, you would want it to look a little something like:do string "Foo"slashi <- fromPathSegmentsslashj <- fromPathSegmentseolreturn (Foo i j)The key concept here is that when you call fromPathSegments to get the first argument of Foo you need to know how many of the path segments were consumed / are remaining, so you can pass only those segments to the second fromPathSegments.So you really need a type like:fromPathSegments :: [String] -> (Either ErrMsg a, [String])which outputs the unconsumed path segments.
But this is obviously a ripe target for a monad of some sort -- trying keep track of the unconsumed portions by hand seems like it would asking for trouble...
The Consumer monad takes care of that and provides the functions you would expect such as, next, peek, and poke. And it seems nice to be able to use Monad, MonadPlus, Applicative, Alternative, etc, for composing fromPathSegments into larger parsers ?But, perhaps there is a better choice of monad, or a better way of dealing with the problem? Or maybe it's not really a problem?I think Failing is a pretty nifty data-type for dealing with errors. But perhaps it is not a big win here.. The #1 thing that makes Failing better than (Either [String] a) is it's Applicative instance. Specifically, Failing will accumulate and return all the errors which have occurred, not the just first failure (which is the behavior for Applicative (Either e)).So for example, let's say you are doing are trying to lookup a bunch of keys from the query string. The key / value pairs in the query string are typically independent of each other. So let's say you do:(,) <$> lookup "foo" <*> lookup "bar"but neither of those keys exist. With Either you will only get the error 'could not find "foo"'. But with Failing you will get the error 'could not find "foo". could not find "bar"'. It is nice to get a report of all the things that are broken, instead of getting only one error at a time, fixing it, and then getting another error, etc.However, I am not sure if this property is all that useful which urlt. If you are trying to parse a url like:(string "Foo" *> Foo) <$> fromPathSegments <*> fromPathSegmentsAnd the parsing of "Foo" fails.. then there is no use in finding out if the other segments parse ok -- because they are likely to be garbage. Maybe it failed because it got the string "FOo" instead of "Foo", but more likely it got something completely unrelated like, /bar/c/2.4.So, perhaps Either is a better choice even with out considering dependencies... I think that Applicative / Alternative instances for Either are only defined in transformers in the Control.Monad.Error module -- which is a bit annoying. But we don't actually need those to implement urlt itself.This brings up another detail though.the fromPathSegments / Consumer stuff is basically implementing a parser. Except, unlike something like parsec, we do not keep track of the current position for reporting errors. I wonder if we should perhaps use a slightly richer parser environment. Within a web app, once you got your to/from instances debugged, you will never get a parse error, so having great error messages is not essential. But, for other people linking to your site it could be potentially helpful. Though, it seems like the current error messages out to be sufficient given how short the urls are..
I'm not certain what exactly the type of ErrMsg should be here; I don't really have a problem using [String], which would be close to the definition of Failing.* I think it's very important to allow users to supply customized 404 pages. Essentially, we need to augment handleWai (possibly others) with a (ErrMsg -> Application) parameter.Yeah, there are (at least) two possibilities, add an extra param for the handler. Or bubble the error up to the top:handleWai_1 :: (url -> String) -> (String -> Failing url) -> String -> ([ErrorMsg] -> Application) -> ((url -> String) -> url -> Application) -> ApplicationhandleWai_1 fromUrl toUrl approot handleError handler =\request ->do let fUrl = toUrl $ stripOverlap approot $ S.unpack $ pathInfo requestcase fUrl of(Failure errs) -> handleError errs request(Success url) -> handler (showString approot . fromUrl) url requesthandleWai_2 :: (url -> String) -> (String -> Failing url) -> String -> ((url -> String) -> url -> Application) -> (Request -> IO (Failing Response))handleWai_2 fromUrl toUrl approot handler =\request ->do let fUrl = toUrl $ stripOverlap approot $ S.unpack $ pathInfo requestcase fUrl of(Failure errs) -> return (Failure errs)(Success url) -> fmap Success $ handler (showString approot . fromUrl) url requestThe second choice is perhaps more flexible. Which do you prefer? In the first option, the handleError function could be a Maybe value -- and if you supply Nothing you get some default 404 page?
In happstack we have a third possiblity. The ServerMonad is an instance of MonadPlus so we can throw out the error message and just call mzero:implSite :: (Functor m, Monad m, MonadPlus m, ServerMonad m) => String -> FilePath -> Site url String (m a) -> m aimplSite domain approot siteSpec =do r <- implSite_ domain approot siteSpeccase r of(Failure _) -> mzero(Success a) -> return a
implSite_ :: (Functor m, Monad m, MonadPlus m, ServerMonad m) => String -> FilePath -> Site url String (m a) -> m (Failing a)implSite_ domain approot siteSpec =dirs approot $ do rq <- askRqlet pathInfo = intercalate "/" (rqPaths rq)f = runSite (domain ++ approot) siteSpec pathInfocase f of(Failure errs) -> return (Failure errs)(Success sp) -> Success <$> (localRq (const $ rq { rqPaths = [] }) sp)then we can do:msum [ implSite "domain" "approot" siteSpec, default404]if implSite calls mzero, then the next handler (in this case default404) is tried.* It might be nice to have "type WaiSite url = Site url String Application". By the way, are you certain you want to allow parameterization over the pathInfo type?I'm not certain I don't want to allow it... I have a vague notion that I might want to use Text sometimes instead of String. Though if I was really committed to that then I should make toPathInfo and fromPathInfo parameterized over pathInfo as well... So perhaps I will axe it from Site for now. I need to change the name of that type and it's record names too I think.
The only packages that I feel qualified to speak about then are urlt and urlt-wai, and my recommendation would be:urlt contains decode/encodePathInfo, PathInfo class and related functions, Site and related functions. If you agree on allowing the parameterization of 404 errors, then also provide a default 404 error.urlt-wai contains WaiSite, handleWai and related functions.Yeah, that is what I was thinking. urlt would contain what is currently in;URLT.BaseURLT.PathInfoURLT.HandleTURLT.MonadURLT.QuickCheck
QuickCheck module does not actually depend on QuickCheck, which is nice because QC1 vs QC2 is a big problem right now.
If I switch to Either instead of Failing I believe the dependencies would be:base, Consumer, template-haskell, network, utf8-stringurlt-wai would just include:URLT.Wai