On the state of Haskell web frameworks

Gregory Collins

15 Mar 2010 15 Mar '10

10:38 p.m.

Hi all, Just subscribed, and after reading through the recent traffic on this list in the archives I have a couple of quick comments. First, and I've discussed this with Michael Snoyman at length in private, I agree 100% with Chris Eidhof when he writes "I don't believe that there could be one big framework for everybody." Even getting consensus on what the prerequisites would be is basically impossible. To quote Gour:

...

Although it might be great framework, I'll be frank to say that Happstack with all technology that goes along is something which is either too complicated for me, not practical to be run on shared hosting (e.g. Webfaction), not documented properly and it belongs to 'noSQL' which is not for me at the moment.

That's why I'm interested to discuss possibility to somehow consolidate Haskell's web frameworks nad provide something which is good for practical development, does not require VPS to be used, nice documentation etc.

A priori we've reached a fundamental disagreement about first principles -- personally I couldn't give a rat's ass whether the framework I'm using runs well on shared hosting, my focus is on being able to efficiently service heavy loads in a production context. Obviously, on the other side of the coin, being able to run in a CGI process hooked up to Apache is really important to many people. Michael wants to use YAML to define data models: more power to him, but personally the idea of doing that makes me want to vomit. I'm not saying that to disparage him at all, the idea isn't necessarily bad on its own merits: it just isn't for me. Everyone has their own pet preferences. I'm seeing a lot of energy being dedicated to standardizing interfaces, which I suppose is an admirable goal, but to me it's putting the cart before the horse: what's the point of having a standardized adapter when there's nothing worth plugging into it, besides some middleware that your framework should be providing for you anyways? Concentrating too much on interop right now is a waste of time IMO - besides, on the level that it's currently being discussed, that stuff is *easy*: making an adapter between any two of the myriad different web stack implementations (WAI, Happstack, Hyena, CGI, Hack, FastCGI, etc etc etc) is trivial. I doubt there's anyone on this list who couldn't cook up a WAI/CGI gateway in an afternoon. We need pluggable *applications* (blog, CMS, RSS feed generation, administrative panels, forum, wiki, caching, user management, etc) that understand how to talk to each other --- expecting "plug and play" compatibility between frameworks on this level, when there's no consensus on the *primitives*, is a pipe dream. The first framework that cracks this particular nut, in a way that makes it easy and pleasurable for people to build web apps that perform, is going to gain a lot of traction. Code talks. G -- Gregory Collins

Show replies by date

Jeremy Shaw

15 Mar 15 Mar

10:59 p.m.

On Mon, Mar 15, 2010 at 5:38 PM, Gregory Collins wrote: We need pluggable *applications* (blog, CMS, RSS feed generation,

...

administrative panels, forum, wiki, caching, user management, etc) that understand how to talk to each other --- expecting "plug and play" compatibility between frameworks on this level, when there's no consensus on the *primitives*, is a pipe dream. The first framework that cracks this particular nut, in a way that makes it easy and pleasurable for people to build web apps that perform, is going to gain a lot of traction. Code talks.

I have been working on this in the happstack framework. Specially, the idea of making possible for you to write a new application by combining several libraries writing by different authors. The first issue is that the libraries need to have unique URLS that can not collide with each other. If two libraries expect to be able to handle, " http://localhost/submit", that will never fly. So one aim of URLT was to ensure that when you combined different modules together, the URLs remained unique. http://src.seereason.com/~jeremy/SimpleSite1.html Another issue is that each library has it's own set of state that it needs to manage. happstack-state is pretty well situated in that regard. The happstack state is made up of an arbitrary number of state Components. Each library can have it's own state Components which get added to the global state. The next big problem I have run into is user management / security. Each of the libraries maybe have different requirements regarding permissions, etc. But I am not clear how much independence they can have from the user management system. I don't really want to take a one-size-fits-all user management approach though if I don't have to. - jeremy

Gregory Collins

16 Mar 16 Mar

1:18 a.m.

Jeremy Shaw writes:

...

On Mon, Mar 15, 2010 at 5:38 PM, Gregory Collins wrote:

We need pluggable *applications* (blog, CMS, RSS feed generation, administrative panels, forum, wiki, caching, user management, etc) that understand how to talk to each other --- expecting "plug and play" compatibility between frameworks on this level, when there's no consensus on the *primitives*, is a pipe dream. The first framework that cracks this particular nut, in a way that makes it easy and pleasurable for people to build web apps that perform, is going to gain a lot of traction. Code talks.

I have been working on this in the happstack framework. Specially, the idea of making possible for you to write a new application by combining several libraries writing by different authors.

The first issue is that the libraries need to have unique URLS that can not collide with each other. If two libraries expect to be able to handle, "http://localhost/submit", that will never fly.

We're planning on handling this in Snap by doing what Java servlets do: you hang your component (our limp-pun code name for these is "Snaplets") somewhere on a portion of the URL tree. If you hang a servlet on "/foo" and give it request "/foo/frobnicate", the server tells you the servlet context path is "/foo" and the request path is "/frobnicate". To me this is a smart and sensible approach.

...

So one aim of URLT was to ensure that when you combined different modules together, the URLs remained unique.

http://src.seereason.com/~jeremy/SimpleSite1.html

I'm not sure I'd want to use this, there's too much data being encoded at the type level here. I don't understand the fetish for wanting to verify links with a typechecking scheme anyhow; if you make the simplifying assumption that component authors can keep it together enough to keep their intra-component links organized, the only problem you really need to worry about is linking from one component into another. (I.e. "zazzle 1.7 broke all my internal links!") To me it seems more natural to solve that issue by asking zazzle to give you a link to one of its resources; i.e. take a pull vs. a push approach. If the code to pull the link to zazzle goes missing it will be obvious and you can fail loudly instead of silently. I don't know about you guys, but I think that level of assistance is enough for me -- do I have to run my website through a proof assistant? G -- Gregory Collins

Peter Gammie

15 Mar 15 Mar

11:15 p.m.

On Tue, Mar 16, 2010 at 9:38 AM, Gregory Collins wrote:

...

We need pluggable *applications* (blog, CMS, RSS feed generation, administrative panels, forum, wiki, caching, user management, etc) that understand how to talk to each other --- expecting "plug and play" compatibility between frameworks on this level, when there's no consensus on the *primitives*, is a pipe dream. The first framework that cracks this particular nut, in a way that makes it easy and pleasurable for people to build web apps that perform, is going to gain a lot of traction. Code talks.

Several years ago (mid-2007) I was looking for something like this. I found Bjorn Bringert's HOPE: http://hope.bringert.net/ (err, that seems broken now.) In essence, it is a framework built from a bunch of standard Haskell libs, for better or worse. It uses CGI/FCGI to talk to a web server. I dropped the HaskellDB dependency (roughly because it was too much effort to get it to scale up to thousands of items - I really needed the database to help with pagination), added multi-lingual support, accessibility, a primitive forum, web polls, google services, etc. for this website: http://drdvietnam.com/ This is not a high-traffic site - perhaps up to a thousand hits a day. For this reason I am sure there are many memory leaks, etc. in my version of HOPE. Why is it not on Hackage? Well, it only compiles with 6.8.x due to the capricious changes to the exception libraries in 6.10.x, and moreover I adjusted some of the base libraries (xhtml, cgi, HSQL, etc) in ways that I doubt everyone would like. (Am I the only person who forks almost every library they use?) So if you're in the market for a CMS-alike and don't want to reinvent the wheel, HOPE might be a good starting point. Bjorn solved a lot of problems - modularity, extensibility, packaging with Cabal, routing URLs, MVC, user management with role-based capabilities, etc. - in highly pragmatic and occasionally beautiful ways. :-) The darcs repos are here: http://peteg.org/haskell/ BTW if you're in this business please take Unicode seriously! Haskell libraries interfacing with C were terrible on this front when I was doing this work. cheers peter -- http://peteg.org/

Jeremy Shaw

11:22 p.m.

On Mon, Mar 15, 2010 at 5:38 PM, Gregory Collins wrote:

...

I'm seeing a lot of energy being dedicated to standardizing interfaces, which I suppose is an admirable goal, but to me it's putting the cart before the horse: what's the point of having a standardized adapter when there's nothing worth plugging into it, besides some middleware that your framework should be providing for you anyways?

In the case of happstack, I have felt that the lazy I/O based server back end would ultimately be a liability. That we ultimately need to be able to use something enumerator based. Additionally, the lazy I/O backend part of happstack-server is the least interesting part of the project. It just takes extra resources to maintain, but provides nothing especially unique. Moving to Wai should put us in a much better position to use a fast enumerator based backend -- and with out having to write one ourselves. Plus it should make it easy for people to use CGI, fastCGI, etc. We have some adapters for that already, but the Wai solution seems cleaner. If you want to use happstack-state, then you still need a process that runs fulltime, but if you just want to use the ServerMonad stuff provided in happstack-server, then CGI should be fine. Some from the point of view of a framework developer, it's not 'the middleware that your framework should be providing', but rather 'not having to reimplement existing middleware when I could be doing something cooler'. And, it also means that we can more easily share pieces of our framework with others. Many people are not interested in happstack-state, but might be interested in happstack-server. So, making it easier for them to use just that little piece is good for us. There is certainly a point where we won't see much increased returns from standard interfaces though. - jeremy

Michael Snoyman

16 Mar 16 Mar

12:45 a.m.

...

Hi all,

Just subscribed, and after reading through the recent traffic on this list in the archives I have a couple of quick comments.

First, and I've discussed this with Michael Snoyman at length in private, I agree 100% with Chris Eidhof when he writes "I don't believe that there could be one big framework for everybody." Even getting consensus on what the prerequisites would be is basically impossible. To quote Gour:

...
Although it might be great framework, I'll be frank to say that Happstack with all technology that goes along is something which is either too complicated for me, not practical to be run on shared hosting (e.g. Webfaction), not documented properly and it belongs to 'noSQL' which is not for me at the moment.

That's why I'm interested to discuss possibility to somehow consolidate Haskell's web frameworks nad provide something which is good for practical development, does not require VPS to be used, nice documentation etc.

A priori we've reached a fundamental disagreement about first principles -- personally I couldn't give a rat's ass whether the framework I'm using runs well on shared hosting, my focus is on being able to efficiently service heavy loads in a production context. Obviously, on the other side of the coin, being able to run in a CGI process hooked up to Apache is really important to many people.

Michael wants to use YAML to define data models: more power to him, but personally the idea of doing that makes me want to vomit. I'm not saying that to disparage him at all, the idea isn't necessarily bad on its own merits: it just isn't for me. Everyone has their own pet preferences.

I'm seeing a lot of energy being dedicated to standardizing interfaces, which I suppose is an admirable goal, but to me it's putting the cart before the horse: what's the point of having a standardized adapter when there's nothing worth plugging into it, besides some middleware that your framework should be providing for you anyways? Concentrating too much on interop right now is a waste of time IMO - besides, on the level that it's currently being discussed, that stuff is *easy*: making an adapter between any two of the myriad different web stack implementations (WAI, Happstack, Hyena, CGI, Hack, FastCGI, etc etc etc) is trivial. I doubt there's anyone on this list who couldn't cook up a WAI/CGI gateway in an afternoon.

We need pluggable *applications* (blog, CMS, RSS feed generation, administrative panels, forum, wiki, caching, user management, etc) that understand how to talk to each other --- expecting "plug and play" compatibility between frameworks on this level, when there's no consensus on the *primitives*, is a pipe dream. The first framework that cracks this particular nut, in a way that makes it easy and pleasurable for people to build web apps that perform, is going to gain a lot of traction. Code talks.

Maybe I'm misunderstanding you, but your whole e-mail seems like a large contradiction. But firstly, I believe your premise is mistaken: we're not

On Mon, Mar 15, 2010 at 2:38 PM, Gregory Collins wrote: trying to "standardize interfaces" for the most part; it's true that WAI was such an approach, but that's not the focus of this thread. You claim that anyone could write an adapter between two interfaces; look at the code for hack-handler-hyena and you'll understand the folly of that sentiment. But more to the point: because of WAI, we don't *have* to write the adapter, and I don't see any substantive critiques of the WAI approach. (If there are, we can address them by fixing the WAI.) Now, let me try to understand your point here: we need standardized interfaces for applications to communicate with each other, but we can't do that until we have the low level interfaces hammered out. On the other hand, you say that it's premature to try to standardize interfaces. And then you make it clear that the use case of many people (shared hosting) doesn't interest you at all; based on previous conversations, I believe this means that Snap will have *no* (Fast)CGI support. So where exactly are you hoping to create this platform from which everyone will build these great applications that snap together so well? I think (correct me if I'm wrong) that everyone else here is in agreement that the goal should not be to create a great framework that will solve everyone's needs. The goal should be to create great libraries that can be used independently. For example, separating out happstack-server and happstack-state now provides two great libraries that someone can use as part of a completely different application stack. I think that this approach is much healthier and productive than each of us trying to create the greatest snappable framework ever and hoping everyone else creates applications for it. Michael

Gregory Collins

5:04 a.m.

Michael Snoyman writes:

...

[F]irstly, I believe your premise is mistaken: we're not trying to "standardize interfaces" for the most part; it's true that WAI was such an approach, but that's not the focus of this thread.

If it's not the focus, it's a recurring theme, especially in the context of multiple active efforts in that direction and postings on the list like this:

...

The Haskell language itself was created to consolidate fragmented market of different FP languages. Are we ready to follow our 'ancestors' and in their spirit do the same when it comes to web-framework(s)?

OK, obviously we're all on the same "team" here, we're all drinking the same kool-aid and there should be plenty of things we can agree upon: Haskell's not tremendously popular, there are only so many decent Haskell hackers, it would be good for us to pool our efforts and work towards common purposes. And there IS a lot of interest in and support of the idea of standardizing a web application interface for haskell: I've seen lots of positive chatter and off the top of my head I can think of at least three on Hackage, yours, Hack, and CGI. CGI has a historical reason to exist (it understands the CGI protocol) but the other two are "just plumbing". Which is fine! Plumbing is great, life would be really horrible without it. But in our case, I don't think it's necessary right now: we don't have good toilets or sewers! The marginal utility of producing a WAI/Hack adapter for our toolkit at this point in time is basically zero: we're talking about spending time making it easier to interoperate with *applications that don't exist*. In fact, the only ones I can even think of at the moment are gitit, and MAYBE the Hackage server. Chicken/egg, right?

...

you claim that anyone could write an adapter between two interfaces; look at the code for hack-handler-hyena and you'll understand the folly of that sentiment.

Are you kidding? It's 164 lines of not-that-hairy code, a full third of which is a status code lookup table. (OK, I'm going to try to be less argumentative :) )

...

But more to the point: because of WAI, we don't *have* to write the adapter,

This is the "M*N" VS "M+N" argument that the Rack guys make, which would be a perfectly good argument except that for us (pretty much) M=N=2. Keep in mind that WSGI (and to a lesser extent, this applies to rack) was created in the context of a bunch of webserver backends and more than half a dozen huge Python web toolkits, any one of which probably had more person-hours invested in it than all of our stuff put together.

...

and I don't see any substantive critiques of the WAI approach. (If there are, we can address them by fixing the WAI.)

...and I just said I was going to try to be less argumentative. Your iteratee formulation is wrong, for starters. You picked the one the current Hyena is using, which Johan is currently in the process of abandoning because the other way is clearly better -- an explicit existential variable for the state is not as clean as the other formulation which pushes that goop down under the lambda, into a continuation. The Iteratee type from the iteratee library also has a convenient Monad instance (a big win!). I don't see that one is possible here, especially since there's no way that I can see for your iteratee to partially consume an input; how can you compose them? The Source thing is a little wacky, it should be an Enumerator instead. Again, the point of these things is that they *compose*; iteratees, which consume chunks of data, plug into enumerators, which produce them. So the request body should be an enumerator, and you process it by providing it an iteratee to read the data; the response body is an enumerator the user fills out, which plugs into an iteratee that splats it out the socket. Speaking of which, you don't define an iteratee type, given your formulation it should be: type Iteratee a = a -> ByteString -> IO (Either a a) or more generally type IterateeM m a = a -> ByteString -> m (Either a a) which is exactly the Hyena definition. The Status datatype is a mess. The requestHeader field should probably be a Map instead of an alist. RequestHeader and ResponseHeader don't need to have all of those header constructors in the datatype like that; as a result you have "ReqContentLength" which is awkward to say the least. Not to mention the fact that web application code should never have to get its hands dirty dealing with content-lengths for requests anyways; that should be handled on the enumerator level. Let's say I use the generic "RequestHeader ByteString" constructor. HTTP header names are supposed to be case-insensitive, and your Eq instance does a case-sensitive comparison: 'RequestHeader "x-foo" /= RequestHeader "X-Foo"', which is wrong; now I can't use "lookup" to pick a header out of the alist. Request has an entry for remoteHost but not remotePort. Request contains no cookies, query parameters, or POST parameters; I'm supposed to parse them myself? Users of Java servlets are snickering. I'm really not trying to badger you here (honest), I'm being hyperbolic to make a point: for nearly all of the criticisms I just made, there's room for a legitimate disagreement about what is the "best" way to approach the solution; lots of people wouldn't want to use iteratees at all, for instance! I also wouldn't expect you to rearrange the whole thing to cater to my whims or anyone else's, either. I think that if we're going to standardize, one of the following preconditions has to hold: * the standard is superior on engineering merits, solves concrete problems/enables you to get work done, and is agreeable enough to enough people that it picks up community momentum by wide consensus. * the standard is imposed by fiat, maybe because it comes from the language vendor. * the standard comes out of a standardization committee effort which hammers out disagreements over a period of time to produce something which everyone can stomach --- I'd actually be pretty okay with this if it happened, but there's a reason "design by committee" is a perjorative. * the standard comes about as a result of market forces, i.e. blub toolkit becomes really popular and a standard emerges because everyone wants to interoperate with blub. Honestly, I just don't any of these things happening, not right now.

...

Now, let me try to understand your point here: we need standardized interfaces for applications to communicate with each other, but we can't do that until we have the low level interfaces hammered out.

No, you're misrepresenting me here. What I said is that "we need pluggable applications that understand how talk to each other", NOT that those interfaces should be subject to standardization. There's a critical distinction there; you don't expect Django components to plug into a Zope app.

...

On the other hand, you say that it's premature to try to standardize interfaces. And then you make it clear that the use case of many people (shared hosting) doesn't interest you at all; based on previous conversations, I believe this means that Snap will have *no* (Fast)CGI support.

I wouldn't say that's necessarily true in the long term, although it's probably true that I myself am not likely to write it. FastCGI support is definitely not important to me, but if someone wanted it badly enough there's nothing in Snap (as of this moment) which would preclude her from writing whichever backend she wanted: CGI, WAI, Hack, whatever. That said, I'd be 100% a-ok with sacrificing CGI support specifically, because spawning a fresh process every time you want to service a request is, frankly, dumb and absurdly limiting. It isn't 1996 anymore.

...

I think (correct me if I'm wrong) that everyone else here is in agreement that the goal should not be to create a great framework that will solve everyone's needs.

I do agree with this; my point all long has been that no one web library is going to satisfy everyone any more than one parsing combinator library or one regular expression library or one database could. I'm aiming to help create a great framework that satisfies my needs, and if it works for other people too: great.

...

The goal should be to create great libraries that can be used independently.

..but not necessarily this, at least not in the broader context of "people writing web applications". Obviously I agree in general terms: self-contained, single-purpose libraries are great. There's no reason to have a bunch of competing libraries to parse and output RSS, for instance. A productive *web framework*, on the other hand, by its nature makes a whole set of "convention over configuration" assumptions about how requests are handled and routed, how data is accessed, how users are managed, how to hook into an admin panel, how templates are laid out on disk, which template system to ship out of the box, how components are specified, how they talk to each other etc etc etc. I see the future being a lot like the present, where a heterogeneous collection of frameworks of varying comprehensiveness and quality provide their own self-contained ecosystems. G. -- Gregory Collins

Curt Sampson

5:25 a.m.

On 2010-03-16 01:04 -0400 (Tue), Gregory Collins wrote:

...

And there IS a lot of interest in and support of the idea of standardizing a web application interface for haskell...

Speaking as someone who spent ten years doing more web applications than he cares to think about, and having written a web application framework, I'd like to make a couple of points about this. I think that the ideal "interface" is something that gives you a web request extremly close to exactly how it came in to the web server, and let you send what is essentially a raw response. This allows developers to work with that directly, if they chose, or use a library interfaces with that, does whatever parsing and other things it cares to, and presents some other interface on the other side of that. If the world could work in such a way we'd probably have several different libraries to provide varoius interfaces, including a CGI-like interface, and the world would be a happy place. Unfortunately, the world doesn't actually work that way. Interfaces (and their associated protocols) such as CGI, SCGI, FastCGI and AJP don't present the raw web request; instead they do a large amount of parsing and modification of the request; exactly how this is done varies not only with the implementation but with the web server configuration. Some of the things passed via this protocol, such as SCRIPT_NAME and PATH_TRANSLATED, often don't make sense in the context of a single application running behind a web server. Others, such as PATH_INFO, will have different values for the same URL depending on the web server configuration. Other information is missing entirely. For example, most of these protocols don't even give you the URL that was used; if you want it you have to try to reconsruct it using your best guess about the web server configuration. Unfortunately, I don't really have a good solution to this. The best I can come up with is to have a simple, standard inteface that delivers essentially the raw request for use with web servers written in Haskell and suggest that anybody who wants to use a different web server just use it as a proxy. It would be possible to provide libraries that try to take, e.g., a FastCGI request and turn it back into the raw request, but this can't be done very reliably. cjs -- Curt Sampson +81 90 7737 2974 http://www.starling-software.com The power of accurate observation is commonly called cynicism by those who have not got it. --George Bernard Shaw

Gregory Collins

6:03 a.m.

Curt Sampson writes:

...

I think that the ideal "interface" is something that gives you a web request extremly close to exactly how it came in to the web server, and let you send what is essentially a raw response. This allows developers to work with that directly, if they chose, or use a library interfaces with that, does whatever parsing and other things it cares to, and presents some other interface on the other side of that.

If the world could work in such a way we'd probably have several different libraries to provide varoius interfaces, including a CGI-like interface, and the world would be a happy place.

Unfortunately, the world doesn't actually work that way. Interfaces (and their associated protocols) such as CGI, SCGI, FastCGI and AJP don't present the raw web request; instead they do a large amount of parsing and modification of the request; exactly how this is done varies not only with the implementation but with the web server configuration.

I think I agree with you completely. The question is: where do you draw the lines? You probably want transfer-encodings handled for you, but what about POST data? In Snap right now we detect requests with mime type "application/x-www-form-urlencoded" and parse them for you -- IMO why on earth would you want anything else? But I'm sure there are people who wouldn't be comfortable with that behaviour. Another example: we treat responses differently based on whether you explicitly set a content length or not. If the connection is HTTP/1.0, and you don't set a content length, we have to set "Connection: close" in the response headers and close the socket when we're finished sending the response, because otherwise the client has no way of knowing when the stream ends. If the connection is HTTP/1.1 and there's no content length, we set "Transfer-encoding: chunked" and encode the response body accordingly. If there is a content-length, we use it, because it's a few bytes cheaper. How much of that sort of logic goes into the interface layer? What guarantees does an application written on (for example) the WAI interface receive about what the web server will or will not touch? G. -- Gregory Collins

Curt Sampson

6:22 a.m.

On 2010-03-16 02:03 -0400 (Tue), Gregory Collins wrote:

...

I think I agree with you completely. The question is: where do you draw the lines?

I like to draw the real line, as best as possible, at "do the minimal possible processing to the request" and let users chose a library to process the request and handle as much as possible. Let me expand on your examples, offering examples of reasonable situations in which one wants the opposite behaviour of your (also reasonable!) situations.

...

You probably want transfer-encodings handled for you....

Let's say on my server I store a lot of static content in gzip'd form. In this case, I certainly want to be handling the transfer-encoding myself, because I don't want to uncompress my compressed copy to send it to the server, only so that the server can compress it again as it sends it.

...

Another example: we treat responses differently based on whether you explicitly set a content length or not. If the connection is HTTP/1.0, and you don't set a content length, we have to set "Connection: close" in the response headers and close the socket when we're finished sending the response....

Another approach would be to have interface library buffer the response as it's being generated and, when it's complete, calculate the content length and pass it on to the web server. In the case of a web server doing a lot of relatively small requests, this would probably be considerably more efficient. But of course you still need to the ability to send out the response as it's being generated for large responses that quickly produce some data, but take a relatively long time to produce all of the data.

...

How much of that sort of logic goes into the interface layer?

I'm not sure what you mean by "interface layer," but I feel that, as much as possible, that sort of logic should be in standalone libraries that go between the "standard" web server interface and the user application. cjs -- Curt Sampson +81 90 7737 2974 http://www.starling-software.com The power of accurate observation is commonly called cynicism by those who have not got it. --George Bernard Shaw

Gregory Collins

6:59 a.m.

Curt Sampson writes:

...

Let's say on my server I store a lot of static content in gzip'd form. In this case, I certainly want to be handling the transfer-encoding myself, because I don't want to uncompress my compressed copy to send it to the server, only so that the server can compress it again as it sends it.

I think you're confusing content-encoding and transfer-encoding here. Servers shouldn't be arbitrarily mucking around with your content-encoding.

...

...
Another example: we treat responses differently based on whether you explicitly set a content length or not. If the connection is HTTP/1.0, and you don't set a content length, we have to set "Connection: close" in the response headers and close the socket when we're finished sending the response....

Another approach would be to have interface library buffer the response as it's being generated and, when it's complete, calculate the content length and pass it on to the web server. In the case of a web server doing a lot of relatively small requests, this would probably be considerably more efficient. But of course you still need to the ability to send out the response as it's being generated for large responses that quickly produce some data, but take a relatively long time to produce all of the data.

This breaks "being able to stream in O(1) space", so it's out. Concrete examples: comet, internet radio, etc.

...

...
How much of that sort of logic goes into the interface layer?

I'm not sure what you mean by "interface layer," but I feel that, as much as possible, that sort of logic should be in standalone libraries that go between the "standard" web server interface and the user application.

By "interface layer" for the purposes of this conversation I mean roughly the level of abstraction that Hack/WAI/WSGI/Rack/Java Servlets are inhabiting. G -- Gregory Collins

Curt Sampson

7:48 a.m.

On 2010-03-16 02:59 -0400 (Tue), Gregory Collins wrote:

...

Curt Sampson writes:

I think you're confusing content-encoding and transfer-encoding here.

Yup. Oops, sorry.

...

...
Another approach would be to have interface library buffer the response as it's being generated and, when it's complete, calculate the content length and pass it on to the web server.

This breaks "being able to stream in O(1) space", so it's out. Concrete examples: comet, internet radio, etc.

Having *the option* to do this for efficiency does not break streaming. Only forcing everyone to use this would do that. My point is, if you provide only one way of doing things, you can end up in a situation where some applications will still work, but be a lot more inefficient than they would be otherwise.

...

By "interface layer" for the purposes of this conversation I mean roughly the level of abstraction that Hack/WAI/WSGI/Rack/Java Servlets are inhabiting.

Ah, well this is what I'm saying: that should ideally be part of a standalone library that translates between the web applicaton and the protocol you use to talk to the server. cjs -- Curt Sampson +81 90 7737 2974 http://www.starling-software.com The power of accurate observation is commonly called cynicism by those who have not got it. --George Bernard Shaw

Michael Snoyman

5:34 a.m.

...

Michael Snoyman writes:

...
[F]irstly, I believe your premise is mistaken: we're not trying to "standardize interfaces" for the most part; it's true that WAI was such an approach, but that's not the focus of this thread.

If it's not the focus, it's a recurring theme, especially in the context of multiple active efforts in that direction and postings on the list like this:

...
The Haskell language itself was created to consolidate fragmented market of different FP languages. Are we ready to follow our 'ancestors' and in their spirit do the same when it comes to web-framework(s)?

OK, obviously we're all on the same "team" here, we're all drinking the same kool-aid and there should be plenty of things we can agree upon: Haskell's not tremendously popular, there are only so many decent Haskell hackers, it would be good for us to pool our efforts and work towards common purposes. And there IS a lot of interest in and support of the idea of standardizing a web application interface for haskell: I've seen lots of positive chatter and off the top of my head I can think of at least three on Hackage, yours, Hack, and CGI.

CGI has a historical reason to exist (it understands the CGI protocol) but the other two are "just plumbing". Which is fine! Plumbing is great, life would be really horrible without it. But in our case, I don't think it's necessary right now: we don't have good toilets or sewers! The marginal utility of producing a WAI/Hack adapter for our toolkit at this point in time is basically zero: we're talking about spending time making it easier to interoperate with *applications that don't exist*. In fact, the only ones I can even think of at the moment are gitit, and MAYBE the Hackage server. Chicken/egg, right?

I don't think it's really fair to put CGI in the same category, and the Hack/WAI issue is well documented: Hack is simple, WAI is performant. However, let me address the meat of your claim: even though you don't see

On Mon, Mar 15, 2010 at 10:04 PM, Gregory Collins wrote: the value, doesn't mean there is none. I'm able to write an application against the WAI and test it locally with the SimpleServer, then deploy it with CGI. That by itself justifies its existence for me.

...

...
you claim that anyone could write an adapter between two interfaces; look at the code for hack-handler-hyena and you'll understand the folly of that sentiment.

Are you kidding? It's 164 lines of not-that-hairy code, a full third of which is a status code lookup table. (OK, I'm going to try to be less argumentative :) )

It wasn't the line count I was talking about: it requires spawning a

separate thread to convert the Enumerator to a lazy bytestring.

...

...
But more to the point: because of WAI, we don't *have* to write the adapter,

This is the "M*N" VS "M+N" argument that the Rack guys make, which would be a perfectly good argument except that for us (pretty much) M=N=2. Keep in mind that WSGI (and to a lesser extent, this applies to rack) was created in the context of a bunch of webserver backends and more than half a dozen huge Python web toolkits, any one of which probably had more person-hours invested in it than all of our stuff put together.

I don't really see your point here.

...

...
and I don't see any substantive critiques of the WAI approach. (If there are, we can address them by fixing the WAI.)

...and I just said I was going to try to be less argumentative.

Your iteratee formulation is wrong, for starters. You picked the one the current Hyena is using, which Johan is currently in the process of abandoning because the other way is clearly better -- an explicit existential variable for the state is not as clean as the other formulation which pushes that goop down under the lambda, into a continuation. The Iteratee type from the iteratee library also has a convenient Monad instance (a big win!). I don't see that one is possible here, especially since there's no way that I can see for your iteratee to partially consume an input; how can you compose them?

The Source thing is a little wacky, it should be an Enumerator instead. Again, the point of these things is that they *compose*; iteratees, which consume chunks of data, plug into enumerators, which produce them. So the request body should be an enumerator, and you process it by providing it an iteratee to read the data; the response body is an enumerator the user fills out, which plugs into an iteratee that splats it out the socket.

So, instead of speaking rhetoric, can you give a concrete example where the Source is *bad*? It has serious benefits over Enumerator, such as the ability to convert (via unsafeInterleaveIO) into a lazy bytestring.

I'm happy to see a better approach, why not give me a datatype for the "obviously better" Enumerator? (Hint: releasing no code and using phrases like "clearly better" are not going to win any arguments.)

...

Speaking of which, you don't define an iteratee type, given your formulation it should be:

type Iteratee a = a -> ByteString -> IO (Either a a)

or more generally

type IterateeM m a = a -> ByteString -> m (Either a a)

which is exactly the Hyena definition.

The Status datatype is a mess. The requestHeader field should probably

...

be a Map instead of an alist. RequestHeader and ResponseHeader don't need to have all of those header constructors in the datatype like that; as a result you have "ReqContentLength" which is awkward to say the least. Not to mention the fact that web application code should never have to get its hands dirty dealing with content-lengths for requests anyways; that should be handled on the enumerator level.

That just shows a misunderstanding on your part: the content-lengths are *not* handled at the application level, though the information is available to the application. Why would I censor it? And would you care to elaborate on "Status datatype is a mess"?

...

Let's say I use the generic "RequestHeader ByteString" constructor. HTTP header names are supposed to be case-insensitive, and your Eq instance does a case-sensitive comparison: 'RequestHeader "x-foo" /= RequestHeader "X-Foo"', which is wrong; now I can't use "lookup" to pick a header out of the alist.

This is a valid concern. However, my goal in that decision is to keep things as close to the wire as possible. On the wire, x-foo and X-Foo are different. You also are conveniently ignoring all character encoding issues. True, HTTP has a character encoding defined, but do you trust ever web client in the world to respect all this?

...

Request has an entry for remoteHost but not remotePort. Request contains no cookies, query parameters, or POST parameters; I'm supposed to parse them myself? Users of Java servlets are snickering.

Yes, you are. You clearly have a misunderstanding about what WAI is about (or are just trying to attack baselessly): it's meant to represent the raw data.

(And if anyone wants remotePort for any reason, let me know. I have yet to see a need for it, and no one has asked.)

...

I'm really not trying to badger you here (honest), I'm being hyperbolic to make a point: for nearly all of the criticisms I just made, there's room for a legitimate disagreement about what is the "best" way to approach the solution; lots of people wouldn't want to use iteratees at all, for instance! I also wouldn't expect you to rearrange the whole thing to cater to my whims or anyone else's, either.

I think that if we're going to standardize, one of the following preconditions has to hold:

* the standard is superior on engineering merits, solves concrete problems/enables you to get work done, and is agreeable enough to enough people that it picks up community momentum by wide consensus.

* the standard is imposed by fiat, maybe because it comes from the language vendor.

* the standard comes out of a standardization committee effort which hammers out disagreements over a period of time to produce something which everyone can stomach --- I'd actually be pretty okay with this if it happened, but there's a reason "design by committee" is a perjorative.

* the standard comes about as a result of market forces, i.e. blub toolkit becomes really popular and a standard emerges because everyone wants to interoperate with blub.

Honestly, I just don't any of these things happening, not right now.

Don't worry, I won't ask you to join the standardization committee ;).

...

...
Now, let me try to understand your point here: we need standardized interfaces for applications to communicate with each other, but we can't do that until we have the low level interfaces hammered out.

No, you're misrepresenting me here. What I said is that "we need pluggable applications that understand how talk to each other", NOT that those interfaces should be subject to standardization. There's a critical distinction there; you don't expect Django components to plug into a Zope app.

...
On the other hand, you say that it's premature to try to standardize interfaces. And then you make it clear that the use case of many people (shared hosting) doesn't interest you at all; based on previous conversations, I believe this means that Snap will have *no* (Fast)CGI support.

I wouldn't say that's necessarily true in the long term, although it's probably true that I myself am not likely to write it. FastCGI support is definitely not important to me, but if someone wanted it badly enough there's nothing in Snap (as of this moment) which would preclude her from writing whichever backend she wanted: CGI, WAI, Hack, whatever. That said, I'd be 100% a-ok with sacrificing CGI support specifically, because spawning a fresh process every time you want to service a request is, frankly, dumb and absurdly limiting. It isn't 1996 anymore.

...
I think (correct me if I'm wrong) that everyone else here is in agreement that the goal should not be to create a great framework that will solve everyone's needs.

I do agree with this; my point all long has been that no one web library is going to satisfy everyone any more than one parsing combinator library or one regular expression library or one database could. I'm aiming to help create a great framework that satisfies my needs, and if it works for other people too: great.

...
The goal should be to create great libraries that can be used independently.

..but not necessarily this, at least not in the broader context of "people writing web applications". Obviously I agree in general terms: self-contained, single-purpose libraries are great. There's no reason to have a bunch of competing libraries to parse and output RSS, for instance.

A productive *web framework*, on the other hand, by its nature makes a whole set of "convention over configuration" assumptions about how requests are handled and routed, how data is accessed, how users are managed, how to hook into an admin panel, how templates are laid out on disk, which template system to ship out of the box, how components are specified, how they talk to each other etc etc etc. I see the future being a lot like the present, where a heterogeneous collection of frameworks of varying comprehensiveness and quality provide their own self-contained ecosystems.

Michael

MightyByte

6:06 a.m.

Michael Snoyman writes:

...

...
But more to the point: because of WAI, we don't *have* to write the adapter, This is the "M*N" VS "M+N" argument that the Rack guys make, which would be a perfectly good argument except that for us (pretty much) M=N=2. Keep in mind that WSGI (and to a lesser extent, this applies to rack) was created in the context of a bunch of webserver backends and more than half a dozen huge Python web toolkits, any one of which

On Mon, Mar 15, 2010 at 10:04 PM, Gregory Collins wrote: Michael Snoyman writes: probably had more person-hours invested in it than all of our stuff put together.

I don't really see your point here.

I think I would characterize the core difference between Greg's and Michael's approaches to be this: Michael = Waterfall Greg = Agile Although there may also be some minor differences in the scope that they're talking about regarding a wai.

5597

Age (days ago)

5598

Last active (days ago)

List overview

Download

13 comments

6 participants

participants (6)

Curt Sampson
Gregory Collins
Jeremy Shaw
Michael Snoyman
MightyByte
Peter Gammie