
On Aug 29, 2011 9:39 PM, "Michael Snoyman"
On Mon, Aug 29, 2011 at 2:21 PM, Gregory Collins
wrote: On Mon, Aug 29, 2011 at 10:08 AM, Michael Snoyman
wrote:
Hi all,
Erik just opened an issue on Github[1] that affected me very recently as well when writing some automated Hackage checking code[2]. The issue is that http-enumerator sees the content-encoding header and decompresses the tarball, returning an uncompressed tarfile. I can avoid this with rawBody = False, but that's not a real solution, since that also disables chunked response handling.
A web server should not be setting "Content-encoding: gzip" on a .tar.gz file. I agree that http-enumerator is correctly following the spec by decompressing.
If you decide to implement a workaround for this, the only reasonable thing I can think of is adding a "ignoreContentEncoding" knob the user can twiddle to violate spec.
I'm wondering what the most appropriate way to handle this is. Maybe a dontDecompress record, looking like:
type ContentType = ByteString dontDecompress :: ContentType -> Bool
Then browser behavior would be:
browserDecompress = (== "application/x-tar")
and current behavior would be:
defaultDecompress = const False
I don't have any strong opinions here...
I agree with Gregory's suggestion of an API that allows an application to see the data prior to decoding the Content-Encoding. It could be tagged with the name of the content-coding, and there could be a generic decode function (ie. the library already knows what needs to be done to decode, so there's no need for the application to go looking up the decode function by name). Conrad.