parsing exif file: binary, binary-strict, or cereal?

Hello, I could not find a pure haskell library to parse EXIF so I wrote one (pretty basic so far): https://github.com/emmanueltouzery/hsexif I wrote it with binary-strict. I also considered binary, but I would like the library not to throw an exception if the file that it's given is not a JPEG or is a JPEG without EXIF, but rather return an Either. I didn't manage to do that with binary (I mean I could offer an IO monad wrapper with try* but to do that even if you give me a lazy bytestring, when it could be done only with pure code, is annoying). It annoys me to load the full JPEG in memory just to parse its EXIF though. Then again with binary I'd have to be pretty careful with strictness annotations and so on, although parsing the EXIF is going to be fast and it won't contain that much data. Still, I want the lazy bytestring holding the JPEG file to be released ASAP. It'd be a shame to have it lying in memory, together with holding the JPEG file open until the library caller actually makes use of the EXIF contents... And then I realized that binary-strict was last updated in 2010, it seems cereal is recommended for strict binary file parsing nowadays. So, what should I do? Leave it as it is? Port to cereal? Port to binary? Otherwise I did write this because I'd like to make a program to parse EXIF file which will have to run also on windows. The haskell exif parsing library that I could find uses the C library libexif, which is going to get complicated to get running on windows. Plus it was fun :-) Thanks, Emmanuel

Emmanuel Touzery wrote:
I could not find a pure haskell library to parse EXIF so I wrote one (pretty basic so far): https://github.com/emmanueltouzery/hsexif
Very nice! Why not upload it to hackage to make it easier for others to share and collaborate?
I wrote it with binary-strict. I also considered binary, but I would like the library not to throw an exception if the file that it's given is not a JPEG or is a JPEG without EXIF, but rather return an Either. I didn't manage to do that with binary
Perhaps the function Data.Binary.Get.runGetOrFail is what you are looking for? Or perhaps the incremental strict interface?
And then I realized that binary-strict was last updated in 2010, it seems cereal is recommended for strict binary file parsing nowadays. So, what should I do? Leave it as it is? Port to cereal? Port to binary?
The binary-strict library is no longer maintained, since binary now also provides a strict interface. Both binary and cereal are good choices. Probably porting to binary would be easiest if you've already written it for binary-strict, assuming runGetOrFail or the incremental interface does what you want. Regards, Yitz

On Sun, Apr 13, 2014 at 4:30 PM, Yitzchak Gale
Emmanuel Touzery wrote:
I could not find a pure haskell library to parse EXIF so I wrote one (pretty basic so far): https://github.com/emmanueltouzery/hsexif
Very nice! Why not upload it to hackage to make it easier for others to share and collaborate?
Yes, that's the plan as soon as I get the basics right and that I write some short documentation. Hopefully very quickly.
I wrote it with binary-strict. I also considered binary, but I would like the library not to throw an exception if the file that it's given is not a JPEG or is a JPEG without EXIF, but rather return an Either. I didn't manage to do that with binary
Perhaps the function Data.Binary.Get.runGetOrFail is what you are looking for? Or perhaps the incremental strict interface?
Ah yes... Actually I was a bit mislead because I found an haskell wiki stating that with binary it was impossible to catch the exceptions except in the IO monad and I took that at face value. I think runGetOrFail is what I want, I'll test it, thank you!
And then I realized that binary-strict was last updated in 2010, it seems cereal is recommended for strict binary file parsing nowadays. So, what should I do? Leave it as it is? Port to cereal? Port to binary?
The binary-strict library is no longer maintained, since binary now also provides a strict interface. Both binary and cereal are good choices.
Probably porting to binary would be easiest if you've already written it for binary-strict, assuming runGetOrFail or the incremental interface does what you want.
Ok, then I'll try the non-strict binary first then. Most JPG files will be only a couple of megabytes and it wouldn't be THAT bad loading them entirely in memory but then, it seems a bit of a shame. With lazy though I'll have to work a bit on my strictness annotations I think. Thanks for the hints! Hopefully I can port it to lazy binary and publish it on hackage soon enough... Emmanuel

well, it went fast :-)
already ported.
regarding strictness, i hope just doing that is enough:
data IfEntry = IfEntry
{
entryTag :: !Word16,
entryFormat :: !Word16,
entryNoComponents :: !Word32,
entryContents :: !Word32
} deriving Show
Though from past experience, I'm far from sure.
I'll add comments soon and then push it to hackage.
emmanuel
On Sun, Apr 13, 2014 at 5:08 PM, Emmanuel Touzery
On Sun, Apr 13, 2014 at 4:30 PM, Yitzchak Gale
wrote: Emmanuel Touzery wrote:
I could not find a pure haskell library to parse EXIF so I wrote one (pretty basic so far): https://github.com/emmanueltouzery/hsexif
Very nice! Why not upload it to hackage to make it easier for others to share and collaborate?
Yes, that's the plan as soon as I get the basics right and that I write some short documentation. Hopefully very quickly.
I wrote it with binary-strict. I also considered binary, but I would like the library not to throw an exception if the file that it's given is not a JPEG or is a JPEG without EXIF, but rather return an Either. I didn't manage to do that with binary
Perhaps the function Data.Binary.Get.runGetOrFail is what you are looking for? Or perhaps the incremental strict interface?
Ah yes... Actually I was a bit mislead because I found an haskell wiki stating that with binary it was impossible to catch the exceptions except in the IO monad and I took that at face value. I think runGetOrFail is what I want, I'll test it, thank you!
And then I realized that binary-strict was last updated in 2010, it seems cereal is recommended for strict binary file parsing nowadays. So, what should I do? Leave it as it is? Port to cereal? Port to binary?
The binary-strict library is no longer maintained, since binary now also provides a strict interface. Both binary and cereal are good choices.
Probably porting to binary would be easiest if you've already written it for binary-strict, assuming runGetOrFail or the incremental interface does what you want.
Ok, then I'll try the non-strict binary first then. Most JPG files will be only a couple of megabytes and it wouldn't be THAT bad loading them entirely in memory but then, it seems a bit of a shame. With lazy though I'll have to work a bit on my strictness annotations I think.
Thanks for the hints! Hopefully I can port it to lazy binary and publish it on hackage soon enough...
Emmanuel
participants (2)
-
Emmanuel Touzery
-
Yitzchak Gale