
On 7 apr 2008, at 14.46, Max Desyatov wrote:
Hi,
I'm interested in working on a library for a stateful web browsing in Haskell during Google Summer of Code. The basic idea is described at http://hackage.haskell.org/trac/summer-of-code/ticket/1107. WWW::Mechanize is a ready to use library written in Perl, though I used python's mechanize when I wrote some simple scripts (http://wwwsearch.sourceforge.net/mechanize/), which provides much cleaner interface. Anyway, it gives simple and convenient way to retreive web-sites, to handle cookies, history and to process retrieved content and forms. There are basics of it Network.Browser module from Haskell's HTTP library (http://hackage.haskell.org/packages/archive/HTTP/3001.0.4/doc/ html/Network-Browser.html), but it's ugly (uses unsafePerformIO for error reporting) and lacks a greater part of needed functionality.
My aim is to greatly improve Network.Browser module and to make coding small scripts with it in more functional way possible. At this moment it uses BrowserAction state monad. Though, the deadline is approaching, I still seek some ways to improve my proposal. So here are the questions: are there any other data structures that will make programming with this library more convenient, besides simple state monad? Should we contrive more sophisticated system with other other separate data structures? What other improvements you'd like to see?
It doesn't have to be perfect. Make sure you know how to use monad transformers. Also take a look at tag soup and the various HTML/XML parsers. I'm sure there's plenty to work on. My guess would be, that you try to write non-trivial example applications and see what is needed. For example, you could write a script to download/upload a Haskell wiki page logging in if necessary. Take a look of what other WWW::Mechanize packages are used. That kind of stuff. Also, for a GSoC proposal you should try to convince the mentors, why your project is useful for Haskell in general. So maybe you have some more arguments there, too. / Thomas