
On 8/1/10 12:12 PM, austin seipp wrote:
Hi Jason,
I've had my eye on the 'Takusen' approach for a while. In particular I think it's a wonderful idea to use the left-fold based interface. Takusen is also well supported and pretty stable, having been around for a while.
I agree; in fact, I use it for all of my database needs.
* It would be nice if we could make it depend on nicer libraries instead of rolling its own stuff - for example, we have Lato's excellent iteratee package, and Bas van Dijk has written a (IMO woefully underappreciated!) 'regions' package with encapsulate the lightweight monadic regions idea Oleg proposed. Of course, due to design, neither of these may work properly for Takusen's case, and if they did they would very likely facilitate API changes, but it's something I've been thinking about in the name of making the library smaller and more lightweight.
Making the library depend on more external libraries will actually make it more "heavyweight" since it will require the user to install more libraries just in order to use Takusen. I'm not saying that this is a bad thing, but I wouldn't automatically count it as a disadvantage that one can install Takusen without requiring lots of other libraries to be installed first. It looks to me like the iteratee package wouldn't be a good fit for Takusen because (as far I understand it) it is designed for the use case of reading raw data where in particular the "chunks" might not be aligned with the records being processed, whereas Takusen is designed for reading in very structured data. The regions package might work, although there are problems with the way that it handles exceptions that has been discussed previously on this list. Finally, a disadvantage of changing Takusen to use these kinds of external libraries is that it could actually make it *harder* to understand how to use it, since first the user would have to understand the concepts involved in the external libraries. Again, I'm not saying that any of these issues automatically make it a bad idea to modify Takusen to use external libraries, just that the current approach works well and is (in my opinion) already relatively simple and straightforward, and it would be unfortunate if this were lost without a clear benefit. Cheers, Greg