Thinking about what's missing in our library coverage

Following Simon M's advice, I look over the typical "batteries" categories, using Python as input: http://docs.python.org/library/index.html The following things were missing from the current Platform. There are many. How would you identify the top, say, 5 libs to add? -- Don * String support o binary formatting [binary] — lazy binary parsing/serialising o pcre regexes [pcre-light] [regex-pcre] — what’s our best regex lib? o unicode text [text] [text-icu] — packed, unicode text o codecs/encodings — encodings? * Data types o higher dimensional arrays [hmatrix] o bloomfilter — bloomfilters o bytestring-tries — IntMap for ByteStrings o dlist — difference lists o numbers — expanded number types * text o attoparsec (simple, bytestring parsing) o polyparse o csv parsing o pandoc — markdown, reStructuredText, HTML, LaTeX, ConTeXt, Docbook, OpenDocument, ODT, RTF, MediaWiki, groff * math and numerics o blas — BLAS o cmath — C math functions o dimensional — physical dimensions o fftw o mersenne-random — fast randoms * persistance o anydbm? o sqlite3 o hdbc * compression o bzip2 o zip o tar * file formats o csv o config parser * crypto o hmac, md5, sha, hashing * systems o getopt o logging o termio o editline o mmap * Internet o network-bytestring o ssl o json o feed (rss, atom) o mime o base64 et al o uuencode o cgi o fastcgi o urls o ftp, http, imap, smtp clients o uuid o url parsing o http server o xml-rpc * Multimedia o colour * Internationalization o gettext o locale o i18n * GUIs o gtk2hs * Development o hscolour

On Mon, Aug 03, 2009 at 04:44:32PM -0700, Donald Bruce Stewart wrote:
How would you identify the top, say, 5 libs to add?
I would not look for libs to add. I would wait for people to come and tell me that they think that particular libs are worthy of addition, and then decide whether or not I agree. Thanks Ian

igloo:
On Mon, Aug 03, 2009 at 04:44:32PM -0700, Donald Bruce Stewart wrote:
How would you identify the top, say, 5 libs to add?
I would not look for libs to add. I would wait for people to come and tell me that they think that particular libs are worthy of addition, and then decide whether or not I agree.
That's fine. I'm just trying to get a sense of what people will be proposing, and why. Is this an 'identify the champion' model? We await a champion to propose things? -- Don

On 04/08/2009 00:59, Ian Lynagh wrote:
On Mon, Aug 03, 2009 at 04:44:32PM -0700, Donald Bruce Stewart wrote:
How would you identify the top, say, 5 libs to add?
I would not look for libs to add. I would wait for people to come and tell me that they think that particular libs are worthy of addition, and then decide whether or not I agree.
Ok, to kick things off then, I propose the following: Add * binary * getopt * gtk2hs Also * keep an eye on text. We certainly want it, but it's a young package and there's no text I/O yet. * decide which regex package(s) we want * remove html? (we have xhtml) * replace haskell-src with haskell-src-exts * remove packedstring Cheers, Simon

On Tue, Aug 4, 2009 at 7:02 AM, Simon Marlow
Add
* binary * getopt * gtk2hs
A definite yes to binary and getopt; but gtk2hs? I don't trust its longevity or maintenance, and as other people have pointed out, it will make the platform harder to support. As well, we risk holding back the platform - hasn't gtk2hs lagged GHC releases in the past? (I seem to remember some of the lags being quite lengthy.)
Also
* keep an eye on text. We certainly want it, but it's a young package and there's no text I/O yet. * decide which regex package(s) we want * remove html? (we have xhtml) * replace haskell-src with haskell-src-exts * remove packedstring
Absolutely. I thought we had already done this - didn't TH's unnecessary use of packedstring get removed a while ago?
Cheers, Simon
-- gwern

On 04/08/2009 12:07, Gwern Branwen wrote:
On Tue, Aug 4, 2009 at 7:02 AM, Simon Marlow
wrote: Add
* binary * getopt * gtk2hs
A definite yes to binary and getopt; but gtk2hs? I don't trust its longevity or maintenance, and as other people have pointed out, it will make the platform harder to support. As well, we risk holding back the platform - hasn't gtk2hs lagged GHC releases in the past? (I seem to remember some of the lags being quite lengthy.)
Adding gtk2hs would be a bold step, no doubt about it. By proposing it I'm hoping to force the issues to the surface: is gtk2hs the GUI lib we want to recommend, or standardise on? If it is, and it has maintenance issues, then those need to be addressed. As far as I'm aware, gtk2hs is the only plausible option for serious GUI development in Haskell at the moment. By bringing gtk2hs into the platform, we would be giving the gtk2hs maintainers a helpful boost; they'd get more testing for one thing.
Also
* keep an eye on text. We certainly want it, but it's a young package and there's no text I/O yet. * decide which regex package(s) we want * remove html? (we have xhtml) * replace haskell-src with haskell-src-exts * remove packedstring
Absolutely. I thought we had already done this - didn't TH's unnecessary use of packedstring get removed a while ago?
Not in the version of TH shipping with GHC 6.10.x, but it will be gone in GHC 6.12. Cheers, Simon

Simon Marlow wrote:
By bringing gtk2hs into the platform, we would be giving the gtk2hs maintainers a helpful boost; they'd get more testing for one thing.
I think that should explicitly not be a reason to bring things into the platform. Ganesh =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ===============================================================================

Hi Simon et al., On Aug 4, 2009, at 13:57, Sittampalam, Ganesh wrote:
Simon Marlow wrote:
By bringing gtk2hs into the platform, we would be giving the gtk2hs maintainers a helpful boost; they'd get more testing for one thing.
I think that should explicitly not be a reason to bring things into the platform.
Bringing Gtk2Hs into the platform is certainly desirable. During the last one or two years, the amount of users has grown steadily which is nice. However, Pete and my time is rather limited and is often being used up by installation issues and questions that could be answered with better documentation. Thus, it would be desirable to bring Gtk2Hs into the platform because it would force us to simplify installation and documentation. For the former part, I wonder if cabalization is important for Gtk2Hs. A cabalized version of Gtk2Hs would allow people to use Cairo and Pango to create PDF documents without the need to install the GUI parts of the library. On the contrary, if Gtk2Hs is shipped with the platform, then all libraries are available anyway and cabalization might not be as important. My question: how important is cabalization for a package that wants to be part of the platform? Axel.

On Tue, Aug 04, 2009 at 02:21:11PM +0200, Axel Simon wrote:
My question: how important is cabalization for a package that wants to be part of the platform?
I think it's important. It means that: * It can be built and installed in a regular way when creating the binary platform installers, without needing special case code * It can be built and installed in a regular way in Linux distros etc, so it does not place an extra burden on distro maintainers trying to support the platform * The platform can be installed with just cabal-install (although you will also need to install the C libraries etc) * People who want to test newer versions that are intended to come with a future platform release can easily do so Thanks Ian

Axel Simon wrote:
For the former part, I wonder if cabalization is important for Gtk2Hs. A cabalized version of Gtk2Hs would allow people to use Cairo and Pango to create PDF documents without the need to install the GUI parts of the library. On the contrary, if Gtk2Hs is shipped with the platform, then all libraries are available anyway and cabalization might not be as important.
I'm a big fan of Gtk2Hs and would really like to see it cabalised (as several packages, most likely), though I can see how that's work. Right now it's a bit annoying to have to go through a separate installation process to install and upgrade it.
My question: how important is cabalization for a package that wants to be part of the platform?
I think it ought to be required. -- Ashley Yakeley

On Tue, 2009-08-04 at 14:21 +0200, Axel Simon wrote:
Hi Simon et al.,
On Aug 4, 2009, at 13:57, Sittampalam, Ganesh wrote:
Simon Marlow wrote:
By bringing gtk2hs into the platform, we would be giving the gtk2hs maintainers a helpful boost; they'd get more testing for one thing.
I think that should explicitly not be a reason to bring things into the platform.
Bringing Gtk2Hs into the platform is certainly desirable. During the last one or two years, the amount of users has grown steadily which is nice. However, Pete and my time is rather limited and is often being used up by installation issues and questions that could be answered with better documentation. Thus, it would be desirable to bring Gtk2Hs into the platform because it would force us to simplify installation and documentation.
For the former part, I wonder if cabalization is important for Gtk2Hs. A cabalized version of Gtk2Hs would allow people to use Cairo and Pango to create PDF documents without the need to install the GUI parts of the library. On the contrary, if Gtk2Hs is shipped with the platform, then all libraries are available anyway and cabalization might not be as important.
My question: how important is cabalization for a package that wants to be part of the platform?
Speaking with my distribution and HP release team hats on I think it is essential. We cannot sensibly write automation tools for packages that are not cabalised. I fully expect it to become a required criteria for package inclusion. Duncan

On 04/08/2009 12:57, Sittampalam, Ganesh wrote:
Simon Marlow wrote:
By bringing gtk2hs into the platform, we would be giving the gtk2hs maintainers a helpful boost; they'd get more testing for one thing.
I think that should explicitly not be a reason to bring things into the platform.
*grin* absolutely :) However, I don't think the platform should just swim around with its mouth open waiting for tasty packages to come along. Sometimes we need to make strategic decisions about what functionality is most important. Cheers, Simon

Simon Marlow wrote:
On 04/08/2009 12:57, Sittampalam, Ganesh wrote:
Simon Marlow wrote:
By bringing gtk2hs into the platform, we would be giving the gtk2hs maintainers a helpful boost; they'd get more testing for one thing.
I think that should explicitly not be a reason to bring things into the platform.
*grin* absolutely :)
However, I don't think the platform should just swim around with its mouth open waiting for tasty packages to come along. Sometimes we need to make strategic decisions about what functionality is most important.
Agreed, and with this in mind perhaps packages should be conditionally accepted, with a defined set of improvements leading to automatic entry. This would allow the platform to drive priorities without dumping things into it half-done. Cheers, Ganesh =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ===============================================================================

Sittampalam, Ganesh wrote:
Simon Marlow wrote:
Sittampalam, Ganesh wrote:
Simon Marlow wrote:
By bringing gtk2hs into the platform, we would be giving the gtk2hs maintainers a helpful boost; they'd get more testing for one thing.
I think that should explicitly not be a reason to bring things into the platform.
*grin* absolutely :)
However, I don't think the platform should just swim around with its mouth open waiting for tasty packages to come along. Sometimes we need to make strategic decisions about what functionality is most important.
Agreed, and with this in mind perhaps packages should be conditionally accepted, with a defined set of improvements leading to automatic entry. This would allow the platform to drive priorities without dumping things into it half-done.
+1. There are many packages which are almost good enough, but have one or two nagging details. I think the process for inclusion of any new package should have this sort of TODO list to ensure that the HP isn't "almost good enough", but actually lives up to its standards. -- Live well, ~wren

marlowsd:
On 04/08/2009 12:57, Sittampalam, Ganesh wrote:
Simon Marlow wrote:
By bringing gtk2hs into the platform, we would be giving the gtk2hs maintainers a helpful boost; they'd get more testing for one thing.
I think that should explicitly not be a reason to bring things into the platform.
*grin* absolutely :)
However, I don't think the platform should just swim around with its mouth open waiting for tasty packages to come along. Sometimes we need to make strategic decisions about what functionality is most important.
I agree with this. There are broader adoption issues we're trying to achieve in this process, after all, and having a better base lib than anyone else helps that. -- Don

As far as I'm aware, gtk2hs is the only plausible option for serious GUI development in Haskell at the moment.
I have used both wxHaskell and gtk2hs, and they are both perfectly easy to install, and both are entirely adequate for serious GUI development. Just wanted to clear up any fear, uncertainty, or doubt about wx. Regards, Malcolm

2009/8/4 Malcolm Wallace
As far as I'm aware, gtk2hs is the only plausible option for serious GUI development in Haskell at the moment.
I have used both wxHaskell and gtk2hs, and they are both perfectly easy to install, and both are entirely adequate for serious GUI development. Just wanted to clear up any fear, uncertainty, or doubt about wx.
Disclaimer: I'm a wxHaskell maintainer. Thanks for the vote, Malcolm. I use wxHaskell in some fairly extensive GUI developments, and it's stable and functional. On a more serious note (and one which applies equally to wxHaskell, BTW), I think the Haskell Platform should consistently use a single license, as to do otherwise heavily complicates matters for the user. I believe that most/all of the libraries in today's platform are BSD (or other very liberal license). Gtk2Hs brings LGPL into the mix, and wxHaskell would, similarly, bring the wxWidgets license (LGPL with binary exception, basically) into the mix. As a Haskell Platform user, I really need the assurance that the licensing situation is straightforward - especially if I'm to promote Haskell at work :-) My vote would be that non-BSD/MIT license automatically excludes a library from inclusion, even though it would exclude my own project. Regards Jeremy

The Hasekll Platform will be a *lot* more compelling if it has a GUI story. If it's hard for experts to get it built on platform X, what are the poor users supposed to do? The whole point of the HP is to take the pain just once, and let our happy users enjoy the benefits.
If it's clear that Gtk2hs is the brand leader, I think Simon is right that we should very seriously consider putting it in the HP. But what about Wx? I'm reluctant to appear to sponsor one or t'other unless there are clear technical or support reasons to do so. Having both would be fine, if their support crews are willing to do the necessary polishing etc.
Concerning the "boost" from the HP, inclusion *will* increase the user-base of a package, and that *does* increase the incentive for the library's support crew to roll up their sleeves. And rightly so. That's one of the benefits of the HP!
Simon
| -----Original Message-----
| From: libraries-bounces@haskell.org [mailto:libraries-bounces@haskell.org] On
| Behalf Of Simon Marlow
| Sent: 04 August 2009 12:27
| To: Gwern Branwen
| Cc: libraries@haskell.org
| Subject: Re: Thinking about what's missing in our library coverage
|
| On 04/08/2009 12:07, Gwern Branwen wrote:
| > On Tue, Aug 4, 2009 at 7:02 AM, Simon Marlow

On Tue, 2009-08-04 at 12:02 +0100, Simon Marlow wrote:
On 04/08/2009 00:59, Ian Lynagh wrote:
On Mon, Aug 03, 2009 at 04:44:32PM -0700, Donald Bruce Stewart wrote:
How would you identify the top, say, 5 libs to add?
I would not look for libs to add. I would wait for people to come and tell me that they think that particular libs are worthy of addition, and then decide whether or not I agree.
Ok, to kick things off then, I propose the following:
Add
* binary * getopt * gtk2hs
Now that's just crazy-talk! :-) What/where is getopt? It's not on hackage. Elsewhere we've raised our concerns about binary. gtk2hs is of course not cabalised.
Also
* keep an eye on text. We certainly want it, but it's a young package and there's no text I/O yet.
I'd say go for it. If the current API is good then that's enough. It's not clear that there needs to be separate I/O modules for it. I might suggest hiding all the fusion modules for starter though.
* decide which regex package(s) we want
I'd like input from the regex maintainer here. In particular which backend do we want in the platform and can we please avoid having more than one (if we can't choose how do we expect users to choose).
* remove html? (we have xhtml)
On the other hand xhtml seems to be going out of fashion.
* replace haskell-src with haskell-src-exts
Yes, if the maintainer thinks its ready.
* remove packedstring
Yes! And editline. Duncan

* replace haskell-src with haskell-src-exts
Yes, if the maintainer thinks its ready.
The maintainer says that the library itself is definitely ready in its current state. :-) The only problem is that it depends on cpphs, which is not in the platform. I personally think that cpphs warrants inclusion in the platform too, but it comes with that same LGPL (+linking exception) "burden" that is already discussed (haskell-src-exts itself uses BSD). I see two possibilities: * Allow LGPL (+linking exception) in the HP and include cpphs. Then haskell-src-exts can replace haskell-src immediately. * I remove the rather small functionality in haskell-src-exts that depends on cpphs (namely deliterating literate source files). Then a slightly stumped haskell-src-exts could replace haskell-src after I release such a stumped version. Ok, I actually see one more possibililty, that Malcolm re-licences cpphs as BSD to have it included. I don't expect that to happen though. :-) Cheers, /Niklas

Hi
Yes, if the maintainer thinks its ready.
The maintainer says that the library itself is definitely ready in its current state. :-)
This user definitely agrees!
The only problem is that it depends on cpphs, which is not in the platform. I personally think that cpphs warrants inclusion in the platform too, but it comes with that same LGPL (+linking exception) "burden" that is already discussed (haskell-src-exts itself uses BSD).
I see two possibilities:
* Allow LGPL (+linking exception) in the HP and include cpphs. Then haskell-src-exts can replace haskell-src immediately.
I support this.
* I remove the rather small functionality in haskell-src-exts that depends on cpphs (namely deliterating literate source files). Then a slightly stumped haskell-src-exts could replace haskell-src after I release such a stumped version.
I seriously dislike the idea that packages must have useful features removed to end in the HP. I also want you to include CPP support in HSE, and removing cpphs makes this more unlikely. Thanks Neil

On Fri, 2009-08-07 at 09:18 +0100, Neil Mitchell wrote:
I see two possibilities:
* Allow LGPL (+linking exception) in the HP and include cpphs. Then haskell-src-exts can replace haskell-src immediately.
I support this.
* I remove the rather small functionality in haskell-src-exts that depends on cpphs (namely deliterating literate source files). Then a slightly stumped haskell-src-exts could replace haskell-src after I release such a stumped version.
I seriously dislike the idea that packages must have useful features removed to end in the HP. I also want you to include CPP support in HSE, and removing cpphs makes this more unlikely.
It's clear that the licensing issue is going to be controversial and will take some time. I'm not sure it is sensible to try and work it out before the next major release, given that the higher priority has to be agreeing the procedure for adding packages. If we do not get around to agreeing the licensing issue then the default position has to be no new licenses 'til we do work it out properly. As a personal opinion, I'd certainly like to see cpphs in the platform and to have Cabal use it in preference to gcc -E / cpp. Duncan

On Mon, Aug 10, 2009 at 01:45:22PM +0100, Duncan Coutts wrote:
It's clear that the licensing issue is going to be controversial and will take some time. I'm not sure it is sensible to try and work it out before the next major release, given that the higher priority has to be agreeing the procedure for adding packages. If we do not get around to agreeing the licensing issue then the default position has to be no new licenses 'til we do work it out properly.
Or to put it another way, decide that "Licence is BSD or MIT" (or whatever the list really is) is a requirement for the upcoming major release. I'd agree with that. Generally, I think that conservatism is the best answer for the platform. It will be a lot worse to put something in and then decide to take it out again, than to decide to leave it out for now and then put it in a few months later. Thanks Ian

On Mon, Aug 10, 2009 at 3:56 PM, Ian Lynagh
On Mon, Aug 10, 2009 at 01:45:22PM +0100, Duncan Coutts wrote:
It's clear that the licensing issue is going to be controversial and will take some time. I'm not sure it is sensible to try and work it out before the next major release, given that the higher priority has to be agreeing the procedure for adding packages. If we do not get around to agreeing the licensing issue then the default position has to be no new licenses 'til we do work it out properly.
Or to put it another way, decide that "Licence is BSD or MIT" (or whatever the list really is) is a requirement for the upcoming major release.
I'd agree with that. Generally, I think that conservatism is the best answer for the platform. It will be a lot worse to put something in and then decide to take it out again, than to decide to leave it out for now and then put it in a few months later.
Would it make sense to decide already now that when GHC has enough support for dynamic libs to make it easy to comply with LGPL (without linkage exceptions), then the set of "HP approved licenses" will be extended to include LGPL? Would the up-and-comming Hackell compilers (who might not have dyn lib support) be happy with that? /M -- Magnus Therning (OpenPGP: 0xAB4DFBA4) magnus@therning.org Jabber: magnus@therning.org http://therning.org/magnus identi.ca|twitter: magthe

On Mon, 2009-08-10 at 13:45 +0100, Duncan Coutts wrote:
It's clear that the licensing issue is going to be controversial and will take some time. I'm not sure it is sensible to try and work it out before the next major release,
If you're not going to work it out now, that means that any new packages should assume the strictest possible restriction (BSD or equivalent only), yes? It's a lot easier to add packages later than remove them. - Adam

On Mon, 2009-08-10 at 08:05 -0700, Adam Wick wrote:
On Mon, 2009-08-10 at 13:45 +0100, Duncan Coutts wrote:
It's clear that the licensing issue is going to be controversial and will take some time. I'm not sure it is sensible to try and work it out before the next major release,
If you're not going to work it out now, that means that any new packages should assume the strictest possible restriction (BSD or equivalent only), yes? It's a lot easier to add packages later than remove them.
Yes, I noted: If we do not get around to agreeing the licensing issue then the default position has to be no new licenses 'til we do work it out properly. Since the current licenses in platform packages are only BSD then the default position until we agree a license policy would have to be BSD-only. Duncan

On Mon, 2009-08-10 at 17:22 +0100, Duncan Coutts wrote:
On Mon, 2009-08-10 at 08:05 -0700, Adam Wick wrote:
On Mon, 2009-08-10 at 13:45 +0100, Duncan Coutts wrote:
It's clear that the licensing issue is going to be controversial and will take some time. I'm not sure it is sensible to try and work it out before the next major release, If you're not going to work it out now, that means that any new packages should assume the strictest possible restriction (BSD or equivalent only), yes? It's a lot easier to add packages later than remove them. Yes, I noted:
Great! -Adam

* I remove the rather small functionality in haskell-src-exts that depends on cpphs (namely deliterating literate source files).
If your only use of cpphs is to deliterate source files, then I recommend that you simply incorporate that functionality directly into haskell-src-exts. Although cpphs's module Language.Preprocessor.Unlit is under the LGPL, it is based on code that was published in the Haskell 1.2 Report, Appendix C, which I assume is freely copyable. Regards, Malcolm

* I remove the rather small functionality in haskell-src-exts that depends on cpphs (namely deliterating literate source files).
If your only use of cpphs is to deliterate source files, then I recommend that you simply incorporate that functionality directly into haskell-src-exts. Although cpphs's module Language.Preprocessor.Unlit is under the LGPL, it is based on code that was published in the Haskell 1.2 Report, Appendix C, which I assume is freely copyable.
That's of course an option. Though as Neil pointed out, my preference would really be to *increase* my dependency on cpphs, by introducing support for CPP in source files. So if we can actually get cpphs in the HP, that would be the best solution by far. :-) Cheers, /Niklas

Niklas Broberg wrote:
* I remove the rather small functionality in haskell-src-exts that depends on cpphs (namely deliterating literate source files).
If your only use of cpphs is to deliterate source files, then I recommend that you simply incorporate that functionality directly into haskell-src-exts. Although cpphs's module Language.Preprocessor.Unlit is under the LGPL, it is based on code that was published in the Haskell 1.2 Report, Appendix C, which I assume is freely copyable.
That's of course an option. Though as Neil pointed out, my preference would really be to *increase* my dependency on cpphs, by introducing support for CPP in source files. So if we can actually get cpphs in the HP, that would be the best solution by far. :-)
But the CPP that GHC supports is different (albeit less good) than cpphs. So if you want to support the extension as it is currently defined/used, you should just shell out to cpp like GHC does. Cheers, Ganesh =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ===============================================================================

But the CPP that GHC supports is different (albeit less good) than cpphs. So if you want to support the extension as it is currently defined/used, you should just shell out to cpp like GHC does.
Shelling out to CPP is not much help if the executable that uses haskell-src-exts is running on a machine that does not have a C pre- processor installed (or if it is not configured to be found easily). Regards, Malcolm

On 06/08/2009 12:59, Duncan Coutts wrote:
On Tue, 2009-08-04 at 12:02 +0100, Simon Marlow wrote:
On 04/08/2009 00:59, Ian Lynagh wrote:
On Mon, Aug 03, 2009 at 04:44:32PM -0700, Donald Bruce Stewart wrote:
How would you identify the top, say, 5 libs to add?
I would not look for libs to add. I would wait for people to come and tell me that they think that particular libs are worthy of addition, and then decide whether or not I agree.
Ok, to kick things off then, I propose the following:
Add
* binary * getopt * gtk2hs
Now that's just crazy-talk! :-)
What/where is getopt? It's not on hackage. Elsewhere we've raised our concerns about binary. gtk2hs is of course not cabalised.
Hmm, I assumed getopt was on Hackage. Where is it then? Aargh! It's still in base :) We temporarily moved it out a while ago, and then moved it back in. Forget I mentioned it. What is stopping gtk2hs being cabalised at this stage? Is it just work, or does it need extensions to Cabal?
Also
* keep an eye on text. We certainly want it, but it's a young package and there's no text I/O yet.
I'd say go for it. If the current API is good then that's enough. It's not clear that there needs to be separate I/O modules for it. I might suggest hiding all the fusion modules for starter though.
My concern is API consistency. The way to get Text from a Handle is very roundabout: you need to read as a bytestring and then convert to Text using an encoding (and if you want the locale encoding you need text-icu I presume). Whereas the way to get a String from a Handle in the locale encoding is just hGetContents. We need to make the API more streamlined here. I haven't put a lot of thought into it, admittedly, but the first step would be to think about how to unify the codec interface, so that both packages can use the same codecs. Perhaps this isn't a showstopper.
* decide which regex package(s) we want
I'd like input from the regex maintainer here. In particular which backend do we want in the platform and can we please avoid having more than one (if we can't choose how do we expect users to choose).
* remove html? (we have xhtml)
On the other hand xhtml seems to be going out of fashion.
Right, but I think xhtml has had a lot more attention over the years. Perhaps it should have the option to produce HTML - after all if you drop the XML header you're nearly there. Haddock has its own local copy of html. I wonder why that is...
* replace haskell-src with haskell-src-exts
Yes, if the maintainer thinks its ready.
* remove packedstring
Yes! And editline.
Absolutely. Cheers, Simon

On Aug 7, 2009, at 10:58, Simon Marlow wrote:
On 06/08/2009 12:59, Duncan Coutts wrote:
On Tue, 2009-08-04 at 12:02 +0100, Simon Marlow wrote:
On 04/08/2009 00:59, Ian Lynagh wrote:
On Mon, Aug 03, 2009 at 04:44:32PM -0700, Donald Bruce Stewart wrote:
How would you identify the top, say, 5 libs to add?
I would not look for libs to add. I would wait for people to come and tell me that they think that particular libs are worthy of addition, and then decide whether or not I agree.
Ok, to kick things off then, I propose the following:
Add
* binary * getopt * gtk2hs
Now that's just crazy-talk! :-)
What/where is getopt? It's not on hackage. Elsewhere we've raised our concerns about binary. gtk2hs is of course not cabalised.
[..]
What is stopping gtk2hs being cabalised at this stage? Is it just work, or does it need extensions to Cabal?
I'm no Cabal expert, so Duncan might know more. What Gtk2Hs needs is the ability to depend on executables (tools like a modified c2hs) that are built by other Cabal packages. Furthermore, we need to generate .hs files using these tools. I don't know how difficult it is to use Cabal to generate the dependencies and invoke the right tools. For instance, a file like .chs.pp is translated to .chs using CPP or hscpp, then to .hs and .chi using our own c2hs and then it is compiled using ghc. Finally, it seems that we need file-specific options in order to compile certain files. I think Cabal has no mechanism for that. I'm sure it's all doable so it probably boils down to a lack of time :-) Cheers, Axel.

On Fri, 2009-08-07 at 09:58 +0100, Simon Marlow wrote:
* remove html? (we have xhtml)
On the other hand xhtml seems to be going out of fashion.
Right, but I think xhtml has had a lot more attention over the years. Perhaps it should have the option to produce HTML - after all if you drop the XML header you're nearly there.
That sounds like a good idea. It would also make it a lot easier for programs to switch between formats.
Haddock has its own local copy of html. I wonder why that is...
Purge it and find out :-) Duncan

On Thu, Aug 6, 2009 at 4:59 AM, Duncan Coutts
* keep an eye on text. We certainly want it, but it's a young package and there's no text I/O yet.
I'd say go for it. If the current API is good then that's enough. It's not clear that there needs to be separate I/O modules for it. I might suggest hiding all the fusion modules for starter though.
It's really not ready for prime time yet. I'm making changes to the API at the moment, but they're not complete, and I want to see what the new localised I/O support in 6.12 looks like to figure out whether I can hook into that in an agreeable fashion to get localised I/O without jumping through too many hoops. I'm even contemplating trimming the naming from Data.Text to Text, as part of my little war on meaningless module prefixes :-) So please, don't go for it just yet. Pitch in with patches instead! Regards, Bryan.

On Mon, Aug 03, 2009 at 04:44:32PM -0700, Don Stewart wrote:
* compression o bzip2 o zip o tar
These seem nice.
* systems o getopt o editline
These, too.
* GUIs o gtk2hs
Now this looks like trouble :). I mean, I would love to know that Gtk2Hs is installed everywhere, but IMHO it would put a great burden to the platform. Would it just bundle the whole Gtk+ inside the instaler? Hmmm... -- Felipe.

Hello Felipe, Tuesday, August 4, 2009, 5:42:21 AM, you wrote:
o gtk2hs
Now this looks like trouble :). I mean, I would love to know that Gtk2Hs is installed everywhere, but IMHO it would put a great burden to the platform. Would it just bundle the whole Gtk+ inside the instaler? Hmmm...
one posible solution to this problem would be to split HP into two editions - basic one should be kept small and simple, even smaller than old GHC distros (i.e. minus opengl-like stuff, with only "core" libraries remaining). while batteries included distro should grow, its main goal - provide one-step installer of whole Haskell world -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Don Stewart wrote:
The following things were missing from the current Platform. There are many. How would you identify the top, say, 5 libs to add?
XML support is missing from your list. That is definitely part of every modern "batteries included" platform, and it should be high on our list. We have a number of excellent and mature packages in this category. Regards, Yitz

Don Stewart wrote:
o bytestring-trie — IntMap for ByteStrings o dlist — difference lists
Well, if we're looking for a champion, I suggest these two should be added. Both serve demonstrated needs, both are easy to support, and both are reinvented over and over again. Because of that reinvention alone, it'd be nice to canonize a library in order to minimize wasted efforts. (If anyone has complaints about bytestring-trie I'd be more than happy to hear of and address them.) -- Live well, ~wren

wren:
Don Stewart wrote:
o bytestring-trie — IntMap for ByteStrings o dlist — difference lists
Well, if we're looking for a champion, I suggest these two should be added. Both serve demonstrated needs, both are easy to support, and both are reinvented over and over again. Because of that reinvention alone, it'd be nice to canonize a library in order to minimize wasted efforts.
(If anyone has complaints about bytestring-trie I'd be more than happy to hear of and address them.)
Could you do comparative benchmarks for insertion and lookup into * Data.Map String Int * Data.Map ByteString Int * bytestring-trie I don't have a sense for how much better bytestring-trie is. dlist I think is obvious. It should really be part of base (as it is used by Show/Read in ad hoc form). -- Don

Don Stewart wrote:
wren:
Don Stewart wrote:
o bytestring-trie — IntMap for ByteStrings o dlist — difference lists
Well, if we're looking for a champion, I suggest these two should be added. Both serve demonstrated needs, both are easy to support, and both are reinvented over and over again. Because of that reinvention alone, it'd be nice to canonize a library in order to minimize wasted efforts.
(If anyone has complaints about bytestring-trie I'd be more than happy to hear of and address them.)
Could you do comparative benchmarks for insertion and lookup into
* Data.Map String Int * Data.Map ByteString Int * bytestring-trie
I don't have a sense for how much better bytestring-trie is.
From Mark Wotton on 2009.03.01 using Microbench hacked to use Integers instead of Ints (to avoid overflow bugs): * Data.List.lookup [(ByteString, Int)]: 160.641ns per iteration / 6225.07 per second. * Data.Map.lookup (Map ByteString Int): 0.881ns per iteration / 1135623.22 per second. * Data.Trie.lookup (Trie Int): 0.243ns per iteration / 4116930.41 per second. I'll try to set up a benchmarking suite to test more recent versions and other functions in the interface. -- Live well, ~wren

wren:
Don Stewart wrote:
wren:
Don Stewart wrote:
o bytestring-trie — IntMap for ByteStrings o dlist — difference lists
Well, if we're looking for a champion, I suggest these two should be added. Both serve demonstrated needs, both are easy to support, and both are reinvented over and over again. Because of that reinvention alone, it'd be nice to canonize a library in order to minimize wasted efforts.
(If anyone has complaints about bytestring-trie I'd be more than happy to hear of and address them.)
Could you do comparative benchmarks for insertion and lookup into
* Data.Map String Int * Data.Map ByteString Int * bytestring-trie
I don't have a sense for how much better bytestring-trie is.
From Mark Wotton on 2009.03.01 using Microbench hacked to use Integers instead of Ints (to avoid overflow bugs):
* Data.List.lookup [(ByteString, Int)]: 160.641ns per iteration / 6225.07 per second.
* Data.Map.lookup (Map ByteString Int): 0.881ns per iteration / 1135623.22 per second.
* Data.Trie.lookup (Trie Int): 0.243ns per iteration / 4116930.41 per second.
I'll try to set up a benchmarking suite to test more recent versions and other functions in the interface.
Thanks! A maintainable testsuite (so we can check this again in the future) will be useful!

I'll try to set up a benchmarking suite to test more recent versions and other functions in the interface.
Thanks!
A maintainable testsuite (so we can check this again in the future) will be useful!
I've occasionally wished for a speed and memory test suite for maps. There are a lot of implementations for haskell, with different tradeoffs, and it would be nice to quantify someone's assertion that "X is so much better than Y" or test a new implementation. This is one area where haskell is much more complicated than an imperative language like python, where you just use a built-in hashmap and performance is going to be basically good for just about all uses. As an aside, definitely +1 on including dlist. It's very useful for Writer, so much so I have my own type aliases for it. However, the name is anything but intuitive to someone looking for a list with efficient appends. Given all the above, one nice contribution of HP could be to package together data structures with the promise that their use and performance are mostly orthogonal and they represent the current best practices for a given access pattern, and documentation describing the differences. I.e. we have list -> dlist -> finger tree sequence with roughly increasing capabilities but also increasing constant costs (I'm guessing) and the same sort of story with maps and arrays. Is documentation part of the goal for HP? Is there a place to put it? It would be nice if it could be integrated into the haddock in a discoverable way, i.e. attached to Data or something, or linked off the main TOC. And it would be nice in general if package haddocks could be linked (or maybe they already can?). I'd be willing to make a start on some documentation and some benchmarks for various access patterns with nice graphs and stuff, if there's interest.

* Data.Map String Int * Data.Map ByteString Int * bytestring-trie
What about a Unicode/text aware version of ByteString? I mean, I suspect people will want to use this map for strings of characters, and I want to make sure we don't hardcode a type that will cause us trouble in the future (and I have this idea in my head that ByteString is supposed to represent byte-sequences and not character-sequences). Just checking -Isaac

Isaac Dupree wrote:
* Data.Map String Int * Data.Map ByteString Int * bytestring-trie
What about a Unicode/text aware version of ByteString? I mean, I suspect people will want to use this map for strings of characters, and I want to make sure we don't hardcode a type that will cause us trouble in the future (and I have this idea in my head that ByteString is supposed to represent byte-sequences and not character-sequences).
The bytestring-trie package uses ByteStrings as a vector of bytes. That is, there's no built in support for or against textual data (modulo byte==char traditions). Anything that can be rendered into a ByteString can be tried (with performance depending on the suitability of the encoding). I can't really see any way to make use of knowing that the data is textual, though. I've thought about adding a typeclass to automate the encoding/decoding between various "string" types and the ByteStrings used internally. Unfortunately such a class would have a very large number of methods, and is of dubious utility in the big picture. All in all, the problem of rendering an abstract string into a sequence of bytes belongs to another package. -- Live well, ~wren

On Tue, Aug 4, 2009 at 2:27 PM, Isaac Dupree wrote: * Data.Map String Int * Data.Map ByteString Int
* bytestring-trie What about a Unicode/text aware version of ByteString? I assume you're not asking in the context of map-like structures, but in
general? If so, look at the text and text-icu packages on Hackage.
Regards,
Bryan.

wren ng thornton wrote:
Don Stewart wrote:
Could you do comparative benchmarks for insertion and lookup into
* Data.Map String Int * Data.Map ByteString Int * bytestring-trie
I don't have a sense for how much better bytestring-trie is.
<cut>
I'll try to set up a benchmarking suite to test more recent versions and other functions in the interface.
If you're going to look at Map String as well as Map ByteString, I hope you wouldn't mind tossing the list-tries package (http://hackage.haskell.org/package/list-tries) into the mix. A Patricia trie with Enum keys (since we're dealing with Chars) from Data.ListTrie.Patricia.Map.Enum should beat Data.Map, at least. I've been meaning to benchmark my library myself but I haven't found the time or energy to do so.

On Mon, Aug 3, 2009 at 4:44 PM, Don Stewart
Following Simon M's advice, I look over the typical "batteries" categories, using Python as input:
http://docs.python.org/library/index.html
The following things were missing from the current Platform. There are many. How would you identify the top, say, 5 libs to add?
-- Don
* String support o binary formatting [binary] — lazy binary parsing/serialising o pcre regexes [pcre-light] [regex-pcre] — what’s our best regex lib? o unicode text [text] [text-icu] — packed, unicode text o codecs/encodings — encodings? * Data types o higher dimensional arrays [hmatrix] o bloomfilter — bloomfilters o bytestring-tries — IntMap for ByteStrings o dlist — difference lists o numbers — expanded number types * text o attoparsec (simple, bytestring parsing) o polyparse o csv parsing o pandoc — markdown, reStructuredText, HTML, LaTeX, ConTeXt, Docbook, OpenDocument, ODT, RTF, MediaWiki, groff * math and numerics o blas — BLAS o cmath — C math functions o dimensional — physical dimensions o fftw o mersenne-random — fast randoms * persistance o anydbm? o sqlite3 o hdbc * compression o bzip2 o zip o tar * file formats o csv o config parser * crypto o hmac, md5, sha, hashing * systems o getopt o logging o termio o editline o mmap * Internet o network-bytestring o ssl o json o feed (rss, atom) o mime o base64 et al o uuencode o cgi o fastcgi o urls o ftp, http, imap, smtp clients o uuid o url parsing o http server o xml-rpc * Multimedia o colour * Internationalization o gettext o locale o i18n * GUIs o gtk2hs * Development o hscolour _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
I would also highly support adding the excellent split library, supporting splitting strings. Split is one of the most-asked for functions in Haskell, and even though it's often easy to use a larger parsing library, the split functions can be very useful, especially for dealing with non-string types. Alex Alex
participants (25)
-
Adam Wick
-
Alexander Dunlap
-
Ashley Yakeley
-
Axel Simon
-
Bryan O'Sullivan
-
Bulat Ziganshin
-
Don Stewart
-
Duncan Coutts
-
Evan Laforge
-
Felipe Lessa
-
Gwern Branwen
-
Ian Lynagh
-
Isaac Dupree
-
Jeremy O'Donoghue
-
Krasimir Angelov
-
Magnus Therning
-
Malcolm Wallace
-
Matti Niemenmaa
-
Neil Mitchell
-
Niklas Broberg
-
Simon Marlow
-
Simon Peyton-Jones
-
Sittampalam, Ganesh
-
wren ng thornton
-
Yitzchak Gale