The state of Hackage: what are we doing about it?

I see fairly regular complaints about too many Haskell libraries, bewildering choice of difficult-to-determine quality. I've tried to summarize the state of Hackage, and what projects are active to make it easier to find high quality libraries: http://tinyurl.com/2cqw9sb Thoughts? -- Don

Excerpts from Don Stewart's message of Tue Jun 01 01:13:20 +0200 2010:
I see fairly regular complaints about too many Haskell libraries, bewildering choice of difficult-to-determine quality.
I want to send a small reminder that there was the idea adding a public wiki for each project which can react upon wishes of users faster than everything else: http://haskell.org/haskellwiki/Hackage_wiki_page_per_project_discussion Marc Weber

Marc Weber schrieb:
Excerpts from Don Stewart's message of Tue Jun 01 01:13:20 +0200 2010:
I see fairly regular complaints about too many Haskell libraries, bewildering choice of difficult-to-determine quality.
I want to send a small reminder that there was the idea adding a public wiki for each project which can react upon wishes of users faster than everything else: http://haskell.org/haskellwiki/Hackage_wiki_page_per_project_discussion
This seems to be part of current efforts. Quoting http://donsbot.wordpress.com/2010/05/31/there-are-a-hell-of-a-lot-of-haskell... "2. Google Summer of Code: Hackage 2.0 – we have Matt Gruen working this summer to finish the implementation of Hackage 2.0 – an improved Hackage that will allow for many new features to help sort out the wheat from the chaff in Haskell packages: build reports, wiki commenting, and social voting."

Forked to the Cafe... Hi all What's the procedure for marking one's own package(s) as deprecated on Hackage? Best wishes Stephen

On Tue, Jun 1, 2010 at 5:55 PM, Stephen Tetley
What's the procedure for marking one's own package(s) as deprecated on Hackage?
Ask Ross Paterson to deprecate your package. Once a package is deprecated it won't show up in the package list anymore but will still be available from the package URL. Regards, Bas

Thanks Bas I've just emailed Ross, so that should be one "zombie" down when he has the chance to update Hackage.

On May 31, 2010, at 19:13 , Don Stewart wrote:
I see fairly regular complaints about too many Haskell libraries, bewildering choice of difficult-to-determine quality.
One thing that might help is just a less cluttered/better organized interface. I always have to use browser find on the package list page. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

2010/6/1 Brandon S. Allbery KF8NH
On May 31, 2010, at 19:13 , Don Stewart wrote:
I see fairly regular complaints about too many Haskell libraries, bewildering choice of difficult-to-determine quality.
One thing that might help is just a less cluttered/better organized interface. I always have to use browser find on the package list page.
The browser find can be quite effective when the descriptions are good. It could also be less boring to use if each package was in a single category. Cheers, Thu

Don wrote
I see fairly regular complaints about too many Haskell libraries, bewildering choice of difficult-to-determine quality.
I've tried to summarize the state of Hackage, and what projects are active to make it easier to find high quality libraries:
Thoughts?
the Hayoo! group is currently extending the search engine, such that it becomes possible, to not only search types and functions, but also search the package descriptions. Building a search index for the cabal package description is already implemented. At the moment the Hayoo! search interface is is the missing part. Technically it's not a difficult extension, but currently we don't have very much spare time. For ranking the results for a package search, a download statistic could be very useful and could easily be integrated. If such a statistic would be available in machine readable format (csv, xml, plain text, ...), we could integrate that. Chees, Uwe P.S.: Sorry for Hayoo! beeing down at the moment. There was a power breakdown at FH Wedel yesterday, and that destroyed the RAID controller of the server running Hayoo!

On Wed, Jun 2, 2010 at 2:20 PM, Uwe Schmidt
For ranking the results for a package search, a download statistic could be very useful and could easily be integrated. If such a statistic would be available in machine readable format (csv, xml, plain text, ...), we could integrate that.
Ordering by nr of direct/indirect reverse dependencies might also be useful. We could add a .csv file version of the main reverse dependency page which you can index easily: http://bifunctor.homelinux.net/~roel/hackage/packages/archive/revdeps-list.h... We should probably compress the .csv file to reduce some network bandwidth. Although that's not really a problem for us because we're on a high bandwidth link. Note that .csv files are already provided for individual packages. Here's the one for the latest network for example: http://bifunctor.homelinux.net/~roel/hackage/packages/archive/network/2.2.1.... Format: package name-version, nr of direct rev. deps., nr of indirect rev. deps., 'D' for Direct rev. dep. and 'I' for Indirect rev. dep.. Regards, Bas

Bas wrote:
Ordering by nr of direct/indirect reverse dependencies might also be useful.
this is already done. After indexing the cabal pages, the dependencies are all known and a package rank is derived from that graph. That is also used in the (new) function search, functions in base (and bytestring) then really get a high rank.
We could add a .csv file version of the main reverse dependency page which you can index easily: http://bifunctor.homelinux.net/~roel/hackage/packages/archive/revdeps-list. html
not really needed, see above. More interesting for package search is a popularity statistics. Thanks for your hint, Uwe

On Wed, Jun 2, 2010 at 4:35 PM, Uwe Schmidt
Bas wrote:
Ordering by nr of direct/indirect reverse dependencies might also be useful.
this is already done.
Ok nice. For others who like to have a .csv file version of the reverse dependency page I quickly hacked something together: http://bifunctor.homelinux.net/~roel/hackage/packages/archive/revdeps-list.c... This is also updated daily. Regards, Bas

si:
Don wrote
I see fairly regular complaints about too many Haskell libraries, bewildering choice of difficult-to-determine quality.
I've tried to summarize the state of Hackage, and what projects are active to make it easier to find high quality libraries:
Thoughts?
the Hayoo! group is currently extending the search engine, such that it becomes possible, to not only search types and functions, but also search the package descriptions. Building a search index for the cabal package description is already implemented. At the moment the Hayoo! search interface is is the missing part. Technically it's not a difficult extension, but currently we don't have very much spare time.
For ranking the results for a package search, a download statistic could be very useful and could easily be integrated. If such a statistic would be available in machine readable format (csv, xml, plain text, ...), we could integrate that.
I've been posting CSV files of the download statistics here: http://www.galois.com/~dons/hackage/hackage-downloads.csv The next quarter's aggregated downloads are due soon. The Arch Haskell site uses these stats to compute some popularity metrics: http://www.galois.com/~dons/arch-haskell-status.html -- Don

On Thursday 03 June 2010 06:27:43 am Don Stewart wrote:
I've been posting CSV files of the download statistics here:
http://www.galois.com/~dons/hackage/hackage-downloads.csv
The next quarter's aggregated downloads are due soon.
The Arch Haskell site uses these stats to compute some popularity metrics:
with that data, it should be rather easy to compute some kind weights for the packages and to prefer the popular packages during the search. In the long run it would be nice to get this statistics more frequently, e.g. monthly. We see, that within a quarter there are a lot of new packages. Uwe

si:
On Thursday 03 June 2010 06:27:43 am Don Stewart wrote:
I've been posting CSV files of the download statistics here:
http://www.galois.com/~dons/hackage/hackage-downloads.csv
The next quarter's aggregated downloads are due soon.
The Arch Haskell site uses these stats to compute some popularity metrics:
with that data, it should be rather easy to compute some kind weights for the packages and to prefer the popular packages during the search. In the long run it would be nice to get this statistics more frequently, e.g. monthly. We see, that within a quarter there are a lot of new packages.
I have scripts, and access to the apache logs -- so its just a matter of free time to work on this. That's the goal though.

Don Stewart wrote:
Thoughts?
Firstly, it pleases me that somebody is taking this problem seriously and looking at it. Currently all the information displayed on a package page comes from the Cabal file. I think it would be useful to be able to retrospectively alter certain things. (Information on stability, level of support, and platforms known to work / not work are obvious candidates here.) The package page really ought to be the central location for saying stuff about the package - and that includes information that becomes available after the package is released. We don't seem to have a comprehensive story for bug-tracking. Each project or library has their own ad-hoc arrangements. Usually that means no bug reporting system at all, although some of the larger projects each have their own seperate site. I think there is milage in setting up some kind of centralised system. Small projects probably don't merit the effort of setting up a whole dedicated tracker, and besides, who wants to create user accounts on a dozen different trackers? I'm thinking some kind of tracker which tracks bugs for any package on Hackage. Like, as soon as you upload a package, people can file bugs against it in one central location. (And check whether it's already been filed, for that matter...) Of course, it's not much use if it doesn't also notify the package author; presumably you have to specify contract details when you create a user account or something. It would also be nice to have a system in place for users to tell package authors about possible fixes / enhancements, etc. But given that everybody has their own favourit source control system (including NONE AT ALL) this isn't particularly easy. I guess we'll leave that for now. Also, unless I'm missing something, Darcs has a feature to run automated tests on commit, but there's no way of including a test framework in such a way that Cabal / Hackage knows how to test your package. The most it can do is test whether it compiles; it might fail spectacularly when actually *run*. (Also, end-users may or may not want this test infrastructure when installing a library.) Ooo, and change logs... There doesn't seem to be any coherant way to organise these. OK, so there's a new version of Foo out now. So... what's new? Is this a bugfix or does it have new features or just tweaked documentation or more QuickCheck properties or...? It looks like there ought to be a specific place to record this information. Documentation is something else worth looking at. Currently the Haskell Way(tm) is to have documentation embedded within the source code itself, which I've never been fond of. For one thing is makes the source about 20x bigger and obscures its structure with a lot of comments. But for another thing, there's more than one kind of documentation. Haddock handles API documentation. It is less than helpful for writing example documentation, introductions and tutorials, and all the other kinds of library documentation one might want to write. Also, it seems unfortunate that you have to edit source code and upload a new version of a package just to improve the documentation. What we do NOT want, of course, is documentation that's out of sync with the package it's supposed to document! Tying the documentation to the source code achieves this (mostly), but it seems like there should be a Better Way(tm)... I'm not sure what though. OK, I'm going to stop typing now...

Tying the documentation to the source code
achieves this (mostly), but it seems like there should be a Better Way(tm)... I'm not sure what though.
Does Haddock support Literate Haskell files? This might be a nice way to keep tutorials and source code together. Also perhaps provide for each package a listing of which packages depend on it. First it shows how popular it is and if docs are missing/inadequate I can at least see how it's used in some other project. -deech

aditya siram wrote:
Does Haddock support Literate Haskell files?
I believe it does.
This might be a nice way to keep tutorials and source code together.
Plausibly. You could write some literate files which aren't part of the library sources, and tell Cabal to include them as "extra files". That won't put the documentation on Hackage though... Presumably special infrastructure would be required for that. (Like, a Cabal field to say "this is extra documentation".)
participants (10)
-
aditya siram
-
Andrew Coppin
-
Bas van Dijk
-
Brandon S. Allbery KF8NH
-
Don Stewart
-
Henning Thielemann
-
Marc Weber
-
Stephen Tetley
-
Uwe Schmidt
-
Vo Minh Thu