Feedback/Ideas for a Master Thesis about creating a Querying Tool for Hackage

Hi, My name is Victor Nithander and I am currently doing my Master Thesis at the Computer Science and Engineering department at Chalmers in Gothenburg and would like to ask you for some help. My Master Thesis is about creating a querying tool for Hackage written in Haskell and I would like to know what kind of features you would like the querying tool to have. My plan is to design and implement an embedded language for queries to be expressed in and then implement a backend which accordingly takes care of those queries. I then plan to packet the tool as a package using Cabal and upload it to Hackage. Any feedback/ideas on features or other things about the tool would be greatly appreciated as I want to make a tool which benefits the community as much as possible. Examples of queries I'm thinking of to implement is: - Which language features (such as typeclass instances) does a certain package use? - How many packages uses a certain language feature? - Which packages does a certain package depend on? - Which packages depends on a certain package? Of course, this in no way limits what other type of queries could be made. I will also post this at the Haskell Section of reddit at http://www.reddit.com/r/haskell/, sorry for the redundancy. Feel free to respond to this either at this mailing list, at reddit, or as a personal email at nithande@student.chalmers.se though I think the best would be to have an open discussion about it. Thanks in advance! / Victor

On Thu, 2014-03-27 at 18:51 +0100, Victor Nithander wrote:
Hi,
My name is Victor Nithander and I am currently doing my Master Thesis at the Computer Science and Engineering department at Chalmers in Gothenburg and would like to ask you for some help.
My Master Thesis is about creating a querying tool for Hackage written in Haskell and I would like to know what kind of features you would like the querying tool to have. My plan is to design and implement an embedded language for queries to be expressed in and then implement a backend which accordingly takes care of those queries. I then plan to packet the tool as a package using Cabal and upload it to Hackage.
Hi Victor, This sounds great. Some feedback: I can't quite tell from your description if you're planning to query just the package metadata (ie the info available in the hackage index) or also the content / source code of packages. Just using the metadata would be a lot easier! Something you may or may not wish to consider is full text search. I've actually been considering integrating full text search into the cabal-install client (based on the text search code that the hackage-server uses). I'm not quite sure if that'd fit nicely with the kind of structured querying that you've mentioned or not. Something to consider anyway. Making it into it's own package is certainly the right place to start. You may want to keep in mind however two contexts where one may ultimately wish to use the code: hackage-server and cabal-install. Assuming that you're only using the index metadata and not package contents then this could fit nicely with either of these tools. In hackage-server we'd be somewhat concerned about performance, but would be able to keep data structures in memory. In cabal-install we obviously cannot pre-compute an in-memory structure for multiple queries, but it would be possible to pre-compute some structure and keep it on disk, and load it into memory to do a single query (a bit like how hoogle works now). Duncan

On 2014-03-27 17:51, Victor Nithander wrote:
Hi,
My name is Victor Nithander and I am currently doing my Master Thesis at the Computer Science and Engineering department at Chalmers in Gothenburg and would like to ask you for some help.
My Master Thesis is about creating a querying tool for Hackage written in Haskell and I would like to know what kind of features you would like the querying tool to have. My plan is to design and implement an embedded language for queries to be expressed in and then implement a backend which accordingly takes care of those queries. I then plan to packet the tool as a package using Cabal and upload it to Hackage.
Any feedback/ideas on features or other things about the tool would be greatly appreciated as I want to make a tool which benefits the community as much as possible.
Examples of queries I’m thinking of to implement is:
*
Which language features (such as typeclass instances) does a certain package use?
*
How many packages uses a certain language feature?
*
Which packages does a certain package depend on?
*
Which packages depends on a certain package?
Hi Victor, cabal-db [1] already does some have basic cross packages functionalities. It would be nice a more powerful query language to go with it. [1] https://github.com/vincenthz/cabal-db -- Vincent
participants (3)
-
Duncan Coutts
-
Victor Nithander
-
Vincent Hanquez