Suggestions for an empirical master thesis

Hi, everyone! I'm looking for a master thesis topic that is empirical in nature (like, statistics, hypothesis testing, etc.). The work could involve analyzing either package metadata (Cabal information), code (AST), and/or data from some other source, possibly comparing similar data from some non-Haskell domain. As an example, one idea that was suggested to me was to look at usage aggregation, or like, how much of a given package another package is actually using (which could be relevant in any orphan-instance discussion). From an academic standpoint, it would be good to pick a metric that can be validated in some way. Thank you! Best, Jon

Jon,
Why don't you do some graph analysis on the entire hackage using the
metadata? Build a graph and analyse its statistical properties.
Best,
-m
On 4 August 2016 at 21:15, Jon Kristensen
Hi, everyone!
I'm looking for a master thesis topic that is empirical in nature (like, statistics, hypothesis testing, etc.).
The work could involve analyzing either package metadata (Cabal information), code (AST), and/or data from some other source, possibly comparing similar data from some non-Haskell domain.
As an example, one idea that was suggested to me was to look at usage aggregation, or like, how much of a given package another package is actually using (which could be relevant in any orphan-instance discussion).
From an academic standpoint, it would be good to pick a metric that can be validated in some way.
Thank you!
Best, Jon _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Hi, Mehmet! Thank you for your suggestion! I think that I will need to describe some concrete goals, and provide some evidence from literature that the area includes, or is related to, particular scientific and engineering challenges. I will take a look at related papers about this, but any thoughts on it would be welcome! Best, Jon On 08/05/2016 02:28 PM, Suzen, Mehmet wrote:
Jon,
Why don't you do some graph analysis on the entire hackage using the metadata? Build a graph and analyse its statistical properties.
Best, -m
On 4 August 2016 at 21:15, Jon Kristensen
wrote: Hi, everyone!
I'm looking for a master thesis topic that is empirical in nature (like, statistics, hypothesis testing, etc.).
The work could involve analyzing either package metadata (Cabal information), code (AST), and/or data from some other source, possibly comparing similar data from some non-Haskell domain.
As an example, one idea that was suggested to me was to look at usage aggregation, or like, how much of a given package another package is actually using (which could be relevant in any orphan-instance discussion).
From an academic standpoint, it would be good to pick a metric that can be validated in some way.
Thank you!
Best, Jon _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Hi Jon,
Allow me to brainstorm with you. I don't know what is done in the context
of code metrics for Haskell programs, but you could try to find
correlations (of their lack thereof) between metrics such as (cyclomatic
complexity, fan-in/fan-out) and bugs. For this you could use the Github
repositories. The number of bugs would have to be normalized using the
numbers of users of a project (maybe measured in number of downloads).
Just an idea...
On Thu, Aug 4, 2016 at 10:15 PM, Jon Kristensen
Hi, everyone!
I'm looking for a master thesis topic that is empirical in nature (like, statistics, hypothesis testing, etc.).
The work could involve analyzing either package metadata (Cabal information), code (AST), and/or data from some other source, possibly comparing similar data from some non-Haskell domain.
As an example, one idea that was suggested to me was to look at usage aggregation, or like, how much of a given package another package is actually using (which could be relevant in any orphan-instance discussion).
From an academic standpoint, it would be good to pick a metric that can be validated in some way.
Thank you!
Best, Jon _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
participants (3)
-
Damian Nadales
-
Jon Kristensen
-
Suzen, Mehmet