GSoC proposal: Data Visualization

Dear Haskell Community, During the last months I used Haskell for machine learning, particularly in the field of Echo State Neural Networks. The main drawback I encountered is that its difficult to visualize and plot data in Haskell in spite the fact there are a couple of plotting libraries. Data visualization is very important in the field of machine learning research (not so much in machine learning implementation) since humans are very efficient to analyze graphical input to figure out what is going on in order to determine possible adjustments. I was wondering if other members of the community have experienced this drawback and would be interested in improved data visualization for Haskell, especially if there is interest to use Haskell for machine learning research. I collected my ideas in the following page: https://github.com/netogallo/Visualizer . Please provide me with feedback because if the proposal is interesting for the community I would start working with it, even if it doesn't make it to this GSoC, but a project like this will need a lot of collaboration for it to be successful. Thank you very much, Best Regards, Ernesto -- Ernesto Rodriguez Bachelor of Computer Science - Class of 2013 Jacobs University Bremen

Hello Ernesto,
There are a number of efforts underway to provide better data vis libraries
for haskell. Likewise, there was some recent discussion on the Diagrams
mailing list about data vis tooling, and there should be a few interesting
tools surfacing over the coming few months.
My immediate concern is that this project is too broad and undefined in
scope to be a successful Haskell GSOC.
A successful GSOC project should have
a) a clear notion of what project's goal is
b) clear evidence that the planned work can reasonably be done over the
summer
c) the result of a successful project would be valuable to the general
haskell community
It sounds like the core of what you want to do is write a small lib that
transforms a data set from some initial "schema" into the "schema" thats
suitable for some underlying choice in plotting tool. This is a useful
thing to do, but not large enough in scope for a GSOC project.
On the flip side, interactive data vis tools are *hard* to do well, and a
GSOC that proposed to work on that from scratch would be very very risky
unless you've spent a lot of time working on building such tools.
You're definitely pointing at region of library space where more nice tools
for haskell would be very valuable, and which a number of folks are trying
to address. But, for GSOC, unless its a very very clearly laid out
proposal, it will be deemed too risky.
I warmly recommend you look at prior years' Haskell GSOC projects to get a
feel for what strong successful projects/proposals look like.
cheers
-Carter
On Fri, Apr 12, 2013 at 5:10 PM, Ernesto Rodriguez
Dear Haskell Community,
During the last months I used Haskell for machine learning, particularly in the field of Echo State Neural Networks. The main drawback I encountered is that its difficult to visualize and plot data in Haskell in spite the fact there are a couple of plotting libraries. Data visualization is very important in the field of machine learning research (not so much in machine learning implementation) since humans are very efficient to analyze graphical input to figure out what is going on in order to determine possible adjustments. I was wondering if other members of the community have experienced this drawback and would be interested in improved data visualization for Haskell, especially if there is interest to use Haskell for machine learning research. I collected my ideas in the following page: https://github.com/netogallo/Visualizerhttps://github.com/netogallo/Visualizer . Please provide me with feedback because if the proposal is interesting for the community I would start working with it, even if it doesn't make it to this GSoC, but a project like this will need a lot of collaboration for it to be successful.
Thank you very much,
Best Regards,
Ernesto
-- Ernesto Rodriguez
Bachelor of Computer Science - Class of 2013 Jacobs University Bremen
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Ernesto Rodriguez wrote:
Dear Haskell Community,
During the last months I used Haskell for machine learning, particularly in the field of Echo State Neural Networks. The main drawback I encountered is that its difficult to visualize and plot data in Haskell in spite the fact there are a couple of plotting libraries. Data visualization is very important in the field of machine learning research (not so much in machine learning implementation) since humans are very efficient to analyze graphical input to figure out what is going on in order to determine possible adjustments. I was wondering if other members of the community have experienced this drawback and would be interested in improved data visualization for Haskell, especially if there is interest to use Haskell for machine learning research. I collected my ideas in the following page: https://github.com/netogallo/Visualizer . Please provide me with feedback because if the proposal is interesting for the community I would start working with it, even if it doesn't make it to this GSoC, but a project like this will need a lot of collaboration for it to be successful.
Your project is very ambitious! In fact, too ambitious. Essentially, you want to build an interactive environment for evaluating Haskell expressions. The use case you have in mind is data visualization for machine learning, but that is just a special case. If you can zoom in and out of plots of infinite time series, you can zoom in and out of audio data, and then why not add an interactive synthesizer widget to create that audio data in the first place. Your idea decomposes into many parts, each of which would easily fill an entire GSoC project on their own. * GUI. Actually, we currently don't have a GUI library that is easy to install for everyone. Choosing wxHaskell or gtk2hs immediately separates your user base into three disjoint parts. I think it's possible to use the web browser as GUI instead (https://github.com/HeinrichApfelmus/threepenny-gui). * Displaying Haskell values in a UI. You mentioned that you want matrices to come with a contextual menu where you can select different transformations on them. It's just a minor step to allow any Haskell function operating on them. I have a couple of ideas on how to do this is in a generic fashion. Unfortunately, the project from last year http://hackage.haskell.org/trac/summer-of-code/ticket/1609 did not succeed satisfactorily. There were some other efforts, but I haven't seen anything released. * UI programming is hard. You could easily spend an entire project on implementing a single visualization, for instance an infinite time series with responsive zoom. It's not difficult to implement something, but adding the right level of polish so that people want to use it takes effort. There's a reason that Matlab costs money, and there's a reason that your mentor relies on it. * Functionality specific to machine learning. Converting Vector to a format suitable for representation of matrices, etc. This is your primary interest. Note that, unfortunately, the parts depend on each other from top to bottom. It's possible to write functionality specific to machine learning, but it would be of little impact if it doesn't come with a good UI. Best regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Hi Hienrich, It is indeed a big scope as you mentioned. Matlab has been working for years to get this functionality right. On the other hand, the project you linked is interesting. For me it would already be a huge advantage if I could edit and re-evaluate expressions interactively (in a comfortable GUI, not ghci). Also a plot widget with sliders would also help. I was wondering if you know any reason the project has not been worked on for various months (as I see in the repo). Is there anyone working in this project and has a later version? I mean these are features that are even available in free math packages such as Sage. Best regards, Ernesto Rodriguez On Sat, Apr 13, 2013 at 10:33 AM, Heinrich Apfelmus < apfelmus@quantentunnel.de> wrote:
Ernesto Rodriguez wrote:
Dear Haskell Community,
During the last months I used Haskell for machine learning, particularly in the field of Echo State Neural Networks. The main drawback I encountered is that its difficult to visualize and plot data in Haskell in spite the fact there are a couple of plotting libraries. Data visualization is very important in the field of machine learning research (not so much in machine learning implementation) since humans are very efficient to analyze graphical input to figure out what is going on in order to determine possible adjustments. I was wondering if other members of the community have experienced this drawback and would be interested in improved data visualization for Haskell, especially if there is interest to use Haskell for machine learning research. I collected my ideas in the following page: https://github.com/netogallo/**Visualizerhttps://github.com/netogallo/Visualizer. Please provide me with feedback because if the proposal is interesting for the community I would start working with it, even if it doesn't make it to this GSoC, but a project like this will need a lot of collaboration for it to be successful.
Your project is very ambitious! In fact, too ambitious.
Essentially, you want to build an interactive environment for evaluating Haskell expressions. The use case you have in mind is data visualization for machine learning, but that is just a special case. If you can zoom in and out of plots of infinite time series, you can zoom in and out of audio data, and then why not add an interactive synthesizer widget to create that audio data in the first place.
Your idea decomposes into many parts, each of which would easily fill an entire GSoC project on their own.
* GUI. Actually, we currently don't have a GUI library that is easy to install for everyone. Choosing wxHaskell or gtk2hs immediately separates your user base into three disjoint parts. I think it's possible to use the web browser as GUI instead (<https://github.com/** HeinrichApfelmus/threepenny-**guihttps://github.com/HeinrichApfelmus/threepenny-gui
).
* Displaying Haskell values in a UI. You mentioned that you want matrices to come with a contextual menu where you can select different transformations on them. It's just a minor step to allow any Haskell function operating on them. I have a couple of ideas on how to do this is in a generic fashion. Unfortunately, the project from last year < http://hackage.haskell.org/**trac/summer-of-code/ticket/**1609http://hackage.haskell.org/trac/summer-of-code/ticket/1609> did not succeed satisfactorily. There were some other efforts, but I haven't seen anything released.
* UI programming is hard. You could easily spend an entire project on implementing a single visualization, for instance an infinite time series with responsive zoom. It's not difficult to implement something, but adding the right level of polish so that people want to use it takes effort. There's a reason that Matlab costs money, and there's a reason that your mentor relies on it.
* Functionality specific to machine learning. Converting Vector to a format suitable for representation of matrices, etc. This is your primary interest.
Note that, unfortunately, the parts depend on each other from top to bottom. It's possible to write functionality specific to machine learning, but it would be of little impact if it doesn't come with a good UI.
Best regards, Heinrich Apfelmus
-- http://apfelmus.nfshost.com
______________________________**_________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe
-- Ernesto Rodriguez Bachelor of Computer Science - Class of 2013 Jacobs University Bremen

Ernesto Rodriguez wrote:
For me it would already be a huge advantage if I could edit and re-evaluate expressions interactively (in a comfortable GUI, not ghci). Also a plot widget with sliders would also help. I was wondering if you know any reason the project has not been worked on for various months (as I see in the repo). Is there anyone working in this project and has a later version? I mean these are features that are even available in free math packages such as Sage.
I was actually the initial mentor for this project. I'm not particularly happy about the result. As you can see, it hasn't been picked up by anyone else, including me, and I think that's because it missed the modularity goals I had in mind. Best regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Heinrich, you hit the nail on the head. for an interactive plotting story to work well, we wind up needing to have better tools in the ecosystem on the gui / computational notebook side. on the other hand, similar work was done last summer, as heinrich mentions, in the form of ghc live https://github.com/shapr/ghclive, by a very strong GSOC participant, and while it works, its not really being used, and is still quite immature. Additionally, the folks at FPcomplete have their browser based haskell interaction tool thats seeing quite a lot of use by folks learning haskell, which in turn raises the quality bar that any other effort must achieve to see serious usage. its also worth wondering if the scala notebook port of the ipython notebook could be used to guid writing a similar tools for haskell. Either way, would require a very concrete plan of attack to be a tractable GSOC project -Carter On Sat, Apr 13, 2013 at 4:33 AM, Heinrich Apfelmus < apfelmus@quantentunnel.de> wrote:
Ernesto Rodriguez wrote:
Dear Haskell Community,
During the last months I used Haskell for machine learning, particularly in the field of Echo State Neural Networks. The main drawback I encountered is that its difficult to visualize and plot data in Haskell in spite the fact there are a couple of plotting libraries. Data visualization is very important in the field of machine learning research (not so much in machine learning implementation) since humans are very efficient to analyze graphical input to figure out what is going on in order to determine possible adjustments. I was wondering if other members of the community have experienced this drawback and would be interested in improved data visualization for Haskell, especially if there is interest to use Haskell for machine learning research. I collected my ideas in the following page: https://github.com/netogallo/**Visualizerhttps://github.com/netogallo/Visualizer. Please provide me with feedback because if the proposal is interesting for the community I would start working with it, even if it doesn't make it to this GSoC, but a project like this will need a lot of collaboration for it to be successful.
Your project is very ambitious! In fact, too ambitious.
Essentially, you want to build an interactive environment for evaluating Haskell expressions. The use case you have in mind is data visualization for machine learning, but that is just a special case. If you can zoom in and out of plots of infinite time series, you can zoom in and out of audio data, and then why not add an interactive synthesizer widget to create that audio data in the first place.
Your idea decomposes into many parts, each of which would easily fill an entire GSoC project on their own.
* GUI. Actually, we currently don't have a GUI library that is easy to install for everyone. Choosing wxHaskell or gtk2hs immediately separates your user base into three disjoint parts. I think it's possible to use the web browser as GUI instead (<https://github.com/** HeinrichApfelmus/threepenny-**guihttps://github.com/HeinrichApfelmus/threepenny-gui
).
* Displaying Haskell values in a UI. You mentioned that you want matrices to come with a contextual menu where you can select different transformations on them. It's just a minor step to allow any Haskell function operating on them. I have a couple of ideas on how to do this is in a generic fashion. Unfortunately, the project from last year < http://hackage.haskell.org/**trac/summer-of-code/ticket/**1609http://hackage.haskell.org/trac/summer-of-code/ticket/1609> did not succeed satisfactorily. There were some other efforts, but I haven't seen anything released.
* UI programming is hard. You could easily spend an entire project on implementing a single visualization, for instance an infinite time series with responsive zoom. It's not difficult to implement something, but adding the right level of polish so that people want to use it takes effort. There's a reason that Matlab costs money, and there's a reason that your mentor relies on it.
* Functionality specific to machine learning. Converting Vector to a format suitable for representation of matrices, etc. This is your primary interest.
Note that, unfortunately, the parts depend on each other from top to bottom. It's possible to write functionality specific to machine learning, but it would be of little impact if it doesn't come with a good UI.
Best regards, Heinrich Apfelmus
-- http://apfelmus.nfshost.com
______________________________**_________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe
participants (3)
-
Carter Schonwald
-
Ernesto Rodriguez
-
Heinrich Apfelmus