
Exercise: how does the approach in the code relate to the approaches to sharing explained in http://okmij.org/ftp/tagless-final/sharing/sharing.html
Chapter 3 introduces an implicit impure counter, and Chapter 4 uses a database that is passed around. let_ in Chapter 5 of sharing.pdf realizes the sharing with sort of continuation-passing style.The unsafe counter works across the module (c.f. counter.zip) but is generally unpredictable...
The reason sharing has the type m a -> m (m a) rather than m a -> m a is the fact new calculations to share may be created dynamically. Therefore, we need a supply of keys (gensym). We count on the monad m to provide the supply. However, top-level computations (bound to top-level identifiers) are created only once, at the initialization time. Therefore, a static assignment of identifiers will suffice. The static assignment is similar to the manual label assignment technique -- the first technique described Sec 3 of the sharing.pdf paper. John T. O'Donnell has automated this manual assignment using TH.
Now I'm on to the next task; how we represent continuous probability distributions? The existing libraries:
Seemingly have restricted themselves to discrete distributions, or at least providing Random support for Monte-Carlo simulations.
I must warn that although it is ridiculously easy to implement MonadProb in Haskell, the result is not usable. Try to implement HMM with any of the available MonadProb and see how well it scales. (Hint: we want the linear time scaling as we evolve the model -- not exponential). There is a *huge* gap between a naive MonadProb and something that could be used even for passingly interesting problems. We need support for so-called `variable elimination'. We need better sampling procedures: rejection sampling is better forgotten. Finally, GHC is actually not a very good language system for probabilistic programming of generative-model--variety. See http://okmij.org/ftp/Haskell/index.html#memo-off for details. To give you the flavor of difficulties, consider a passingly interesting target tracking problem: looking at a radar screen and figuring out how many planes are there and where they are going: http://okmij.org/ftp/kakuritu/index.html#blip Since the equipment is imperfect, there could be a random blip on the radar that corresponds to no target. If we have a 10x10 screen and 2% probability of a noise blip in every of the 100 `pixels', we get the search space of 2^100 -- even before we get to the planes and their stochastic equations of motion. Hansei can deal with the problem -- and even scale linearly with time. I don't think you can make a monad out of Gaussian distributions. That is not to say that monads are useless in these problems -- monads are useful, but at a different level, to build a code for say, MCMC (Monte Carlo Markov Chain). It this this meta-programming approach that underlies Infer.net http://research.microsoft.com/en-us/um/cambridge/projects/infernet/ and Avi Pfeffer's Figaro http://www.cs.tufts.edu/~nr/cs257/archive/avi-pfeffer/figaro.pdf I should point out Professor Sato of Toukyou Tech, who is the pioneer in the probabilistic programming http://sato-www.cs.titech.ac.jp/sato/ http://sato-www.cs.titech.ac.jp/en/publication/ You can learn from him all there is to know about probabilistic programming.