representations of probability distributions
 
            Dear cafe, following up the announcement of bali-phy and the discussion that ensued, I'd like ask the cafe for an overview of representations of probability distributions as data types. It seems that all packages mentioned to far use some flavour of Markov chain. Why? What else is there? Can someone recommend a survey publication/book/talk? Explicitly I am interested in: - the data type, as in Haskell's type system - how an element of the data type relates to the distribution it describes - how the monad instance works (if there is one) - how to compute conditionals (if computable) - how to compute integrals (if computable) - what base types are supported (finite types, reals, products ...) Below is an incomplete, and possibly oversimplified (or even wrong), list of my own. Additions and corrections welcome. -- Olaf 1. Markov chains - uses a random number generator - type is something like Markov a = a -> IO a - repeated execution produces an infinite trace xs :: [a] where for each predicate p :: a -> Bool, the limit of (length (filter p (take n xs))) % n as n tends to infinity approaches the probability of p under the distribution modeled. - monad instance? - conditionals? - integrals? - supported base types a in Markov a? 2. Riesz functionals: possibly the most functional representation - representation as linear functionals Riesz a = (a -> R) -> R for a suitable real number type R - embed predicates in the type (a -> R) via characteristic functions characteristic p = \a -> if p a then 1 else 0 Then for a functional phi :: Riesz a and p :: a -> Bool, phi (characteristic p) is the probability of p. - Monad instance is the same as for the continuation monad - conditionals? - integration is trivial: evaluation - arbitrary base types - but how would you even declare a uniform distribution in this type? 3. Density functions - represent a distribution on the reals as density function w.r.t. the Lebesgue measure - limited to reals (and products thereof) - data type possibly holds only parameters of parametric families (gaussian, beta, gamma, ...) - computing conditionals only for matching pairs of prior and sampling function (conjugate priors), analytically - integration analytically - no monad instance because no arbitrary base type 4. Finite distributions - type Dist a = [(a,R)] for suitable real number type R - optimizations use search trees instead of flat lists - basically composition of list and writer monads, where R is regarded as monoid under multiplication - represents a distribution of finite support. If phi :: Dist a, then probability p = sum $ map snd $ filter (p.fst) phi - conditioning possible if base type is of Eq class and R supports division - gets unwieldy without Ord instance of base type to keep the list small 5. Valuations - type Val a = (a -> Bool) -> R for suitable real number type R - models mathematical definition closely - monad bind is integration: requires exact reals R due to limit process - conditioning possible if R supports division
 
            Hi, On 07/03/2017 05:14 PM, Olaf Klinke wrote:
Dear cafe,
following up the announcement of bali-phy and the discussion that ensued, I'd like ask the cafe for an overview of representations of probability distributions as data types. It seems that all packages mentioned to far use some flavour of Markov chain. Why? What else is there? Can someone recommend a survey publication/book/talk?
One other class of inference strategies is Sequential Monte Carlo (aka Particle filters). Unlike MCMC, these methods run multiple points ("particles") in parallel, and the points can have unequal weights. The point cloud is incrementally updated as information is added, so these methods work well for online algorithms where you sequentially add data points. One idea in the probabilistic programming community is to separate the inference strategy from the inference problem, so that one could apply "pluggable" inference strategies. (See https://arxiv.org/abs/1404.0099). There are also "variational" methods, which I think restrict the posterior distribution to a particular analytic form, and use an iterative algorithm to find the parameter values of the restricted distribution that minimize the KL distance to the true posterior distribution. So, this is not a Monte Carlo method.
Explicitly I am interested in: - the data type, as in Haskell's type system - how an element of the data type relates to the distribution it describes - how the monad instance works (if there is one) - how to compute conditionals (if computable) - how to compute integrals (if computable) - what base types are supported (finite types, reals, products ...)
Below is an incomplete, and possibly oversimplified (or even wrong), list of my own. Additions and corrections welcome. -- Olaf
1. Markov chains - uses a random number generator - type is something like Markov a = a -> IO a - repeated execution produces an infinite trace xs :: [a] where for each predicate p :: a -> Bool, the limit of (length (filter p (take n xs))) % n as n tends to infinity approaches the probability of p under the distribution modeled. - monad instance? - conditionals? - integrals? - supported base types a in Markov a?
2. Riesz functionals: possibly the most functional representation - representation as linear functionals Riesz a = (a -> R) -> R for a suitable real number type R - embed predicates in the type (a -> R) via characteristic functions characteristic p = \a -> if p a then 1 else 0 Then for a functional phi :: Riesz a and p :: a -> Bool, phi (characteristic p) is the probability of p. - Monad instance is the same as for the continuation monad - conditionals? - integration is trivial: evaluation - arbitrary base types - but how would you even declare a uniform distribution in this type?
3. Density functions - represent a distribution on the reals as density function w.r.t. the Lebesgue measure - limited to reals (and products thereof) - data type possibly holds only parameters of parametric families (gaussian, beta, gamma, ...) - computing conditionals only for matching pairs of prior and sampling function (conjugate priors), analytically - integration analytically - no monad instance because no arbitrary base type
4. Finite distributions - type Dist a = [(a,R)] for suitable real number type R - optimizations use search trees instead of flat lists - basically composition of list and writer monads, where R is regarded as monoid under multiplication - represents a distribution of finite support. If phi :: Dist a, then probability p = sum $ map snd $ filter (p.fst) phi - conditioning possible if base type is of Eq class and R supports division - gets unwieldy without Ord instance of base type to keep the list small
5. Valuations - type Val a = (a -> Bool) -> R for suitable real number type R - models mathematical definition closely - monad bind is integration: requires exact reals R due to limit process - conditioning possible if R supports division _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
participants (2)
- 
                 Benjamin Redelings Benjamin Redelings
- 
                 Olaf Klinke Olaf Klinke