Testing for statistical properties

Hey everyone! I have some computations that satisfy statistical properties which I would like to test --- that is, the result of the computation is non-deterministic, but I want to check that it is sampling the distribution that it should be sampling. Is anyone aware of a Haskell library out there that does anything like this? Cheers, Greg

Hi Greg,
Assuming this is a one-dimensional distribtution, you should use a
kolmogorov-smirnov test to test this:
http://en.wikipedia.org/wiki/Kolmogorov-Smirnov_test
I've implemented to the KS distribution from the CERN code linked in
the wikipedia article, here:
http://github.com/glutamate/samfun/blob/master/Math/Probably/KS.hs
(warning, i wasn't able to verify the numbers coming out against
anything so just check that it makes sense)
So all you have to do is to find the maximal distance between your
samples and the cumulative density function, multiply by the sqrt. of
of the number of samples, and calculate kprob on that.
I don't think you can do this in a Bayesian way because you can't
enumerate all the other distributions your samples could come from?
Tom
On Thu, Jan 7, 2010 at 9:31 PM, Gregory Crosswhite
Hey everyone! I have some computations that satisfy statistical properties which I would like to test --- that is, the result of the computation is non-deterministic, but I want to check that it is sampling the distribution that it should be sampling. Is anyone aware of a Haskell library out there that does anything like this?
Cheers, Greg
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Thanks! I had reached the same conclusion, so I am glad to see that you already wrote code to do this for me. :-) There is a bug in the version that you posted, though: you missed one of the terms in the u < 0.755 case, so the c2 constant goes completely unused. Here is my modified version of your function + bug fix, with some stylistic tweaks: computeKolmogorovProbability :: Double -> Double computeKolmogorovProbability z | u < 0.2 = 1 | u < 0.755 = 1 - w * (exp(c1/v)+exp(c2/v)+exp(c3/v))/u | u < 6.8116 = 2 * sum [ sign * exp(coef*v) | (sign,coef) <- take (1 `max` round (3/u)) coefs ] | otherwise = 0 where u = abs z v = u*u w = 2.50662827 c1 = -pi**2/8 c2 = 9*c1 c3 = 25*c1 coefs = [(1,-2),(-1,-8),(1,-18),(-1,-32)] Cheers, Greg On Jan 8, 2010, at 2:59 AM, Tom Nielsen wrote:
Hi Greg,
Assuming this is a one-dimensional distribtution, you should use a kolmogorov-smirnov test to test this:
http://en.wikipedia.org/wiki/Kolmogorov-Smirnov_test
I've implemented to the KS distribution from the CERN code linked in the wikipedia article, here:
http://github.com/glutamate/samfun/blob/master/Math/Probably/KS.hs
(warning, i wasn't able to verify the numbers coming out against anything so just check that it makes sense)
So all you have to do is to find the maximal distance between your samples and the cumulative density function, multiply by the sqrt. of of the number of samples, and calculate kprob on that.
I don't think you can do this in a Bayesian way because you can't enumerate all the other distributions your samples could come from?
Tom
On Thu, Jan 7, 2010 at 9:31 PM, Gregory Crosswhite
wrote: Hey everyone! I have some computations that satisfy statistical properties which I would like to test --- that is, the result of the computation is non-deterministic, but I want to check that it is sampling the distribution that it should be sampling. Is anyone aware of a Haskell library out there that does anything like this?
Cheers, Greg
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (2)
-
Gregory Crosswhite
-
Tom Nielsen