Testing for statistical properties

older
telling ghc to run several jobs in...

Gregory Crosswhite

7 Jan 2010 7 Jan '10

9:31 p.m.

Hey everyone! I have some computations that satisfy statistical properties which I would like to test --- that is, the result of the computation is non-deterministic, but I want to check that it is sampling the distribution that it should be sampling. Is anyone aware of a Haskell library out there that does anything like this? Cheers, Greg

Show replies by date

Tom Nielsen

8 Jan 8 Jan

10:59 a.m.

Hi Greg, Assuming this is a one-dimensional distribtution, you should use a kolmogorov-smirnov test to test this: http://en.wikipedia.org/wiki/Kolmogorov-Smirnov_test I've implemented to the KS distribution from the CERN code linked in the wikipedia article, here: http://github.com/glutamate/samfun/blob/master/Math/Probably/KS.hs (warning, i wasn't able to verify the numbers coming out against anything so just check that it makes sense) So all you have to do is to find the maximal distance between your samples and the cumulative density function, multiply by the sqrt. of of the number of samples, and calculate kprob on that. I don't think you can do this in a Bayesian way because you can't enumerate all the other distributions your samples could come from? Tom On Thu, Jan 7, 2010 at 9:31 PM, Gregory Crosswhite wrote:

...

Hey everyone! I have some computations that satisfy statistical properties which I would like to test --- that is, the result of the computation is non-deterministic, but I want to check that it is sampling the distribution that it should be sampling. Is anyone aware of a Haskell library out there that does anything like this?

Cheers, Greg

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Gregory Crosswhite

7:47 p.m.

Thanks! I had reached the same conclusion, so I am glad to see that you already wrote code to do this for me. :-) There is a bug in the version that you posted, though: you missed one of the terms in the u < 0.755 case, so the c2 constant goes completely unused. Here is my modified version of your function + bug fix, with some stylistic tweaks: computeKolmogorovProbability :: Double -> Double computeKolmogorovProbability z | u < 0.2 = 1 | u < 0.755 = 1 - w * (exp(c1/v)+exp(c2/v)+exp(c3/v))/u | u < 6.8116 = 2 * sum [ sign * exp(coef*v) | (sign,coef) <- take (1 `max` round (3/u)) coefs ] | otherwise = 0 where u = abs z v = u*u w = 2.50662827 c1 = -pi**2/8 c2 = 9*c1 c3 = 25*c1 coefs = [(1,-2),(-1,-8),(1,-18),(-1,-32)] Cheers, Greg On Jan 8, 2010, at 2:59 AM, Tom Nielsen wrote:

...

Hi Greg,

Assuming this is a one-dimensional distribtution, you should use a kolmogorov-smirnov test to test this:

http://en.wikipedia.org/wiki/Kolmogorov-Smirnov_test

I've implemented to the KS distribution from the CERN code linked in the wikipedia article, here:

http://github.com/glutamate/samfun/blob/master/Math/Probably/KS.hs

(warning, i wasn't able to verify the numbers coming out against anything so just check that it makes sense)

So all you have to do is to find the maximal distance between your samples and the cumulative density function, multiply by the sqrt. of of the number of samples, and calculate kprob on that.

I don't think you can do this in a Bayesian way because you can't enumerate all the other distributions your samples could come from?

Tom

On Thu, Jan 7, 2010 at 9:31 PM, Gregory Crosswhite wrote:

...
Hey everyone! I have some computations that satisfy statistical properties which I would like to test --- that is, the result of the computation is non-deterministic, but I want to check that it is sampling the distribution that it should be sampling. Is anyone aware of a Haskell library out there that does anything like this?

Cheers, Greg

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

5667

Age (days ago)

5668

Last active (days ago)

List overview

Download

2 comments

2 participants

participants (2)

Gregory Crosswhite
Tom Nielsen