I'm pleased to announce a new Haskell statistics library, imaginatively named statistics: http://hackage.haskell.org/package/statistics
- Support for common discrete and continuous probability
distributions (binomial, gamma, exponential, geometric, hypergeometric,
normal, and Poisson)
- Kernel density estimation
- Autocorrelation analysis
- Functions over sample data
- Quantile estimation
- Resampling techniques: jackknife and bootstrap estimation
The statistics library certainly isn't yet comprehensive, but it has
some features that I think make it very attractive as a base for
further work:
- It's very fast, building on some of the fantastic software
that's available on Hackage these days. I make heavy use of Don
Stewart's uvector library
(itself a port of Roman Leshchinskiy's vector library), which means
that many functions allocate no memory and execute tight loops using
only machine registers. I use Dan Doel's uvector-algorithms library to perform fast partial sorts. I also use Don's mersenne-random library for fast random number generation when doing bootstrap analysis.
- I've
put a fair amount of effort into finding and using algorithms that are
numerically stable (trying to avoid problems like catastrophic
cancellation). Whenever possible, I indicate which methods are used in
the documentation. (For more information on numerical stability, see What Every Scientist Should Know About Floating-Point Arithmetic).
If you want to contribute, please get the source code and hack away:
darcs get http://darcs.serpentine.com/statistics
For more details, see http://bit.ly/ykOeK