
-------------------------------------------- -- logfloat 0.13.3 -------------------------------------------- This package provides a type for storing numbers in the log-domain, primarily useful for preventing underflow when multiplying many probabilities as in HMMs and other probabilistic models. The package also provides modules for dealing with floating numbers correctly. This version drops support for Hugs and GHC < 7.8. Nothing major has changed, so they should still work; it's just that they're no longer officially supported. Thus, this version of the library provides a transitional point between backwards compatability and adding new features (see below). -------------------------------------------- -- Changes since 0.12.1 (2010-03-19) -------------------------------------------- * Monomorphized logFloat, logToLogFloat, fromLogFloat, and logFromLogFloat: that is, they all take/return Double now. The change was made to help reduce the need for explicit type signatures. It shouldn't really affect most users, since it seems noone was really making use of the polymorphism provided by previous versions. To get the previous behavior back, just explicitly add calls to realToFrac wherever necessary. * Fixed some instances to get them to compile under the new role-based type system of GHC 7.10 * Cleaned up various extraneous rewrite rules, specializations, etc * Added the functions sum, product, and pow. Both sum and product preserve more precision than the fold-based definitions in the Prelude. Moreover, sum is _much_ faster than the Prelude version, since it only requires crossing the log/exp boundary n+1 times, instead of 2*(n-1) times. The only downside is that sum requires two passes over the input and thus is not amenable to list fusion. -------------------------------------------- -- Upcoming changes (0.14+) -------------------------------------------- * Since the Data.Number.RealToFrac module is no longer required by any of the others, it will probably be forked off to a separate package in order to improve portability of the rest of the package by removing the need for MPTCs. * There's long been clamoring for adding a vector:Data.Vector.Unboxed.Unbox instance. I've been reluctant to add such an instance due to wanting to retain backwards compatibility and portability. Having dropped support for Hugs and older versions of GHC, I'm now willing to add them in. The logfloat library is conceptually quite simple, and thus to whatever extent possible I'd still like to retain portability to non-GHC compilers. So if you are interested in using logfloat with another compiler/interpreter but run into problems (e.g., due to the type families required by the vector library), please get in touch and I'll try to get things to work. -------------------------------------------- -- Compatibility / Portability -------------------------------------------- The package is compatible with GHC 7.8.3 and 7.10.1. It may still compile with older versions of GHC (or even Hugs!), however they are no longer officially supported. The package is not compatible with nhc98 and Yhc because Data.Number.RealToFrac uses MPTCs. However, that module is no longer required by any others, and all the other modules should be compatible with these compilers. Thus, it should be fairly easy to port. If you do so, please let me know and I'll try to incorporate support for them. -------------------------------------------- -- Links -------------------------------------------- Homepage: http://code.haskell.org/~wren/ Hackage: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/logfloat Darcs: http://code.haskell.org/~wren/logfloat/ Haddock (Darcs version): http://code.haskell.org/~wren/logfloat/dist/doc/html/logfloat/ -- Live well, ~wren

On Sun, 29 Mar 2015, wren romano wrote:
-------------------------------------------- -- logfloat 0.13.3 --------------------------------------------
This package provides a type for storing numbers in the log-domain, primarily useful for preventing underflow when multiplying many probabilities as in HMMs and other probabilistic models. The package also provides modules for dealing with floating numbers correctly.
I am currently working on http://hub.darcs.net/thielema/hmm-hmatrix It does not need log-numbers because it normalizes all temporary results. This way I can use fast hmatrix operations. Would normalization also be a solution for other probabilistic models?

On Mon, Mar 30, 2015 at 4:38 AM, Henning Thielemann
On Sun, 29 Mar 2015, wren romano wrote:
-------------------------------------------- -- logfloat 0.13.3 --------------------------------------------
This package provides a type for storing numbers in the log-domain, primarily useful for preventing underflow when multiplying many probabilities as in HMMs and other probabilistic models. The package also provides modules for dealing with floating numbers correctly.
I am currently working on http://hub.darcs.net/thielema/hmm-hmatrix
It does not need log-numbers because it normalizes all temporary results. This way I can use fast hmatrix operations. Would normalization also be a solution for other probabilistic models?
For many models, normalization isn't computationally feasible. Even for HMMs, when the tag/state space is large, I shudder to think of the overhead. The best cost I can imagine for normalizeFactor is O(log S); thus, increasing the S-dependent factor of the forward/backward algorithm's complexity from O(S^2) to O(S^2*log S). And that's at best; a naive implementation would cost O(S), making the forward/backward algorithm cubic in the size of the tag/state space! For small models that may be allowable, but for the sorts of models I work with that's an unacceptable slowdown. Even with normalization, your code would benefit from using logfloat (or some of the tricks included therein). E.g., your implementation of Math.HiddenMarkovModel.Normalized.logLikelihood will introduce a lot of unnecessary error, due to iterated use of binary (+) on Floating values. (The NC.sumElements function used in normalizeFactor may suffer the same implementation problem; but it's unclear.) The 'product' function for LogFloats performs Kahan summation in order to reduce the loss of precision due to repeated addition. (And in future versions of the library, I'm working on alternative implementations which sacrifice the single-pass property of Kahan summation in order to *completely* eliminate error of summations.) For this particular trick, you could also use Edward Kmett's "compensated" library -- Live well, ~wren

On Mon, 6 Apr 2015, wren romano wrote:
For many models, normalization isn't computationally feasible.
Even for HMMs, when the tag/state space is large, I shudder to think of the overhead. The best cost I can imagine for normalizeFactor is O(log S); thus, increasing the S-dependent factor of the forward/backward algorithm's complexity from O(S^2) to O(S^2*log S). And that's at best; a naive implementation would cost O(S), making the forward/backward algorithm cubic in the size of the tag/state space!
I apply normalization with complexity ~ S after a matrix multiplication with complexity ~ S^2, thus overall complexity remains quadratic per list element.
participants (2)
-
Henning Thielemann
-
wren romano