truncate results depend on strict/lazy

Deep in a WAI web app, I have a function that converts a String from a web form like "0.12" to an Int 12. Converting "0.12" to the Float 0.12 is working. However, converting the Float 0.12 to the Int 12 does not work as expected unless I use the trace function. In the following, f = 0.12::Float, gotten from a function that parses "0.12" into 0.12. In the following expression, the result is: Success (Just 11).
Success $ Just $ truncate (f * 100)
In the following expression, the result is: Success (Just 12)
let expanded = f * 100 ans = truncate expanded in trace (show expanded) $ Success $ Just $ ans
That made me think that "f * 100" had to be strictly evaluated before given to truncate for some reason, so I tried using seq to get the same effect, but that didn't work. Am I correct in assuming that laziness has something to do with this problem?

On Mon, Sep 9, 2013 at 1:26 PM, Bryan Vicknair
Deep in a WAI web app, I have a function that converts a String from a web form like "0.12" to an Int 12. Converting "0.12" to the Float 0.12 is working. However, converting the Float 0.12 to the Int 12 does not work as expected unless I use the trace function.
Not answering your question but can't you read . drop 1 . dropWhile (/= '.') $ "0.12" ?
In the following, f = 0.12::Float, gotten from a function that parses "0.12" into 0.12.
In the following expression, the result is: Success (Just 11).
Success $ Just $ truncate (f * 100)
In the following expression, the result is: Success (Just 12)
let expanded = f * 100 ans = truncate expanded in trace (show expanded) $ Success $ Just $ ans
That made me think that "f * 100" had to be strictly evaluated before given to truncate for some reason, so I tried using seq to get the same effect, but that didn't work. Am I correct in assuming that laziness has something to do with this problem?
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- MM "All we have to decide is what we do with the time that is given to us"

On Mon, Sep 09, 2013 at 01:34:24PM -0400, Mihai Maruseac wrote:
On Mon, Sep 9, 2013 at 1:26 PM, Bryan Vicknair
wrote: Deep in a WAI web app, I have a function that converts a String from a web form like "0.12" to an Int 12. Converting "0.12" to the Float 0.12 is working. However, converting the Float 0.12 to the Int 12 does not work as expected unless I use the trace function.
Not answering your question but can't you read . drop 1 . dropWhile (/= '.') $ "0.12" ?
That would work for the specific input of "0.xyz", but in general, the input string may encode a float with an arbitrary number of digits on both sides of the decimal. In reality, the range is probaby from 0.10 to 1.99, so I considered writing a parser for just that range, and I may still, but this behavior is so surprising to me that I feel like I need to understand it.

On Mon, Sep 9, 2013 at 1:59 PM, Bryan Vicknair
That would work for the specific input of "0.xyz", but in general, the input string may encode a float with an arbitrary number of digits on both sides of the decimal.
So, you'd need to convert 10.34 to 1034? I'd start with span (/= '.') "10.32" and go from there :) (span from Data.Char) -- MM "All we have to decide is what we do with the time that is given to us"

On Mon, Sep 9, 2013 at 7:26 PM, Bryan Vicknair
Deep in a WAI web app, I have a function that converts a String from a web form like "0.12" to an Int 12. Converting "0.12" to the Float 0.12 is working. However, converting the Float 0.12 to the Int 12 does not work as expected unless I use the trace function.
In the following, f = 0.12::Float, gotten from a function that parses "0.12" into 0.12.
In the following expression, the result is: Success (Just 11).
Success $ Just $ truncate (f * 100)
In the following expression, the result is: Success (Just 12)
let expanded = f * 100 ans = truncate expanded in trace (show expanded) $ Success $ Just $ ans
That made me think that "f * 100" had to be strictly evaluated before given to truncate for some reason, so I tried using seq to get the same effect, but that didn't work. Am I correct in assuming that laziness has something to do with this problem?
It would be a serious bug if that was true since lazyness shouldn't change the semantic of a program, except sometimes by allowing the program to terminate where the strict version wouldn't. On the other hand I can't reproduce your bug, couldn't you provide more details (including GHC version and parsing code) -- Jedaï

On Mon, Sep 09, 2013 at 07:59:28PM +0200, Chaddaï Fouché wrote:
In the following expression, the result is: Success (Just 11).
Success $ Just $ truncate (f * 100)
In the following expression, the result is: Success (Just 12)
let expanded = f * 100 ans = truncate expanded in trace (show expanded) $ Success $ Just $ ans
That made me think that "f * 100" had to be strictly evaluated before given to truncate for some reason, so I tried using seq to get the same effect, but that didn't work. Am I correct in assuming that laziness has something to do with this problem?
It would be a serious bug if that was true since lazyness shouldn't change the semantic of a program, except sometimes by allowing the program to terminate where the strict version wouldn't.
On the other hand I can't reproduce your bug, couldn't you provide more details (including GHC version and parsing code)
-- Jedaï
I put together a simple library and web app to demonstrate the behavior I'm seeing: git clone git@bitbucket.org:bryanvick/truncate.git The README has simple instructions to view the behavior in a repl or in a web app. The Lib.hs file is where the parsing code is. I swear I saw different behaviour between separate cabal sandboxes while I was testing this. Sometimes the parsing would work as expected, sometimes it wouldn't. That made me think that maybe different versions of dependencies are being installed for different runs. I'll start paying attention to "cabal sandbox hc-pkg" to see if this is the case.

On Mon, Sep 9, 2013 at 7:26 PM, Bryan Vicknair
wrote: Deep in a WAI web app, I have a function that converts a String from a web form like "0.12" to an Int 12. Converting "0.12" to the Float 0.12 is working. However, converting the Float 0.12 to the Int 12 does not work as expected unless I use the trace function.
On Mon, Sep 9, 2013 Bryan Vicknair
wrote: I put together a simple library and web app to demonstrate the behavior I'm seeing: git clone git@bitbucket.org:bryanvick/truncate.git The README has simple instructions to view the behavior in a repl or in a web app. The Lib.hs file is where the parsing code is.
I swear I saw different behaviour between separate cabal sandboxes while I was testing this. Sometimes the parsing would work as expected, sometimes it wouldn't. That made me think that maybe different versions of dependencies are being installed for different runs. I'll start paying attention to "cabal sandbox hc-pkg" to see if this is the case.
I just witnessed the parsing code in question giving different results in different invocations of a repl in the same cabal sandbox. I started a cabal sandbox and installed dependencies:
cabal sandbox init cabal install --only-dependencies ... <bunch of compiling>
I started a repl...
cabal repl Package has never been configured. Configuring with default flags. If this fails, please run configure manually. Resolving dependencies... Configuring app-0... Preprocessing library app-0... GHCi, version 7.4.1: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Loading package bytestring-0.9.2.1 ... linking ... done. Loading package array-0.4.0.0 ... linking ... done. Loading package deepseq-1.3.0.0 ... linking ... done. Loading package containers-0.4.2.1 ... linking ... done. Loading package transformers-0.3.0.0 ... linking ... done. Loading package mtl-2.1.2 ... linking ... done. Loading package text-0.11.3.1 ... linking ... done. Loading package digestive-functors-0.6.1.0 ... linking ... done. [1 of 1] Compiling Lib ( Lib.hs, interpreted ) Ok, modules loaded: Lib. *Lib> import Data.Text (pack)
And this time, the parsing works! *Lib Data.Text> validateVal $ pack "0.12" Success (Just 12) *Lib Data.Text> :q Leaving GHCi. I did a dump of the libraries installed to compare to later sandboxes that don't work:
cabal sandbox hc-pkg list > /tmp/working-deps
I ran the web app:
cabal run Building app-0... Preprocessing library app-0... [1 of 1] Compiling Lib ( Lib.hs, dist/build/Lib.o ) In-place registering app-0... Preprocessing executable 'app' for app-0... [1 of 2] Compiling Lib ( Lib.hs, dist/build/app/app-tmp/Lib.o ) [2 of 2] Compiling Main ( app.hs, dist/build/app/app-tmp/Main.o )
app.hs:67:21: Warning: Defined but not used: `name' app.hs:67:37: Warning: Defined but not used: `val' Linking dist/build/app/app ... running at port 8005 Thing {name = "name", val = Just 11} The parsing didn't work there, as usual. So I went back into the repl:
cabal repl Preprocessing library app-0... GHCi, version 7.4.1: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Loading package bytestring-0.9.2.1 ... linking ... done. Loading package array-0.4.0.0 ... linking ... done. Loading package deepseq-1.3.0.0 ... linking ... done. Loading package containers-0.4.2.1 ... linking ... done. Loading package transformers-0.3.0.0 ... linking ... done. Loading package mtl-2.1.2 ... linking ... done. Loading package text-0.11.3.1 ... linking ... done. Loading package digestive-functors-0.6.1.0 ... linking ... done. Ok, modules loaded: Lib. Prelude Lib> import Data.Text (pack)
And all of a sudden, the parsing code doesn't work again!: Prelude Data.Text Lib> validateVal $ pack "0.12" Success (Just 11) I'm loosing my mind!

On Tue, Sep 10, 2013 at 5:08 AM, Bryan Vicknair
And all of a sudden, the parsing code doesn't work again!:
Prelude Data.Text Lib> validateVal $ pack "0.12" Success (Just 11)
This might be due to a floating-point roundoff error since 0.12 doesn't have a finite binary representation. The function truncate does exactly what it says, so truncate (100 * 0.1199999) evaluates to 11. Are you sure you don't want "round" instead of "truncate"? -- Kim-Ee

On Tue, Sep 10, 2013 at 05:26:08AM +0700, Kim-Ee Yeoh wrote:
On Tue, Sep 10, 2013 at 5:08 AM, Bryan Vicknair
wrote: And all of a sudden, the parsing code doesn't work again!:
Prelude Data.Text Lib> validateVal $ pack "0.12" Success (Just 11)
This might be due to a floating-point roundoff error since 0.12 doesn't have a finite binary representation.
The function truncate does exactly what it says, so truncate (100 * 0.1199999) evaluates to 11.
Are you sure you don't want "round" instead of "truncate"?
-- Kim-Ee
My first thought, as always with Floats, is that there was a binary representation problem. If it were just that I wouldn't mind, but it is the inconsistent evaluation of the following expression that is really bothering me:
truncate ((0.12::Float) * (100::Float))
let f = 0.12::Float truncate (f * 100)
In GHCI, I get 12: truncate ((0.12::Float) * (100::Float)) But in my web app and in this project it gives me 11: https://bitbucket.org/bryanvick/truncate/overview Whatever the behavior of truncate is, given the same input it should give the same output. In the project referenced above, the input to validateVal and unsafeValidateVal is always "0.12", but the output is 11 for the first and 12 for the second. The only difference between the two functions is that the unsafe version performs IO.

On Mon, Sep 9, 2013 at 8:03 PM, Bryan Vicknair
Whatever the behavior of truncate is, given the same input it should give the same output.
An ideal, but unlikely when floating point is involved; optimization can result in altered evaluation order or removal (or even addition in some cases) of load/stores which can modify the internal representation. Note that ghci is completely unoptimized. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On Mon, Sep 09, 2013 at 08:28:48PM -0400, Brandon Allbery wrote:
On Mon, Sep 9, 2013 at 8:03 PM, Bryan Vicknair
wrote: Whatever the behavior of truncate is, given the same input it should give the same output.
An ideal, but unlikely when floating point is involved; optimization can result in altered evaluation order or removal (or even addition in some cases) of load/stores which can modify the internal representation. Note that ghci is completely unoptimized.
Thanks everyone for the help. I think I'll write my own parsing code instead of using reads and truncate. I really don't like floats. The whole reason this parsing code exists is so that the DB can store a simple integer instead of a float, but I still need to show the value as a float to the users. This is scary though. This is the first leak I've found in the referential transparency abstraction in Haskell (besides unsafePerformIO). And the problem is, if I don't have referential transparency for 100% of pure functions, then I have to be suspicious of every one. I can use some heuristics to narrow down the suspects, such as questioning floating point functions first, but I just had my bubble burst nonetheless. Is there any resource that compiles all the potential exceptions to the Haskell abstract machine?

Whenever I have to deal with precision issues I try to remove floats and doubles from the equation entirely.
toRational (0.12 :: Float) 16106127 % 134217728 toRational (0.12 :: Double) 1080863910568919 % 9007199254740992>
truncate $ (toRational (0.12 :: Float)* 100)
11
truncate $ (toRational (0.12 :: Double)* 100)
11
Wrong, but at least it is consistently wrong and should not be interpreted
differently regardless of optimizations.
On Tue, Sep 10, 2013 at 1:14 PM, Bryan Vicknair
On Mon, Sep 09, 2013 at 08:28:48PM -0400, Brandon Allbery wrote:
On Mon, Sep 9, 2013 at 8:03 PM, Bryan Vicknair
wrote: Whatever the behavior of truncate is, given the same input it should give the same output.
An ideal, but unlikely when floating point is involved; optimization can result in altered evaluation order or removal (or even addition in some cases) of load/stores which can modify the internal representation. Note that ghci is completely unoptimized.
Thanks everyone for the help. I think I'll write my own parsing code instead of using reads and truncate. I really don't like floats. The whole reason this parsing code exists is so that the DB can store a simple integer instead of a float, but I still need to show the value as a float to the users.
This is scary though. This is the first leak I've found in the referential transparency abstraction in Haskell (besides unsafePerformIO). And the problem is, if I don't have referential transparency for 100% of pure functions, then I have to be suspicious of every one. I can use some heuristics to narrow down the suspects, such as questioning floating point functions first, but I just had my bubble burst nonetheless.
Is there any resource that compiles all the potential exceptions to the Haskell abstract machine? _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

On Tue, Sep 10, 2013 at 1:14 PM, Bryan Vicknair
This is scary though. This is the first leak I've found in the referential transparency abstraction in Haskell (besides unsafePerformIO). And the problem is, if I don't have referential transparency for 100% of pure functions, then I have to be suspicious of every one.
It's not quite *that* bad. Floating point is a well known source of confusion, except to beginners; there are "why did it do this why is your language broken waaa" posts from people discovering floating point in every single programming language that supports them. You *cannot* 100% accurately store numbers that can have infinite precision in a limited address space. And the representation has its own issues, since computers work best in powers of 2 but we work in powers of 10; numbers that are finite in a decimal representation may be repeating infinite values in a normalized base 2 floating representation. (There are types such as CReal which attempt to keep arbitrary precision. This breaks down as soon as you try to use them with transcendentals. There is no finite *and* 100% accurate representation of pi.) Aside from this, the only other "leaks" you're likely to ever encounter (absent unsafePerformIO and friends) are other cases of physical hardware being less perfect than mathematical abstraction, which is to say running out of memory (by far the most common, but also the most amenable to modification/correction), hardware failures, memory bit-flips, etc. At some point, physical constraints are going to win because we can't run programs on abstract mathematical theories, we have to run them on (however abstracted) physical systems. But practically, floating point is the main place where things go south; and yes, it's generally hated in the functional programming community: no matter what you do, real numbers are going to be a pain, and the standard compromise known as floating point is especially frustrating because its behavior can't be captured in a simple functional description. But floating point is what CPUs use and have specific support for. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 10 September 2013 19:06, Brandon Allbery
But practically, floating point is the main place where things go south; and yes, it's generally hated in the functional programming community: no matter what you do, real numbers are going to be a pain, and the standard compromise known as floating point is especially frustrating because its behavior can't be captured in a simple functional description. But floating point is what CPUs use and have specific support for.
From my perspective getting floating point computation right is hard. Proving (or at least reasoning) that an algorithm based on floating
Initial disclaimer: I'm a total Haskell and functional programming noob. However I'm well versed in a number of other languages and in floating point and expend a substantial amount of time focussing on numerical accuracy. What do you mean when you say that floating point can't be capture in a simple functional description? Leaving aside FPU inconsistencies, hardware failure etc. and assuming some sensible arithmetic (e.g. IEEE-754) the result of floatadd(A, B) is fully defined for all inputs A and B. How is that inconsistent with functional programming? I'd like to understand more about this as when I'm better at Haskell I expect to use it for at least some floating point computation, point computation works or has some accuracy is non-trivial and requires knowledge about the sequence of elementary floating point operations. I sympathise with the OP in that I consider any compiler "optimisation" that changes the underlying FPU operations in such a way that the end result differs should be considered a bug rather than an optimisation (with the exception of constant folding). But as I said I'm a total Haskell noob, so I don't yet understand what compiler optimisations in Haskell really mean or, more importantly, how easily they can be controlled. To the OP, if I understand what you're doing correctly then float (or any binary radix floating point type) is absolutely the wrong thing to use. If you want to convert from an exact decimal-string representation of a number to an exact fixed-point integer representation you should use exact computation in all stages or at least use a decimal floating point format (I don't know yet if Haskell has one). Any conversion to binary-float as an intermediary format risks a (very likely) inexact conversion since binary floating point format can only represent rational numbers whose denominator (in lowest terms) is a power of 2. In your case 0.12 is the rational number 3/25 and 25 is not a power of 2. Consequently when you attempt to store 0.12 in a binary float you will be getting a float whose exact value is not 0.12. I don't know how to conveniently do this in Haskell but in Python we can easily find the exact decimal representation of the nearest 64-bit floating point value to 0.12: $ py -3.3 -c 'from decimal import Decimal; print(Decimal(0.12))' 0.11999999999999999555910790149937383830547332763671875 Multiplying the binary float using the FPU gives us the exact value you wanted: $ py -3.3 -c 'from decimal import Decimal; print(Decimal(0.12*100))' 12 However the true result of multiplying the nearest binary float to 0.12 by 100 is still slightly less than 12. So if the combined multiply-by-100-and-truncate operation is performed correctly then you should get 11. The appropriate parsing algorithm is straight-forward although I don't yet know how to do this in Haskell. Split the string on the '.' and parse both sides as integers. Reject the string if there are not two digits on the RHS. Then you can multiply the LHS by 100 and add the RHS. (It's slightly more complex if you need to do negative numbers as well). It sounds like the Rational type already deals with all this parsing complexity for you in which case you should use that. Oscar

On Tue, Sep 10, 2013 at 5:11 PM, Oscar Benjamin
What do you mean when you say that floating point can't be capture in a simple functional description?
*You* try describing the truncation behavior of Intel FPUs (they use 80 bits internally but only store 64, for (double)). "Leaving aside" isn't an option; it's visible in the languages that use them. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 10 September 2013 22:49, Brandon Allbery
On Tue, Sep 10, 2013 at 5:11 PM, Oscar Benjamin
wrote: What do you mean when you say that floating point can't be capture in a simple functional description?
*You* try describing the truncation behavior of Intel FPUs (they use 80 bits internally but only store 64, for (double)). "Leaving aside" isn't an option; it's visible in the languages that use them.
However, for the same CPU and the same pair of inputs floatadd(A, B) returns the same result right? The result may differ from one CPU to another and it doesn't respect associativity etc. but it's still a well defined function (if no one changes the rounding mode etc. midway through computation). What is it about functional programming languages that makes this difficult as you implied earlier? Oscar

On Tue, Sep 10, 2013 at 6:14 PM, Oscar Benjamin
On 10 September 2013 22:49, Brandon Allbery
wrote: On Tue, Sep 10, 2013 at 5:11 PM, Oscar Benjamin < oscar.j.benjamin@gmail.com> wrote:
What do you mean when you say that floating point can't be capture in a simple functional description?
*You* try describing the truncation behavior of Intel FPUs (they use 80 bits internally but only store 64, for (double)). "Leaving aside" isn't an option; it's visible in the languages that use them.
However, for the same CPU and the same pair of inputs floatadd(A, B) returns the same result right? The result may differ from one CPU to
In isolation, probably. When combined with other operations, it depends on optimization and the other operations. through computation). What is it about functional programming
languages that makes this difficult as you implied earlier?
Only the expectation differs; programmers in e.g. C generally ignore such things, although there are obscure compiler options that try to control what happens. And C doesn't promise much about the behavior anyway. In pure functional programming, people get used to things behaving in nice theoretically characterized ways... and then they run into the bit size limit on Int or the somewhat erratic behavior of Float and Double and suddenly the nice abstractions fall apart. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 11 September 2013 00:19, Brandon Allbery
On Tue, Sep 10, 2013 at 6:14 PM, Oscar Benjamin
wrote: On 10 September 2013 22:49, Brandon Allbery
wrote: On Tue, Sep 10, 2013 at 5:11 PM, Oscar Benjamin
wrote: What do you mean when you say that floating point can't be capture in a simple functional description?
*You* try describing the truncation behavior of Intel FPUs (they use 80 bits internally but only store 64, for (double)). "Leaving aside" isn't an option; it's visible in the languages that use them.
However, for the same CPU and the same pair of inputs floatadd(A, B) returns the same result right? The result may differ from one CPU to
In isolation, probably. When combined with other operations, it depends on optimization and the other operations.
Well that depends what you mean. floatadd(A, floatadd(B, C)) is also a well defined function. It just happens that it is not equivalent to floatadd(floatadd(A, B), C). But the same is true for most functions. If the compiler tries to optimise by assuming that e.g. (a+b)+c is equivalent to a+(b+c) for floats then this is not really an optimisation but rather a semantic change. If those kind of optimisations are occurring beneath your feet then it becomes pretty much impossible to reason about the accuracy of higher-level code.
through computation). What is it about functional programming languages that makes this difficult as you implied earlier?
Only the expectation differs; programmers in e.g. C generally ignore such things, although there are obscure compiler options that try to control what happens. And C doesn't promise much about the behavior anyway. In pure functional programming, people get used to things behaving in nice theoretically characterized ways... and then they run into the bit size limit on Int or the somewhat erratic behavior of Float and Double and suddenly the nice abstractions fall apart.
It's true that straight-forward C code is very much subject to problems with regard to compiler optimisation and that most C programmers don't care. However as of C99 the C standards are integrated with IEEE-754 meaning that if you do care about these things then it is possible to control them (without resorting to Fortran!). Does Haskell have language/compiler features that can protect FP operations from unsafe optimisation (or from any optimisation)? Oscar
participants (7)
-
Brandon Allbery
-
Bryan Vicknair
-
Chaddaï Fouché
-
David McBride
-
Kim-Ee Yeoh
-
Mihai Maruseac
-
Oscar Benjamin