mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-07 22:11:09 +08:00
387 lines
13 KiB
Markdown
387 lines
13 KiB
Markdown
[#]: subject: "Why does 0.1 + 0.2 = 0.30000000000000004?"
|
||
[#]: via: "https://jvns.ca/blog/2023/02/08/why-does-0-1-plus-0-2-equal-0-30000000000000004/"
|
||
[#]: author: "Julia Evans https://jvns.ca/"
|
||
[#]: collector: "lkxed"
|
||
[#]: translator: " "
|
||
[#]: reviewer: " "
|
||
[#]: publisher: " "
|
||
[#]: url: " "
|
||
|
||
Why does 0.1 + 0.2 = 0.30000000000000004?
|
||
======
|
||
|
||
Hello! I was trying to write about floating point yesterday,
|
||
and I found myself wondering about this calculation, with 64-bit floats:
|
||
|
||
```
|
||
>>> 0.1 + 0.2
|
||
0.30000000000000004
|
||
```
|
||
|
||
I realized that I didn’t understand exactly how it worked. I mean, I know
|
||
floating point calculations are inexact, and I know that you can’t exactly
|
||
represent `0.1` in binary, but: there’s a floating point number that’s closer to
|
||
0.3 than `0.30000000000000004`! So why do we get the answer
|
||
`0.30000000000000004`?
|
||
|
||
If you don’t feel like reading this whole post with a bunch of calculations, the short answer is that
|
||
`0.1000000000000000055511151231257827021181583404541015625 + 0.200000000000000011102230246251565404236316680908203125` lies exactly between
|
||
2 floating point numbers,
|
||
`0.299999999999999988897769753748434595763683319091796875` (usually printed as `0.3`) and
|
||
`0.3000000000000000444089209850062616169452667236328125` (usually printed as `0.30000000000000004`). The answer is
|
||
`0.30000000000000004` (the second one) because its significand is even.
|
||
|
||
#### how floating point addition works
|
||
|
||
This is roughly how floating point addition works:
|
||
|
||
- Add together the numbers (with extra precision)
|
||
- Round the result to the nearest floating point number
|
||
|
||
So let’s use these rules to calculate 0.1 + 0.2. I just learned how floating
|
||
point addition works yesterday so it’s possible I’ve made some mistakes in this
|
||
post, but I did get the answers I expected at the end.
|
||
|
||
#### step 1: find out what 0.1 and 0.2 are
|
||
|
||
First, let’s use Python to figure out what the exact values of `0.1` and `0.2` are, as 64-bit floats.
|
||
|
||
```
|
||
>>> f"{0.1:.80f}"
|
||
'0.10000000000000000555111512312578270211815834045410156250000000000000000000000000'
|
||
>>> f"{0.2:.80f}"
|
||
'0.20000000000000001110223024625156540423631668090820312500000000000000000000000000'
|
||
```
|
||
|
||
These really are the exact values: because floating point numbers are in base
|
||
2, you can represent them all exactly in base 10. You just need a lot of digits
|
||
sometimes :)
|
||
|
||
#### step 2: add the numbers together
|
||
|
||
Next, let’s add those numbers together. We can add the fractional parts together as integers to get the exact answer:
|
||
|
||
```
|
||
>>> 1000000000000000055511151231257827021181583404541015625 + 2000000000000000111022302462515654042363166809082031250
|
||
3000000000000000166533453693773481063544750213623046875
|
||
```
|
||
|
||
So the exact sum of those two floating point numbers is `0.3000000000000000166533453693773481063544750213623046875`
|
||
|
||
This isn’t our final answer though because `0.3000000000000000166533453693773481063544750213623046875` isn’t a 64-bit float.
|
||
|
||
#### step 3: look at the nearest floating point numbers
|
||
|
||
Now, let’s look at the floating point numbers around `0.3`. Here’s the closest floating point number to `0.3` (usually written as just `0.3`, even though that isn’t its exact value):
|
||
|
||
```
|
||
>>> f"{0.3:.80f}"
|
||
'0.29999999999999998889776975374843459576368331909179687500000000000000000000000000'
|
||
```
|
||
|
||
We can figure out the next floating point number after `0.3` by serializing
|
||
`0.3` to 8 bytes with `struct.pack`, adding 1, and then using `struct.unpack`:
|
||
|
||
```
|
||
>>> struct.pack("!d", 0.3)
|
||
b'?\xd3333333'
|
||
# manually add 1 to the last byte
|
||
>>> next_float = struct.unpack("!d", b'?\xd3333334')[0]
|
||
>>> next_float
|
||
0.30000000000000004
|
||
>>> f"{next_float:.80f}"
|
||
'0.30000000000000004440892098500626161694526672363281250000000000000000000000000000'
|
||
```
|
||
|
||
Apparently you can also do this with `math.nextafter`:
|
||
|
||
```
|
||
>>> math.nextafter(0.3, math.inf)
|
||
0.30000000000000004
|
||
```
|
||
|
||
So the two 64-bit floats around
|
||
`0.3` are
|
||
`0.299999999999999988897769753748434595763683319091796875` and
|
||
`0.3000000000000000444089209850062616169452667236328125`
|
||
|
||
#### step 4: find out which one is closest to our result
|
||
|
||
It turns out that `0.3000000000000000166533453693773481063544750213623046875`
|
||
is exactly in the middle of
|
||
`0.299999999999999988897769753748434595763683319091796875` and
|
||
`0.3000000000000000444089209850062616169452667236328125`.
|
||
|
||
You can see that with this calculation:
|
||
|
||
```
|
||
>>> (3000000000000000444089209850062616169452667236328125000 + 2999999999999999888977697537484345957636833190917968750) // 2 == 3000000000000000166533453693773481063544750213623046875
|
||
True
|
||
```
|
||
|
||
So neither of them is closest.
|
||
|
||
#### how does it know which one to round to?
|
||
|
||
In the binary representation of a floating point number, there’s a number
|
||
called the “significand”. In cases like this (where the result is exactly in
|
||
between 2 successive floating point number, it’ll round to the one with the
|
||
even significand.
|
||
|
||
In this case that’s `0.300000000000000044408920985006261616945266723632812500`
|
||
|
||
We actually saw the significand of this number a bit earlier:
|
||
|
||
- 0.30000000000000004 is `struct.unpack('!d', b'?\xd3333334')`
|
||
- 0.3 is `struct.unpack('!d', b'?\xd3333333')`
|
||
|
||
The last digit of the big endian hex representation of `0.30000000000000004` is
|
||
`4`, so that’s the one with the even significand (because the significand is at
|
||
the end).
|
||
|
||
#### let’s also work out the whole calculation in binary
|
||
|
||
Above we did the calculation in decimal, because that’s a little more intuitive
|
||
to read. But of course computers don’t do these calculations in decimal –
|
||
they’re done in a base 2 representation. So I wanted to get an idea of how that
|
||
worked too.
|
||
|
||
I don’t think this binary calculation part of the post is particularly clear
|
||
but it was helpful for me to write out. There are a really a lot of numbers and
|
||
it might be terrible to read.
|
||
|
||
#### how 64-bit floats numbers work: exponent and significand
|
||
|
||
64-bit floating point numbers are represented with 2 integers: an **exponent** and the **significand** and a 1-bit **sign**.
|
||
|
||
Here’s the equation of how the exponent and significand correspond to an actual number
|
||
|
||
$$\text{sign} \times 2^\text{exponent} (1 + \frac{\text{significand}}{2^{52}})$$
|
||
|
||
For example if the exponent was `1` the significand was `2**51`, and the sign was positive, we’d get
|
||
|
||
$$2^{1} (1 + \frac{2^{51}}{2^{52}})$$
|
||
|
||
which is equal to `2 * (1 + 0.5)` , or 3.
|
||
|
||
#### step 1: get the exponent and significand for 0.1 and 0.2
|
||
|
||
I wrote some inefficient functions to get the exponent and significand of a positive float in Python:
|
||
|
||
```
|
||
def get_exponent(f):
|
||
# get the first 12 bytes
|
||
bytestring = struct.pack('!d', f)
|
||
return int.from_bytes(bytestring, byteorder='big') >> 52
|
||
def get_significand(f):
|
||
# get the last 52 bytes
|
||
bytestring = struct.pack('!d', f)
|
||
x = int.from_bytes(bytestring, byteorder='big')
|
||
exponent = get_exponent(f)
|
||
return x ^ (exponent << 52)
|
||
```
|
||
|
||
I’m ignoring the sign bit (the first bit) because we only need these functions
|
||
to work on two numbers (0.1 and 0.2) and those two numbers are both positive.
|
||
|
||
First, let’s get the exponent and significand of 0.1. We need to subtract 1023
|
||
to get the actual exponent because that’s how floating point works.
|
||
|
||
```
|
||
>>> get_exponent(0.1) - 1023
|
||
-4
|
||
>>> get_significand(0.1)
|
||
2702159776422298
|
||
```
|
||
|
||
The way these numbers work together to get `0.1` is `2**exponent + significand / 2**(52 - exponent)`.
|
||
|
||
Here’s that calculation in Python:
|
||
|
||
```
|
||
>>> 2**-4 + 2702159776422298 / 2**(52 + 4)
|
||
0.1
|
||
```
|
||
|
||
(you might legitimately be worried about floating point accuracy issues with
|
||
this calculation, but in this case I’m pretty sure it’s fine because these
|
||
numbers by definition don’t have accuracy issues – the floating point numbers starting at `2**-4` go up in steps of `1/2**(52 + 4)`)
|
||
|
||
We can do the same thing for `0.2`:
|
||
|
||
```
|
||
>>> get_exponent(0.2) - 1023
|
||
-3
|
||
>>> get_significand(0.2)
|
||
2702159776422298
|
||
```
|
||
|
||
And here’s how that exponent and significand work together to get `0.2`:
|
||
|
||
```
|
||
>>> 2**-3 + 2702159776422298 / 2**(52 + 3)
|
||
0.2
|
||
```
|
||
|
||
(by the way, it’s not a coincidence that 0.1 and 0.2 have the same significand – it’s because `x` and `2*x` always have the same significand)
|
||
|
||
#### step 2: rewrite 0.1 to have a bigger exponent
|
||
|
||
`0.2` has a bigger exponent than `0.1` – -3 instead of -4.
|
||
|
||
So we need to rewrite
|
||
|
||
```
|
||
2**-4 + 2702159776422298 / 2**(52 + 4)
|
||
```
|
||
|
||
to be `X / (2**52 + 3)`
|
||
|
||
If we solve for X in `2**-4 + 2702159776422298 / 2**(52 + 4) = X / (2**52 + 3)`, we get:
|
||
|
||
`X = 2**51 + 2702159776422298 /2`
|
||
|
||
We can calculate that in Python pretty easily:
|
||
|
||
```
|
||
>>> 2**51 + 2702159776422298 //2
|
||
3602879701896397
|
||
```
|
||
|
||
#### step 3: add the significands
|
||
|
||
Now we’re trying to do this addition
|
||
|
||
```
|
||
2**-3 + 2702159776422298 / 2**(52 + 3) + 3602879701896397 / 2**(52 + 3)
|
||
```
|
||
|
||
So we need to add together `2702159776422298` and `3602879701896397`
|
||
|
||
```
|
||
>>> 2702159776422298 + 3602879701896397
|
||
6305039478318695
|
||
```
|
||
|
||
Cool. But `6305039478318695` is more than 2**52 - 1 (the maximum value for a significand), so we have a problem:
|
||
|
||
```
|
||
>>> 6305039478318695 > 2**52
|
||
True
|
||
```
|
||
|
||
#### step 4: increase the exponent
|
||
|
||
Right now our answer is
|
||
|
||
```
|
||
2**-3 + 6305039478318695 / 2**(52 + 3)
|
||
```
|
||
|
||
First, let’s subtract 2**52 to get
|
||
|
||
```
|
||
2**-2 + 1801439850948199 / 2**(52 + 3)
|
||
```
|
||
|
||
This is almost perfect, but the `2**(52 + 3)` at the end there needs to be a `2**(52 + 2)`.
|
||
|
||
So we need to divide 1801439850948199 by 2. This is where we run into inaccuracies – `1801439850948199` is odd!
|
||
|
||
```
|
||
>>> 1801439850948199 / 2
|
||
900719925474099.5
|
||
```
|
||
|
||
It’s exactly in between two integers, so we round to the nearest even number (which is what the floating point specification says to do), so our final floating point number result is:
|
||
|
||
```
|
||
>>> 2**-2 + 900719925474100 / 2**(52 + 2)
|
||
0.30000000000000004
|
||
```
|
||
|
||
That’s the answer we expected:
|
||
|
||
```
|
||
>>> 0.1 + 0.2
|
||
0.30000000000000004
|
||
```
|
||
|
||
#### this probably isn’t exactly how it works in hardware
|
||
|
||
The way I’ve described the operations here isn’t literally exactly
|
||
what happens when you do floating point addition (it’s not “solving for X” for
|
||
example), I’m sure there are a lot of efficient tricks. But I think it’s about
|
||
the same idea.
|
||
|
||
#### printing out floating point numbers is pretty weird
|
||
|
||
We said earlier that the floating point number 0.3 isn’t equal to 0.3. It’s actually this number:
|
||
|
||
```
|
||
>>> f"{0.3:.80f}"
|
||
'0.29999999999999998889776975374843459576368331909179687500000000000000000000000000'
|
||
```
|
||
|
||
So when you print out that number, why does it display `0.3`?
|
||
|
||
The computer isn’t actually printing out the exact value of the number, instead
|
||
it’s printing out the _shortest_ decimal number `d` which has the property that
|
||
our floating point number `f` is the closest floating point number to `d`.
|
||
|
||
It turns out that doing this efficiently isn’t trivial at all, and there are a bunch of academic papers about it like [Printing Floating-Point Numbers Quickly and Accurately][1]. or [How to print floating point numbers accurately][2].
|
||
|
||
#### would it be more intuitive if computers printed out the exact value of a float?
|
||
|
||
Rounding to a nice clean decimal value is nice, but in a way I feel like it
|
||
might be more intuitive if computers just printed out the exact value of a
|
||
floating point number – it might make it seem a lot less surprising when you
|
||
get weird results.
|
||
|
||
To me,
|
||
0.1000000000000000055511151231257827021181583404541015625 +
|
||
0.200000000000000011102230246251565404236316680908203125
|
||
= 0.3000000000000000444089209850062616169452667236328125 feels less surprising than 0.1 + 0.2 = 0.30000000000000004.
|
||
|
||
Probably this is a bad idea, it would definitely use a lot of screen space.
|
||
|
||
#### a quick note on PHP
|
||
|
||
Someone in the comments somewhere pointed out that `<?php echo (0.1 + 0.2 );?>`
|
||
prints out `0.3`. Does that mean that floating point math is different in PHP?
|
||
|
||
I think the answer is no – if I run:
|
||
|
||
`<?php echo (0.1 + 0.2 )- 0.3);?>` on [this
|
||
page][3], I get the exact same answer as in
|
||
Python 5.5511151231258E-17. So it seems like the underlying floating point
|
||
math is the same.
|
||
|
||
I think the reason that `0.1 + 0.2` prints out `0.3` in PHP is that PHP’s
|
||
algorithm for displaying floating point numbers is less precise than Python’s
|
||
– it’ll display `0.3` even if that number isn’t the closest floating point
|
||
number to 0.3.
|
||
|
||
#### that’s all!
|
||
|
||
I kind of doubt that anyone had the patience to follow all of that arithmetic,
|
||
but it was helpful for me to write down, so I’m publishing this post anyway.
|
||
Hopefully some of this makes sense.
|
||
|
||
--------------------------------------------------------------------------------
|
||
|
||
via: https://jvns.ca/blog/2023/02/08/why-does-0-1-plus-0-2-equal-0-30000000000000004/
|
||
|
||
作者:[Julia Evans][a]
|
||
选题:[lkxed][b]
|
||
译者:[译者ID](https://github.com/译者ID)
|
||
校对:[校对者ID](https://github.com/校对者ID)
|
||
|
||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||
|
||
[a]: https://jvns.ca/
|
||
[b]: https://github.com/lkxed/
|
||
[1]: https://legacy.cs.indiana.edu/~dyb/pubs/FP-Printing-PLDI96.pdf
|
||
[2]: https://lists.nongnu.org/archive/html/gcl-devel/2012-10/pdfkieTlklRzN.pdf
|
||
[3]: https://replit.com/languages/php_cli
|