mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-04 22:00:34 +08:00
472 lines
18 KiB
Markdown
472 lines
18 KiB
Markdown
|
[#]: subject: "Examples of problems with integers"
|
|||
|
[#]: via: "https://jvns.ca/blog/2023/01/18/examples-of-problems-with-integers/"
|
|||
|
[#]: author: "Julia Evans https://jvns.ca/"
|
|||
|
[#]: collector: "lkxed"
|
|||
|
[#]: translator: " "
|
|||
|
[#]: reviewer: " "
|
|||
|
[#]: publisher: " "
|
|||
|
[#]: url: " "
|
|||
|
|
|||
|
Examples of problems with integers
|
|||
|
======
|
|||
|
|
|||
|
Hello! A few days back we talked about [problems with floating point numbers][1].
|
|||
|
|
|||
|
This got me thinking – but what about integers? Of course integers have all
|
|||
|
kinds of problems too – anytime you represent a number in a small fixed amount of
|
|||
|
space (like 8/16/32/64 bits), you’re going to run into problems.
|
|||
|
|
|||
|
So I [asked on Mastodon again][2] for examples of integer problems and got all kinds of great responses again. Here’s a table of contents.
|
|||
|
|
|||
|
[example 1: the small database primary key][3][example 2: integer overflow/underflow][4][aside: how do computers represent negative integers?][5][example 3: decoding a binary format in Java][6][example 4: misinterpreting an IP address or string as an integer][7][example 5: security problems because of integer overflow][8][example 6: the case of the mystery byte order][9][example 7: modulo of negative numbers][10][example 8: compilers removing integer overflow checks][11][example 9: the && typo][12]
|
|||
|
|
|||
|
Like last time, I’ve written some example programs to demonstrate these
|
|||
|
problems. I’ve tried to use a variety of languages in the examples (Go,
|
|||
|
Javascript, Java, and C) to show that these problems don’t just show up in
|
|||
|
super low level C programs – integers are everywhere!
|
|||
|
|
|||
|
Also I’ve probably made some mistakes in here, I learned several things while writing this.
|
|||
|
|
|||
|
#### example 1: the small database primary key
|
|||
|
|
|||
|
One of the most classic (and most painful!) integer problems is:
|
|||
|
|
|||
|
- You create a database table where the primary key is a 32-bit unsigned integer, thinking “4 billion rows should be enough for anyone!”
|
|||
|
- You are massively successful and eventually, your table gets close to 4 billion rows
|
|||
|
- oh no!
|
|||
|
- You need to do a database migration to switch your primary key to be a 64-bit integer instead
|
|||
|
|
|||
|
If the primary key actually reaches its maximum value I’m not sure exactly what
|
|||
|
happens, I’d imagine you wouldn’t be able to create any new database rows and
|
|||
|
it would be a very bad day for your massively successful service.
|
|||
|
|
|||
|
#### example 2: integer overflow/underflow
|
|||
|
|
|||
|
Here’s a Go program:
|
|||
|
|
|||
|
```
|
|||
|
package main
|
|||
|
|
|||
|
import "fmt"
|
|||
|
|
|||
|
func main() {
|
|||
|
var x uint32 = 5
|
|||
|
var length uint32 = 0
|
|||
|
if x < length-1 {
|
|||
|
fmt.Printf("%d is less than %d\n", x, length-1)
|
|||
|
}
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
This slightly mysteriously prints out:
|
|||
|
|
|||
|
```
|
|||
|
5 is less than 4294967295
|
|||
|
```
|
|||
|
|
|||
|
That true, but it’s not what you might have expected.
|
|||
|
|
|||
|
#### what’s going on?
|
|||
|
|
|||
|
`0 - 1` is equal to the 4 bytes `0xFFFFFFFF`.
|
|||
|
|
|||
|
There are 2 ways to interpret those 4 bytes:
|
|||
|
|
|||
|
- As a _signed_ integer (-1)
|
|||
|
- As an _unsigned_ integer (4294967295)
|
|||
|
|
|||
|
Go here is treating `length - 1` as a **unsigned** integer, because we defined `x` and `length` as uint32s (the “u” is for “unsigned”). So it’s testing if 5 is less than 4294967295, which it is!
|
|||
|
|
|||
|
#### what do we do about it?
|
|||
|
|
|||
|
I’m not actually sure if there’s any way to automatically detect integer overflow errors in Go. (though it looks like there’s a [github issue from 2019 with some discussion][13])
|
|||
|
|
|||
|
Some brief notes about other languages:
|
|||
|
|
|||
|
- Lots of languages (Python, Java, Ruby) don’t have unsigned integers at all, so this specific problem doesn’t come up
|
|||
|
- In C, you can compile with `clang -fsanitize=unsigned-integer-overflow`. Then if your code has an overflow/underflow like this, the program will crash.
|
|||
|
- Similarly in Rust, if you compile your program in debug mode it’ll crash if there’s an integer overflow. But in release mode it won’t crash, it’ll just happily decide that 0 - 1 = 4294967295.
|
|||
|
|
|||
|
The reason Rust doesn’t check for overflows if you compile your program in
|
|||
|
release mode (and the reason C and Go don’t check) is that – these checks are
|
|||
|
expensive! Integer arithmetic is a very big part of many computations, and
|
|||
|
making sure that every single addition isn’t overflowing makes it slower.
|
|||
|
|
|||
|
#### aside: how do computers represent negative integers?
|
|||
|
|
|||
|
I mentioned in the last section that `0xFFFFFFFF` can mean either `-1` or
|
|||
|
`4294967295`. You might be thinking – what??? Why would `0xFFFFFFFF` mean `-1`?
|
|||
|
|
|||
|
So let’s talk about how computers represent negative integers for a second.
|
|||
|
|
|||
|
I’m going to simplify and talk about 8-bit integers instead of 32-bit integers,
|
|||
|
because there are less of them and it works basically the same way.
|
|||
|
|
|||
|
You can represent 256 different numbers with an 8-bit integer: 0 to 255
|
|||
|
|
|||
|
```
|
|||
|
00000000 -> 0
|
|||
|
00000001 -> 1
|
|||
|
00000010 -> 2
|
|||
|
...
|
|||
|
11111111 -> 255
|
|||
|
```
|
|||
|
|
|||
|
But what if you want to represent _negative_ integers? We still only have 8
|
|||
|
bits! So we need to reassign some of these and treat them as negative numbers
|
|||
|
instead.
|
|||
|
|
|||
|
Here’s the way most modern computers do it:
|
|||
|
|
|||
|
- Every number that’s 128 or more becomes a negative number instead
|
|||
|
- How to know _which_ negative number it is: take the positive integer you’d expect it to be, and then subtract 256
|
|||
|
|
|||
|
So 255 becomes -1, 128 becomes -128, and 200 becomes -56.
|
|||
|
|
|||
|
Here are some maps of bits to numbers:
|
|||
|
|
|||
|
```
|
|||
|
00000000 -> 0
|
|||
|
00000001 -> 1
|
|||
|
00000010 -> 2
|
|||
|
01111111 -> 127
|
|||
|
10000000 -> -128 (previously 128)
|
|||
|
10000001 -> -127 (previously 129)
|
|||
|
10000010 -> -126 (previously 130)
|
|||
|
...
|
|||
|
11111111 -> -1 (previously 255)
|
|||
|
```
|
|||
|
|
|||
|
This gives us 256 numbers, from -128 to 127.
|
|||
|
|
|||
|
And `11111111` (or `0xFF`, or 255) is -1.
|
|||
|
|
|||
|
For 32 bit integers, it’s the same story, except it’s “every number larger than 2^31 becomes negative” and “subtract 2^32”. And similarly for other integer sizes.
|
|||
|
|
|||
|
That’s how we end up with `0xFFFFFFFF` meaning -1.
|
|||
|
|
|||
|
#### there are multiple ways to represent negative integers
|
|||
|
|
|||
|
The way we just talked about of representing negative integers (“it’s the equivalent positive integer, but you subtract 2^n”) is called
|
|||
|
**two’s complement**, and it’s the most common on modern computers. There are several other ways
|
|||
|
though, the [wikipedia article has a list][14].
|
|||
|
|
|||
|
#### weird thing: the absolute value of -128 is negative
|
|||
|
|
|||
|
This [Go program][15] has a pretty simple `abs()` function that computes the absolute value of an integer:
|
|||
|
|
|||
|
```
|
|||
|
package main
|
|||
|
|
|||
|
import (
|
|||
|
"fmt"
|
|||
|
)
|
|||
|
|
|||
|
func abs(x int8) int8 {
|
|||
|
if x < 0 {
|
|||
|
return -x
|
|||
|
}
|
|||
|
return x
|
|||
|
}
|
|||
|
|
|||
|
func main() {
|
|||
|
fmt.Println(abs(-127))
|
|||
|
fmt.Println(abs(-128))
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
This prints out:
|
|||
|
|
|||
|
```
|
|||
|
127
|
|||
|
-128
|
|||
|
```
|
|||
|
|
|||
|
This is because the signed 8-bit integers go from -128 to 127 – there **is** no +128!
|
|||
|
Some programs might crash when you try to do this (it’s an overflow), but Go
|
|||
|
doesn’t.
|
|||
|
|
|||
|
Now that we’ve talked about signed integers a bunch, let’s dig into another example of how they can cause problems.
|
|||
|
|
|||
|
#### example 3: decoding a binary format in Java
|
|||
|
|
|||
|
Let’s say you’re parsing a binary format in Java, and you want to get the first
|
|||
|
4 bits of the byte `0x90`. The correct answer is 9.
|
|||
|
|
|||
|
```
|
|||
|
public class Main {
|
|||
|
public static void main(String[] args) {
|
|||
|
byte b = (byte) 0x90;
|
|||
|
System.out.println(b >> 4);
|
|||
|
}
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
This prints out “-7”. That’s not right!
|
|||
|
|
|||
|
#### what’s going on?
|
|||
|
|
|||
|
There are two things we need to know about Java to make sense of this:
|
|||
|
|
|||
|
- Java doesn’t have unsigned integers.
|
|||
|
- Java can’t right shift bytes, it can only shift integers. So anytime you shift a byte, it has to be promoted into an integer.
|
|||
|
|
|||
|
Let’s break down what those two facts mean for our little calculation `b >> 4`:
|
|||
|
|
|||
|
- In bits, `0x90` is `10010000`. This starts with a 1, which means that it’s more than 128, which means it’s a negative number
|
|||
|
- Java sees the `>>` and decides to promote `0x90` to an integer, so that it can shift it
|
|||
|
- The way you convert a negative byte to an 32-bit integer is to add a bunch of `1`s at the beginning. So now our 32-bit integer is `0xFFFFFF90` (`F` being 15, or `1111`)
|
|||
|
- Now we right shift (`b >> 4`). By default, Java does a **signed shift**, which means that it adds 0s to the beginning if it’s positive, and 1s to the beginning if it’s negative. (`>>>` is an unsigned shift in Java)
|
|||
|
- We end up with `0xFFFFFFF9` (having cut off the last 4 bits and added more 1s at the beginning)
|
|||
|
- As a signed integer, that’s -7!
|
|||
|
|
|||
|
#### what can you do about it?
|
|||
|
|
|||
|
I don’t the actual idiomatic way to do this in Java is, but the way I’d naively
|
|||
|
approach fixing this is to put in a bit mask before doing the right shift. So
|
|||
|
instead of:
|
|||
|
|
|||
|
```
|
|||
|
b >> 4
|
|||
|
```
|
|||
|
|
|||
|
we’d write
|
|||
|
|
|||
|
```
|
|||
|
(b & 0xFF) >> 4
|
|||
|
```
|
|||
|
|
|||
|
`b & 0xFF` seems redundant (`b` is already a byte!), but it’s actually not because `b` is being promoted to an integer.
|
|||
|
|
|||
|
Now instead of `0x90 -> 0xFFFFFF90 -> 0xFFFFFFF9`, we end up calculating `0x90 -> 0xFFFFFF90 -> 0x00000090 -> 0x00000009`, which is the result we wanted: 9.
|
|||
|
|
|||
|
And when we actually try it, it prints out “9”.
|
|||
|
|
|||
|
Also, if we were using a language with unsigned integers, the natural way to
|
|||
|
deal with this would be to treat the value as an unsigned integer in the first
|
|||
|
place. But that’s not possible in Java.
|
|||
|
|
|||
|
#### example 4: misinterpreting an IP address or string as an integer
|
|||
|
|
|||
|
I don’t know if this is technically a “problem with integers” but it’s funny
|
|||
|
so I’ll mention it: [Rachel by the bay][16] has a bunch of great
|
|||
|
examples of things that are not integers being interpreted as integers. For
|
|||
|
example, “HTTP” is `0x48545450` and `2130706433` is `127.0.0.1`.
|
|||
|
|
|||
|
She points out that you can actually ping any integer, and it’ll convert that integer into an IP address, for example:
|
|||
|
|
|||
|
```
|
|||
|
$ ping 2130706433
|
|||
|
PING 2130706433 (127.0.0.1): 56 data bytes
|
|||
|
$ ping 132848123841239999988888888888234234234234234234
|
|||
|
PING 132848123841239999988888888888234234234234234234 (251.164.101.122): 56 data bytes
|
|||
|
```
|
|||
|
|
|||
|
(I’m not actually sure how ping is parsing that second integer or why ping accepts these giant larger-than-2^64-integers as valid inputs, but it’s a fun weird thing)
|
|||
|
|
|||
|
#### example 5: security problems because of integer overflow
|
|||
|
|
|||
|
Another integer overflow example: here’s a [search for CVEs involving integer overflows][17].
|
|||
|
There are a lot! I’m not a security person, but here’s one random example: this [json parsing library bug][18]
|
|||
|
|
|||
|
My understanding of that json parsing bug is roughly:
|
|||
|
|
|||
|
- you load a JSON file that’s 3GB or something, or 3,000,000,000
|
|||
|
- due to an integer overflow, the code allocates close to 0 bytes of memory instead of ~3GB amount of memory
|
|||
|
- but the JSON file is still 3GB, so it gets copied into the tiny buffer with almost 0 bytes of memory
|
|||
|
- this overwrites all kinds of other memory that it’s not supposed to
|
|||
|
|
|||
|
The CVE says “This vulnerability mostly impacts process availability”, which I
|
|||
|
think means “the program crashes”, but sometimes this kind of thing is much
|
|||
|
worse and can result in arbitrary code execution.
|
|||
|
|
|||
|
My impression is that there are a large variety of different flavours of
|
|||
|
security vulnerabilities caused by integer overflows.
|
|||
|
|
|||
|
#### example 6: the case of the mystery byte order
|
|||
|
|
|||
|
One person said that they’re do scientific computing and sometimes they need to
|
|||
|
read files which contain data with an unknown byte order.
|
|||
|
|
|||
|
Let’s invent a small example of this: say you’re reading a file which contains 4
|
|||
|
bytes - `00`, `00`, `12`, and `81` (in that order), that you happen to know
|
|||
|
represent a 4-byte integer. There are 2 ways to interpret that integer:
|
|||
|
|
|||
|
- `0x00001281` (which translates to 4737). This order is called “big endian”
|
|||
|
- `0x81120000` (which translates to 2165440512). This order is called “little endian”.
|
|||
|
|
|||
|
Which one is it? Well, maybe the file contains some metadata that specifies the
|
|||
|
endianness. Or maybe you happen to know what machine it was generated on and
|
|||
|
what byte order that machine uses. Or maybe you just read a bunch of values,
|
|||
|
try both orders, and figure out which makes more sense. Maybe 2165440512 is too
|
|||
|
big to make sense in the context of whatever your data is supposed to mean, or
|
|||
|
maybe `4737` is too small.
|
|||
|
|
|||
|
A couple more notes on this:
|
|||
|
|
|||
|
- this isn’t just a problem with integers, floating point numbers have byte
|
|||
|
order too
|
|||
|
- this also comes up when reading data from a network, but in that case the
|
|||
|
byte order isn’t a “mystery”, it’s just going to be big endian. But x86
|
|||
|
machines (and many others) are little endian, so you have to swap the byte
|
|||
|
order of all your numbers.
|
|||
|
|
|||
|
#### example 7: modulo of negative numbers
|
|||
|
|
|||
|
This is more of a design decision about how different programming languages design their math libraries, but it’s still a little weird and lots of people mentioned it.
|
|||
|
|
|||
|
Let’s say you write `-13 % 3` in your program, or `13 % -3`. What’s the result?
|
|||
|
|
|||
|
It turns out that different programming languages do it differently, for
|
|||
|
example in Python `-13 % 3 = 2` but in Javascript `-13 % 3 = -1`.
|
|||
|
|
|||
|
There’s a table in [this blog post][19] that
|
|||
|
describes a bunch of different programming languages’ choices.
|
|||
|
|
|||
|
#### example 8: compilers removing integer overflow checks
|
|||
|
|
|||
|
We’ve been hearing a lot about integer overflow and why it’s bad. So let’s
|
|||
|
imagine you try to be safe and include some checks in your programs – after
|
|||
|
each addition, you make sure that the calculation didn’t overflow. Like this:
|
|||
|
|
|||
|
```
|
|||
|
#include <stdio.h>
|
|||
|
|
|||
|
#define INT_MAX 2147483647
|
|||
|
|
|||
|
int check_overflow(int n) {
|
|||
|
n = n + 100;
|
|||
|
if (n + 100 < 0)
|
|||
|
return -1;
|
|||
|
return 0;
|
|||
|
}
|
|||
|
|
|||
|
int main() {
|
|||
|
int result = check_overflow(INT_MAX);
|
|||
|
printf("%d\n", result);
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
`check_overflow` here should return `-1` (failure), because `INT_MAX + 100` is more than the maximum integer size.
|
|||
|
|
|||
|
```
|
|||
|
$ gcc check_overflow.c -o check_overflow && ./check_overflow
|
|||
|
-1
|
|||
|
$ gcc -O3 check_overflow.c -o check_overflow && ./check_overflow
|
|||
|
0
|
|||
|
```
|
|||
|
|
|||
|
That’s weird – when we compile with `gcc`, we get the answer we expected, but
|
|||
|
with `gcc -O3`, we get a different answer. Why?
|
|||
|
|
|||
|
#### what’s going on?
|
|||
|
|
|||
|
My understanding (which might be wrong) is:
|
|||
|
|
|||
|
- Signed integer overflow in C is **undefined behavior**. I think that’s
|
|||
|
because different C implementations might be using different representations
|
|||
|
of signed integers (maybe they’re using one’s complement instead of two’s
|
|||
|
complement or something)
|
|||
|
- “undefined behaviour” in C means “the compiler is free to do literally whatever it wants after that point” (see this post [With undefined behaviour, anything is possible][20] by Raph Levine for a lot more)
|
|||
|
- Some compiler optimizations assume that undefined behaviour will never
|
|||
|
happen. They’re free to do this, because – if that undefined behaviour
|
|||
|
_did_ happen, then they’re allowed to do whatever they want, so “run the
|
|||
|
code that I optimized assuming that this would never happen” is fine.
|
|||
|
- So this `if (n + 100 < 0)` check is irrelevant – if that did
|
|||
|
happen, it would be undefined behaviour, so there’s no need to execute the
|
|||
|
contents of that if statement.
|
|||
|
|
|||
|
So, that’s weird. I’m not going to write a “what can you do about it?” section here because I’m pretty out of my depth already.
|
|||
|
|
|||
|
I certainly would not have expected that though.
|
|||
|
|
|||
|
My impression is that “undefined behaviour” is really a C/C++ concept, and
|
|||
|
doesn’t exist in other languages in the same way except in the case of “your
|
|||
|
program called some C code in an incorrect way and that C code did something
|
|||
|
weird because of undefined behaviour”. Which of course happens all the time.
|
|||
|
|
|||
|
#### example 9: the && typo
|
|||
|
|
|||
|
This one was mentioned as a very upsetting bug. Let’s say you have two integers
|
|||
|
and you want to check that they’re both nonzero.
|
|||
|
|
|||
|
In Javascript, you might write:
|
|||
|
|
|||
|
```
|
|||
|
if a && b {
|
|||
|
/* some code */
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
But you could also make a typo and type:
|
|||
|
|
|||
|
```
|
|||
|
if a & b {
|
|||
|
/* some code */
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
This is still perfectly valid code, but it means something completely different
|
|||
|
– it’s a bitwise and instead of a boolean and. Let’s go into a Javascript
|
|||
|
console and look at bitwise vs boolean and for `9` and `4`:
|
|||
|
|
|||
|
```
|
|||
|
> 9 && 4
|
|||
|
4
|
|||
|
> 9 & 4
|
|||
|
0
|
|||
|
> 4 && 5
|
|||
|
5
|
|||
|
> 4 & 5
|
|||
|
4
|
|||
|
```
|
|||
|
|
|||
|
It’s easy to imagine this turning into a REALLY annoying bug since it would be
|
|||
|
intermittent – often `x & y` does turn out to be truthy if `x && y` is truthy.
|
|||
|
|
|||
|
#### what to do about it?
|
|||
|
|
|||
|
For Javascript, ESLint has a [no-bitwise check][21] check), which
|
|||
|
requires you manually flag “no, I actually know what I’m doing, I want to do
|
|||
|
bitwise and” if you use a bitwise and in your code. I’m sure many other linters
|
|||
|
have a similar check.
|
|||
|
|
|||
|
#### that’s all for now!
|
|||
|
|
|||
|
There are definitely more problems with integers than this, but this got pretty
|
|||
|
long again and I’m tired of writing again so I’m going to stop :)
|
|||
|
|
|||
|
--------------------------------------------------------------------------------
|
|||
|
|
|||
|
via: https://jvns.ca/blog/2023/01/18/examples-of-problems-with-integers/
|
|||
|
|
|||
|
作者:[Julia Evans][a]
|
|||
|
选题:[lkxed][b]
|
|||
|
译者:[译者ID](https://github.com/译者ID)
|
|||
|
校对:[校对者ID](https://github.com/校对者ID)
|
|||
|
|
|||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
|||
|
|
|||
|
[a]: https://jvns.ca/
|
|||
|
[b]: https://github.com/lkxed/
|
|||
|
[1]: https://jvns.ca/blog/2023/01/13/examples-of-floating-point-problems/
|
|||
|
[2]: https://social.jvns.ca/@b0rk/109700446576896509
|
|||
|
[3]: https://jvns.ca#example-1-the-small-database-primary-key
|
|||
|
[4]: https://jvns.ca#example-2-integer-overflow-underflow
|
|||
|
[5]: https://jvns.ca#aside-how-do-computers-represent-negative-integers
|
|||
|
[6]: https://jvns.ca#example-3-decoding-a-binary-format-in-java
|
|||
|
[7]: https://jvns.ca#example-4-misinterpreting-an-ip-address-or-string-as-an-integer
|
|||
|
[8]: https://jvns.ca#example-5-security-problems-because-of-integer-overflow
|
|||
|
[9]: https://jvns.ca#example-6-the-case-of-the-mystery-byte-order
|
|||
|
[10]: https://jvns.ca#example-7-modulo-of-negative-numbers
|
|||
|
[11]: https://jvns.ca#example-8-compilers-removing-integer-overflow-checks
|
|||
|
[12]: https://jvns.ca#example-9-the-typo
|
|||
|
[13]: https://github.com/golang/go/issues/30613
|
|||
|
[14]: https://en.wikipedia.org/wiki/Signed_number_representations
|
|||
|
[15]: https://go.dev/play/p/iSFxbFAe75M
|
|||
|
[16]: https://rachelbythebay.com/w/2020/11/26/magic/
|
|||
|
[17]: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=integer+overflow
|
|||
|
[18]: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-24795
|
|||
|
[19]: https://torstencurdt.com/tech/posts/modulo-of-negative-numbers/
|
|||
|
[20]: https://raphlinus.github.io/programming/rust/2018/08/17/undefined-behavior.html
|
|||
|
[21]: https://eslint.org/docs/latest/rules/no-bitwise
|