diff --git a/sources/tech/20230118.0 ⭐️⭐️⭐️ Examples of problems with integers.md b/sources/tech/20230118.0 ⭐️⭐️⭐️ Examples of problems with integers.md index 6751db4114..de9fb18280 100644 --- a/sources/tech/20230118.0 ⭐️⭐️⭐️ Examples of problems with integers.md +++ b/sources/tech/20230118.0 ⭐️⭐️⭐️ Examples of problems with integers.md @@ -12,18 +12,22 @@ Examples of problems with integers Hello! A few days back we talked about [problems with floating point numbers][1]. -This got me thinking – but what about integers? Of course integers have all -kinds of problems too – anytime you represent a number in a small fixed amount of -space (like 8/16/32/64 bits), you’re going to run into problems. +This got me thinking – but what about integers? Of course integers have all kinds of problems too – anytime you represent a number in a small fixed amount of space (like 8/16/32/64 bits), you’re going to run into problems. So I [asked on Mastodon again][2] for examples of integer problems and got all kinds of great responses again. Here’s a table of contents. -[example 1: the small database primary key][3][example 2: integer overflow/underflow][4][aside: how do computers represent negative integers?][5][example 3: decoding a binary format in Java][6][example 4: misinterpreting an IP address or string as an integer][7][example 5: security problems because of integer overflow][8][example 6: the case of the mystery byte order][9][example 7: modulo of negative numbers][10][example 8: compilers removing integer overflow checks][11][example 9: the && typo][12] +- [example 1: the small database primary key][3] +- [example 2: integer overflow/underflow][4] +- [aside: how do computers represent negative integers?][5] +- [example 3: decoding a binary format in Java][6] +- [example 4: misinterpreting an IP address or string as an integer][7] +- [example 5: security problems because of integer overflow][8] +- [example 6: the case of the mystery byte order][9] +- [example 7: modulo of negative numbers][10] +- [example 8: compilers removing integer overflow checks][11] +- [example 9: the && typo][12] -Like last time, I’ve written some example programs to demonstrate these -problems. I’ve tried to use a variety of languages in the examples (Go, -Javascript, Java, and C) to show that these problems don’t just show up in -super low level C programs – integers are everywhere! +Like last time, I’ve written some example programs to demonstrate these problems. I’ve tried to use a variety of languages in the examples (Go, Javascript, Java, and C) to show that these problems don’t just show up in super low level C programs – integers are everywhere! Also I’ve probably made some mistakes in here, I learned several things while writing this. @@ -36,9 +40,7 @@ One of the most classic (and most painful!) integer problems is: - oh no! - You need to do a database migration to switch your primary key to be a 64-bit integer instead -If the primary key actually reaches its maximum value I’m not sure exactly what -happens, I’d imagine you wouldn’t be able to create any new database rows and -it would be a very bad day for your massively successful service. +If the primary key actually reaches its maximum value I’m not sure exactly what happens, I’d imagine you wouldn’t be able to create any new database rows and it would be a very bad day for your massively successful service. #### example 2: integer overflow/underflow @@ -87,20 +89,15 @@ Some brief notes about other languages: - In C, you can compile with `clang -fsanitize=unsigned-integer-overflow`. Then if your code has an overflow/underflow like this, the program will crash. - Similarly in Rust, if you compile your program in debug mode it’ll crash if there’s an integer overflow. But in release mode it won’t crash, it’ll just happily decide that 0 - 1 = 4294967295. -The reason Rust doesn’t check for overflows if you compile your program in -release mode (and the reason C and Go don’t check) is that – these checks are -expensive! Integer arithmetic is a very big part of many computations, and -making sure that every single addition isn’t overflowing makes it slower. +The reason Rust doesn’t check for overflows if you compile your program in release mode (and the reason C and Go don’t check) is that – these checks are expensive! Integer arithmetic is a very big part of many computations, and making sure that every single addition isn’t overflowing makes it slower. #### aside: how do computers represent negative integers? -I mentioned in the last section that `0xFFFFFFFF` can mean either `-1` or -`4294967295`. You might be thinking – what??? Why would `0xFFFFFFFF` mean `-1`? +I mentioned in the last section that `0xFFFFFFFF` can mean either `-1` or `4294967295`. You might be thinking – what??? Why would `0xFFFFFFFF` mean `-1`? So let’s talk about how computers represent negative integers for a second. -I’m going to simplify and talk about 8-bit integers instead of 32-bit integers, -because there are less of them and it works basically the same way. +I’m going to simplify and talk about 8-bit integers instead of 32-bit integers, because there are less of them and it works basically the same way. You can represent 256 different numbers with an 8-bit integer: 0 to 255 @@ -112,9 +109,7 @@ You can represent 256 different numbers with an 8-bit integer: 0 to 255 11111111 -> 255 ``` -But what if you want to represent _negative_ integers? We still only have 8 -bits! So we need to reassign some of these and treat them as negative numbers -instead. +But what if you want to represent _negative_ integers? We still only have 8 bits! So we need to reassign some of these and treat them as negative numbers instead. Here’s the way most modern computers do it: @@ -147,9 +142,7 @@ That’s how we end up with `0xFFFFFFFF` meaning -1. #### there are multiple ways to represent negative integers -The way we just talked about of representing negative integers (“it’s the equivalent positive integer, but you subtract 2^n”) is called -**two’s complement**, and it’s the most common on modern computers. There are several other ways -though, the [wikipedia article has a list][14]. +The way we just talked about of representing negative integers (“it’s the equivalent positive integer, but you subtract 2^n”) is called **two’s complement**, and it’s the most common on modern computers. There are several other ways though, the [wikipedia article has a list][14]. #### weird thing: the absolute value of -128 is negative @@ -182,16 +175,13 @@ This prints out: -128 ``` -This is because the signed 8-bit integers go from -128 to 127 – there **is** no +128! -Some programs might crash when you try to do this (it’s an overflow), but Go -doesn’t. +This is because the signed 8-bit integers go from -128 to 127 – there **is** no +128! Some programs might crash when you try to do this (it’s an overflow), but Go doesn’t. Now that we’ve talked about signed integers a bunch, let’s dig into another example of how they can cause problems. #### example 3: decoding a binary format in Java -Let’s say you’re parsing a binary format in Java, and you want to get the first -4 bits of the byte `0x90`. The correct answer is 9. +Let’s say you’re parsing a binary format in Java, and you want to get the first 4 bits of the byte `0x90`. The correct answer is 9. ``` public class Main { @@ -222,9 +212,7 @@ Let’s break down what those two facts mean for our little calculation `b >> 4` #### what can you do about it? -I don’t the actual idiomatic way to do this in Java is, but the way I’d naively -approach fixing this is to put in a bit mask before doing the right shift. So -instead of: +I don’t the actual idiomatic way to do this in Java is, but the way I’d naively approach fixing this is to put in a bit mask before doing the right shift. So instead of: ``` b >> 4 @@ -238,20 +226,15 @@ we’d write `b & 0xFF` seems redundant (`b` is already a byte!), but it’s actually not because `b` is being promoted to an integer. -Now instead of `0x90 -> 0xFFFFFF90 -> 0xFFFFFFF9`, we end up calculating `0x90 -> 0xFFFFFF90 -> 0x00000090 -> 0x00000009`, which is the result we wanted: 9. +Now instead of `0x90 -> 0xFFFFFF90 -> 0xFFFFFFF9`, we end up calculating `0x90 -> 0xFFFFFF90 -> 0x00000090 -> x00000009`, which is the result we wanted: 9. And when we actually try it, it prints out “9”. -Also, if we were using a language with unsigned integers, the natural way to -deal with this would be to treat the value as an unsigned integer in the first -place. But that’s not possible in Java. +Also, if we were using a language with unsigned integers, the natural way to deal with this would be to treat the value as an unsigned integer in the first place. But that’s not possible in Java. #### example 4: misinterpreting an IP address or string as an integer -I don’t know if this is technically a “problem with integers” but it’s funny -so I’ll mention it: [Rachel by the bay][16] has a bunch of great -examples of things that are not integers being interpreted as integers. For -example, “HTTP” is `0x48545450` and `2130706433` is `127.0.0.1`. +I don’t know if this is technically a “problem with integers” but it’s funny so I’ll mention it: [Rachel by the bay][16] has a bunch of great examples of things that are not integers being interpreted as integers. For example, “HTTP” is `0x48545450` and `2130706433` is `127.0.0.1`. She points out that you can actually ping any integer, and it’ll convert that integer into an IP address, for example: @@ -266,8 +249,7 @@ PING 132848123841239999988888888888234234234234234234 (251.164.101.122): 56 data #### example 5: security problems because of integer overflow -Another integer overflow example: here’s a [search for CVEs involving integer overflows][17]. -There are a lot! I’m not a security person, but here’s one random example: this [json parsing library bug][18] +Another integer overflow example: here’s a [search for CVEs involving integer overflows][17]. There are a lot! I’m not a security person, but here’s one random example: this [json parsing library bug][18] My understanding of that json parsing bug is roughly: @@ -276,40 +258,25 @@ My understanding of that json parsing bug is roughly: - but the JSON file is still 3GB, so it gets copied into the tiny buffer with almost 0 bytes of memory - this overwrites all kinds of other memory that it’s not supposed to -The CVE says “This vulnerability mostly impacts process availability”, which I -think means “the program crashes”, but sometimes this kind of thing is much -worse and can result in arbitrary code execution. +The CVE says “This vulnerability mostly impacts process availability”, which I think means “the program crashes”, but sometimes this kind of thing is much worse and can result in arbitrary code execution. -My impression is that there are a large variety of different flavours of -security vulnerabilities caused by integer overflows. +My impression is that there are a large variety of different flavours of security vulnerabilities caused by integer overflows. #### example 6: the case of the mystery byte order -One person said that they’re do scientific computing and sometimes they need to -read files which contain data with an unknown byte order. +One person said that they’re do scientific computing and sometimes they need to read files which contain data with an unknown byte order. -Let’s invent a small example of this: say you’re reading a file which contains 4 -bytes - `00`, `00`, `12`, and `81` (in that order), that you happen to know -represent a 4-byte integer. There are 2 ways to interpret that integer: +Let’s invent a small example of this: say you’re reading a file which contains 4 bytes - `00`, `00`, `12`, and `81` (in that order), that you happen to know represent a 4-byte integer. There are 2 ways to interpret that integer: - `0x00001281` (which translates to 4737). This order is called “big endian” - `0x81120000` (which translates to 2165440512). This order is called “little endian”. -Which one is it? Well, maybe the file contains some metadata that specifies the -endianness. Or maybe you happen to know what machine it was generated on and -what byte order that machine uses. Or maybe you just read a bunch of values, -try both orders, and figure out which makes more sense. Maybe 2165440512 is too -big to make sense in the context of whatever your data is supposed to mean, or -maybe `4737` is too small. +Which one is it? Well, maybe the file contains some metadata that specifies the endianness. Or maybe you happen to know what machine it was generated on and what byte order that machine uses. Or maybe you just read a bunch of values, try both orders, and figure out which makes more sense. Maybe 2165440512 is too big to make sense in the context of whatever your data is supposed to mean, or maybe `4737` is too small. A couple more notes on this: -- this isn’t just a problem with integers, floating point numbers have byte -order too -- this also comes up when reading data from a network, but in that case the -byte order isn’t a “mystery”, it’s just going to be big endian. But x86 -machines (and many others) are little endian, so you have to swap the byte -order of all your numbers. +- this isn’t just a problem with integers, floating point numbers have byte order too +- this also comes up when reading data from a network, but in that case the byte order isn’t a “mystery”, it’s just going to be big endian. But x86 machines (and many others) are little endian, so you have to swap the byte order of all your numbers. #### example 7: modulo of negative numbers @@ -317,17 +284,13 @@ This is more of a design decision about how different programming languages desi Let’s say you write `-13 % 3` in your program, or `13 % -3`. What’s the result? -It turns out that different programming languages do it differently, for -example in Python `-13 % 3 = 2` but in Javascript `-13 % 3 = -1`. +It turns out that different programming languages do it differently, for example in Python `-13 % 3 = 2` but in Javascript `-13 % 3 = -1`. -There’s a table in [this blog post][19] that -describes a bunch of different programming languages’ choices. +There’s a table in [this blog post][19] that describes a bunch of different programming languages’ choices. #### example 8: compilers removing integer overflow checks -We’ve been hearing a lot about integer overflow and why it’s bad. So let’s -imagine you try to be safe and include some checks in your programs – after -each addition, you make sure that the calculation didn’t overflow. Like this: +We’ve been hearing a lot about integer overflow and why it’s bad. So let’s imagine you try to be safe and include some checks in your programs – after each addition, you make sure that the calculation didn’t overflow. Like this: ``` #include @@ -356,39 +319,26 @@ $ gcc -O3 check_overflow.c -o check_overflow && ./check_overflow 0 ``` -That’s weird – when we compile with `gcc`, we get the answer we expected, but -with `gcc -O3`, we get a different answer. Why? +That’s weird – when we compile with `gcc`, we get the answer we expected, but with `gcc -O3`, we get a different answer. Why? #### what’s going on? My understanding (which might be wrong) is: -- Signed integer overflow in C is **undefined behavior**. I think that’s -because different C implementations might be using different representations -of signed integers (maybe they’re using one’s complement instead of two’s -complement or something) +- Signed integer overflow in C is **undefined behavior**. I think that’s because different C implementations might be using different representations of signed integers (maybe they’re using one’s complement instead of two’s complement or something) - “undefined behaviour” in C means “the compiler is free to do literally whatever it wants after that point” (see this post [With undefined behaviour, anything is possible][20] by Raph Levine for a lot more) -- Some compiler optimizations assume that undefined behaviour will never -happen. They’re free to do this, because – if that undefined behaviour -_did_ happen, then they’re allowed to do whatever they want, so “run the -code that I optimized assuming that this would never happen” is fine. -- So this `if (n + 100 < 0)` check is irrelevant – if that did -happen, it would be undefined behaviour, so there’s no need to execute the -contents of that if statement. +- Some compiler optimizations assume that undefined behaviour will never happen. They’re free to do this, because – if that undefined behaviour _did_ happen, then they’re allowed to do whatever they want, so “run the code that I optimized assuming that this would never happen” is fine. +- So this `if (n + 100 < 0)` check is irrelevant – if that did happen, it would be undefined behaviour, so there’s no need to execute the contents of that if statement. So, that’s weird. I’m not going to write a “what can you do about it?” section here because I’m pretty out of my depth already. I certainly would not have expected that though. -My impression is that “undefined behaviour” is really a C/C++ concept, and -doesn’t exist in other languages in the same way except in the case of “your -program called some C code in an incorrect way and that C code did something -weird because of undefined behaviour”. Which of course happens all the time. +My impression is that “undefined behaviour” is really a C/C++ concept, and doesn’t exist in other languages in the same way except in the case of “your program called some C code in an incorrect way and that C code did something weird because of undefined behaviour”. Which of course happens all the time. #### example 9: the && typo -This one was mentioned as a very upsetting bug. Let’s say you have two integers -and you want to check that they’re both nonzero. +This one was mentioned as a very upsetting bug. Let’s say you have two integers and you want to check that they’re both nonzero. In Javascript, you might write: @@ -406,9 +356,7 @@ if a & b { } ``` -This is still perfectly valid code, but it means something completely different -– it’s a bitwise and instead of a boolean and. Let’s go into a Javascript -console and look at bitwise vs boolean and for `9` and `4`: +This is still perfectly valid code, but it means something completely different – it’s a bitwise and instead of a boolean and. Let’s go into a Javascript console and look at bitwise vs boolean and for `9` and `4`: ``` > 9 && 4 @@ -421,20 +369,15 @@ console and look at bitwise vs boolean and for `9` and `4`: 4 ``` -It’s easy to imagine this turning into a REALLY annoying bug since it would be -intermittent – often `x & y` does turn out to be truthy if `x && y` is truthy. +It’s easy to imagine this turning into a REALLY annoying bug since it would be intermittent – often `x & y` does turn out to be truthy if `x && y` is truthy. #### what to do about it? -For Javascript, ESLint has a [no-bitwise check][21] check), which -requires you manually flag “no, I actually know what I’m doing, I want to do -bitwise and” if you use a bitwise and in your code. I’m sure many other linters -have a similar check. +For Javascript, ESLint has a [no-bitwise check][21] check), which requires you manually flag “no, I actually know what I’m doing, I want to do bitwise and” if you use a bitwise and in your code. I’m sure many other linters have a similar check. #### that’s all for now! -There are definitely more problems with integers than this, but this got pretty -long again and I’m tired of writing again so I’m going to stop :) +There are definitely more problems with integers than this, but this got pretty long again and I’m tired of writing again so I’m going to stop :) --------------------------------------------------------------------------------