mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-04 22:00:34 +08:00
313 lines
17 KiB
Markdown
313 lines
17 KiB
Markdown
[#]: subject: "Some possible reasons for 8-bit bytes"
|
||
[#]: via: "https://jvns.ca/blog/2023/03/06/possible-reasons-8-bit-bytes/"
|
||
[#]: author: "Julia Evans https://jvns.ca/"
|
||
[#]: collector: "lkxed"
|
||
[#]: translator: " "
|
||
[#]: reviewer: " "
|
||
[#]: publisher: " "
|
||
[#]: url: " "
|
||
|
||
Some possible reasons for 8-bit bytes
|
||
======
|
||
|
||
I’ve been working on a zine about how computers represent thing in binary, and
|
||
one question I’ve gotten a few times is – why does the x86 architecture use 8-bit bytes? Why not
|
||
some other size?
|
||
|
||
With any question like this, I think there are two options:
|
||
|
||
- It’s a historical accident, another size (like 4 or 6 or 16 bits) would work just as well
|
||
- 8 bits is objectively the Best Option for some reason, even if history had played out differently we would still use 8-bit bytes
|
||
- some mix of 1 & 2
|
||
|
||
I’m not super into computer history (I like to use computers a lot more than I
|
||
like reading about them), but I am always curious if there’s an essential
|
||
reason for why a computer thing is the way it is today, or whether it’s mostly
|
||
a historical accident. So we’re going to talk about some computer history.
|
||
|
||
As an example of a historical accident: DNS has a `class` field which has 5
|
||
possible values (“internet”, “chaos”, “hesiod”, “none”, and “any”). To me that’s
|
||
a clear example of a historical accident – I can’t imagine that we’d define
|
||
the class field the same way if we could redesign DNS today without worrying about backwards compatibility. I’m
|
||
not sure if we’d use a class field at all!
|
||
|
||
There aren’t any definitive answers in this post, but I asked [on Mastodon][1] and
|
||
here are some potential reasons I found for the 8-bit byte. I think the answer
|
||
is some combination of these reasons.
|
||
|
||
#### what’s the difference between a byte and a word?
|
||
|
||
First, this post talks about “bytes” and “words” a lot. What’s the difference between a byte and a word? My understanding is:
|
||
|
||
- the **byte size** is the smallest unit you can address. For example in a program on my machine `0x20aa87c68` might be the address of one byte, then `0x20aa87c69` is the address of the next byte.
|
||
- The **word size** is some multiple of the byte size. I’ve been confused about
|
||
this for years, and the Wikipedia definition is incredibly vague (“a word is
|
||
the natural unit of data used by a particular processor design”). I
|
||
originally thought that the word size was the same as your register size (64
|
||
bits on x86-64). But according to section 4.1 (“Fundamental Data Types”) of the [Intel architecture manual][2],
|
||
on x86 a word is 16 bits even though the registers are 64 bits. So I’m
|
||
confused – is a word on x86 16 bits or 64 bits? Can it mean both, depending
|
||
on the context? What’s the deal?
|
||
|
||
Now let’s talk about some possible reasons that we use 8-bit bytes!
|
||
|
||
#### reason 1: to fit the English alphabet in 1 byte
|
||
|
||
[This Wikipedia article][3] says that the IBM System/360 introduced the 8-bit byte in 1964.
|
||
|
||
Here’s a [video interview with Fred Brooks (who managed the project)][4] talking about why. I’ve transcribed some of it here:
|
||
|
||
> … the six bit bytes [are] really better for scientific computing and the 8-bit byte ones are really better for commercial computing and each one can be made to work for the other.
|
||
> So it came down to an executive decision and I decided for the 8-bit byte, Jerry’s proposal.
|
||
>
|
||
> ...
|
||
>
|
||
> My most important technical decision in my IBM career was to go with the 8-bit byte for the 360.
|
||
> And on the basis of I believe character processing was going to become important as opposed to decimal digits.
|
||
|
||
It makes sense that an 8-bit byte would be better for text processing: 2^6 is
|
||
64, so 6 bits wouldn’t be enough for lowercase letters, uppercase letters, and symbols.
|
||
|
||
To go with the 8-bit byte, System/360 also introduced the [EBCDIC][5] encoding, which is an 8-bit character encoding.
|
||
|
||
It looks like the next important machine in 8-bit-byte history was the
|
||
[Intel 8008][6], which was built to be
|
||
used in a computer terminal (the Datapoint 2200). Terminals need to be able to
|
||
represent letters as well as terminal control codes, so it makes sense for them
|
||
to use an 8-bit byte.
|
||
[This Datapoint 2200 manual from the Computer History Museum][7]
|
||
says on page 7 that the Datapoint 2200 supported ASCII (7 bit) and EBCDIC (8 bit).
|
||
|
||
#### why was the 6-bit byte better for scientific computing?
|
||
|
||
I was curious about this comment that the 6-bit byte would be better for scientific computing. Here’s a quote from [this interview from Gene Amdahl][8]:
|
||
|
||
> I wanted to make it 24 and 48 instead of 32 and 64, on the basis that this
|
||
> would have given me a more rational floating point system, because in floating
|
||
> point, with the 32-bit word, you had to keep the exponent to just 8 bits for
|
||
> exponent sign, and to make that reasonable in terms of numeric range it could
|
||
> span, you had to adjust by 4 bits instead of by a single bit. And so it caused
|
||
> you to lose some of the information more rapidly than you would with binary
|
||
> shifting
|
||
|
||
I don’t understand this comment at all – why does the exponent have to be 8 bits
|
||
if you use a 32-bit word size? Why couldn’t you use 9 bits or 10 bits if you
|
||
wanted? But it’s all I could find in a quick search.
|
||
|
||
#### why did mainframes use 36 bits?
|
||
|
||
Also related to the 6-bit byte: a lot of mainframes used a 36-bit word size. Why? Someone pointed out
|
||
that there’s a great explanation in the Wikipedia article on [36-bit computing][9]:
|
||
|
||
> Prior to the introduction of computers, the state of the art in precision
|
||
> scientific and engineering calculation was the ten-digit, electrically powered,
|
||
> mechanical calculator… These calculators had a column of keys for each digit,
|
||
> and operators were trained to use all their fingers when entering numbers, so
|
||
> while some specialized calculators had more columns, ten was a practical limit.
|
||
>
|
||
> Early binary computers aimed at the same market therefore often used a 36-bit
|
||
> word length. This was long enough to represent positive and negative integers
|
||
> to an accuracy of ten decimal digits (35 bits would have been the minimum)
|
||
|
||
So this 36 bit thing seems to based on the fact that log_2(20000000000) is 34.2. Huh.
|
||
|
||
My guess is that the reason for this is in the 50s, computers were
|
||
extremely expensive. So if you wanted your computer to support ten decimal
|
||
digits, you’d design so that it had exactly enough bits to do that, and no
|
||
more.
|
||
|
||
Today computers are way faster and cheaper, so if you want to represent ten
|
||
decimal digits for some reason you can just use 64 bits – wasting a little bit
|
||
of space is usually no big deal.
|
||
|
||
Someone else mentioned that some of these machines with 36-bit word sizes let
|
||
you choose a byte size – you could use 5 or 6 or 7 or 8-bit bytes, depending
|
||
on the context.
|
||
|
||
#### reason 2: to work well with binary-coded decimal
|
||
|
||
In the 60s, there was a popular integer encoding called binary-coded decimal (or [BCD][10] for short) that
|
||
encoded every decimal digit in 4 bits.
|
||
|
||
For example, if you wanted to encode the number 1234, in BCD that would be something like:
|
||
|
||
```
|
||
0001 0010 0011 0100
|
||
```
|
||
|
||
So if you want to be able to easily work with binary-coded decimal, your byte
|
||
size should be a multiple of 4 bits, like 8 bits!
|
||
|
||
#### why was BCD popular?
|
||
|
||
This integer representation seemed really weird to me – why not just use
|
||
binary, which is a much more efficient way to store integers? Efficiency was really important in early computers!
|
||
|
||
My best guess about why is that early computers didn’t have displays the same way we do
|
||
now, so the contents of a byte were mapped directly to on/off lights.
|
||
|
||
Here’s a [picture from Wikipedia of an IBM 650 with some lights on its display][11] ([CC BY-SA 3.0][12]):
|
||
|
||
![][13]
|
||
|
||
So if you want people to be relatively able to easily read off a decimal number
|
||
from its binary representation, this makes a lot more sense. I think today BCD
|
||
is obsolete because we have displays and our computers can convert numbers
|
||
represented in binary to decimal for us and display them.
|
||
|
||
Also, I wonder if BCD is where the term “nibble” for 4 bits comes from – in
|
||
the context of BCD, you end up referring to half bytes a lot (because every
|
||
digits is 4 bits). So it makes sense to have a word for “4 bits”, and people
|
||
called 4 bits a nibble. Today “nibble” feels to me like an archaic term though –
|
||
I’ve definitely never used it except as a fun fact (it’s such a fun word!). The Wikipedia article on [nibbles][14] supports this theory:
|
||
|
||
> The nibble is used to describe the amount of memory used to store a digit of
|
||
> a number stored in packed decimal format (BCD) within an IBM mainframe.
|
||
|
||
Another reason someone mentioned for BCD was **financial calculations**. Today
|
||
if you want to store a dollar amount, you’ll typically just use an integer
|
||
amount of cents, and then divide by 100 if you want the dollar part. This is no
|
||
big deal, division is fast. But apparently in the 70s dividing an integer
|
||
represented in binary by 100 was very slow, so it was worth it to redesign how
|
||
you represent your integers to avoid having to divide by 100.
|
||
|
||
Okay, enough about BCD.
|
||
|
||
#### reason 3: 8 is a power of 2?
|
||
|
||
A bunch of people said it’s important for a CPU’s byte size to be a power of 2.
|
||
I can’t figure out whether this is true or not though, and I wasn’t satisfied with the explanation that “computers use binary so powers of 2 are good”. That seems very plausible but I wanted to dig deeper.
|
||
And historically there have definitely been lots of machines that used byte sizes that weren’t powers of 2, for example (from [this retro computing stack exchange thread][15]):
|
||
|
||
- Cyber 180 mainframes used 6-bit bytes
|
||
- the Univac 1100 / 2200 series used a 36-bit word size
|
||
- the PDP-8 was a 12-bit machine
|
||
|
||
Some reasons I heard for why powers of 2 are good that I haven’t understood yet:
|
||
|
||
- every bit in a word needs a bus, and you want the number of buses to be a power of 2 (why?)
|
||
- a lot of circuit logic is susceptible to divide-and-conquer techniques (I think I need an example to understand this)
|
||
|
||
Reasons that made more sense to me:
|
||
|
||
- it makes it easier to design **clock dividers** that can measure “8 bits were
|
||
sent on this wire” that work based on halving – you can put 3 halving clock
|
||
dividers in series. [Graham Sutherland][16] told me about this and made this really cool
|
||
[simulator of clock dividers][17] showing what these clock dividers look like. That site (Falstad) also has a bunch of other example circuits and it seems like a really cool way to make circuit simulators.
|
||
- if you have an instruction that zeroes out a specific bit in a byte, then if
|
||
your byte size is 8 (2^3), you can use just 3 bits of your instruction to
|
||
indicate which bit. x86 doesn’t seem to do this, but the [Z80’s bit testing instructions][18] do.
|
||
- someone mentioned that some processors use [Carry-lookahead adders][19], and they work
|
||
in groups of 4 bits. From some quick Googling it seems like there are a wide
|
||
variety of adder circuits out there though.
|
||
- **bitmaps**: Your computer’s memory is organized into pages (usually of size 2^n). It
|
||
needs to keep track of whether every page is free or not. Operating systems
|
||
use a bitmap to do this, where each bit corresponds to a page and is 0 or 1
|
||
depending on whether the page is free. If you had a 9-bit byte, you would
|
||
need to divide by 9 to find the page you’re looking for in the bitmap.
|
||
Dividing by 9 is slower than dividing by 8, because dividing by powers of 2
|
||
is always the fastest thing.
|
||
|
||
I probably mangled some of those explanations pretty badly: I’m pretty far out
|
||
of my comfort zone here. Let’s move on.
|
||
|
||
#### reason 4: small byte sizes are good
|
||
|
||
You might be wondering – well, if 8-bit bytes were better than 4-bit bytes,
|
||
why not keep increasing the byte size? We could have 16-bit bytes!
|
||
|
||
A couple of reasons to keep byte sizes small:
|
||
|
||
- It’s a waste of space – a byte is the minimum unit you can address, and if
|
||
your computer is storing a lot of ASCII text (which only needs 7 bits), it
|
||
would be a pretty big waste to dedicate 12 or 16 bits to each character when
|
||
you could use 8 bits instead.
|
||
- As bytes get bigger, your CPU needs to get more complex. For example you need one bus line per bit. So I guess simpler is better.
|
||
|
||
My understanding of CPU architecture is extremely shaky so I’ll leave it at
|
||
that. The “it’s a waste of space” reason feels pretty compelling to me though.
|
||
|
||
#### reason 5: compatibility
|
||
|
||
The Intel 8008 (from 1972) was the precursor to the 8080 (from 1974), which was the precursor to the
|
||
8086 (from 1976) – the first x86 processor. It seems like the 8080 and the
|
||
8086 were really popular and that’s where we get our modern x86 computers.
|
||
|
||
I think there’s an “if it ain’t broke don’t fix it” thing going on here – I
|
||
assume that 8-bit bytes were working well, so Intel saw no need to change the
|
||
design. If you keep the same 8-bit byte, then you can reuse more of your
|
||
instruction set.
|
||
|
||
Also around the 80s we start getting network protocols like TCP
|
||
which use 8-bit bytes (usually called “octets”), and if you’re going to be
|
||
implementing network protocols, you probably want to be using an 8-bit byte.
|
||
|
||
#### that’s all!
|
||
|
||
It seems to me like the main reasons for the 8-bit byte are:
|
||
|
||
- a lot of early computer companies were American, the most commonly used language in the US is English
|
||
- those people wanted computers to be good at text processing
|
||
- smaller byte sizes are in general better
|
||
- 7 bits is the smallest size you can fit all English characters + punctuation in
|
||
- 8 is a better number than 7 (because it’s a power of 2)
|
||
- once you have popular 8-bit computers that are working well, you want to keep the same design for compatibility
|
||
|
||
Someone pointed out that [page 65 of this book from 1962][20]
|
||
talking about IBM’s reasons to choose an 8-bit byte basically says the same thing:
|
||
|
||
- Its full capacity of 256 characters was considered to be sufficient for the great majority of applications.
|
||
- Within the limits of this capacity, a single character is represented by a
|
||
single byte, so that the length of any particular record is not dependent on
|
||
the coincidence of characters in that record.
|
||
- 8-bit bytes are reasonably economical of storage space
|
||
- For purely numerical work, a decimal digit can be represented by only 4
|
||
bits, and two such 4-bit bytes can be packed in an 8-bit byte. Although such
|
||
packing of numerical data is not essential, it is a common practice in
|
||
order to increase speed and storage efficiency. Strictly speaking, 4-bit
|
||
bytes belong to a different code, but the simplicity of the 4-and-8-bit
|
||
scheme, as compared with a combination 4-and-6-bit scheme, for example,
|
||
leads to simpler machine design and cleaner addressing logic.
|
||
- Byte sizes of 4 and 8 bits, being powers of 2, permit the computer designer
|
||
to take advantage of powerful features of binary addressing and indexing to
|
||
the bit level (see Chaps. 4 and 5 ) .
|
||
|
||
>
|
||
|
||
Overall this makes me feel like an 8-bit byte is a pretty natural choice if
|
||
you’re designing a binary computer in an English-speaking country.
|
||
|
||
--------------------------------------------------------------------------------
|
||
|
||
via: https://jvns.ca/blog/2023/03/06/possible-reasons-8-bit-bytes/
|
||
|
||
作者:[Julia Evans][a]
|
||
选题:[lkxed][b]
|
||
译者:[译者ID](https://github.com/译者ID)
|
||
校对:[校对者ID](https://github.com/校对者ID)
|
||
|
||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||
|
||
[a]: https://jvns.ca/
|
||
[b]: https://github.com/lkxed/
|
||
[1]: https://social.jvns.ca/@b0rk/109976810279702728
|
||
[2]: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
|
||
[3]: https://en.wikipedia.org/wiki/IBM_System/360
|
||
[4]: https://www.youtube.com/watch?v=9oOCrAePJMs&t=140s
|
||
[5]: https://en.wikipedia.org/wiki/EBCDIC
|
||
[6]: https://en.wikipedia.org/wiki/Intel_8008
|
||
[7]: https://archive.computerhistory.org/resources/text/2009/102683240.05.02.acc.pdf
|
||
[8]: https://archive.computerhistory.org/resources/access/text/2013/05/102702492-05-01-acc.pdf
|
||
[9]: https://en.wikipedia.org/wiki/36-bit_computing
|
||
[10]: https://en.wikipedia.org/wiki/Binary-coded_decimal
|
||
[11]: https://commons.wikimedia.org/wiki/File:IBM-650-panel.jpg
|
||
[12]: http://creativecommons.org/licenses/by-sa/3.0/
|
||
[13]: https://upload.wikimedia.org/wikipedia/commons/a/ad/IBM-650-panel.jpg
|
||
[14]: https://en.wikipedia.org/wiki/Nibble
|
||
[15]: https://retrocomputing.stackexchange.com/questions/7937/last-computer-not-to-use-octets-8-bit-bytes
|
||
[16]: https://poly.nomial.co.uk/
|
||
[17]: https://www.falstad.com/circuit/circuitjs.html?ctz=CQAgjCAMB0l3BWcMBMcUHYMGZIA4UA2ATmIxAUgpABZsKBTAWjDACgwEknsUQ08tQQKgU2AdxA8+I6eAyEoEqb3mK8VMAqWSNakHsx9Iywxj6Ea-c0oBKUy-xpUWYGc-D9kcftCQo-URgEZRQERSMnKkiTSTDFLQjw62NlMBorRP5krNjwDP58fMztE04kdKsRFBQqoqoQyUcRVhl6tLdCwVaonXBO2s0Cwb6UPGEPXmiPPLHhIrne2Y9q8a6lcpAp9edo+r7tkW3c5WPtOj4TyQv9G5jlO5saMAibPOeIoppm9oAPEEU2C0-EBaFoThAAHoUGx-mA8FYgfNESgIFUrNDYVtCBBttg8LiUPR0VCYWhyD0Wp0slYACIASQAamTIORFqtuucQAzGTQ2OTaD9BN8Soo6Uy8PzWQ46oImI4aSB6QA5ZTy9EuVQjPLq3q6kQmAD21Beome0qQMHgkDIhHCYVEfCQ9BVbGNRHAiio5vIltg8Ft9stXg99B5MPdFK9tDAFqg-rggcIDui1i23KZfPd3WjPuoVoDCiDjv4gjDErYQA
|
||
[18]: http://www.chebucto.ns.ca/~af380/z-80-h.htm
|
||
[19]: https://en.wikipedia.org/wiki/Carry-lookahead_adder
|
||
[20]: https://web.archive.org/web/20170403014651/http://archive.computerhistory.org/resources/text/IBM/Stretch/pdfs/Buchholz_102636426.pdf
|