diff --git a/sources/tech/20161106 Myths about -dev-urandom.md b/sources/tech/20161106 Myths about -dev-urandom.md deleted file mode 100644 index f88a439e31..0000000000 --- a/sources/tech/20161106 Myths about -dev-urandom.md +++ /dev/null @@ -1,290 +0,0 @@ -Moelf translating -Myths about /dev/urandom -====== - -There are a few things about /dev/urandom and /dev/random that are repeated again and again. Still they are false. - -I'm mostly talking about reasonably recent Linux systems, not other UNIX-like systems. - -### /dev/urandom is insecure. Always use /dev/random for cryptographic purposes. - -Fact: /dev/urandom is the preferred source of cryptographic randomness on UNIX-like systems. - -### /dev/urandom is a pseudo random number generator, a PRNG, while /dev/random is a “true” random number generator. - -Fact: Both /dev/urandom and /dev/random are using the exact same CSPRNG (a cryptographically secure pseudorandom number generator). They only differ in very few ways that have nothing to do with “true” randomness. - -### /dev/random is unambiguously the better choice for cryptography. Even if /dev/urandom were comparably secure, there's no reason to choose the latter. - -Fact: /dev/random has a very nasty problem: it blocks. - -### But that's good! /dev/random gives out exactly as much randomness as it has entropy in its pool. /dev/urandom will give you insecure random numbers, even though it has long run out of entropy. - -Fact: No. Even disregarding issues like availability and subsequent manipulation by users, the issue of entropy “running low” is a straw man. About 256 bits of entropy are enough to get computationally secure numbers for a long, long time. - -And the fun only starts here: how does /dev/random know how much entropy there is available to give out? Stay tuned! - -### But cryptographers always talk about constant re-seeding. Doesn't that contradict your last point? - -Fact: You got me! Kind of. It is true, the random number generator is constantly re-seeded using whatever entropy the system can lay its hands on. But that has (partly) other reasons. - -Look, I don't claim that injecting entropy is bad. It's good. I just claim that it's bad to block when the entropy estimate is low. - -### That's all good and nice, but even the man page for /dev/(u)random contradicts you! Does anyone who knows about this stuff actually agree with you? - -Fact: No, it really doesn't. It seems to imply that /dev/urandom is insecure for cryptographic use, unless you really understand all that cryptographic jargon. - -The man page does recommend the use of /dev/random in some cases (it doesn't hurt, in my opinion, but is not strictly necessary), but it also recommends /dev/urandom as the device to use for “normal” cryptographic use. - -And while appeal to authority is usually nothing to be proud of, in cryptographic issues you're generally right to be careful and try to get the opinion of a domain expert. - -And yes, quite a few experts share my view that /dev/urandom is the go-to solution for your random number needs in a cryptography context on UNIX-like systems. Obviously, their opinions influenced mine, not the other way around. - -Hard to believe, right? I must certainly be wrong! Well, read on and let me try to convince you. - -I tried to keep it out, but I fear there are two preliminaries to be taken care of, before we can really tackle all those points. - -Namely, what is randomness, or better: what kind of randomness am I talking about here? - -And, even more important, I'm really not being condescending. I have written this document to have a thing to point to, when this discussion comes up again. More than 140 characters. Without repeating myself again and again. Being able to hone the writing and the arguments itself, benefitting many discussions in many venues. - -And I'm certainly willing to hear differing opinions. I'm just saying that it won't be enough to state that /dev/urandom is bad. You need to identify the points you're disagreeing with and engage them. - -### You're saying I'm stupid! - -Emphatically no! - -Actually, I used to believe that /dev/urandom was insecure myself, a few years ago. And it's something you and me almost had to believe, because all those highly respected people on Usenet, in web forums and today on Twitter told us. Even the man page seems to say so. Who were we to dismiss their convincing argument about “entropy running low”? - -This misconception isn't so rampant because people are stupid, it is because with a little knowledge about cryptography (namely some vague idea what entropy is) it's very easy to be convinced of it. Intuition almost forces us there. Unfortunately intuition is often wrong in cryptography. So it is here. - -### True randomness - -What does it mean for random numbers to be “truly random”? - -I don't want to dive into that issue too deep, because it quickly gets philosophical. Discussions have been known to unravel fast, because everyone can wax about their favorite model of randomness, without paying attention to anyone else. Or even making himself understood. - -I believe that the “gold standard” for “true randomness” are quantum effects. Observe a photon pass through a semi-transparent mirror. Or not. Observe some radioactive material emit alpha particles. It's the best idea we have when it comes to randomness in the world. Other people might reasonably believe that those effects aren't truly random. Or even that there is no randomness in the world at all. Let a million flowers bloom. - -Cryptographers often circumvent this philosophical debate by disregarding what it means for randomness to be “true”. They care about unpredictability. As long as nobody can get any information about the next random number, we're fine. And when you're talking about random numbers as a prerequisite in using cryptography, that's what you should aim for, in my opinion. - -Anyway, I don't care much about those “philosophically secure” random numbers, as I like to think of your “true” random numbers. - -### Two kinds of security, one that matters - -But let's assume you've obtained those “true” random numbers. What are you going to do with them? - -You print them out, frame them and hang them on your living-room wall, to revel in the beauty of a quantum universe? That's great, and I certainly understand. - -Wait, what? You're using them? For cryptographic purposes? Well, that spoils everything, because now things get a bit ugly. - -You see, your truly-random, quantum effect blessed random numbers are put into some less respectable, real-world tarnished algorithms. - -Because almost all of the cryptographic algorithms we use do not hold up to ### information-theoretic security**. They can “only” offer **computational security. The two exceptions that come to my mind are Shamir's Secret Sharing and the One-time pad. And while the first one may be a valid counterpoint (if you actually intend to use it), the latter is utterly impractical. - -But all those algorithms you know about, AES, RSA, Diffie-Hellman, Elliptic curves, and all those crypto packages you're using, OpenSSL, GnuTLS, Keyczar, your operating system's crypto API, these are only computationally secure. - -What's the difference? While information-theoretically secure algorithms are secure, period, those other algorithms cannot guarantee security against an adversary with unlimited computational power who's trying all possibilities for keys. We still use them because it would take all the computers in the world taken together longer than the universe has existed, so far. That's the level of “insecurity” we're talking about here. - -Unless some clever guy breaks the algorithm itself, using much less computational power. Even computational power achievable today. That's the big prize every cryptanalyst dreams about: breaking AES itself, breaking RSA itself and so on. - -So now we're at the point where you don't trust the inner building blocks of the random number generator, insisting on “true randomness” instead of “pseudo randomness”. But then you're using those “true” random numbers in algorithms that you so despise that you didn't want them near your random number generator in the first place! - -Truth is, when state-of-the-art hash algorithms are broken, or when state-of-the-art block ciphers are broken, it doesn't matter that you get “philosophically insecure” random numbers because of them. You've got nothing left to securely use them for anyway. - -So just use those computationally-secure random numbers for your computationally-secure algorithms. In other words: use /dev/urandom. - -### Structure of Linux's random number generator - -#### An incorrect view - -Chances are, your idea of the kernel's random number generator is something similar to this: - -![image: mythical structure of the kernel's random number generator][1] - -“True randomness”, albeit possibly skewed and biased, enters the system and its entropy is precisely counted and immediately added to an internal entropy counter. After de-biasing and whitening it's entering the kernel's entropy pool, where both /dev/random and /dev/urandom get their random numbers from. - -The “true” random number generator, /dev/random, takes those random numbers straight out of the pool, if the entropy count is sufficient for the number of requested numbers, decreasing the entropy counter, of course. If not, it blocks until new entropy has entered the system. - -The important thing in this narrative is that /dev/random basically yields the numbers that have been input by those randomness sources outside, after only the necessary whitening. Nothing more, just pure randomness. - -/dev/urandom, so the story goes, is doing the same thing. Except when there isn't sufficient entropy in the system. In contrast to /dev/random, it does not block, but gets “low quality random” numbers from a pseudorandom number generator (conceded, a cryptographically secure one) that is running alongside the rest of the random number machinery. This CSPRNG is just seeded once (or maybe every now and then, it doesn't matter) with “true randomness” from the randomness pool, but you can't really trust it. - -In this view, that seems to be in a lot of people's minds when they're talking about random numbers on Linux, avoiding /dev/urandom is plausible. - -Because either there is enough entropy left, then you get the same you'd have gotten from /dev/random. Or there isn't, then you get those low-quality random numbers from a CSPRNG that almost never saw high-entropy input. - -Devilish, right? Unfortunately, also utterly wrong. In reality, the internal structure of the random number generator looks like this. - -#### A better simplification - -##### Before Linux 4.8 - -![image: actual structure of the kernel's random number generator before Linux 4.8][2] This is a pretty rough simplification. In fact, there isn't just one, but three pools filled with entropy. One primary pool, and one for /dev/random and /dev/urandom each, feeding off the primary pool. Those three pools all have their own entropy counts, but the counts of the secondary pools (for /dev/random and /dev/urandom) are mostly close to zero, and “fresh” entropy flows from the primary pool when needed, decreasing its entropy count. Also there is a lot of mixing and re-injecting outputs back into the system going on. All of this is far more detail than is necessary for this document. - -See the big difference? The CSPRNG is not running alongside the random number generator, filling in for those times when /dev/urandom wants to output something, but has nothing good to output. The CSPRNG is an integral part of the random number generation process. There is no /dev/random handing out “good and pure” random numbers straight from the whitener. Every randomness source's input is thoroughly mixed and hashed inside the CSPRNG, before it emerges as random numbers, either via /dev/urandom or /dev/random. - -Another important difference is that there is no entropy counting going on here, but estimation. The amount of entropy some source is giving you isn't something obvious that you just get, along with the data. It has to be estimated. Please note that when your estimate is too optimistic, the dearly held property of /dev/random, that it's only giving out as many random numbers as available entropy allows, is gone. Unfortunately, it's hard to estimate the amount of entropy. - -The Linux kernel uses only the arrival times of events to estimate their entropy. It does that by interpolating polynomials of those arrival times, to calculate “how surprising” the actual arrival time was, according to the model. Whether this polynomial interpolation model is the best way to estimate entropy is an interesting question. There is also the problem that internal hardware restrictions might influence those arrival times. The sampling rates of all kinds of hardware components may also play a role, because it directly influences the values and the granularity of those event arrival times. - -In the end, to the best of our knowledge, the kernel's entropy estimate is pretty good. Which means it's conservative. People argue about how good it really is, but that issue is far above my head. Still, if you insist on never handing out random numbers that are not “backed” by sufficient entropy, you might be nervous here. I'm sleeping sound because I don't care about the entropy estimate. - -So to make one thing crystal clear: both /dev/random and /dev/urandom are fed by the same CSPRNG. Only the behavior when their respective pool runs out of entropy, according to some estimate, differs: /dev/random blocks, while /dev/urandom does not. - -##### From Linux 4.8 onward - -In Linux 4.8 the equivalency between /dev/urandom and /dev/random was given up. Now /dev/urandom output does not come from an entropy pool, but directly from a CSPRNG. - -![image: actual structure of the kernel's random number generator from Linux 4.8 onward][3] - -We will see shortly why that is not a security problem. - -### What's wrong with blocking? - -Have you ever waited for /dev/random to give you more random numbers? Generating a PGP key inside a virtual machine maybe? Connecting to a web server that's waiting for more random numbers to create an ephemeral session key? - -That's the problem. It inherently runs counter to availability. So your system is not working. It's not doing what you built it to do. Obviously, that's bad. You wouldn't have built it if you didn't need it. - -I'm working on safety-related systems in factory automation. Can you guess what the main reason for failures of safety systems is? Manipulation. Simple as that. Something about the safety measure bugged the worker. It took too much time, was too inconvenient, whatever. People are very resourceful when it comes to finding “inofficial solutions”. - -But the problem runs even deeper: people don't like to be stopped in their ways. They will devise workarounds, concoct bizarre machinations to just get it running. People who don't know anything about cryptography. Normal people. - -Why not patching out the call to `random()`? Why not having some guy in a web forum tell you how to use some strange ioctl to increase the entropy counter? Why not switch off SSL altogether? - -In the end you just educate your users to do foolish things that compromise your system's security without you ever knowing about it. - -It's easy to disregard availability, usability or other nice properties. Security trumps everything, right? So better be inconvenient, unavailable or unusable than feign security. - -But that's a false dichotomy. Blocking is not necessary for security. As we saw, /dev/urandom gives you the same kind of random numbers as /dev/random, straight out of a CSPRNG. Use it! - -### The CSPRNGs are alright - -But now everything sounds really bleak. If even the high-quality random numbers from /dev/random are coming out of a CSPRNG, how can we use them for high-security purposes? - -It turns out, that “looking random” is the basic requirement for a lot of our cryptographic building blocks. If you take the output of a cryptographic hash, it has to be indistinguishable from a random string so that cryptographers will accept it. If you take a block cipher, its output (without knowing the key) must also be indistinguishable from random data. - -If anyone could gain an advantage over brute force breaking of cryptographic building blocks, using some perceived weakness of those CSPRNGs over “true” randomness, then it's the same old story: you don't have anything left. Block ciphers, hashes, everything is based on the same mathematical fundament as CSPRNGs. So don't be afraid. - -### What about entropy running low? - -It doesn't matter. - -The underlying cryptographic building blocks are designed such that an attacker cannot predict the outcome, as long as there was enough randomness (a.k.a. entropy) in the beginning. A usual lower limit for “enough” may be 256 bits. No more. - -Considering that we were pretty hand-wavey about the term “entropy” in the first place, it feels right. As we saw, the kernel's random number generator cannot even precisely know the amount of entropy entering the system. Only an estimate. And whether the model that's the basis for the estimate is good enough is pretty unclear, too. - -### Re-seeding - -But if entropy is so unimportant, why is fresh entropy constantly being injected into the random number generator? - -djb [remarked][4] that more entropy actually can hurt. - -First, it cannot hurt. If you've got more randomness just lying around, by all means use it! - -There is another reason why re-seeding the random number generator every now and then is important: - -Imagine an attacker knows everything about your random number generator's internal state. That's the most severe security compromise you can imagine, the attacker has full access to the system. - -You've totally lost now, because the attacker can compute all future outputs from this point on. - -But over time, with more and more fresh entropy being mixed into it, the internal state gets more and more random again. So that such a random number generator's design is kind of self-healing. - -But this is injecting entropy into the generator's internal state, it has nothing to do with blocking its output. - -### The random and urandom man page - -The man page for /dev/random and /dev/urandom is pretty effective when it comes to instilling fear into the gullible programmer's mind: - -> A read from the /dev/urandom device will not block waiting for more entropy. As a result, if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a cryptographic attack on the algorithms used by the driver. Knowledge of how to do this is not available in the current unclassified literature, but it is theoretically possible that such an attack may exist. If this is a concern in your application, use /dev/random instead. - -Such an attack is not known in “unclassified literature”, but the NSA certainly has one in store, right? And if you're really concerned about this (you should!), please use /dev/random, and all your problems are solved. - -The truth is, while there may be such an attack available to secret services, evil hackers or the Bogeyman, it's just not rational to just take it as a given. - -And even if you need that peace of mind, let me tell you a secret: no practical attacks on AES, SHA-3 or other solid ciphers and hashes are known in the “unclassified” literature, either. Are you going to stop using those, as well? Of course not! - -Now the fun part: “use /dev/random instead”. While /dev/urandom does not block, its random number output comes from the very same CSPRNG as /dev/random's. - -If you really need information-theoretically secure random numbers (you don't!), and that's about the only reason why the entropy of the CSPRNGs input matters, you can't use /dev/random, either! - -The man page is silly, that's all. At least it tries to redeem itself with this: - -> If you are unsure about whether you should use /dev/random or /dev/urandom, then probably you want to use the latter. As a general rule, /dev/urandom should be used for everything except long-lived GPG/SSL/SSH keys. - -Fine. I think it's unnecessary, but if you want to use /dev/random for your “long-lived keys”, by all means, do so! You'll be waiting a few seconds typing stuff on your keyboard, that's no problem. - -But please don't make connections to a mail server hang forever, just because you “wanted to be safe”. - -### Orthodoxy - -The view espoused here is certainly a tiny minority's opinions on the Internet. But ask a real cryptographer, you'll be hard pressed to find someone who sympathizes much with that blocking /dev/random. - -Let's take [Daniel Bernstein][5], better known as djb: - -> Cryptographers are certainly not responsible for this superstitious nonsense. Think about this for a moment: whoever wrote the /dev/random manual page seems to simultaneously believe that -> -> * (1) we can't figure out how to deterministically expand one 256-bit /dev/random output into an endless stream of unpredictable keys (this is what we need from urandom), but -> -> * (2) we _can_ figure out how to use a single key to safely encrypt many messages (this is what we need from SSL, PGP, etc.). -> -> - -> -> For a cryptographer this doesn't even pass the laugh test. - -Or [Thomas Pornin][6], who is probably one of the most helpful persons I've ever encountered on the Stackexchange sites: - -> The short answer is yes. The long answer is also yes. /dev/urandom yields data which is indistinguishable from true randomness, given existing technology. Getting "better" randomness than what /dev/urandom provides is meaningless, unless you are using one of the few "information theoretic" cryptographic algorithm, which is not your case (you would know it). -> -> The man page for urandom is somewhat misleading, arguably downright wrong, when it suggests that /dev/urandom may "run out of entropy" and /dev/random should be preferred; - -Or maybe [Thomas Ptacek][7], who is not a real cryptographer in the sense of designing cryptographic algorithms or building cryptographic systems, but still the founder of a well-reputed security consultancy that's doing a lot of penetration testing and breaking bad cryptography: - -> Use urandom. Use urandom. Use urandom. Use urandom. Use urandom. Use urandom. - -### Not everything is perfect - -/dev/urandom isn't perfect. The problems are twofold: - -On Linux, unlike FreeBSD, /dev/urandom never blocks. Remember that the whole security rested on some starting randomness, a seed? - -Linux's /dev/urandom happily gives you not-so-random numbers before the kernel even had the chance to gather entropy. When is that? At system start, booting the computer. - -FreeBSD does the right thing: they don't have the distinction between /dev/random and /dev/urandom, both are the same device. At startup /dev/random blocks once until enough starting entropy has been gathered. Then it won't block ever again. - -In the meantime, Linux has implemented a new syscall, originally introduced by OpenBSD as getentropy(2): getrandom(2). This syscall does the right thing: blocking until it has gathered enough initial entropy, and never blocking after that point. Of course, it is a syscall, not a character device, so it isn't as easily accessible from shell or script languages. It is available from Linux 3.17 onward. - -On Linux it isn't too bad, because Linux distributions save some random numbers when booting up the system (but after they have gathered some entropy, since the startup script doesn't run immediately after switching on the machine) into a seed file that is read next time the machine is booting. So you carry over the randomness from the last running of the machine. - -Obviously that isn't as good as if you let the shutdown scripts write out the seed, because in that case there would have been much more time to gather entropy. The advantage is obviously that this does not depend on a proper shutdown with execution of the shutdown scripts (in case the computer crashes, for example). - -And it doesn't help you the very first time a machine is running, but the Linux distributions usually do the same saving into a seed file when running the installer. So that's mostly okay. - -Virtual machines are the other problem. Because people like to clone them, or rewind them to a previously saved check point, this seed file doesn't help you. - -But the solution still isn't using /dev/random everywhere, but properly seeding each and every virtual machine after cloning, restoring a checkpoint, whatever. - -### tldr; - - Just use /dev/urandom! - - --------------------------------------------------------------------------------- - -via: https://www.2uo.de/myths-about-urandom/ - -作者:[Thomas Hühn][a] -译者:[译者ID](https://github.com/译者ID) -校对:[校对者ID](https://github.com/校对者ID) - -本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出 - -[a]:https://www.2uo.de/ -[1]:https://www.2uo.de/myths-about-urandom/structure-no.png -[2]:https://www.2uo.de/myths-about-urandom/structure-yes.png -[3]:https://www.2uo.de/myths-about-urandom/structure-new.png -[4]:http://blog.cr.yp.to/20140205-entropy.html -[5]:http://www.mail-archive.com/cryptography@randombit.net/msg04763.html -[6]:http://security.stackexchange.com/questions/3936/is-a-rand-from-dev-urandom-secure-for-a-login-key/3939#3939 -[7]:http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/ diff --git a/translated/tech/20161106 Myths about -dev-urandom.md b/translated/tech/20161106 Myths about -dev-urandom.md new file mode 100644 index 0000000000..aa83b07b3a --- /dev/null +++ b/translated/tech/20161106 Myths about -dev-urandom.md @@ -0,0 +1,296 @@ +关于 /dev/urandom 的流言终结 +====== + +有很多关于 /dev/urandom 和 /dev/random 的流言在坊间不断流传。流言终究是流言。 +本篇文章里针对的都是今年的 Linux 操作系统,其他类 Unix 操作系统不在讨论范围内。 + +### /dev/urandom 不安全。加密用途必须使用 /dev/random。 + +事实:/dev/urandom 才是类 Unix 操作系统下推荐的加密种子。 + +### /dev/urandom 是伪随机数生成器(PRND),而 /dev/random 是“真”随机数生成器。 + +事实:他们两者本质上用的是同一种 CSPRNG (一种密码学伪随机数生成器)。他们之间细微的差别和“真”不“真”随机完全无关 + +### /dev/random 在任何情况下都是密码学应用更好地选择。即便 /dev/urandom 也同样安全,我们还是不应该用 urandom。 + +事实:/dev/random 有个很恶心人的问题:它是阻塞的。(译者:意味着请求都得逐个执行,等待前一个事件完成) + +### 但阻塞不是好事吗!/dev/random 只会给出电脑收集的信息熵足以支持的随机量。/dev/urandom 在用完了所有熵的情况下还会不断吐不安全的随机数给你。 + +事实:这是误解。就算我们不去考虑应用层面后续对随机种子的用法,“用完信息熵池”这个概念本身就不存在。仅仅 256 bits 的熵就足以生成计算上安全的随机数很长,很长一段时间了。 + +问题的关键还在后头:/dev/random 怎么知道有系统会多少可用的信息熵?接着看! + +### 但密码学家老是讨论重新选种子(re-seeding)。这难道不和上一条冲突吗? + +事实:你说的也没错!某种程度上吧。确实,随机数生成器一直在使用系统信息熵的状态重新选种。但这么做(一部分)是因为别的原因。 + +这样说吧,我没有说引入新的信息熵是坏的。更多的熵肯定更好。我只是说在熵池低的时候阻塞是没必要的。 + +### 好,就算你说的都对,但是 /dev/(u)random 的 man 页面和你说的也不一样啊!到底有没有专家同意你说的这堆啊? + +事实:其实 man 页面和我说的不冲突。它看似好像在说 /dev/urandom 对密码学用途来说不安全,但如果你真的理解这堆密码学术语你就知道他说的并不是这个意思。 + +man 页面确实说在一些情况下推荐使用 /dev/random (我觉得也没问题,但绝对不是说必要的),但它也推荐在大多数“一般”的密码学应用下使用 /dev/urandom 。 + +虽然诉诸权威一般来说不是好事,但在密码学这么严肃的事情上,和专家统一意见是很有必要的。 + +所以说呢,还确实有一些专家和我的一件事一致的:/dev/urandom 就应该是类 UNIX 操作系统下密码学应用的首选。显然的,是他们的观点说服了我而不是反过来的。 + +难以相信吗?觉得我肯定错了?读下去看我能不能说服你。 + +我尝试不讲太高深的东西,但是有两点内容必须先提一下才能让我们接着论证观点。 + +首当其冲的,什么是随机性,或者更准确地:我们在探讨什么样的随机性? + +另外一点很重要的是,我没有尝试以说教的态度对你们写这段话。我写这篇文章是为了日后可以在讨论起的时候指给别人看。比 140 字长(译者:推特长度)。这样我就不用一遍遍重复我的观点了。能把论点磨炼成一篇文章本身就很有助于将来的讨论。 + +并且我非常乐意听到不一样的观点。但我只是认为单单地说 /dev/urandom 坏是不够的。你得能指出到底有什么问题,并且剖析他们。 + +### 你是在说我笨?! + +绝对没有! + +事实上我自己也相信了 “/dev/urandom 不安全的” 好些年。这几乎不是我们的错,因为那么德高望重的人在 Usenet,论坛,推特上根我们重复这个观点。甚至连 man page 都似是而非地说着。我们当年怎么可能打发诸如“信息熵太低了”这种看上去就很让人信服的观点呢? + +整个流言之所以如此广为流传不是因为人们太蠢,而是因为但凡有点关于信息熵和密码学概念的人都会觉得这个说法很有道理。直觉似乎都在告诉我们这流言讲的很有道理。很不幸直觉在密码学里通常不管用,这次也一样。 + +### 真随机 + +什么叫一个随机变量是“真随机的”? + +我不想搞的太复杂以至于变成哲学范畴的东西。这种讨论很容易走偏因为随机模型大家见仁见智,讨论很快变得毫无意义。 + +在我看来真随机的“试金石”是量子效应。一个光子穿过或不穿过一个50%的半透镜。或者观察一个放射性粒子衰变。这类东西是现实世界最接近真随机的东西。当然,有些人也不相信这类过程是真随机的,或者这个世界根本不存在任何随机性。这个就百家争鸣我也不好多说什么了。 + +密码学家一般都会通过不去讨论什么是“真随机”来避免这种争论。他们更关心的是不可预测性。只要没有任何方法能猜出下一个随机数就可以了。所以当你以密码学应用为前提讨论一个随机数好不好的时候,在我看来这才是最重要的。 + +无论如何,我不怎么关心“哲学上安全”的随机数,这也包括别人嘴里的“真”随机数。 + +## 两种安全,一种有用 + +但就让我们退一步说,你有了一个“真”随机变量。你下一步做什么呢? + +你把他们打印出来然后挂在墙上来战士量子宇宙的美与和谐?牛逼!我很理解你。 + +但是等等,你说你要用他们?做密码学用途?额,那这就废了,因为这事情就有点复杂了。 + +事情是这样的,你的真随机,量子力学加护的随机数即将被用进不理想的现实世界程序里。 + +因为我们使用的大多数算法并不是 ### 理论信息学上安全的。**他们只能提供** 计算意义上的安全。我能想到为数不多的例外就只有 Shamir 密钥分享 和 One-time pad 算法。并且就算前者是名副其实的(如果你实际打算用的话),后者则毫无可行性可言。 + +但所有那些大名鼎鼎的密码学算法,AES,RSA,Diffie-Hellman, 椭圆曲线,还有所有那些加密软件包,OpenSSL,GnuTLS,Keyczar,你的操作系统的加密 API,都仅仅是计算意义上的安全的。 + +那区别是什么呢?理论信息学上的安全肯定是安全的,句号。其他那些的算法都可能在理论上被拥有无限计算力的穷举破解。我们依然愉快地使用他们因为全世界的计算机加起来都不可能在宇宙年龄的时间里破解,至少现在是这样。而这就是我们文章里说的“不安全”。 + +除非哪个聪明的家伙破解了算法本身——在只需要极少量计算力的情况下。这也是每个密码学家梦寐以求的圣杯:破解 AES 本身,破解 RSA 算法本身。 + +所以现在我们来到了更底层的东西:随机数生成器,你坚持要“真随机”而不是“伪随机”。但是没过一会儿你的真随机数就被喂进了你极为鄙视的伪随机算法里了! + +真相是,如果我们最先进的 hash 算法被破解了,或者最先进的块加密被破解了,你得到这些那些“哲学上不安全的”甚至无所谓了,因为反正你也没有安全的应用方法了。 + +所以喂计算性上安全的随机数给你仅仅是计算性上安全的算法就可以了,换而言之,用 /dev/urandom。 + +### Linux 随机数生成器的构架 + +#### 一种错误的看法 + +你对内核的随机数生成器的理解很可能是像这样的: + +![image: mythical structure of the kernel's random number generator][1] + +“真随机数”,尽管可能有点瑕疵,进入操作系统然后它的熵立刻被加入内部熵计数器。然后经过去 bias 和“漂白”之后它进入内核的熵池,然后 /dev/random 和 /dev/urandom 从里面生成随机数。 + +“真”随机数生成器,/dev/random,直接从池里选出随机数,如果熵计数器表示能满足需要的数字大小,那就吐出数字并且减少熵计数。如果不够的话,他会阻塞程序直至有足够的熵进入和系统。 + +这里很重要一环是 /dev/random 几乎直接把那些进入系统的随机性吐了出来,不经扭曲。 + +而对 /dev/urandom 来说,事情是一样的。除了当没有足够的熵的时候,它不会阻塞,而会从一直在运行的伪随机数生成器里吐出“底质量”的随机数。这个 CSPRNG 只会用“真随机数”生成种子一次(或者好几次,这不重要),但你不能特别相信它。 + +在这种对随机数生成的理解下,很多人会觉得在 Linux 下尽量避免 /dev/urandom 看上去有那么点道理。 + +因为要么你有足够多的熵,你会相当于用了 /dev/random。要么没有,那你就会从几乎没有高熵输入的 CSPRNG 那里得到一个低质量的随机数。 + +看上去很邪恶是吧?很不幸的是这种看法是完全错误的。实际上,随机数生成器的构架更像是这样的。 + +#### 更好地简化 + +##### Linux 4.8 之前 + +![image: actual structure of the kernel's random number generator before Linux 4.8][2] + +这是个很粗糙的简化。实际上不仅有一个,而是三个熵池。一个主池,另一个给 /dev/random,还有一个给 /dev/urandom,后两者依靠从主池里获取熵。这三个池都有各自的熵计数器,但二级池(后两个)的计数器基本都在0附近,而“新鲜”的熵总在需要的时候从主池流过来。同时还有好多混合和回流进系统在同时进行。整个过程对于这篇文档来说都过于复杂了我们跳过。 + +但你看到最大的区别了吗? CSPRNG 并不是和随机数生成器一起跑用来填充 /dev/urandom 需要输出但熵不够的时候。CSPRNG 是整个随机数生成过程的内部组件之一。从来就没有什么 /dev/random 直接从池里输出纯纯的随机性。每个随机源的输入都在 CSPRNG 里充分混合和 hash 过了,这一切都发生在实际变成一个随机数,被/dev/urandom 或者 /dev/random 吐出去之前。 + +另外一个重要的区别是是这里没有熵计数器的任何事情,只有预估。一个源给你的熵的量并不是什么很明确能直接得到的数字。你得预估它。注意,如果你太乐观地预估了它,那 /dev/random 最重要的特性——只给出熵允许的随机量——就荡然无存了。很不幸的,预估熵的量是很困难的。 + +Linux 内核只使用事件的到达时间来预估熵的量。它通过多项式插值,某种模型,来预估实际的到达时间有多“出乎意料”。这种多项式插值的方法到底是不是好的预估熵量的方法本身就是个问题。同时硬件情况会不会以某种特定的方式影响到达时间也是个问题。而所有硬件的取样率也是个问题,因为这基本上就直接决定了随机数到达时间的颗粒度。 + +说到最后,至少现在看来,内核的熵预估还是不错的。这也意味着它比较保守。有些人会具体地讨论它有多好,这都超出我的脑容量了。就算这样,如果你坚持不想在没有足够多的熵的情况下吐出随机数,那你看到这里可能还会有一丝紧张。我睡的就很香了,因为我不关心熵预估什么的。 + +最后强调一下终点:/dev/random 和 /dev/urandom 都是被同一个 CSPRNG 喂的输入。只有他们在用完各自熵池(根据某种预估标准)的时候,他们的行为会不同:/dev/random 阻塞,/dev/urandom 不阻塞。 + +##### Linux 4.8 以后 + +在 Linux 4.8 里,/dev/random 和 /dev/urandom 的等价性被放弃了。现在 /dev/urandom 的输出不来自于熵池,而是直接从 CSPRNG 来。 + +![image: actual structure of the kernel's random number generator from Linux 4.8 onward][3] + +我们很快会理解为什么这不是一个安全问题。 + +### 阻塞有什么问题? + +你有没有需要等着 /dev/random 来吐随机数?比如在虚拟机里生成一个 PGP 密钥?或者访问一个在生成会话密钥的网站? + +这些都是问题。阻塞本质上会降低可用性。换而言之你的系统不干你让它干的事情。不用我说,这是不好的。要是它不 work 你干嘛搭建它呢? + +我在工厂自动化里做过和安全相关的系统。猜猜看安全系统失效的主要原因是什么?被错误操作。就这么简单。很多安全措施的流程让工人恼火了。比如时间太长,或者太不方便。你要知道人很会找捷径来“解决”问题。 + +但其实有个更深刻的问题:人们不喜欢被打断。他们会找一些绕过的方法,把一些诡异的东西接在一起仅仅因为这样能用。一般人根本不知道什么密码学什么乱七八糟的,至少正常的人是这样吧。 + +为什么不禁止调用 `random()`?为什么不随便在论坛上找个人告诉你用写奇异的 ioctl 来增加熵计数器呢?为什么不干脆就把 SSL 加密给关了算了呢? + +到头来如果东西太难用的话,你的用户就会被迫开始做一些降低系统安全性的事情——你甚至不知道他们会做些什么。 + +我们很容易会忽视可用性之类的重要性。毕竟安全第一对吧?所以比起牺牲安全,不可用,难用,不方便都是次要的? + +这种二元对立的想法是错的。阻塞不一定就安全了。正如我们看到的,/dev/urandom 直接从 CSPRNG 里给你一样好的随机数。用它不好吗! + +### CSPRNG 没问题 + +现在情况听上去很沧桑。如果连高质量的 /dev/random 都是从一个 CSPRNG 里来的,我们怎么敢在高安全性的需求上使用它呢? + +实际上,“看上去随机”是现存大多数密码学算法的更集。如果你观察一个密码学 hash 的输出,它得和随机的字符串不可区分,密码学家才会认可这个算法。如果你生成一个块加密,它的输出(在你不知道密钥的情况下)也必须和随机数据不可区分才行。 + +如果任何人能比暴力穷举要更有效地破解一个加密,比如它利用了某些 CSPRNG 伪随机的弱点,那这就又是老一套了:一切都废了,也别谈后面的了。块加密,hash,一切都是基于某个数学算法,比如 CSPRNG。所以别害怕,到头来都一样。 + +### 那熵池快空了的情况呢? + +毫无影响。 + +加密算法的根基建立在攻击者不能预测输出上,只要最一开始有足够的随机性(熵)就行了。一般的下限是 256 bits,不需要更多了。 + +介于我们一直在很随意的使用“熵”这个概念,我用 bits 来量化随机性希望读者不要太在意细节。像我们之前讨论的那样,内核的随机数生成器甚至没法精确地知道进入系统的熵的量。只有一个预估。而且这个预估的准确性到底怎么样也没人知道。 +It doesn't matter. + +### 重新选种 + +但如果熵这么不重要,为什么还要有新的熵一直被收进随机数生成器里呢? + +djb [提到][4] 太多的熵甚至可能会起到反效果。 + +首先,一般不会这样。如果你有很多随机性可以拿来用,用就对了! + +但随机数生成器时不时要重新选种还有别的原因: + +想象一下如果有个攻击者获取了你随机数生成器的所有内部状态。这是最坏的情况了,本质上你的一切都暴露给攻击者了。 + +你已经凉了,因为攻击者可以计算出所有未来会被输出的随机数了。 + +但是,如果不断有新的熵被混进系统,那内部状态会在一次变得随机起来。所以随机数生成器被设计成这样有些“自愈”能力。 + +但这是在给内部状态引入新的熵,这和阻塞输出没有任何关系。 + + +### random 和 urandom 的 man 页面 + +这两个 man 页面在吓唬程序员方面很有建树: + +> 从 /dev/urandom 读取数据不会因为需要更多熵而阻塞。这样的结果是,如果熵池里没有足够多的熵,取决于驱动使用的算法,返回的数值在理论上有被密码学攻击的可能性。发动这样攻击的步骤并没有出现在任何公开文献当中,但这样的攻击从理论上讲是可能存在的。如果你的应用担心这类情况,你应该使用 /dev/random。 + +没有“公开的文献”描述,但是 NSA 的小卖部里肯定卖这种攻击手段是吧?如果你真的真的很担心(你应该很担心),那就用 /dev/random 然后所有问题都没了? + +然而事实是,可能什么情报局有这种攻击,或者什么邪恶黑客组织找到了方法。但如果我们就直接假设这种攻击一定存在也是不合理的。 + +而且就算你想给自己一个安心,我要给你泼个冷水:AES,SHA-3 或者其他什么常见的加密算法也没有“公开文献记述”的攻击手段。难道你也不用这几个加密算法了?这显然是可笑的。 + +我们在回到 man 页面说:“使用 /dev/random”。我们已经知道了,虽然 /dev/urandom 不阻塞,但是它的随机数和 /dev/random 都是从同一个 CSPRNG 里来的。 + +如果你真的需要信息论理论上安全的随机数(你不需要的相信我),那才有可能成为唯一一个你需要等足够熵进入 CSPRNG 的理由。而且你也不能用 /dev/random。 + +man 页面有毒,就这样。但至少它还稍稍挽回了一下自己: +> 如果你不确定该用 /dev/random 还是 /dev/urandom ,那你可能应该用后者。通常来说,除了需要长期使用的 GPG/SSL/SSH 密钥以外,你总该使用/dev/urandom 。 + +行。我觉得没必要,但如果你真的要用 /dev/random 来生成 “长期使用的密钥”,用就是了也没人拦着!你可能需要等几秒钟或者敲几下键盘来增加熵,但没什么问题。 + +但求求你们,不要就因为“你想更安全点”就让连个邮件服务器要挂起半天。 + +### 正道 + +本篇文章里的观点显然在互联网上是“小众”的。但如果问问一个真正的密码学家,你很难找到一个认同阻塞 /dev/random 的人。 + +比如我们看看 [Daniel Bernstein][5] djb: + +> 我们密码学家对这种胡乱迷信行为表示不负责。你想想,写 /dev/random man 页面的人好像同时相信: +> +> * (1) 我们不知道如何用一个 256-bit 长的 /dev/random 的输出来生成一个无限长的随机密钥串流(这是我们需要 /dev/urandom 吐出来的),但与此同时 +> * (2) 我们却知道怎么用单个密钥来加密一条消息(这是 SSL,PGP 之类干的事情) +> +> + +> +> 对密码学家来说这甚至都不好笑了 + + + +或早 [Thomas Pornin][6],他也是我在 stackexchange 上见过最乐于助人的一位: + +> 简单来说,是的。展开说,答案还是一样。/dev/urandom 生成的数据可以说和真随机完全无法区分,至少在现有科技水平下。使用比 /dev/urandom “更好的“随机性毫无意义,除非你在使用极为罕见的“信息论安全”的加密算法。这肯定不是你的情况,不然你早就说了。 +> +> urandom 的 man 页面多多少少有些误导人,或者干脆可以说是错的——特别是当它说 /dev/urandom 会“用完熵”以及 “/dev/random 是更好的”那几句话; + + + +或者 [Thomas Ptacek][7],他不设计密码算法或者密码学系统,但他是一家名声在外的安全咨询公司的创始人,这家公司负责很多渗透和破解烂密码学算法的测试: + +> 用 urandom。用 urandom。用 urandom。用 urandom。用 urandom。 + + + +### 没有完美 + +/dev/urandom 不是完美的,问题分两层: + +在 Linux 上,不像 FreeBSD,/dev/urandom 永远不阻塞。记得安全性取决于某个最一开始决定的随机性?种子? + +Linux 的 /dev/urandom 会很乐意给你吐点不怎么随机的随机数,甚至在内核有机会收集一丁点熵之前。什么时候有这种情况?当你系统刚刚启动的时候。 + +FreeBSD 的行为更正确点:/dev/random 和 /dev/urandom 是一样的,在系统启动的时候 /dev/random 会阻塞到有足够的熵为止,然后他们都再也不阻塞了。 + +与此同时 Linux 实行了一个新的 syscall,最早由 OpenBSD 引入叫 getentrypy(2),在 Linux 下这个叫 getrandom(2)。这个 syscall 有着上述正确的行为:阻塞到有足够的熵为止,然后再也不阻塞了。当然,这是个 syscall,而不是一个字节设备(译者:指不在 /dev/ 下),所以它在 shell 或者别的脚本语言里没那么容易获取。这个 syscall 自 Linux 3.17 起存在。 + +在 Linux 上其实这个问题不太大,因为 Linux 发行版会在启动的过程中储蓄一点随机数(这发生在已经有一些熵之后,因为启动程序不会在按下电源的一瞬间就开始运行)到一个种子文件,以便系统下次启动的时候读取。所以每次启动的时候系统都会从上一次会话里带一点随机性过来。 + +显然这比不上在关机脚本里写入一些随机种子,因为这样的显然就有更多熵可以操作了。但这样做显而易见的好处就是它不关心系统是不是正确关机了,比如可能你系统崩溃了。 + +而且这种做法在你真正第一次启动系统的时候也没法帮你随机,不过好在系统安装器一般会写一个种子文件,所以基本上问题不大。 + +虚拟机是另外一层问题。因为用户喜欢克隆他们,或者恢复到某个之前的状态。这种情况下那个种子文件就帮不到你了。 + +但解决方案依然和用 /dev/random 没关系,而是你应该正确的给每个克隆或者恢复的的镜像重新生成种子文件,之类的。 + +### 太长不看; + + 别问,问就是用 /dev/urandom ! + + +-------------------------------------------------------------------------------- + +via: https://www.2uo.de/myths-about-urandom/ + +作者:[Thomas Hühn][a] +译者:[Moelf](https://github.com/Moelf) +校对:[校对者ID](https://github.com/校对者ID) + +本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出 + +[a]:https://www.2uo.de/ +[1]:https://www.2uo.de/myths-about-urandom/structure-no.png +[2]:https://www.2uo.de/myths-about-urandom/structure-yes.png +[3]:https://www.2uo.de/myths-about-urandom/structure-new.png +[4]:http://blog.cr.yp.to/20140205-entropy.html +[5]:http://www.mail-archive.com/cryptography@randombit.net/msg04763.html +[6]:http://security.stackexchange.com/questions/3936/is-a-rand-from-dev-urandom-secure-for-a-login-key/3939#3939 +[7]:http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/