mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-13 22:30:37 +08:00
commit
bd82f5dbe5
@ -1,149 +0,0 @@
|
||||
Translating by qhwdw
|
||||
Reasons Kubernetes is cool
|
||||
============================================================
|
||||
|
||||
When I first learned about Kubernetes (a year and a half ago?) I really didn’t understand why I should care about it.
|
||||
|
||||
I’ve been working full time with Kubernetes for 3 months or so and now have some thoughts about why I think it’s useful. (I’m still very far from being a Kubernetes expert!) Hopefully this will help a little in your journey to understand what even is going on with Kubernetes!
|
||||
|
||||
I will try to explain some reason I think Kubenetes is interesting without using the words “cloud native”, “orchestration”, “container”, or any Kubernetes-specific terminology :). I’m going to explain this mostly from the perspective of a kubernetes operator / infrastructure engineer, since my job right now is to set up Kubernetes and make it work well.
|
||||
|
||||
I’m not going to try to address the question of “should you use kubernetes for your production systems?” at all, that is a very complicated question. (not least because “in production” has totally different requirements depending on what you’re doing)
|
||||
|
||||
### Kubernetes lets you run code in production without setting up new servers
|
||||
|
||||
The first pitch I got for Kubernetes was the following conversation with my partner Kamal:
|
||||
|
||||
Here’s an approximate transcript:
|
||||
|
||||
* Kamal: With Kubernetes you can set up a new service with a single command
|
||||
|
||||
* Julia: I don’t understand how that’s possible.
|
||||
|
||||
* Kamal: Like, you just write 1 configuration file, apply it, and then you have a HTTP service running in production
|
||||
|
||||
* Julia: But today I need to create new AWS instances, write a puppet manifest, set up service discovery, configure my load balancers, configure our deployment software, and make sure DNS is working, it takes at least 4 hours if nothing goes wrong.
|
||||
|
||||
* Kamal: Yeah. With Kubernetes you don’t have to do any of that, you can set up a new HTTP service in 5 minutes and it’ll just automatically run. As long as you have spare capacity in your cluster it just works!
|
||||
|
||||
* Julia: There must be a trap
|
||||
|
||||
There kind of is a trap, setting up a production Kubernetes cluster is (in my experience) is definitely not easy. (see [Kubernetes The Hard Way][3] for what’s involved to get started). But we’re not going to go into that right now!
|
||||
|
||||
So the first cool thing about Kubernetes is that it has the potential to make life way easier for developers who want to deploy new software into production. That’s cool, and it’s actually true, once you have a working Kubernetes cluster you really can set up a production HTTP service (“run 5 of this application, set up a load balancer, give it this DNS name, done”) with just one configuration file. It’s really fun to see.
|
||||
|
||||
### Kubernetes gives you easy visibility & control of what code you have running in production
|
||||
|
||||
IMO you can’t understand Kubernetes without understanding etcd. So let’s talk about etcd!
|
||||
|
||||
Imagine that I asked you today “hey, tell me every application you have running in production, what host it’s running on, whether it’s healthy or not, and whether or not it has a DNS name attached to it”. I don’t know about you but I would need to go look in a bunch of different places to answer this question and it would take me quite a while to figure out. I definitely can’t query just one API.
|
||||
|
||||
In Kubernetes, all the state in your cluster – applications running (“pods”), nodes, DNS names, cron jobs, and more – is stored in a single database (etcd). Every Kubernetes component is stateless, and basically works by
|
||||
|
||||
* Reading state from etcd (eg “the list of pods assigned to node 1”)
|
||||
|
||||
* Making changes (eg “actually start running pod A on node 1”)
|
||||
|
||||
* Updating the state in etcd (eg “set the state of pod A to ‘running’”)
|
||||
|
||||
This means that if you want to answer a question like “hey, how many nginx pods do I have running right now in that availabliity zone?” you can answer it by querying a single unified API (the Kubernetes API!). And you have exactly the same access to that API that every other Kubernetes component does.
|
||||
|
||||
This also means that you have easy control of everything running in Kubernetes. If you want to, say,
|
||||
|
||||
* Implement a complicated custom rollout strategy for deployments (deploy 1 thing, wait 2 minutes, deploy 5 more, wait 3.7 minutes, etc)
|
||||
|
||||
* Automatically [start a new webserver][1] every time a branch is pushed to github
|
||||
|
||||
* Monitor all your running applications to make sure all of them have a reasonable cgroups memory limit
|
||||
|
||||
all you need to do is to write a program that talks to the Kubernetes API. (a “controller”)
|
||||
|
||||
Another very exciting thing about the Kubernetes API is that you’re not limited to just functionality that Kubernetes provides! If you decide that you have your own opinions about how your software should be deployed / created / monitored, then you can write code that uses the Kubernetes API to do it! It lets you do everything you need.
|
||||
|
||||
### If every Kubernetes component dies, your code will still keep running
|
||||
|
||||
One thing I was originally promised (by various blog posts :)) about Kubernetes was “hey, if the Kubernetes apiserver and everything else dies, it’s ok, your code will just keep running”. I thought this sounded cool in theory but I wasn’t sure if it was actually true.
|
||||
|
||||
So far it seems to be actually true!
|
||||
|
||||
I’ve been through some etcd outages now, and what happens is
|
||||
|
||||
1. All the code that was running keeps running
|
||||
|
||||
2. Nothing _new_ happens (you can’t deploy new code or make changes, cron jobs will stop working)
|
||||
|
||||
3. When everything comes back, the cluster will catch up on whatever it missed
|
||||
|
||||
This does mean that if etcd goes down and one of your applications crashes or something, it can’t come back up until etcd returns.
|
||||
|
||||
### Kubernetes’ design is pretty resilient to bugs
|
||||
|
||||
Like any piece of software, Kubernetes has bugs. For example right now in our cluster the controller manager has a memory leak, and the scheduler crashes pretty regularly. Bugs obviously aren’t good but so far I’ve found that Kubernetes’ design helps mitigate a lot of the bugs in its core components really well.
|
||||
|
||||
If you restart any component, what happens is:
|
||||
|
||||
* It reads all its relevant state from etcd
|
||||
|
||||
* It starts doing the necessary things it’s supposed to be doing based on that state (scheduling pods, garbage collecting completed pods, scheduling cronjobs, deploying daemonsets, whatever)
|
||||
|
||||
Because all the components don’t keep any state in memory, you can just restart them at any time and that can help mitigate a variety of bugs.
|
||||
|
||||
For example! Let’s say you have a memory leak in your controller manager. Because the controller manager is stateless, you can just periodically restart it every hour or something and feel confident that you won’t cause any consistency issues. Or we ran into a bug in the scheduler where it would sometimes just forget about pods and never schedule them. You can sort of mitigate this just by restarting the scheduler every 10 minutes. (we didn’t do that, we fixed the bug instead, but you _could_ :) )
|
||||
|
||||
So I feel like I can trust Kubernetes’ design to help make sure the state in the cluster is consistent even when there are bugs in its core components. And in general I think the software is generally improving over time. The only stateful thing you have to operate is etcd
|
||||
|
||||
Not to harp on this “state” thing too much but – I think it’s cool that in Kubernetes the only thing you have to come up with backup/restore plans for is etcd (unless you use persistent volumes for your pods). I think it makes kubernetes operations a lot easier to think about.
|
||||
|
||||
### Implementing new distributed systems on top of Kubernetes is relatively easy
|
||||
|
||||
Suppose you want to implement a distributed cron job scheduling system! Doing that from scratch is a ton of work. But implementing a distributed cron job scheduling system inside Kubernetes is much easier! (still not trivial, it’s still a distributed system)
|
||||
|
||||
The first time I read the code for the Kubernetes cronjob controller I was really delighted by how simple it was. Here, go read it! The main logic is like 400 lines of Go. Go ahead, read it! => [cronjob_controller.go][4] <=
|
||||
|
||||
Basically what the cronjob controller does is:
|
||||
|
||||
* Every 10 seconds:
|
||||
* Lists all the cronjobs that exist
|
||||
|
||||
* Checks if any of them need to run right now
|
||||
|
||||
* If so, creates a new Job object to be scheduled & actually run by other Kubernetes controllers
|
||||
|
||||
* Clean up finished jobs
|
||||
|
||||
* Repeat
|
||||
|
||||
The Kubernetes model is pretty constrained (it has this pattern of resources are defined in etcd, controllers read those resources and update etcd), and I think having this relatively opinionated/constrained model makes it easier to develop your own distributed systems inside the Kubernetes framework.
|
||||
|
||||
Kamal introduced me to this idea of “Kubernetes is a good platform for writing your own distributed systems” instead of just “Kubernetes is a distributed system you can use” and I think it’s really interesting. He has a prototype of a [system to run an HTTP service for every branch you push to github][5]. It took him a weekend and is like 800 lines of Go, which I thought was impressive!
|
||||
|
||||
### Kubernetes lets you do some amazing things (but isn’t easy)
|
||||
|
||||
I started out by saying “kubernetes lets you do these magical things, you can just spin up so much infrastructure with a single configuration file, it’s amazing”. And that’s true!
|
||||
|
||||
What I mean by “Kubernetes isn’t easy” is that Kubernetes has a lot of moving parts learning how to successfully operate a highly available Kubernetes cluster is a lot of work. Like I find that with a lot of the abstractions it gives me, I need to understand what is underneath those abstractions in order to debug issues and configure things properly. I love learning new things so this doesn’t make me angry or anything, I just think it’s important to know :)
|
||||
|
||||
One specific example of “I can’t just rely on the abstractions” that I’ve struggled with is that I needed to learn a LOT [about how networking works on Linux][6] to feel confident with setting up Kubernetes networking, way more than I’d ever had to learn about networking before. This was very fun but pretty time consuming. I might write more about what is hard/interesting about setting up Kubernetes networking at some point.
|
||||
|
||||
Or I wrote a [2000 word blog post][7] about everything I had to learn about Kubernetes’ different options for certificate authorities to be able to set up my Kubernetes CAs successfully.
|
||||
|
||||
I think some of these managed Kubernetes systems like GKE (google’s kubernetes product) may be simpler since they make a lot of decisions for you but I haven’t tried any of them.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://jvns.ca/blog/2017/10/05/reasons-kubernetes-is-cool/
|
||||
|
||||
作者:[ Julia Evans][a]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]:https://jvns.ca/about
|
||||
[1]:https://github.com/kamalmarhubi/kubereview
|
||||
[2]:https://jvns.ca/categories/kubernetes
|
||||
[3]:https://github.com/kelseyhightower/kubernetes-the-hard-way
|
||||
[4]:https://github.com/kubernetes/kubernetes/blob/e4551d50e57c089aab6f67333412d3ca64bc09ae/pkg/controller/cronjob/cronjob_controller.go
|
||||
[5]:https://github.com/kamalmarhubi/kubereview
|
||||
[6]:https://jvns.ca/blog/2016/12/22/container-networking/
|
||||
[7]:https://jvns.ca/blog/2017/08/05/how-kubernetes-certificates-work/
|
150
translated/tech/20171005 Reasons Kubernetes is cool.md
Normal file
150
translated/tech/20171005 Reasons Kubernetes is cool.md
Normal file
@ -0,0 +1,150 @@
|
||||
为什么 Kubernetes 很酷
|
||||
============================================================
|
||||
|
||||
在我刚开始学习 Kubernetes(大约是一年半以前吧?)时,我真的不明白为什么应该去关注它。
|
||||
|
||||
在我使用 Kubernetes 全职工作了三个多月后,我才有了一些想法为什么我应该考虑使用它了。(我距离成为一个 Kubernetes 专家还很远!)希望这篇文章对你理解 Kubernetes 能做什么会有帮助!
|
||||
|
||||
我将尝试去解释我认为的对 Kubernetes 感兴趣的一些原因,而不去使用 “原生云(cloud native)”、“编排系统(orchestration)"、”容器(container)“、或者任何 Kubernetes 专用的术语 :)。我去解释的这些主要来自 Kubernetes 操作者/基础设施工程师的观点,因为,我现在的工作就是去配置 Kubernetes 和让它工作的更好。
|
||||
|
||||
我根本就不去尝试解决一些如 “你应该在你的生产系统中使用 Kubernetes 吗?”这样的问题。那是非常复杂的问题。(不仅是因为“生产系统”根据你的用途而总是有不同的要求“)
|
||||
|
||||
### Kubernetes 可以让你在生产系统中运行代码而不需要去设置一台新的服务器
|
||||
|
||||
我首次被说教使用 Kubernetes 是与我的伙伴 Kamal 的下面的谈话:
|
||||
|
||||
大致是这样的:
|
||||
|
||||
* Kamal: 使用 Kubernetes 你可以通过几个简单的命令就能设置一台新的服务器。
|
||||
|
||||
* Julia: 我觉得不太可能吧。
|
||||
|
||||
* Kamal: 像这样,你写一个配置文件,然后应用它,这时候,你就在生产系统中运行了一个 HTTP 服务。
|
||||
|
||||
* Julia: 但是,现在我需要去创建一个新的 AWS 实例,明确地写一个 Puppet,设置服务发现,配置负载均衡,配置开发软件,并且确保 DNS 正常工作,如果没有什么问题的话,至少在 4 小时后才能投入使用。
|
||||
|
||||
* Kamal: 是的,使用 Kubernetes 你不需要做那么多事情,你可以在 5 分钟内设置一台新的 HTTP 服务,并且它将自动运行。只要你的集群中有空闲的资源它就能正常工作!
|
||||
|
||||
* Julia: 这儿一定是一个”坑“
|
||||
|
||||
这里有一种陷阱,设置一个生产用 Kubernetes 集群(在我的经险中)确实并不容易。(查看 [Kubernetes The Hard Way][3] 中去开始使用时有哪些复杂的东西)但是,我们现在并不深入讨论它。
|
||||
|
||||
因此,Kubernetes 第一个很酷的事情是,它可能使那些想在生产系统中部署新开发的软件的方式变得更容易。那是很酷的事,而且它真的是这样,因此,一旦你使用一个 Kubernetes 集群工作,你真的可以仅使用一个配置文件在生产系统中设置一台 HTTP 服务(在 5 分钟内运行这个应用程序,设置一个负载均衡,给它一个 DNS 名字,等等)。看起来真的很有趣。
|
||||
|
||||
### 对于运行在生产系统中的你的代码,Kubernetes 可以提供更好的可见性和可管理性
|
||||
|
||||
在我看来,在理解 etcd 之前,你可能不会理解 Kubernetes 的。因此,让我们先讨论 etcd!
|
||||
|
||||
想像一下,如果现在我这样问你,”告诉我你运行在生产系统中的每个应用程序,它运行在哪台主机上?它是否状态很好?是否为它分配了一个 DNS 名字?”我并不知道这些,但是,我可能需要到很多不同的地方去查询来回答这些问题,并且,我需要花很长的时间才能搞定。我现在可以很确定地说不需要查询,仅一个 API 就可以搞定它们。
|
||||
|
||||
在 Kubernetes 中,你的集群的所有状态 – 应用程序运行 (“pods”)、节点、DNS 名字、 cron 任务、 等等 – 都保存在一个单一的数据库中(etcd)。每个 Kubernetes 组件是无状态的,并且基本是通过下列来工作的。
|
||||
|
||||
* 从 etcd 中读取状态(比如,“分配给节点 1 的 pods 列表“)
|
||||
|
||||
* 产生变化(比如,”在节点 1 上运行 pod A")
|
||||
|
||||
* 更新 etcd 中的状态(比如,“设置 pod A 的状态为 ‘running’”)
|
||||
|
||||
这意味着,如果你想去回答诸如 “在那个可用区域中有多少台运行 nginx 的 pods?” 这样的问题时,你可以通过查询一个统一的 API(Kubernetes API)去回答它。并且,你可以在每个其它 Kubernetes 组件上运行那个 API 去进行同样的访问。
|
||||
|
||||
这也意味着,你可以很容易地去管理每个运行在 Kubernetes 中的任何东西。如果你想这样做,你可以:
|
||||
|
||||
* 为部署实现一个复杂的定制的部署策略(部署一个东西,等待 2 分钟,部署 5 个以上,等待 3.7 分钟,等等)
|
||||
|
||||
* 每当推送到 github 上一个分支,自动化 [启动一个新的 web 服务器][1]
|
||||
|
||||
* 监视所有你的运行的应用程序,确保它们有一个合理的内存使用限制。
|
||||
|
||||
所有你需要做的这些事情,只需要写一个告诉 Kubernetes API(“controller”)的程序就可以了。
|
||||
|
||||
关于 Kubernetes API 的其它的令人激动的事情是,你不会被局限为 Kubernetes 提供的现有功能!如果对于你想去部署/创建/监视的软件有你自己的想法,那么,你可以使用 Kubernetes API 去写一些代码去达到你的目的!它可以让你做到你想做的任何事情。
|
||||
|
||||
### 如果每个 Kubernetes 组件都“挂了”,你的代码将仍然保持运行
|
||||
|
||||
关于 Kubernetes 我承诺的(通过各种博客文章:))一件事情是,“如果 Kubernetes API 服务和其它组件”挂了“,你的代码将一直保持运行状态”。从理论上说,这是它第二件很酷的事情,但是,我不确定它是否真是这样的。
|
||||
|
||||
到目前为止,这似乎是真的!
|
||||
|
||||
我已经断开了一些正在运行的 etcd,它会发生的事情是
|
||||
|
||||
1. 所有的代码继续保持运行状态
|
||||
|
||||
2. 不能做 _新的_ 事情(你不能部署新的代码或者生成变更,cron 作业将停止工作)
|
||||
|
||||
3. 当它恢复时,集群将赶上这期间它错过的内容
|
||||
|
||||
这样做,意味着如果 etcd 宕掉,并且你的应用程序的其中之一崩溃或者发生其它事情,在 etcd 恢复之前,它并不能返回(come back up)。
|
||||
|
||||
### Kubernetes 的设计对 bugs 很有弹性
|
||||
|
||||
与任何软件一样,Kubernetes 有 bugs。例如,到目前为止,我们的集群控制管理器有内存泄漏,并且,调度器经常崩溃。Bugs 当然不好,但是,我发现 Kubernetes 的设计,帮助减少了许多在它的内核中的错误。
|
||||
|
||||
如果你重启动任何组件,将发生:
|
||||
|
||||
* 从 etcd 中读取所有的与它相关的状态
|
||||
|
||||
* 基于那些状态(调度 pods、全部 pods 的垃圾回收、调度 cronjobs、按需部署、等等),它启动去做它认为必须要做的事情。
|
||||
|
||||
因为,所有的组件并不会在内存中保持状态,你在任何时候都可以重启它们,它可以帮助你减少各种 bugs。
|
||||
|
||||
例如,假如说,在你的控制管理器中有内存泄露。因为,控制管理器是无状态的,你可以每小时定期去启动它,或者,感觉到可能导致任何不一致的问题发生时。或者 ,在我们运行的调度器中有一个 bug,它有时仅仅是忘记了 pods 或者从来没有调度它们。你可以每隔 10 分钟来重启调度器来缓减这种情况。(我们并不这么做,而是去修复这个 bug,但是,你_可以吗_:))
|
||||
|
||||
因此,我觉得即使在它的内核组件中有 bug,我仍然可以信任 Kubernetes 的设计去帮助我确保集群状态的一致性。并且,总在来说,随着时间的推移软件将会提高。你去操作的仅有的有状态的东西是 etcd。
|
||||
|
||||
不用过多地讨论“状态”这个东西 – 但是,我认为在 Kubernetes 中很酷的一件事情是,唯一需要去做备份/恢复计划的事情是 etcd (除非为你的 pods 使用了持久化存储的卷)。我认为这样可以使 kubernetes 对关于你考虑的事情的操作更容易一些。
|
||||
|
||||
### 在 Kubernetes 之上实现新的分发系统是非常容易的
|
||||
|
||||
假设你想去实现一个分发 cron 作业调度系统!从零开始做工作量非常大。但是,在 Kubernetes 里面实现一个分发 cron 作业调度系统是非常容易的!(它仍然是一个分布式系统)
|
||||
|
||||
我第一次读到 Kubernetes 的 cronjob 作业控制器的代码时,它是如此的简单,我真的特别高兴。它在这里,去读它吧,主要的逻辑大约是 400 行。去读它吧! => [cronjob_controller.go][4] <=
|
||||
|
||||
从本质上来看,cronjob 控制器做了:
|
||||
|
||||
* 每 10 秒钟:
|
||||
* 列出所有已存在的 cronjobs
|
||||
|
||||
* 检查是否有需要现在去运行的任务
|
||||
|
||||
* 如果有,创建一个新的作业对象去被调度并通过其它的 Kubernetes 控制器去真正地去运行它
|
||||
|
||||
* 清理已完成的作业
|
||||
|
||||
* 重复以上工作
|
||||
|
||||
Kubernetes 模型是很受限制的(它有定义在 etcd 中的资源模式,控制器读取这个资源和更新 etcd),我认为这种相关的固有的/受限制的模型,可以使它更容易地在 Kubernetes 框架中开发你自己的分布式系统。
|
||||
|
||||
Kamal 介绍给我的 “ Kubernetes 是一个写你自己的分布式系统的很好的平台” 这一想法,而不是“ Kubernetes 是一个你可以使用的分布式系统”,并且,我想我对它真的有兴趣。他有一个 [system to run an HTTP service for every branch you push to github][5] 的雏型。他花了一个周末的时候,大约有了 800 行,我觉得它真的很不错!
|
||||
|
||||
### Kubernetes 可以使你做一些非常神奇的事情(但并不容易)
|
||||
|
||||
我一开始就说 “kubernetes 可以让你做一些很神奇的事情,你可以用一个配置文件来做这么多的基础设施,它太神奇了”,而且这是真的!
|
||||
|
||||
为什么说“Kubernetes 并不容易”呢?,是因为 Kubernetes 有很多的课件去学习怎么去成功地运营一个高可用的 Kubernetes 集群要做很多的工作。就像我发现它给我了许多抽象的东西,我需要去理解这些抽象的东西,为了去调试问题和正确地配置它们。我喜欢学习新东西,因此,它并不会使我发狂或者生气,我只是觉得理解它很重要:)
|
||||
|
||||
对于 “我不能仅依靠抽象概念” 的一个具体的例子是,我一直在努力学习需要的更多的 [Linux 上的关于网络的工作][6],去对设置 Kubernetes 网络有信心,这比我以前学过的关于网络的知识要多很多。这种方式很有意思但是非常费时间。在以后的某个时间,我可以写更多的关于设置 Kubernetes 网络的困难的/有趣的事情。
|
||||
|
||||
或者,我写一个关于学习 Kubernetes 的不同选项所做事情的 [2000 字的博客文章][7],才能够成功去设置我的 Kubernetes CAs。
|
||||
|
||||
我觉得,像 GKE (google 的 Kubernetes 生产系统) 这样的一些管理 Kubernetes 的系统可能更简单,因为,他们为你做了许多的决定,但是,我没有尝试过它们。
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://jvns.ca/blog/2017/10/05/reasons-kubernetes-is-cool/
|
||||
|
||||
作者:[Julia Evans][a]
|
||||
译者:[qhwdw](https://github.com/qhwdw)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]:https://jvns.ca/about
|
||||
[1]:https://github.com/kamalmarhubi/kubereview
|
||||
[2]:https://jvns.ca/categories/kubernetes
|
||||
[3]:https://github.com/kelseyhightower/kubernetes-the-hard-way
|
||||
[4]:https://github.com/kubernetes/kubernetes/blob/e4551d50e57c089aab6f67333412d3ca64bc09ae/pkg/controller/cronjob/cronjob_controller.go
|
||||
[5]:https://github.com/kamalmarhubi/kubereview
|
||||
[6]:https://jvns.ca/blog/2016/12/22/container-networking/
|
||||
[7]:https://jvns.ca/blog/2017/08/05/how-kubernetes-certificates-work/
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user