mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-13 22:30:37 +08:00
[翻译完成] How to get into DevOps 初稿 (#10300)
This commit is contained in:
parent
61e375f04f
commit
c9bb7fd030
@ -1,144 +0,0 @@
|
|||||||
belitex 翻译中
|
|
||||||
How to get into DevOps
|
|
||||||
======
|
|
||||||
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/rh_003784_02_os.comcareers_resume_rh1x.png?itok=S3HGxi6E)
|
|
||||||
|
|
||||||
I've observed a sharp uptick of developers and systems administrators interested in "getting into DevOps" within the past year or so. This pattern makes sense: In an age in which a single developer can spin up a globally distributed infrastructure for an application with a few dollars and a few API calls, the gap between development and systems administration is closer than ever. Although I've seen plenty of blog posts and articles about cool DevOps tools and thoughts to think about, I've seen fewer content on pointers and suggestions for people looking to get into this work.
|
|
||||||
|
|
||||||
My goal with this article is to draw what that path looks like. My thoughts are based upon several interviews, chats, late-night discussions on [reddit.com/r/devops][1], and random conversations, likely over beer and delicious food. I'm also interested in hearing feedback from those who have made the jump; if you have, please reach out through [my blog][2], [Twitter][3], or in the comments below. I'd love to hear your thoughts and stories.
|
|
||||||
|
|
||||||
### Olde world IT
|
|
||||||
|
|
||||||
Understanding history is key to understanding the future, and DevOps is no exception. To understand the pervasiveness and popularity of the DevOps movement, understanding what IT was like in the late '90s and most of the '00s is helpful. This was my experience.
|
|
||||||
|
|
||||||
I started my career in late 2006 as a Windows systems administrator in a large, multi-national financial services firm. In those days, adding new compute involved calling Dell (or, in our case, CDW) and placing a multi-hundred-thousand-dollar order of servers, networking equipment, cables, and software, all destined for your on- and offsite datacenters. Although VMware was still convincing companies that using virtual machines was, indeed, a cost-effective way of hosting its "performance-sensitive" application, many companies, including mine, pledged allegiance to running applications on their physical hardware.
|
|
||||||
|
|
||||||
Our technology department had an entire group dedicated to datacenter engineering and operations, and its job was to negotiate our leasing rates down to some slightly less absurd monthly rate and ensure that our systems were being cooled properly (an exponentially difficult problem if you have enough equipment). If the group was lucky/wealthy enough, the offshore datacenter crew knew enough about all of our server models to not accidentally pull the wrong thing during after-hours trading. Amazon Web Services and Rackspace were slowly beginning to pick up steam, but were far from critical mass.
|
|
||||||
|
|
||||||
In those days, we also had teams dedicated to ensuring that the operating systems and software running on top of that hardware worked when they were supposed to. The engineers were responsible for designing reliable architectures for patching, monitoring, and alerting these systems as well as defining what the "gold image" looked like. Most of this work was done with a lot of manual experimentation, and the extent of most tests was writing a runbook describing what you did, and ensuring that what you did actually did what you expected it to do after following said runbook. This was important in a large organization like ours, since most of the level 1 and 2 support was offshore, and the extent of their training ended with those runbooks.
|
|
||||||
|
|
||||||
(This is the world that your author lived in for the first three years of his career. My dream back then was to be the one who made the gold standard!)
|
|
||||||
|
|
||||||
Software releases were another beast altogether. Admittedly, I didn't gain a lot of experience working on this side of the fence. However, from stories that I've gathered (and recent experience), much of the daily grind for software development during this time went something like this:
|
|
||||||
|
|
||||||
* Developers wrote code as specified by the technical and functional requirements laid out by business analysts from meetings they weren't invited to.
|
|
||||||
* Optionally, developers wrote unit tests for their code to ensure that it didn't do anything obviously crazy, like try to divide over zero without throwing an exception.
|
|
||||||
* When done, developers would mark their code as "Ready for QA." A quality assurance person would pick up the code and run it in their own environment, which might or might not be like production or even the environment the developer used to test their own code against.
|
|
||||||
* Failures would get sent back to the developers within "a few days or weeks" depending on other business activities and where priorities fell.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Although sysadmins and developers didn't often see eye to eye, the one thing they shared a common hatred for was "change management." This was a composition of highly regulated (and in the case of my employer at the time), highly necessary rules and procedures governing when and how technical changes happened in a company. Most companies followed [ITIL][4] practices, which, in a nutshell, asked a lot of questions around why, when, where, and how things happened and provided a process for establishing an audit trail of the decisions that led up to those answers.
|
|
||||||
|
|
||||||
As you could probably gather from my short history lesson, many, many things were done manually in IT. This led to a lot of mistakes. Lots of mistakes led up to lots of lost revenue. Change management's job was to minimize those lost revenues; this usually came in the form of releases only every two weeks and changes to servers, regardless of their impact or size, queued up to occur between Friday at 4 p.m. and Monday at 5:59 a.m. (Ironically, this batching of work led to even more mistakes, usually more serious ones.)
|
|
||||||
|
|
||||||
### DevOps isn't a Tiger Team
|
|
||||||
|
|
||||||
You might be thinking "What is Carlos going on about, and when is he going to talk about Ansible playbooks?" I love Ansible tons, but hang on; this is important.
|
|
||||||
|
|
||||||
Have you ever been assigned to a project where you had to interact with the "DevOps" team? Or did you have to rely on a "configuration management" or "CI/CD" team to ensure your pipeline was set up properly? Have you had to attend meetings about your release and what it pertains to--weeks after the work was marked "code complete"?
|
|
||||||
|
|
||||||
If so, then you're reliving history. All of that comes from all of the above.
|
|
||||||
|
|
||||||
[Silos form][5] out of an instinctual draw to working with people like ourselves. Naturally, it's no surprise that this human trait also manifests in the workplace. I even saw this play out at a 250-person startup where I used to work. When I started, developers all worked in common pods and collaborated heavily with each other. As the codebase grew in complexity, developers who worked on common features naturally aligned with each other to try and tackle the complexity within their own feature. Soon afterwards, feature teams were officially formed.
|
|
||||||
|
|
||||||
Sysadmins and developers at many of the companies I worked at not only formed natural silos like this, but also fiercely competed with each other. Developers were mad at sysadmins when their environments were broken. Developers were mad at sysadmins when their environments were too locked down. Sysadmins were mad that developers were breaking their environments in arbitrary ways all of the time. Sysadmins were mad at developers for asking for way more computing power than they needed. Neither side understood each other, and worse yet, neither side wanted to.
|
|
||||||
|
|
||||||
Most developers were uninterested in the basics of operating systems, kernels, or, in some cases, computer hardware. As well, most sysadmins, even Linux sysadmins, took a 10-foot pole approach to learning how to code. They tried a bit of C in college, hated it and never wanted to touch an IDE again. Consequently, developers threw their environment problems over the wall to sysadmins, sysadmins prioritized them with the hundreds of other things that were thrown over the wall to them, and everyone busy-waited angrily while hating each other. The purpose of DevOps was to put an end to this.
|
|
||||||
|
|
||||||
DevOps isn't a team. CI/CD isn't a group in Jira. DevOps is a way of thinking. According to the movement, in an ideal world, developers, sysadmins, and business stakeholders would be working as one team. While they might not know everything about each other's worlds, not only do they all know enough to understand each other and their backlogs, but they can, for the most part, speak the same language.
|
|
||||||
|
|
||||||
This is the basis behind having all infrastructure and business logic be in code and subject to the same deployment pipelines as the software that sits on top of it. Everybody is winning because everyone understands each other. This is also the basis behind the rise of other tools like chatbots and easily accessible monitoring and graphing.
|
|
||||||
|
|
||||||
[Adam Jacob said][6] it best: "DevOps is the word we will use to describe the operational side of the transition to enterprises being software led."
|
|
||||||
|
|
||||||
### What do I need to know to get into DevOps?
|
|
||||||
|
|
||||||
I'm commonly asked this question, and the answer, like most open-ended questions like this, is: It depends.
|
|
||||||
|
|
||||||
At the moment, the "DevOps engineer" varies from company to company. Smaller companies that have plenty of software developers but fewer folks that understand infrastructure will likely look for people with more experience administrating systems. Other, usually larger and/or older companies that have a solid sysadmin organization will likely optimize for something closer to a [Google site reliability engineer][7], i.e. "a software engineer to design an operations function." This isn't written in stone, however, as, like any technology job, the decision largely depends on the hiring manager sponsoring it.
|
|
||||||
|
|
||||||
That said, we typically look for engineers who are interested in learning more about:
|
|
||||||
|
|
||||||
* How to administrate and architect secure and scalable cloud platforms (usually on AWS, but Azure, Google Cloud Platform, and PaaS providers like DigitalOcean and Heroku are popular too);
|
|
||||||
* How to build and optimize deployment pipelines and deployment strategies on popular [CI/CD][8] tools like Jenkins, Go continuous delivery, and cloud-based ones like Travis CI or CircleCI;
|
|
||||||
* How to monitor, log, and alert on changes in your system with timeseries-based tools like Kibana, Grafana, Splunk, Loggly, or Logstash; and
|
|
||||||
* How to maintain infrastructure as code with configuration management tools like Chef, Puppet, or Ansible, as well as deploy said infrastructure with tools like Terraform or CloudFormation.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Containers are becoming increasingly popular as well. Despite the [beef against the status quo][9] surrounding Docker at scale, containers are quickly becoming a great way of achieving an extremely high density of services and applications running on fewer systems while increasing their reliability. (Orchestration tools like Kubernetes or Mesos can spin up new containers in seconds if the host they're being served by fails.) Given this, having knowledge of Docker or rkt and an orchestration platform will go a long way.
|
|
||||||
|
|
||||||
If you're a systems administrator that's looking to get into DevOps, you will also need to know how to write code. Python and Ruby are popular languages for this purpose, as they are portable (i.e., can be used on any operating system), fast, and easy to read and learn. They also form the underpinnings of the industry's most popular configuration management tools (Python for Ansible, Ruby for Chef and Puppet) and cloud API clients (Python and Ruby are commonly used for AWS, Azure, and Google Cloud Platform clients).
|
|
||||||
|
|
||||||
If you're a developer looking to make this change, I highly recommend learning more about Unix, Windows, and networking fundamentals. Even though the cloud abstracts away many of the complications of administrating a system, debugging slow application performance is aided greatly by knowing how these things work. I've included a few books on this topic in the next section.
|
|
||||||
|
|
||||||
If this sounds overwhelming, you aren't alone. Fortunately, there are plenty of small projects to dip your feet into. One such toy project is Gary Stafford's Voter Service, a simple Java-based voting platform. We ask our candidates to take the service from GitHub to production infrastructure through a pipeline. One can combine that with Rob Mile's awesome DevOps Tutorial repository to learn about ways of doing this.
|
|
||||||
|
|
||||||
Another great way of becoming familiar with these tools is taking popular services and setting up an infrastructure for them using nothing but AWS and configuration management. Set it up manually first to get a good idea of what to do, then replicate what you just did using nothing but CloudFormation (or Terraform) and Ansible. Surprisingly, this is a large part of the work that we infrastructure devs do for our clients on a daily basis. Our clients find this work to be highly valuable!
|
|
||||||
|
|
||||||
### Books to read
|
|
||||||
|
|
||||||
If you're looking for other resources on DevOps, here are some theory and technical books that are worth a read.
|
|
||||||
|
|
||||||
#### Theory books
|
|
||||||
|
|
||||||
* [The Phoenix Project][10] by Gene Kim. This is a great book that covers much of the history I explained earlier (with much more color) and describes the journey to a lean company running on agile and DevOps.
|
|
||||||
* [Driving Technical Change][11] by Terrance Ryan. Awesome little book on common personalities within most technology organizations and how to deal with them. This helped me out more than I expected.
|
|
||||||
* [Peopleware][12] by Tom DeMarco and Tim Lister. A classic on managing engineering organizations. A bit dated, but still relevant.
|
|
||||||
* [Time Management for System Administrators][13] by Tom Limoncelli. While this is heavily geared towards sysadmins, it provides great insight into the life of a systems administrator at most large organizations. If you want to learn more about the war between sysadmins and developers, this book might explain more.
|
|
||||||
* [The Lean Startup][14] by Eric Ries. Describes how Eric's 3D avatar company, IMVU, discovered how to work lean, fail fast, and find profit faster.
|
|
||||||
* [Lean Enterprise][15] by Jez Humble and friends. This book is an adaption of The Lean Startup for the enterprise. Both are great reads and do a good job of explaining the business motivation behind DevOps.
|
|
||||||
* [Infrastructure As Code][16] by Kief Morris. Awesome primer on, well, infrastructure as code! It does a great job of describing why it's essential for any business to adopt this for their infrastructure.
|
|
||||||
* [Site Reliability Engineering][17] by Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall Richard Murphy. A book explaining how Google does SRE, or also known as "DevOps before DevOps was a thing." Provides interesting opinions on how to handle uptime, latency, and keeping engineers happy.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Technical books
|
|
||||||
|
|
||||||
If you're looking for books that'll take you straight to code, you've come to the right section.
|
|
||||||
|
|
||||||
* [TCP/IP Illustrated][18] by the late W. Richard Stevens. This is the classic (and, arguably, complete) tome on the fundamental networking protocols, with special emphasis on TCP/IP. If you've heard of Layers 1, 2, 3, and 4 and are interested in learning more, you'll need this book.
|
|
||||||
* [UNIX and Linux System Administration Handbook][19] by Evi Nemeth, Trent Hein, and Ben Whaley. A great primer into how Linux and Unix work and how to navigate around them.
|
|
||||||
* [Learn Windows Powershell In A Month of Lunches][20] by Don Jones and Jeffrey Hicks. If you're doing anything automated with Windows, you will need to learn how to use Powershell. This is the book that will help you do that. Don Jones is a well-known MVP in this space.
|
|
||||||
* Practically anything by [James Turnbull][21]. He puts out great technical primers on popular DevOps-related tools.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
From companies deploying everything to bare metal (there are plenty that still do, for good reasons) to trailblazers doing everything serverless, DevOps is likely here to stay for a while. The work is interesting, the results are impactful, and, most important, it helps bridge the gap between technology and business. It's a wonderful thing to see.
|
|
||||||
|
|
||||||
Originally published at [Neurons Firing on a Keyboard][22], CC-BY-SA.
|
|
||||||
|
|
||||||
--------------------------------------------------------------------------------
|
|
||||||
|
|
||||||
via: https://opensource.com/article/18/1/getting-devops
|
|
||||||
|
|
||||||
作者:[Carlos Nunez][a]
|
|
||||||
译者:[译者ID](https://github.com/译者ID)
|
|
||||||
校对:[校对者ID](https://github.com/校对者ID)
|
|
||||||
|
|
||||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
|
||||||
|
|
||||||
[a]:https://opensource.com/users/carlosonunez
|
|
||||||
[1]:https://www.reddit.com/r/devops/
|
|
||||||
[2]:https://carlosonunez.wordpress.com/
|
|
||||||
[3]:https://twitter.com/easiestnameever
|
|
||||||
[4]:https://en.wikipedia.org/wiki/ITIL
|
|
||||||
[5]:https://www.psychologytoday.com/blog/time-out/201401/getting-out-your-silo
|
|
||||||
[6]:https://twitter.com/adamhjk/status/572832185461428224
|
|
||||||
[7]:https://landing.google.com/sre/interview/ben-treynor.html
|
|
||||||
[8]:https://en.wikipedia.org/wiki/CI/CD
|
|
||||||
[9]:https://thehftguy.com/2016/11/01/docker-in-production-an-history-of-failure/
|
|
||||||
[10]:https://itrevolution.com/book/the-phoenix-project/
|
|
||||||
[11]:https://pragprog.com/book/trevan/driving-technical-change
|
|
||||||
[12]:https://en.wikipedia.org/wiki/Peopleware:_Productive_Projects_and_Teams
|
|
||||||
[13]:http://shop.oreilly.com/product/9780596007836.do
|
|
||||||
[14]:http://theleanstartup.com/
|
|
||||||
[15]:https://info.thoughtworks.com/lean-enterprise-book.html
|
|
||||||
[16]:http://infrastructure-as-code.com/book/
|
|
||||||
[17]:https://landing.google.com/sre/book.html
|
|
||||||
[18]:https://en.wikipedia.org/wiki/TCP/IP_Illustrated
|
|
||||||
[19]:http://www.admin.com/
|
|
||||||
[20]:https://www.manning.com/books/learn-windows-powershell-in-a-month-of-lunches-third-edition
|
|
||||||
[21]:https://jamesturnbull.net/
|
|
||||||
[22]:https://carlosonunez.wordpress.com/2017/03/02/getting-into-devops/
|
|
145
translated/talk/20180117 How to get into DevOps.md
Normal file
145
translated/talk/20180117 How to get into DevOps.md
Normal file
@ -0,0 +1,145 @@
|
|||||||
|
|
||||||
|
DevOps 实践指南
|
||||||
|
======
|
||||||
|
|
||||||
|
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/rh_003784_02_os.comcareers_resume_rh1x.png?itok=S3HGxi6E)
|
||||||
|
|
||||||
|
在去年大概一年的时间里,我注意到对“Devops 实践”感兴趣的开发人员和系统管理员突然有了明显的增加。这样的变化也合理:现在开发者只要花很少的钱,调用一些 API, 就能单枪匹马地在一整套分布式基础设施上运行自己的应用, 在这个时代,开发和运维的紧密程度前所未有。我看过许多博客和文章介绍很酷的 DevOps 工具和相关思想,但是给那些希望践行 DevOps 的人以指导和建议的内容,我却很少看到。
|
||||||
|
|
||||||
|
这篇文章的目的就是描述一下如何去实践。我的想法基于 Reddit 上 [devops][1] 的一些访谈、聊天和深夜讨论,还有一些随机谈话,一般都发生在享受啤酒和美食的时候。如果你已经开始这样实践,我对你的反馈很感兴趣,请通过 [我的博客][2] 或者 [Twitter][3] 联系我,也可以直接在下面评论。我很乐意听到你们的想法和故事。
|
||||||
|
|
||||||
|
### 古代的 IT
|
||||||
|
|
||||||
|
了解历史是搞清楚未来的关键,DevOps 也不例外。想搞清楚 DevOps 运动的普及和流行,去了解一下上世纪 90 年代后期和 21 世纪前十年 IT 的情况会有帮助。这是我的经验。
|
||||||
|
|
||||||
|
我的第一份工作是在一家大型跨国金融服务公司做 Windows 系统管理员。当时给计算资源扩容需要给 Dell 打电话 (或者像我们公司那样打给 CDW ),并下一个价值数十万美元的订单,包含服务器、网络设备、电缆和软件,所有这些都要运到在线或离线的数据中心去。虽然 VMware 仍在尝试说服企业使用虚拟机运行他们的“性能敏感”型程序是更划算的,但是包括我们在内的很多公司都还忠于使用他们的物理机运行应用。
|
||||||
|
|
||||||
|
在我们技术部门,有一个专门做数据中心工程和操作的完整团队,他们的工作包括价格谈判,让荒唐的租赁月费能够下降一点点,还包括保证我们的系统能够正常冷却(如果设备太多,这个事情的难度会呈指数增长)。如果这个团队足够幸运足够有钱,境外数据中心的工作人员对我们所有的服务器型号又都有足够的了解,就能避免在盘后交易中不小心扯错东西。那时候亚马逊 AWS 和 Rackspace 逐渐开始加速扩张,但还远远没到临界规模。
|
||||||
|
|
||||||
|
当时我们还有专门的团队来保证硬件上运行着的操作系统和软件能够按照预期工作。这些工程师负责设计可靠的架构以方便给系统打补丁,监控和报警,还要定义基础镜像 (gold image) 的内容。这些大都是通过很多手工实验完成的,很多手工实验是为了编写一个运行说明书 (runbook) 来描述要做的事情,并确保按照它执行后的结果确实在预期内。在我们这么大的组织里,这样做很重要,因为一线和二线的技术支持都是境外的,而他们的培训内容只覆盖到了这些运行说明而已。
|
||||||
|
|
||||||
|
(这是我职业生涯前三年的世界。我那时候的梦想是成为制定金本位制的人!)
|
||||||
|
|
||||||
|
软件发布则完全是另外一头怪兽。无可否认,我在这方面并没有积累太多经验。但是,从我收集的故事(和最近的经历)来看,当时大部分软件开发的日常大概是这样:
|
||||||
|
|
||||||
|
* 开发人员按照技术和功能需求来编写代码,这些需求来自于业务分析人员的会议,但是会议并没有邀请开发人员参加。
|
||||||
|
* 开发人员可以选择为他们的代码编写单元测试,以确保在代码里没有任何明显的疯狂行为,比如除以 0 但不抛出异常。
|
||||||
|
* 然后开发者会把他们的代码标记为 "Ready for QA."(准备好了接受测试),质量保障的成员会把这个版本的代码发布到他们自己的环境中,这个环境和生产环境可能相似,也可能不相似,甚至和开发环境相比也不一定相似。
|
||||||
|
* 故障会在几天或者几个星期内反馈到开发人员那里,这个时长取决于其他业务活动和优先事项。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
虽然系统管理员和开发人员经常有不一致的意见,但是对“变更管理”的痛恨却是一致的。变更管理由高度规范的(就我当时的雇主而言)和非常有必要的规则和程序组成,用来管理一家公司应该什么时候做技术变更,以及如何做。很多公司都按照 [ITIL][4] 来操作, 简单的说,ITIL 问了很多和事情发生的原因、时间、地点和方式相关的问题,而且提供了一个过程,对产生最终答案的决定做审计跟踪。
|
||||||
|
|
||||||
|
你可能从我的简短历史课上了解到,当时 IT 的很多很多事情都是手工完成的。这导致了很多错误。错误又导致了很多财产损失。变更管理的工作就是尽量减少这些损失,它常常以这样的形式出现:不管变更的影响和规模大小,每两周才能发布部署一次。周五下午 4 点到周一早上 5 点 59 分这段时间,需要排队等候发布窗口。(讽刺的是,这种流程导致了更多错误,通常还是更严重的那种错误)
|
||||||
|
|
||||||
|
### DevOps 不是专家团
|
||||||
|
|
||||||
|
你可能在想 "Carlos 你在讲啥啊,什么时候才能说到 Ansible playbooks? ",我热爱 Ansible, 但是请再等一会;下面这些很重要。
|
||||||
|
|
||||||
|
你有没有过被分配到过需要跟"DevOps"小组打交道的项目?你有没有依赖过“配置管理”或者“持续集成/持续交付”小组来保证业务流水线设置正确?你有没有在代码开发完的数周之后才参加发布部署的会议?
|
||||||
|
|
||||||
|
如果有过,那么你就是在重温历史,这个历史是由上面所有这些导致的。
|
||||||
|
|
||||||
|
出于本能,我们喜欢和像自己的人一起工作,这会导致[筒仓][5]的行成。很自然,这种人类特质也会在工作场所表现出来是不足为奇的。我甚至在一个 250 人的创业公司里见到过这样的现象,当时我在那里工作。刚开始的时候,开发人员都在聚在一起工作,彼此深度协作。随着代码变得复杂,开发相同功能的人自然就坐到了一起,解决他们自己的复杂问题。然后按功能划分的小组很快就正式形成了。
|
||||||
|
|
||||||
|
在我工作过的很多公司里,系统管理员和开发人员不仅像这样形成了天然的筒仓,而且彼此还有激烈的对抗。开发人员的环境出问题了或者他们的权限太小了,就会对系统管理员很恼火。系统管理员怪开发者无时不刻的不在用各种方式破坏他们的环境,怪开发人员申请的计算资源严重超过他们的需要。双方都不理解对方,更糟糕的是,双方都不愿意去理解对方。
|
||||||
|
|
||||||
|
大部分开发人员对操作系统,内核或计算机硬件都不感兴趣。同样的,大部分系统管理员,即使是 Linux 的系统管理员,也都不愿意学习编写代码,他们在大学期间学过一些 C 语言,然后就痛恨它,并且永远都不想再碰 IDE. 所以,开发人员把运行环境的问题甩给围墙外的系统管理员,系统管理员把这些问题和甩过来的其他上百个问题放在一起,做一个优先级安排。每个人都很忙,心怀怨恨的等待着。DevOps 的目的就是解决这种矛盾。
|
||||||
|
|
||||||
|
DevOps 不是一个团队,CI/CD 也不是 Jira 系统的一个用户组。DevOps 是一种思考方式。根据这个运动来看,在理想的世界里,开发人员、系统管理员和业务相关人将作为一个团队工作。虽然他们可能不完全了解彼此的世界,可能没有足够的知识去了解彼此的积压任务,但他们在大多数情况下能有一致的看法。
|
||||||
|
|
||||||
|
把所有基础设施和业务逻辑都代码化,再串到一个发布部署流水线里,就像是运行在这之上的应用一样。这个理念的基础就是 DevOps. 因为大家都理解彼此,所以人人都是赢家。聊天机器人和易用的监控工具、可视化工具的兴起,背后的基础也是 DevOps.
|
||||||
|
|
||||||
|
[Adam Jacob][6] 说的最好:"DevOps 就是企业往软件导向型过渡时我们用来描述操作的词"
|
||||||
|
|
||||||
|
### 要实践 DevOps 我需要知道些什么
|
||||||
|
|
||||||
|
我经常被问到这个问题,它的答案,和同属于开放式的其他大部分问题一样:视情况而定。
|
||||||
|
|
||||||
|
现在“DevOps 工程师”在不同的公司有不同的含义。在软件开发人员比较多但是很少有人懂基础设施的小公司,他们很可能是在找有更多系统管理经验的人。而其他公司,通常是大公司或老公司或又大又老的公司,已经有一个稳固的系统管理团队了,他们在向类似于谷歌 [SRE][7] 的方向做优化,也就是“设计操作功能的软件工程师”。但是,这并不是金科玉律,就像其他技术类工作一样,这个决定很大程度上取决于他的招聘经理。
|
||||||
|
|
||||||
|
也就是说,我们一般是在找对深入学习以下内容感兴趣的工程师:
|
||||||
|
|
||||||
|
* 如何管理和设计安全、可扩展的云上的平台(通常是在 AWS 上,不过微软的 Azure, 谷歌的 Cloud Platform,还有 DigitalOcean 和 Heroku 这样的 PaaS 提供商,也都很流行)
|
||||||
|
* 如何用流行的 [CI/CD][8] 工具,比如 Jenkins,Gocd,还有基于云的 Travis CI 或者 CircleCI,来构造一条优化的发布部署流水线,和发布部署策略。
|
||||||
|
* 如何在你的系统中使用基于时间序列的工具,比如 Kibana,Grafana,Splunk,Loggly 或者 Logstash,来监控,记录,并在变化的时候报警,还有
|
||||||
|
* 如何使用配置管理工具,例如 Chef,Puppet 或者 Ansible 做到“基础设施即代码”,以及如何使用像 Terraform 或 CloudFormation 的工具发布这些基础设施。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
容器也变得越来越受欢迎。尽管有人对大规模使用 Docker 的现状[表示不满][9],但容器正迅速地成为一种很好的方式来实现在更少的操作系统上运行超高密度的服务和应用,同时提高它们的可靠性。(像 Kubernetes 或者 Mesos 这样的容器编排工具,能在宿主机故障的时候,几秒钟之内重新启动新的容器。)考虑到这些,掌握 Docker 或者 rkt 以及容器编排平台的知识会对你大有帮助。
|
||||||
|
|
||||||
|
如果你是希望做 DevOps 实践的系统管理员,你还需要知道如何写代码。Python 和 Ruby 是 DevOps 领域的流行语言,因为他们是可移植的(也就是说可以在任何操作系统上运行),快速的,而且易读易学。它们还支撑着这个行业最流行的配置管理工具(Ansible 是使用 Python 写的,Chef 和 Puppet 是使用 Ruby 写的)以及云平台的 API 客户端(亚马逊 AWS, 微软 Azure, 谷歌 Cloud Platform 的客户端通常会提供 Python 和 Ruby 语言的版本)。
|
||||||
|
|
||||||
|
如果你是开发人员,也希望做 DevOps 的实践,我强烈建议你去学习 Unix,Windows 操作系统以及网络基础知识。虽然云计算把很多系统管理的难题抽象化了,但是对慢应用的性能做 debug 的时候,你知道操作系统如何工作的就会有很大的帮助。下文包含了一些这个主题的图书。
|
||||||
|
|
||||||
|
如果你觉得这些东西听起来内容太多,大家都是这么想的。幸运的是,有很多小项目可以让你开始探索。其中一个启动项目是 Gary Stafford 的[选举服务](https://github.com/garystafford/voter-service), 一个基于 Java 的简单投票平台。我们要求面试候选人通过一个流水线将该服务从 GitHub 部署到生产环境基础设施上。你可以把这个服务与 Rob Mile 写的了不起的 DevOps [入门教程](https://github.com/maxamg/cd-office-hours)结合起来,学习如何编写流水线。
|
||||||
|
|
||||||
|
还有一个熟悉这些工具的好方法,找一个流行的服务,然后只使用 AWS 和配置管理工具来搭建这个服务所需要的基础设施。第一次先手动搭建,了解清楚要做的事情,然后只用 CloudFormation (或者 Terraform) 和 Ansible 重写刚才的手动操作。令人惊讶的是,这就是我们基础设施开发人员为客户所做的大部分日常工作,我们的客户认为这样的工作非常有意义!
|
||||||
|
|
||||||
|
### 需要读的书
|
||||||
|
|
||||||
|
如果你在找 DevOps 的其他资源,下面这些理论和技术书籍值得一读。
|
||||||
|
|
||||||
|
#### 理论书籍
|
||||||
|
|
||||||
|
* Gene Kim 写的 [The Phoenix Project (凤凰项目)][10]。这是一本很不错的书,内容涵盖了我上文解释过的历史(写的更生动形象),描述了一个运行在敏捷和 DevOps 之上的公司向精益前进的过程。
|
||||||
|
* Terrance Ryan 写的 [Driving Technical Change (布道之道)][11]。非常好的一小本书,讲了大多数技术型组织内的常见性格特点以及如何和他们打交道。这本书对我的帮助比我想象的更多。
|
||||||
|
* Tom DeMarco 和 Tim Lister 合著的 [Peopleware (人件)][12]。管理工程师团队的经典图书,有一点过时,但仍然很有价值。
|
||||||
|
* Tom Limoncelli 写的 [Time Management for System Administrators (时间管理: 给系统管理员)][13]。这本书主要面向系统管理员,它对很多大型组织内的系统管理员生活做了深入的展示。如果你想了解更多系统管理员和开发人员之间的冲突,这本书可能解释了更多。
|
||||||
|
* Eric Ries 写的 [The Lean Startup (精益创业)][14]。描述了 Eric 自己的 3D 虚拟形象公司,IMVU, 发现了如何精益工作,快速失败和更快盈利。
|
||||||
|
* Jez Humble 和他的朋友写的[Lean Enterprise (精益企业)][15]。这本书是对精益创业做的改编,以更适应企业,两本书都很棒,都很好的解释了 DevOps 背后的商业动机。
|
||||||
|
* Kief Morris 写的 [Infrastructure As Code (基础设施即代码)][16]。关于 "基础设施即代码" 的非常好的入门读物!很好的解释了为什么所有公司都有必要采纳这种做法。
|
||||||
|
* Betsy Beyer, Chris Jones, Jennifer Petoff 和 Niall Richard Murphy 合著的 [Site Reliability Engineering (站点可靠性工程师)][17]。一本解释谷歌 SRE 实践的书,也因为是 "DevOps 诞生之前的 DevOps" 被人熟知。在如何处理运行时间、时延和保持工程师快乐方面提供了有趣的看法。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### 技术书籍
|
||||||
|
|
||||||
|
如果你想找的是让你直接跟代码打交道的书,看这里就对了。
|
||||||
|
|
||||||
|
* W. Richard Stevens 的 [TCP/IP Illustrated (TCP/IP 详解)][18]。这是一套经典的(也可以说是最全面的)讲解基本网络协议的巨著,重点介绍了 TCP/IP 协议族。如果你听说过 1,2, 3,4 层网络,而且对深入学习他们感兴趣,那么你需要这本书。
|
||||||
|
* Evi Nemeth, Trent Hein 和 Ben Whaley 合著的 [UNIX and Linux System Administration Handbook (UNIX/Linux 系统管理员手册)][19]。一本很好的入门书,介绍 Linux/Unix 如何工作以及如何使用。
|
||||||
|
* Don Jones 和 Jeffrey Hicks 合著的 [Learn Windows Powershell In A Month of Lunches (Windows PowerShell实战指南)][20]. 如果你在 Windows 系统下做自动化任务,你需要学习怎么使用 Powershell。这本书能够帮助你。Don Jones 是这方面著名的 MVP。
|
||||||
|
* 几乎所有 [James Turnbull][21] 写的东西,针对流行的 DevOps 工具,他发表了很好的技术入门读物。
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
不管是在那些把所有应用都直接部署在物理机上的公司,(现在很多公司仍然有充分的理由这样做)还是在那些把所有应用都做成 serverless 的先驱公司,DevOps 都很可能会持续下去。这部分工作很有趣,产出也很有影响力,而且最重要的是,它搭起桥梁衔接了技术和业务之间的缺口。DevOps 是一个值得期待的美好事物。
|
||||||
|
|
||||||
|
首次发表在 [Neurons Firing on a Keyboard][22]。使用 CC-BY-SA 协议。
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
via: https://opensource.com/article/18/1/getting-devops
|
||||||
|
|
||||||
|
作者:[Carlos Nunez][a]
|
||||||
|
译者:[belitex](https://github.com/belitex)
|
||||||
|
校对:[校对者ID](https://github.com/校对者ID)
|
||||||
|
|
||||||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
[a]:https://opensource.com/users/carlosonunez
|
||||||
|
[1]:https://www.reddit.com/r/devops/
|
||||||
|
[2]:https://carlosonunez.wordpress.com/
|
||||||
|
[3]:https://twitter.com/easiestnameever
|
||||||
|
[4]:https://en.wikipedia.org/wiki/ITIL
|
||||||
|
[5]:https://www.psychologytoday.com/blog/time-out/201401/getting-out-your-silo
|
||||||
|
[6]:https://twitter.com/adamhjk/status/572832185461428224
|
||||||
|
[7]:https://landing.google.com/sre/interview/ben-treynor.html
|
||||||
|
[8]:https://en.wikipedia.org/wiki/CI/CD
|
||||||
|
[9]:https://thehftguy.com/2016/11/01/docker-in-production-an-history-of-failure/
|
||||||
|
[10]:https://itrevolution.com/book/the-phoenix-project/
|
||||||
|
[11]:https://pragprog.com/book/trevan/driving-technical-change
|
||||||
|
[12]:https://en.wikipedia.org/wiki/Peopleware:_Productive_Projects_and_Teams
|
||||||
|
[13]:http://shop.oreilly.com/product/9780596007836.do
|
||||||
|
[14]:http://theleanstartup.com/
|
||||||
|
[15]:https://info.thoughtworks.com/lean-enterprise-book.html
|
||||||
|
[16]:http://infrastructure-as-code.com/book/
|
||||||
|
[17]:https://landing.google.com/sre/book.html
|
||||||
|
[18]:https://en.wikipedia.org/wiki/TCP/IP_Illustrated
|
||||||
|
[19]:http://www.admin.com/
|
||||||
|
[20]:https://www.manning.com/books/learn-windows-powershell-in-a-month-of-lunches-third-edition
|
||||||
|
[21]:https://jamesturnbull.net/
|
||||||
|
[22]:https://carlosonunez.wordpress.com/2017/03/02/getting-into-devops/
|
Loading…
Reference in New Issue
Block a user