From d8f628c98bde3a9c50129dac06cf8ece3c83f3a4 Mon Sep 17 00:00:00 2001 From: "Xingyu.Wang" Date: Tue, 16 Apr 2019 00:52:02 +0800 Subject: [PATCH] TSD:20190413 The Fargate Illusion.md --- sources/tech/20190413 The Fargate Illusion.md | 448 ------------------ .../tech/20190413 The Fargate Illusion.md | 447 +++++++++++++++++ 2 files changed, 447 insertions(+), 448 deletions(-) delete mode 100644 sources/tech/20190413 The Fargate Illusion.md create mode 100644 translated/tech/20190413 The Fargate Illusion.md diff --git a/sources/tech/20190413 The Fargate Illusion.md b/sources/tech/20190413 The Fargate Illusion.md deleted file mode 100644 index 008fe36aa0..0000000000 --- a/sources/tech/20190413 The Fargate Illusion.md +++ /dev/null @@ -1,448 +0,0 @@ -[#]: collector: (lujun9972) -[#]: translator: (wxy) -[#]: reviewer: ( ) -[#]: publisher: ( ) -[#]: url: ( ) -[#]: subject: (The Fargate Illusion) -[#]: via: (https://leebriggs.co.uk/blog/2019/04/13/the-fargate-illusion.html) -[#]: author: (Lee Briggs https://leebriggs.co.uk/) - -The Fargate Illusion -====== - -I’ve been building a Kubernetes based platform at $work now for almost a year, and I’ve become a bit of a Kubernetes apologist. It’s true, I think the technology is fantastic. I am however under no illusions about how difficult it is to operate and maintain. I read posts like [this][1] one earlier in the year and found myself nodding along to certain aspects of the opinion. If I was in a smaller company, with 10/15 engineers, I’d be horrified if someone suggested managing and maintaining a fleet of Kubernetes clusters. The operational overhead is just too high. - -Despite my love for all things Kubernetes at this point, I do remain curious about the notion that “serverless” computing will kill the ops engineer. The main source of intrigue here is the desire to stay gainfully employed in the future - if we aren’t going to need OPS engineers in our glorious future, I’d like to see what all the fuss is about. I’ve done some experimentation in Lamdba and Google Cloud Functions and been impressed by what I saw, but I still firmly believe that serverless solutions only solve a percentage of the problem. - -I’ve had my eye on [AWS Fargate][2] for some time now and it’s something that developers at $work have been gleefully pointed at as “serverless computing” - mainly because with Fargate, you can run your Docker container without having to manage the underlying nodes. I wanted to see what that actually meant - so I set about trying to get an app running on Fargate from scratch. I defined the succes criteria here as something close-ish to a “production ready” application, so I wanted to have the following: - - * A running container on Fargate - * With configuration pushed down in the form of environment variables - * “Secrets” should not be in plaintext - * Behind a loadbalancer - * TLS enabled with a valid SSL certificate - - - -I approached this whole task from an infrastructure as code mentality, and instead of following the default AWS console wizards, I used terraform to define the infrastructure. It’s very possible this overcomplicated things, but I wanted to make sure any deployment was repeatable and discoverable to anyone else wanting to follow along. - -All of the above criteria is generally achieveable with a Kubernetes based platform using a few external add-ons and plugins, so I’m admittedly approaching this whole task with a comparitive mentality - because I’m comparing it with my common workflow. My main goal was to see how easy this was with Fargate, especially when compared with Kubernetes. I was pretty surprised with the outcome. - -### AWS has overhead - -I had a clean AWS account and was determined to go from zero to a deployed webapp. Like any other infrastructure in AWS, I had to get the baseline infrastructure working - so I first had to define a VPC. - -I wanted to follow the best practices, so I carved the VPC up into subnets across availability zones, with a public and a private subnet. It occurred to me at this point that as long as this need was always there, I’d probably be able to find a job of some description. The notion that AWS is operationally “free” is something that has irked me for quite some time now. Many people in the developer community take for granted how much work and effort there is in setting up and defining a well designed AWS account and infrastructure. This is _before_ we even start talking about a multi-account architecture - I’m still in a single account here and I’m already having to define infrastructure and traditional network items. - -It’s also worth remembering here, I’ve done this quite a few times now, so I _knew_ exactly what to do. I could have used the default VPC in my account, and the pre-provided subnets, which I expect many people who are getting started might do. This took me about half an hour to get running, but I couldn’t help but think here that even if I want to run lambda functions, I still need some kind of connectivity and networking. Defining NAT gateways and routing in a VPC doesn’t feel very serveless at all, but it has to be done to get things moving. - -### Run my damn container - -Once I had the base infrastructure up and running, I now wanted to get my docker container running. I started examining the Fargate docs and browsed through the [Getting Started][3] docs and something immediately popped out at me: - -> [][4] - -Hold on a minute, there’s at least THREE steps here just to get my container up and running? This isn’t quite how this whole thing was sold to me, but let’s get started. - -#### Task Definitions - -A task definition defines the actual container you want to run. The problem I ran into immediately here is that this thing is insanely complicated. Lots of the options here are very straightforward, like specifying the docker image and memory limits, but I also had to define a networking model and a variety of other options that I wasn’t really familiar with. Really? If I had come into this process with absolutely no AWS knowledge I’d be incredibly overwhelmed at this stage. A full list of the [parameters][5] can be found on the AWS page, and the list is long. I knew my container needed to have some environment variables, and it needed to expose a port. So I defined that first, with the help of a fantastic [terraform module][6] which really made this easier. If I didn’t have this, I’d be hand writing JSON to define my container definition. - -First, I defined some environment variables: - -``` -container_environment_variables = [ - { - name = "USER" - value = "${var.user}" - }, - { - name = "PASSWORD" - value = "${var.password}" - } -] -``` - -Then I compiled the task definition using the module I mentioned above: - -``` -module "container_definition_app" { - source = "cloudposse/ecs-container-definition/aws" - version = "v0.7.0" - - container_name = "${var.name}" - container_image = "${var.image}" - - container_cpu = "${var.ecs_task_cpu}" - container_memory = "${var.ecs_task_memory}" - container_memory_reservation = "${var.container_memory_reservation}" - - port_mappings = [ - { - containerPort = "${var.app_port}" - hostPort = "${var.app_port}" - protocol = "tcp" - }, - ] - - environment = "${local.container_environment_variables}" - -} -``` - -I was pretty confused at this point - I need to define a lot of configuration here to get this running and I’ve barely even started, but it made a little sense - anything running a docker container needs to have _some_ idea of the configuration values of the docker container. I’ve [previously written][7] about the problems with Kubernetes and configuration management and the same problem seemed to be rearing its ugly head again here. - -Next, I defined the task definition from the module above (which thankfully abstracted the required JSON away from me - if I had to hand write JSON at this point I’ve have probably given up). - -I realised immediately I was missing something as I was defining the module parameters. I need an IAM role as well! Okay, let me define that: - -``` -resource "aws_iam_role" "ecs_task_execution" { - name = "${var.name}-ecs_task_execution" - - assume_role_policy = < [][16] - --------------------------------------------------------------------------------- - -via: https://leebriggs.co.uk/blog/2019/04/13/the-fargate-illusion.html - -作者:[Lee Briggs][a] -选题:[lujun9972][b] -译者:[译者ID](https://github.com/译者ID) -校对:[校对者ID](https://github.com/校对者ID) - -本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出 - -[a]: https://leebriggs.co.uk/ -[b]: https://github.com/lujun9972 -[1]: https://matthias-endler.de/2019/maybe-you-dont-need-kubernetes/ -[2]: https://aws.amazon.com/fargate/ -[3]: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ECS_GetStarted.html -[4]: https://imgur.com/FpU0lds -[5]: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html -[6]: https://github.com/cloudposse/terraform-aws-ecs-container-definition -[7]: https://leebriggs.co.uk/blog/2018/05/08/kubernetes-config-mgmt.html -[8]: https://github.com/kubernetes-incubator/external-dns -[9]: https://github.com/jetstack/cert-manager -[10]: https://github.com/terraform-aws-modules/terraform-aws-ecs -[11]: https://kubernetes.io/docs/concepts/configuration/secret/ -[12]: https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-paramstore.html -[13]: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/specifying-sensitive-data.html -[14]: https://twitter.com/briggsl/status/1116870900719030272 -[15]: https://cloud.google.com/run/ -[16]: https://imgur.com/QfFg225 diff --git a/translated/tech/20190413 The Fargate Illusion.md b/translated/tech/20190413 The Fargate Illusion.md new file mode 100644 index 0000000000..07c2bcf3f6 --- /dev/null +++ b/translated/tech/20190413 The Fargate Illusion.md @@ -0,0 +1,447 @@ +[#]: collector: (lujun9972) +[#]: translator: (wxy) +[#]: reviewer: ( ) +[#]: publisher: ( ) +[#]: url: ( ) +[#]: subject: (The Fargate Illusion) +[#]: via: (https://leebriggs.co.uk/blog/2019/04/13/the-fargate-illusion.html) +[#]: author: (Lee Briggs https://leebriggs.co.uk/) + +Fargate 幻觉 +====== + +我在 $work 工作的近一年的时间里建立了一个基于 Kubernetes 的平台,而且成为了一个 Kubernetes 的辩护人。这是真的,我认为这项技术太棒了。然而我没有真正想过它的操作和维护究竟有多困难。我在今年早些时候阅读了[这样][1]的一篇文章,并对其中某些意见深以为然。如果我在一家规模较小的、有 10 到 15 个工程师的公司,假如有人建议管理和维护一批 Kubernetes 集群,我会深感震惊的。因为它的运营开销太高了! + +尽管我现在对 Kubernetes 的一切都很感兴趣,但我仍然对“无服务器Serverless”计算会消灭运维工程师的说法抱有疑问。这种奇谈怪论主要来源于希望在未来仍然能有一份有收益的工作 —— 如果我们前景光明的未来不需要运维工程师,我觉得也没什么大惊小怪的。我已经在 Lamdba 和Google Cloud Functions 上做了一些实验,结果让我印象十分深刻,但我仍然坚信无服务器解决方案只是解决了一部分问题。 + +我已经关注 [AWS Fargate][2] 已经有一段时间了,这是就是 $work 的开发人员目为“无服务器计算”的东西 —— 主要是因为使用了 Fargate,你就可以运行你的 Docker 容器而不需要管理底层节点。我想看看它到底意味着什么 —— 所以我开始尝试从头开始在 Fargate 上运行一个应用程序。我定义的成功标准是与“生产级”应用程序紧密相关的某些东西,所以我希望得到以下内容: + +* 一个在 Fargate 上运行的容器 +* 配置以环境变量的形式下推 +* “隐秘信息” 不能是明文的 +* 位于负载均衡器之后 +* SSL 证书有效的 TLS 通道 + +我从基础设施即代码的方式开始整个任务,不遵循默认的 AWS 控制台向导,而是使用 terraform 来定义基础架构。这很可能让整个事情变得很复杂,但我想确保任何部署对于任何想要按此步骤复现的人都是可重复的和可发现的。 + +所有上述标准通常都可以通过基于 Kubernetes 的平台使用一些外部附加组件和插件来实现,所以我确实是以一种比较的心态来处理整个任务 —— 因为我要将它与我的常用工作流程进行比较。我的主要目标是看看Fargate 有多容易,特别是与 Kubernetes 相比时。结果让我感到非常惊讶。 + +### AWS 是有开销的 + +我有一个干净的 AWS 账户,并决定从零到部署一个 webapp。与 AWS 中的其它基础设施一样,我必须使基本的基础设施正常工作 - 因此我首先必须定义 VPC。 + +遵循最佳实践,因此我将 VPC 划分为可用区域内的子网,具有公共子网和私有子网。在这一点上我想到,只要这种需求存在,我就能找到一份这种工作。AWS 在运维上“免费”这一概念一直让我感到厌倦。开发者社区中的许多人理所当然地认为在设置和定义设计良好的 AWS 账户和基础设施方面不需要多少工作和努力。在我们甚至开始谈论多帐户架构*之前*(现在我仍然使用单一帐户),我必须已经定义好基础设施和传统的网络设备。 + +这里也值得记住,我已经做了很多次,所以我*知道*该做什么。我可以在我的帐户中使用默认的 VPC 以及预先提供的子网,我觉得很多人也可以使用它。这花了我大约半个小时才能运行,但我不禁想到,即使我想运行 lambda 函数,我仍然需要某种连接和网络。在 VPC 中定义 NAT 网关和路由根本不会让你觉得“无服务器”,但要往下进行这就是必须要做的。 + + +### 运行个简单的容器 + +我启动运行了基本的基础设施之后,我想让我的 Docker 容器运行起来。 我开始翻阅 Fargate 文档并浏览 [入门][3] 文档,这些就立即突然出现在了我面前: + +![][4] + +等一下,只是让我的容器运行就至少要有**三个**步骤?这完全不像我所想的,不过还是让我们开始吧。 + +#### 任务定义 + +“任务定义Task Definition”用来定义要运行的实际容器。我在这里遇到的问题是,任务定义这件事非常复杂。这里有很多选项是非常简单的,比如指定 Docker 镜像和内存限制,但我还必须定义一个网络模型以及我并不熟悉的各种其他选项。真需要这样吗?如果我完全没有 AWS 方面的知识就进入到这个过程里,那么在这个阶段我会感觉非常的不知所措。可以在 AWS 页面上找到这些 [参数][5] 的完整列表,这个列表很长。我知道我的容器需要有一些环境变量,它需要暴露一个端口。所以我首先在一个神奇的 [terraform 模块][6] 的帮助下定义了这一点,这真的让这件事更容易。如果我没有这个模块,我会亲自编写 JSON 来定义我的容器定义。 + + +首先我定义了一些环境变量: + +``` +container_environment_variables = [ + { + name = "USER" + value = "${var.user}" + }, + { + name = "PASSWORD" + value = "${var.password}" + } +] +``` + +然后我使用上面提及的模块组成了任务定义: + +``` +module "container_definition_app" { + source = "cloudposse/ecs-container-definition/aws" + version = "v0.7.0" + + container_name = "${var.name}" + container_image = "${var.image}" + + container_cpu = "${var.ecs_task_cpu}" + container_memory = "${var.ecs_task_memory}" + container_memory_reservation = "${var.container_memory_reservation}" + + port_mappings = [ + { + containerPort = "${var.app_port}" + hostPort = "${var.app_port}" + protocol = "tcp" + }, + ] + + environment = "${local.container_environment_variables}" + +} +``` + +在这一点上我非常困惑 —— 我需要在这里定义很多配置以使其运行,而这时什么都没有开始呢,但这是必要的 —— 运行 Docker 容器肯定需要了解一些容器配置的知识。我 [之前写过][7] 关于 Kubernetes 和配置管理的问题的文章,同样的问题似乎在这里再次抬头。 + +接下来,我从上面的模块中定义了任务定义(幸好从我这里抽象出了所需的 JSON —— 如果我不得不手写JSON,我可能已经放弃了)。 + +当我定义模块参数时,我突然意识到我错过了一些东西。我也需要一个 IAM 角色!好吧,让我来定义: + +``` +resource "aws_iam_role" "ecs_task_execution" { + name = "${var.name}-ecs_task_execution" + + assume_role_policy = <秘密管理secret management部分的方式是使用 [AWS SSM][12](此服务的全名是 AWS 系统管理器参数存储库,但我不想使用这个名称,因为坦率地说这个名字太愚蠢了)。 + +AWS 文档很好的[涵盖了这个内容][13],因此我开始将其转换为 terraform。 + + +##### 指定秘密信息 + +首先,你必须定义一个参数并为其命名。在 terraform 中,它看起来像这样: + + +``` +resource "aws_ssm_parameter" "app_password" { + name = "${var.app_password_param_name}" # The name of the value in AWS SSM + type = "SecureString" + value = "${var.app_password}" # The actual value of the password, like correct-horse-battery-stable +} +``` + +显然,这里的关键组件是 “SecureString” 类型。这会使用默认的 AWS KMS 密钥来加密数据,这对我来说并不是很直观。这比 Kubernetes 秘密具有巨大优势,默认情况下,这些秘密信息在 etcd 中是不加密的。 + +然后我为 ECS 指定了另一个本地值映射,并将其作为秘密参数传递: + +``` +container_secrets = [ + { + name = "PASSWORD" + valueFrom = "${var.app_password_param_name}" + }, +] + +module "container_definition_app" { + source = "cloudposse/ecs-container-definition/aws" + version = "v0.7.0" + + container_name = "${var.name}" + container_image = "${var.image}" + + container_cpu = "${var.ecs_task_cpu}" + container_memory = "${var.ecs_task_memory}" + container_memory_reservation = "${var.container_memory_reservation}" + + port_mappings = [ + { + containerPort = "${var.app_port}" + hostPort = "${var.app_port}" + protocol = "tcp" + }, + ] + + environment = "${local.container_environment_variables}" + secrets = "${local.container_secrets}" +``` + +##### 出了个问题 + +此时,我重新部署了我的任务定义,并且非常困惑。为什么任务没有正确拉起?当新的任务定义(版本 8)可用时,我一直在控制台中看到正在运行的应用程序仍在使用先前的任务定义(版本 7)。这件事花费的时间比我预期的要长,但是在控制台的事件屏幕上,我注意到了 IAM 错误。我错过了一个步骤,容器无法从 AWS SSM 中读取秘密信息,因为它没有正确的 IAM 权限。这是我第一次真正对整个这件事情感到沮丧。从用户体验的角度来看,这里的反馈非常*糟糕*。如果我没有发觉的话,我会认为一切都很好,因为仍然有一个任务正在运行,我的应用程序仍然可以通过正确的 URL 访问 —— 只不过是旧的配置而已。 + +在 Kubernetes 里,我会清楚地看到 pod 定义中的错误。Fargate 可以确保我的应用不会停止,这绝对是太棒了,但作为一名运维,我需要一些关于发生了什么的实际反馈。这真的不够好。我真的希望 Fargate 团队的人能够读到这篇文章,改善这种体验。 + +### 就这样了 + +到这里就结束了 —— 我的应用程序正在运行,也符合我的所有标准。我确实意识到我做了一些改进,其中包括: + +* 定义一个 cloudwatch 日志组,这样我就可以正确地写日志了 +* 添加了一个 route53 托管区域,使整个事情从 DNS 角度更容易自动化 +* 修复并重新调整了 IAM 权限,这里太宽泛了 + +但老实说,在这一点上我想反思一下这段经历。我写了一个关于我的经历的 [推特会话][14],然后花了其余时间思考我在这里真正感受到的。 + +### 代价 + +经过一夜的反思,我意识到无论你是使用 Fargate 还是 Kubernetes,这个过程都大致相同。最让我感到惊讶的是,尽管我经常声称 Fargate “更容易”,但我真的没有看到任何超过 Kubernetes 平台的好处。现在,如果你正在构建 Kubernetes 集群,我绝对可以看到这里的价值 —— 管理节点和控制面板只是不必要的开销。问题是 —— 基于 Kubernetes 的平台的大多数消费者都*没有*这样做。如果你很幸运能够使用 GKE,你几乎不需要考虑集群的管理,你可以使用单个 gcloud 命令来运行集群。我经常使用 Digital Ocean 的 Kubernetes 治理服务,我可以肯定地说它就像操作 Fargate 集群一样简单 —— 实际上在某种程度上它更容易。 + +必须定义一些基础设施来运行你的容器就是此时的代价。谷歌本周可能刚刚使用他们的 [Google Cloud Run][15] 产品改变了游戏规则,但他们在这一领域的领先优势远远领先于其他所有人。 + +从这整个经历中,我可以肯定的说:*大规模运行容器仍然很难。*它需要思考,需要领域知识,需要运维和开发人员之间的协作。它还需要一个基础来构建 —— 任何基于 AWS 的操作都需要事先定义和运行一些基础架构。我对一些公司似乎渴望的 “NoOps” 概念非常感兴趣。我想如果你正在运行一个无状态应用程序,你可以把它全部放在一个 lambda 函数和一个 API 网关中,这可能不错,但我们是否真的适合在任何一种企业环境中这样做?我真的不这么认为。 + +#### 公平比较 + +令我印象深刻的另一个现实是,技术 A 和技术 B 之间的比较通常不太公平,我经常在 AWS 上看到这一点。这种实际情况往往与 Jeff Barr 博客文章截然不同。如果你是一家足够小的公司,你可以使用 AWS 控制台在 AWS 中部署你的应用程序并接受所有默认值,这绝对更容易。但是,我不想使用默认值,因为默认值几乎是不适用于生产环境的。一旦你开始剥离掉云服务商服务的层面,你就会开始意识到最终你仍然是在运行软件。它仍然需要设计良好、部署良好、运行良好。我相信 AWS 和 Kubernetes 以及所有其他云服务商的增值服务使得它更容易运行、设计和操作,但它绝对不是免费的。 + +#### Kubernetes 的争议 + +最后就是:如果你将 Kubernetes 纯粹视为一个容器编排工具,你可能会喜欢 Fargate。然而,随着我对 Kubernetes 越来越熟悉,我开始意识到它作为一种技术的重要性 - 不仅因为它是一个伟大的容器编排工具,而且因为它的设计模式 - 它是声明性的、API 驱动的平台。 在*整个* Fargate 过程期间发生的一个简单的事情是,如果我删除这里某个东西,Fargate 不一定会为我重新创建它。自动缩放很不错,不需要管理服务器和操作系统的补丁及更新很棒,但我觉得因为无法使用 Kubernetes 自我修复和 API 驱动模型而失去了很多。当然,Kubernetes 有一个学习曲线 - 但从这里的体验来看,Fargate 也是如此。 + +### 总结 + +尽管我在这个过程中遭遇了困惑,但我确实很喜欢这种体验。我仍然相信 Fargate 是一项出色的技术,AWS 团队对 ECS/Fargate 所做的工作确实非常出色。然而,我的观点是,这绝对不比 Kubernetes “更容易”,只是……难点不同。 + +在生产环境中运行容器时出现的问题大致相同。如果你从这篇文章中有所收获,它应该是这样的:*不管你选择的哪种方式都有运维开销*。不要相信你选择一些东西你的世界就变得更轻松。我个人的意见是:如果你有一个运维团队,而你的公司将为多个应用程序团队部署容器 —— 选择一种技术并围绕它构建流程和工具以使其更容易。 + +人们说的一点肯定是没错,某种技术肯定比现在更容易一些。在这个阶段,谈到 Fargate,下面的漫画这总结了我的感受: + +![][16] + +-------------------------------------------------------------------------------- + +via: https://leebriggs.co.uk/blog/2019/04/13/the-fargate-illusion.html + +作者:[Lee Briggs][a] +选题:[lujun9972][b] +译者:[wxy](https://github.com/wxy) +校对:[校对者ID](https://github.com/校对者ID) + +本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出 + +[a]: https://leebriggs.co.uk/ +[b]: https://github.com/lujun9972 +[1]: https://matthias-endler.de/2019/maybe-you-dont-need-kubernetes/ +[2]: https://aws.amazon.com/fargate/ +[3]: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ECS_GetStarted.html +[4]: https://i.imgur.com/YfMyXBdl.png +[5]: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html +[6]: https://github.com/cloudposse/terraform-aws-ecs-container-definition +[7]: https://leebriggs.co.uk/blog/2018/05/08/kubernetes-config-mgmt.html +[8]: https://github.com/kubernetes-incubator/external-dns +[9]: https://github.com/jetstack/cert-manager +[10]: https://github.com/terraform-aws-modules/terraform-aws-ecs +[11]: https://kubernetes.io/docs/concepts/configuration/secret/ +[12]: https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-paramstore.html +[13]: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/specifying-sensitive-data.html +[14]: https://twitter.com/briggsl/status/1116870900719030272 +[15]: https://cloud.google.com/run/ +[16]: https://i.imgur.com/Bx7Q50Jl.jpg