Merge pull request #10130 from DavidChenLiang/master

20180827 An introduction to diffs and patches.md
This commit is contained in:
Xingyu.Wang 2018-09-10 16:16:14 +08:00 committed by GitHub
commit d5d0ab3976
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 124 additions and 114 deletions

View File

@ -1,114 +0,0 @@
Translating by DavidChenLiang
An introduction to diffs and patches
======
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/find-file-linux-code_magnifying_glass_zero.png?itok=E2HoPDg0)
If youve ever worked on a large codebase with a distributed development model, youve probably heard people say things like “Sue just sent a patch,” or “Rajiv is checking out the diff.” Maybe those terms were new to you and you wondered what they meant. Open source has had an impact here, as the main development model of large projects from Apache web server to the Linux kernel have been “patch-based” development projects throughout their lifetime. In fact, did you know that Apaches name originated from the set of patches that were collected and collated against the original [NCSA HTTPd server source code][1]?
You might think this is folklore, but an early [capture of the Apache website][2] claims that the name was derived from this original “patch” collection; hence **APA** t **CH** y server, which was then simplified to Apache.
But enough history trivia. What exactly are these patches and diffs that developers talk about?
First, for the sake of this article, lets assume that these two terms reference one and the same thing. “Diff” is simply short for “difference;” a Unix utility by the same name reveals the difference between one or more files. We will look at a diff utility example below.
A “patch” refers to a specific collection of differences between files that can be applied to a source code tree using the Unix diff utility. So we can create diffs (or patches) using the diff tool and apply them to an unpatched version of that same source code using the patch tool. As an aside (and breaking my rule of no more history trivia), the word “patch” comes from the physical covering of punchcard holes to make software changes in the early computing days, when punchcards represented the program executed by the computers processor. The image below, found on this [Wikipedia page][3] describing software patches, shows this original “patching” concept:
![](https://opensource.com/sites/default/files/uploads/360px-harvard_mark_i_program_tape.agr_.jpg)
Now that you have a basic understanding of patches and diffs, lets explore how software developers use these tools. If you havent used a source code control system like [Git][4] or [Subversion][5], I will set the stage for how most non-trivial software projects are developed. If you think of the life of a software project as a set of actions along a timeline, you might visualize changes to the software—such as adding a feature or a function to a source code file or fixing a bug—appearing at different points on the timeline, with each discrete point representing the state of all the source code files at that time. We will call these points of change “commits,” using the same nomenclature that todays most popular source code control tool, Git, uses. When you want to see the difference between the source code before and after a certain commit, or between many commits, you can use a tool to show us diffs, or differences.
If you are developing software using this same source code control tool, Git, you may have changes in your local system that you want to provide for others to potentially add as commits to their own tree. One way to provide local changes to others is to create a diff of your local tree's changes and send this “patch” to others who are working on the same source code. This lets others patch their tree and see the source code tree with your changes applied.
### Linux, Git, and GitHub
This model of sharing patch files is how the Linux kernel community operates regarding proposed changes today. If you look at the archives for any of the popular Linux kernel mailing lists—[LKML][6] is the primary one, but others include [linux-containers][7], [fs-devel][8], [Netdev][9], to name a few—youll find many developers posting patches that they wish to have others review, test, and possibly bring into the official Linux kernel Git tree at some point. It is outside of the scope of this article to discuss Git, the source code control system written by Linus Torvalds, in more detail, but it's worth noting that Git enables this distributed development model, allowing patches to live separately from a main repository, pushing and pulling into different trees and following their specific development flow.
Before moving on, we cant ignore the most popular service in which patches and diffs are relevant: [GitHub][10]. Given its name, you can probably guess that GitHub is based on Git, but it offers a web- and API-based workflow around the Git tool for distributed open source project development. One of the main ways that patches are shared in GitHub is not via email, like the Linux kernel, but by creating a **pull request**. When you commit changes on your own copy of a source code tree, you can share those changes by creating a pull request against a commonly shared repository for that software project. GitHub is used by many active and popular open source projects today, such as [Kubernetes][11], [Docker][12], [the Container Network Interface (CNI)][13], [Istio][14], and many others. In the GitHub world, users tend to use the web-based interface to review the diffs or patches that comprise a pull request, but you can still access the raw patch files and use them at the command line with the patch utility.
### Getting down to business
Now that weve covered patches and diffs and how they are used in popular open source communities or tools, let's look at a few examples.
The first example includes two copies of a source tree, and one has changes that we want to visualize using the diff utility. In our examples, we will look at “unified” diffs because that is the expected view for patches in most of the modern software development world. Check the diff manual page for more information on options and ways to produce differences. The original source code is located in sources-orig and our second, modified codebase is located in a directory named sources-fixed. To show the differences in a unified diff format in your terminal, use the following command:
```
$ diff -Naur sources-orig/ sources-fixed/
```
...which then shows the following diff command output:
```
diff -Naur sources-orig/officespace/interest.go sources-fixed/officespace/interest.go
--- sources-orig/officespace/interest.go        2018-08-10 16:39:11.000000000 -0400
+++ sources-fixed/officespace/interest.go       2018-08-10 16:39:40.000000000 -0400
@@ -11,15 +11,13 @@
   InterestRate float64
 }
+// compute the rounded interest for a transaction
 func computeInterest(acct *Account, t Transaction) float64 {
   interest := t.Amount 选题模板.txt 中文排版指北.md comic core.md Dict.md lctt2014.md lctt2016.md LCTT翻译规范.md LICENSE Makefile published README.md sign.md sources translated t.InterestRate
   roundedInterest := math.Floor(interest*100) / 100.0
   remainingInterest := interest - roundedInterest
-  // a little extra..
-  remainingInterest *= 1000
-
   // Save the remaining interest into an account we control:
   acct.Balance = acct.Balance + remainingInterest
```
The first few lines of the diff command output could use some explanation: The three `---` signs show the original filename; any lines that exist in the original file but not in the compared new file will be prefixed with a single `-` to note that this line was “subtracted” from the sources. The `+++` signs show the opposite: The compared new file and additions found in this file are marked with a single `+` symbol to show they were added in the new version of the file. Each “hunk” (thats what sections prefixed by `@@` are called) of the difference patch file has contextual line numbers that help the patch tool (or other processors) know where to apply this change. You can see from the "Office Space" movie reference function that weve corrected (by removing three lines) the greed of one of our software developers, who added a bit to the rounded-out interest calculation along with a comment to our function.
If you want someone else to test the changes from this tree, you could save this output from diff into a patch file:
```
$ diff -Naur sources-orig/ sources-fixed/ >myfixes.patch
```
Now you have a patch file, myfixes.patch, which can be shared with another developer to apply and test this set of changes. A fellow developer can apply the changes using the patch tool, given that their current working directory is in the base of the source code tree:
```
$ patch -p1 < ../myfixes.patch
patching file officespace/interest.go
```
Now your fellow developers source tree is patched and ready to build and test the changes that were applied via the patch. What if this developer had made changes to interest.go separately? As long as the changes do not conflict directly—for example, change the same exact lines—the patch tool should be able to solve where to merge the changes in. As an example, an interest.go file with several other changes is used in the following example run of patch:
```
$ patch -p1 < ../myfixes.patch
patching file officespace/interest.go
Hunk #1 succeeded at 26 (offset 15 lines).
```
In this case, patch warns that the changes did not apply at the original location in the file, but were offset by 15 lines. If you have heavily changed files, patch may give up trying to find where the changes fit, but it does provide options (with requisite warnings in the documentation) for turning up the matching “fuzziness” (which are beyond the scope of this article).
If you are using Git and/or GitHub, you will probably not use the diff or patch tools as standalone tools. Git offers much of this functionality so you can use the built-in capabilities of working on a shared source tree with merging and pulling other developers changes. One similar capability is to use git diff to provide the unified diff output in your local tree or between any two references (a commit identifier, the name of a tag or branch, and so on). You can even create a patch file that someone not using Git might find useful by simply piping the git diff output to a file, given that it uses the exact format of the diffcommand that patch can consume. Of course, GitHub takes these capabilities into a web-based user interface so you can view file changes on a pull request. In this view, you will note that it is effectively a unified diff view in your web browser, and GitHub allows you to download these changes as a raw patch file.
### Summary
Youve learned what a diff and a patch are, as well as the common Unix/Linux command line tools that interact with them. Unless you are a developer on a project still using a patch file-based development method—like the Linux kernel—you will consume these capabilities primarily through a source code control system like Git. But its helpful to know the background and underpinnings of features many developers use daily through higher-level tools like GitHub. And who knows—they may come in handy someday when you need to work with patches from a mailing list in the Linux world.
--------------------------------------------------------------------------------
via: https://opensource.com/article/18/8/diffs-patches
作者:[Phil Estes][a]
选题:[lujun9972](https://github.com/lujun9972)
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:https://opensource.com/users/estesp
[1]:https://github.com/TooDumbForAName/ncsa-httpd
[2]:https://web.archive.org/web/19970615081902/http:/www.apache.org/info.html
[3]:https://en.wikipedia.org/wiki/Patch_(computing)
[4]:https://git-scm.com/
[5]:https://subversion.apache.org/
[6]:https://lkml.org/
[7]:https://lists.linuxfoundation.org/pipermail/containers/
[8]:https://patchwork.kernel.org/project/linux-fsdevel/list/
[9]:https://www.spinics.net/lists/netdev/
[10]:https://github.com/
[11]:https://kubernetes.io/
[12]:https://www.docker.com/
[13]:https://github.com/containernetworking/cni
[14]:https://istio.io/

View File

@ -0,0 +1,124 @@
差异文件diffs和补丁文件patches简介
======
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/find-file-linux-code_magnifying_glass_zero.png?itok=E2HoPDg0)
如果你曾有机会在一个使用分布式开发模型的大型代码库上工作过,你就应该听说过类似下面的话,"Sue刚发过来一个补丁","Rajiv 正在签出差异文件", 可能这些词(补丁,差异文件)对你而言很陌生,而你确定很想搞懂他们到底指什么。开源软件对上述提到的名词有很大的贡献,作为从开发 Apache web 服务器到开发Linux 内核的开发模型,"基于补丁文件的开发" 这一模式贯穿了上述项目的始终。实际上,你可能不知道 Apache 的名字就来自一系列的代码补丁,他们被一一收集起来并针对原来的[NCSA HTTPd server source code][1]进行了校对
你可能认为前面说的只不过是些逸闻,但是一份早期的[capture of the Apache website][2]声称Apache 的名字就是来自于最早的“补丁文件”集合;(译注Apache 英文音和补丁相似),是“打了补丁的”服务器的英文名字简化。
好了,言归正传,程序员嘴里说的"差异"和"补丁" 到底是什么?
首先,在这篇文章里,我们可以认为这两个术语都指向同一个概念。“差异” 就是”补丁“。Unix 下的同名工具程序diff("差异")和patch("补丁")剖析了一个或多个文件之间的”差异”。下面我们看看diff 的例子:
一个"补丁"指的是文件之间一系列差异,这些差异能被 Unix 的 diff程序应用在源代码树上使之转变为程序员想要的文件状态。我们能使用diff 工具来创建“差异”( 或“补丁”),然后将他们“打” 在一个没有这个补丁的源代码版本上,此外,(我又要开始跑题了...,“补丁” 这个词真的指在计算机的早期使用打卡机的时候,用来覆盖在纸带上的用于修复代码错误的覆盖纸,那个时代纸带(上面有孔)就是在计算机处理器上运行的程序。下面的这张图,来自[Wikipedia Page][3] 真切的描绘了最初的“ 打补丁”这个词的出处:
![](https://opensource.com/sites/default/files/uploads/360px-harvard_mark_i_program_tape.agr_.jpg)
现在你对补丁和差异就了一个基本的概念,让我们来看看软件开发者是怎么使用这些工具的。如果你还没有使用过类似于[Git][4]这样的源代码版本控制工具的话,我将会一步步展示最流行的软件项目是怎么使用它们的。如果你将一个软件的生命周期看成是一条时间线的话,你就能看见这个软件的点滴变化,比如在何时源代码树加上了一个功能,在何时源代码树修复了一个功能缺陷。我们称这些改变的点为“进行了一次提交”,”提交“这个词被当今最流行的源代码版本管理工具Git使用, 当你想检查在一个提交前后的代码变化的话,(或者在许多个提交之间的代码变化),你都可以使用工具来观察文件差异。
如果你在使用 Git 开发软件的话,你开发环境本地有可能就有你想交给别的开发者的提交,为了给别的开发者你的提交,一个方法就是创建一个你本地文件的差异文件,然后将这个“补丁”发送给和你工作在同一个源代码树的别的开发者。别的开发者在“打”了你的补丁之后,就能看到在你的代码变树上的变化。
### Linux, Git, 和 GitHub
这种共享补丁的开发模型正是现今 Linux 内核社区如何处理内核修改提议而采用的模型。如果你有机会浏览任何一个主流的 Linux 内核邮件列表-主要是[LKML][6],包括[linux-containers][7][fs-devel][8][Netdev][9]等等你能看到很多开发者会贴出他们想让其他内核开发者审核测试或者合入Linux官方Git代码树某个提交的补丁。当然讨论 Git 不在这篇文章范围之内Git 是由 Linus Torvalds 开发的源代码控制系统,它支持分布式开发模型以及允许独立于主要代码仓库的补丁包,这些补丁包能被推送或拉取到不同的源代码树上并遵守这些代码树各自的开发流程。)
在继续我们的话题之前,我们当然不能忽略和补丁和差异这个概念最相关的的服务:[GitHub][10]。从它的名字就能猜想出 GitHub 是基于 Git 的,而且它还围绕着 Git对分布式开源代码开发模型提供了基于Web 和 API 的工作流管理。译注即Pull Request -- 拉取请求)。在 GitHub 上,分享补丁的方式不是像 Linux 内核社区那样通过邮件列表,而是通过创建一个 **拉取请求** 。当你提交你自己源代码的改动时你能通过创建一个针对软件项目的主仓库的“拉取请求”来分享你的代码改动译注即核心开发者维护一个主仓库开发者去“fork”这个仓库待各自的提交后再创建针对这个主仓库的拉取请求所有的拉取请求由主仓库的核心开发者批准后才能合入主代码库。GitHub 被当今很多活跃的开源社区所采用,如[Kubernetes][11],[Docker][12],[the Container Network Interface (CNI)][13],[Istio][14]等等。在 GitHub 的世界里,用户会倾向于使用基于 Web 页面的方式来审核一个拉取请求里的补丁或差异,你也可以直接访问原始的补丁并在命令行上直接使用它们。
### 该说点干货了
我们前面已经讲了在流行的开源社区了是怎么应用补丁和 diff的现在看看一些例子。
第一个例子包括一个源代码树的两个不同拷贝,其中一个有代码改动,我们想用 diff来看看这些改动是什么。这个例子里我们想看的是“合并格式”的补丁这是现在软件开发世界里最通用的格式。如果想知道更详细参数的用法以及如何生成diff请参考diff手册。原始的代码在sources-orig目录 而改动后的代码在sources-fixed目录. 如果要在你的命令行上用“合并格式”来展示补丁,请运行如下命令。(译注: 参数 N 代表如果比较的文件不存在,则认为是个空文件, a代表将所有文件都作为文本文件对待u 代表使用合并格式并输出上下文r 代表递归比较目录)
```
$ diff -Naur sources-orig/ sources-fixed/
```
...下面是 diff命令的输出:
```
diff -Naur sources-orig/officespace/interest.go sources-fixed/officespace/interest.go
--- sources-orig/officespace/interest.go        2018-08-10 16:39:11.000000000 -0400
+++ sources-fixed/officespace/interest.go       2018-08-10 16:39:40.000000000 -0400
@@ -11,15 +11,13 @@
   InterestRate float64
 }
+// compute the rounded interest for a transaction
 func computeInterest(acct *Account, t Transaction) float64 {
   interest := t.Amount * t.InterestRate
   roundedInterest := math.Floor(interest*100) / 100.0
   remainingInterest := interest - roundedInterest
-  // a little extra..
-  remainingInterest *= 1000
-
   // Save the remaining interest into an account we control:
   acct.Balance = acct.Balance + remainingInterest
```
最开始几行 diff的输出可以这样解释三个---’显示了原来文件的名字;任何在原文件(译注:不是源文件)里存在而在新文件里不存在的行将会用前缀‘-’,用来表示这些行被从源代码里‘减去’了。而‘+++’表示的则相反:在新文件里被加上的行会被放上前缀‘+’,表示这是在新文件里被'加上'的行。每一个补丁”块“(用@@作为前缀的的部分)都有上下文的行号,这能帮助补丁工具(或其他处理器)知道在代码的哪里应用这个补丁块。你能看到我们已经修改了”办公室“这部电影里提到的那个函数(移除了三行并加上了一行代码注释),电影里那个有点贪心的工程师可是偷偷的在计算利息的函数里加了点”料“哦。( LCTT译注剧情详情请见电影 https://movie.douban.com/subject/1296424/
如果你想找人来测试你的代码改动,你可以将差异保存到一个补丁里:
```
$ diff -Naur sources-orig/ sources-fixed/ >myfixes.patch
```
现在你有补丁 myfixes.patch了你能把它分享给别的开发者他们可以将这个补丁打在他们自己的源代码树上从而得到和你一样的代码并测试他们。如果一个开发者的当前工作目录就是他的源代码树的根的话他可以用下面的命令来打补丁
```
$ patch -p1 < ../myfixes.patch
patching file officespace/interest.go
```
现在这个开发者的源代码树已经打好补丁并准备好构建和测试文件的修改了。那么如果这个开发者在打补丁之前已经改动过了怎么办只要这些改动没有直接冲突译注比如改在同一行上补丁工具就能自动的合并代码的改动。例如下面的interest.go 文件他有其他几处改动然后它想打上myfixes.patch 这个补丁:
```
$ patch -p1 < ../myfixes.patch
patching file officespace/interest.go
Hunk #1 succeeded at 26 (offset 15 lines).
```
在这个例子中补丁警告说代码改动并不在文件原来的地方而是偏移了15行。如果你文件改动的很厉害补丁可能干脆说找不到要应用的地方还好补丁程序提供了提供了打开”模糊“匹配的选项这个选项在文档里有预置的警告信息对其讲解已经超出了本文的范围
如果你使用 Git 或者 GitHub 的话你可能不会直接使用diff或patch。Git 已经内置了这些功能你能使用这些功能和共享一个源代码树的其他开发者交互拉取或合并代码。Git一个比较相近的功能是可以使用 git diff 来对你的本地代码树生成全局差异,又或者对你的任意两次”引用“(可能是一个代表提交的数字,或一个标记或分支的名字,等等)做全局补丁。你甚至能简单的用管道将 git diff的输出到一个文件里这个文件必须严格符合将要被使用它的程序的输入要求然后将这个文件交给一个并不使用 Git 的开发者应用到他的代码上。当然GitHub 把这些功能放到了 Web 上,你能直接在 Web 页面上查看一个拉取请求的文件变动。在Web 上你能看到所展示的合并差异GitHub 还允许你将这些代码改动下载为原始的补丁文件。
### 总结
好了,你已经学到了”差异“和”补丁“是什么,以及在 Unix/Linux 上怎么使用命令行工具和他们交互。除非你还在像 Linux 内核开发这样的项目中工作而使用完全基于补丁的开发方式你应该会主要通过你的源代码控制系统如Git来使用补丁。但熟悉像 GitHub 这样的高级别工具的技术背景和技术底层对你的工作也是大有裨益的。谁知道会不会有一天你需要和一个来自 Linux 世界邮件列表的补丁包打交道呢?
--------------------------------------------------------------------------------
via: https://opensource.com/article/18/8/diffs-patches
作者:[Phil Estes][a]
选题:[lujun9972](https://github.com/lujun9972)
译者:[David Chen](https://github.com/DavidChenLiang)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:https://opensource.com/users/estesp
[1]:https://github.com/TooDumbForAName/ncsa-httpd
[2]:https://web.archive.org/web/19970615081902/http:/www.apache.org/info.html
[3]:https://en.wikipedia.org/wiki/Patch_(computing)
[4]:https://git-scm.com/
[5]:https://subversion.apache.org/
[6]:https://lkml.org/
[7]:https://lists.linuxfoundation.org/pipermail/containers/
[8]:https://patchwork.kernel.org/project/linux-fsdevel/list/
[9]:https://www.spinics.net/lists/netdev/
[10]:https://github.com/
[11]:https://kubernetes.io/
[12]:https://www.docker.com/
[13]:https://github.com/containernetworking/cni
[14]:https://istio.io/