diff --git a/published/20231123 git branches- intuition - reality.md b/published/20231123 git branches- intuition - reality.md new file mode 100644 index 0000000000..bbbc60a3d7 --- /dev/null +++ b/published/20231123 git branches- intuition - reality.md @@ -0,0 +1,279 @@ +[#]: subject: "git branches: intuition & reality" +[#]: via: "https://jvns.ca/blog/2023/11/23/branches-intuition-reality/" +[#]: author: "Julia Evans https://jvns.ca/" +[#]: collector: "lujun9972/lctt-scripts-1700446145" +[#]: translator: "ChatGPT" +[#]: reviewer: "wxy" +[#]: publisher: "wxy" +[#]: url: "https://linux.cn/article-16430-1.html" + +Git 分支:直觉与现实 +====== + +![][0] + +你好!我一直在投入写作一本关于 Git 的小册,因此我对 Git 分支投入了许多思考。我不断从他人那里听说他们觉得 Git 分支的操作方式违反直觉。这使我开始思考:直觉上的分支概念可能是什么样,以及它如何与 Git 的实际操作方式区别开来? + +在这篇文章中,我想简洁地讨论以下几点内容: + +* 我认为许多人可能有的一个直觉性的思维模型 +* Git 如何在内部实现分支的表示(例如,“分支是对提交的指针”) +* 这种“直觉模型”与实际操作方式之间的紧密关联 +* 直觉模型的某些局限性,以及为何它可能引发问题 + +本文无任何突破性内容,我会尽量保持简洁。 + +### 分支的直观模型 + +当然,人们对分支有许多不同的直觉。我自己认为最符合“苹果树的一个分支”这一物理比喻的可能是下面这个。 + +我猜想许多人可能会这样理解 Git 分支:在下图中,两个红色的提交就代表一个“分支”。 + +![][1] + +我认为在这个示意图中有两点很重要: + +1. 分支上有两个提交 +2. 分支有一个“父级”(`main`),它是这个“父级”的分支 + +虽然这个观点看似合理,但实际上它并不符合 Git 对于分支的定义 — 最重要的是,Git 并没有一个分支的“父级”的概念。那么,Git 又是如何定义分支的呢? + +### 在 Git 里,分支是完整的历史 + +在 Git 中,一个分支是每个过去提交的完整历史记录,而不仅仅是那个“分支”提交。因此,在我们上述的示意图中,所有的分支(`main` 和 `branch`)都包含了 4 次提交。 + +我创建了一个示例仓库,地址为:。它设置的分支方式与前图一样。现在,我们来看看这两个分支: + +`main` 分支包含了 4 次提交: + +``` +$ git log --oneline main +70f727a d +f654888 c +3997a46 b +a74606f a +``` + +`mybranch` 分支也有 4 次提交。最后两次提交在这两个分支里都存在。 + +``` +$ git log --oneline mybranch +13cb960 y +9554dab x +3997a46 b +a74606f a +``` + +因此,`mybranch` 中的提交次数为 4,而不仅仅是 2 次“分支”提交,即 `13cb960` 和 `9554dab`。 + +你可以用以下方式让 Git 绘制出这两个分支的所有提交: + +``` +$ git log --all --oneline --graph +* 70f727a (HEAD -> main, origin/main) d +* f654888 c +| * 13cb960 (origin/mybranch, mybranch) y +| * 9554dab x +|/ +* 3997a46 b +* a74606f a +``` + +### 分支以提交 ID 的形式存储 + +在 Git 的内部,分支会以一种微小的文本文件的形式存储下来,其中包含了一个提交 ID。这就是我一开始提及到的“技术上正确”的定义。这个提交就是分支上最新的提交。 + +我们来看一下示例仓库中 `main` 和 `mybranch` 的文本文件: + +``` +$ cat .git/refs/heads/main +70f727acbe9ea3e3ed3092605721d2eda8ebb3f4 +$ cat .git/refs/heads/mybranch +13cb960ad86c78bfa2a85de21cd54818105692bc +``` + +这很好理解:`70f727` 是 `main` 上的最新提交,而 `13cb96` 是 `mybranch` 上的最新提交。 + +这样做的原因是,每个提交都包含一种指向其父级的指针,所以 Git 可以通过追踪这些指针链来找到分支上所有的提交。 + +正如我前文所述,这里遗漏的一个重要因素是这两个分支间的任何关联关系。从这里能看出,`mybranch` 是 `main` 的一个分支——这一点并没有被表明出来。 + +既然我们已经探讨了直观理解的分支概念是如何不成立的,我接下来想讨论的是,为何它在某些重要的方面又是如何成立的。 + +### 人们的直观感觉通常并非全然错误 + +我发现,告诉人们他们对 Git 的直觉理解是“错误的”的说法颇为流行。我觉得这样的说法有些可笑——总的来说,即使人们关于某个题目的直觉在某些方面在技术上不精确,但他们通常会有完全合理的理由来支持他们的直觉!即使是“不正确的”模型也可能极其有用。 + +现在,我们来讨论三种情况,其中直觉上的“分支”概念与我们实际在操作中如何使用 Git 非常相符。 + +### 变基操作使用的是“直观”的分支概念 + +现在,让我们回到最初的图片。 + +![][1] + +当你在 `main` 上对 `mybranch` 执行 变基rebase 操作时,它将取出“直观”分支上的提交(只有两个红色的提交)然后将它们应用到 `main` 上。 + +执行结果就是,只有两次提交(`x` 和 `y`)被复制。以下是相关操作的样子: + +``` +$ git switch mybranch +$ git rebase main +$ git log --oneline mybranch +952fa64 (HEAD -> mybranch) y +7d50681 x +70f727a (origin/main, main) d +f654888 c +3997a46 b +a74606f a +``` + +在此,`git rebase` 创建了两个新的提交(`952fa64` 和 `7d50681`),这两个提交的信息来自之前的两个 `x` 和 `y` 提交。 + +所以直觉上的模型并不完全错误!它很精确地告诉你在变基中发生了什么。 + +但因为 Git 不知道 `mybranch` 是 `main` 的一个分叉,你需要显式地告诉它在何处进行变基。 + +### 合并操作也使用了“直观”的分支概念 + +合并操作并不复制提交,但它们确实需要一个“基础base”提交:合并的工作原理是查看两组更改(从共享基础开始),然后将它们合并。 + +我们撤销刚才完成的变基操作,然后看看合并基础是什么。 + +``` +$ git switch mybranch +$ git reset --hard 13cb960 # 撤销 rebase +$ git merge-base main mybranch +3997a466c50d2618f10d435d36ef12d5c6f62f57 +``` + +这里我们获得了分支分离出来的“基础”提交,也就是 `3997a4`。这正是你可能会基于我们的直观图片想到的提交。 + +### GitHub 的拉取请求也使用了直观的概念 + +如果我们在 GitHub 上创建一个拉取请求,打算将 `mybranch` 合并到 `main`,这个请求会展示出两次提交:也就是 `x` 和 `y`。这完全符合我们的预期,也和我们对分支的直观认识相符。 + +![][2] + +我想,如果你在 GitLab 上发起一个合并请求,那显示的内容应该会与此类似。 + +### 直观理解颇为精准,但它有一定局限性 + +这使我们的对分支直观定义看起来相当准确!这个“直观”的概念和合并、变基操作以及 GitHub 拉取请求的工作方式完全吻合。 + +当你在进行合并、变基或创建拉取请求时,你需要明确指定另一个分支(如 `git rebase main`),因为 Git 不知道你的分支是基于哪个分支的。 + +然而,关于分支的直观理解有一个比较严重的问题:你直觉上认为 `main` 分支和某个分离的分支有很大的区别,但 Git 并不清楚这点。 + +所以,现在我们要来讨论一下 Git 分支的不同种类。 + +### 主干和派生分支 + +对于人类来说,`main` 和 `mybranch` 有着显著的区别,你可能针对如何使用它们,有着截然不同的意图。 + +通常,我们会将某些分支视为“主干trunk”分支,同时将其他一些分支看作是“派生”。你甚至可能有派生的派生分支。 + +当然,Git 自身并没有这样的区分(“派生”是我刚刚构造的术语!),但是分支的种类确实会影响你如何处理它。 + +例如: + +* 你可能会想将 `mybranch` 变基到 `main`,但你大概不会想将 `main` 变基到 `mybranch` —— 那就太奇怪了! +* 一般来说,人们在重写“主干”分支的历史时比短期存在的派生分支更为谨慎。 + +### Git 允许你进行“反向”的变基 + +我认为人们经常对 Git 感到困惑的一点是 —— 由于 Git 并没有分支是否是另一个分支的“派生”的概念,它不会给你任何关于何时合适将分支 X 变基到分支 Y 的指引。这一切需要你自己去判断。 + +例如,你可以执行以下命令: + +``` +$ git checkout main +$ git rebase mybranch +``` + +或者 + +``` +$ git checkout mybranch +$ git rebase main +``` + +Git 将会欣然允许你进行任一操作,尽管在这个案例中 `git rebase main` 是极其正常的,而 `git rebase mybranch` 则显得格外奇怪。许多人表示他们对此感到困惑,所以我提供了一个展示两种变基类型的图片以供参考: + +![][3] + +相似地,你可以进行“反向”的合并,尽管这相较于反向变基要正常得多——将 `mybranch` 合并到 `main` 和将 `main` 合并到 `mybranch` 都有各自的益处。 + +下面是一个展示你可以进行的两种合并方式的示意图: + +![][4] + +### Git 对于分支之间缺乏层次结构感觉有些奇怪 + +我经常听到 “`main` 分支没什么特别的” 的表述,而这令我感到困惑——对于我来说,我处理的大部分仓库里,`main` 无疑是非常特别的!那么人们为何会称其为不特别呢? + +我觉得,重点在于:尽管分支确实存在彼此间的关系(`main` 通常是非常特别的!),但 Git 并不知情这些关系。 + +每当你执行如 `git rebase` 或 `git merge` 这样的 `git` 命令时,你都必须明确地告诉 Git 分支间的关系,如果你出错,结果可能会相当混乱。 + +我不知道 Git 在此方面的设计究竟“对”还是“错”(无疑它有利有弊,而我已对无休止的争论感到厌倦),但我认为,这对于许多人来说,原因在于它有些出人意料。 + +### Git 关于分支的用户界面也同样怪异 + +假设你只想查看某个分支上的“派生”提交,正如我们之前讨论的,这是完全正常的需求。 + +下面是用 `git log` 查看我们分支上的两次派生提交的方法: + +``` +$ git switch mybranch +$ git log main..mybranch --oneline +13cb960 (HEAD -> mybranch, origin/mybranch) y +9554dab x +``` + +你可以用 `git diff` 这样查看同样两次提交的合并差异: + +``` +$ git diff main...mybranch +``` + +因此,如果你想使用 `git log` 查看 `x` 和 `y` 这两次提交,你需要用到两个点(`..`),但查看同样的提交使用 `git diff`,你却需要用到三个点(`...`)。 + +我个人从来都记不住 `..` 和 `...` 的具体用意,所以我通常虽然它们在原则上可能很有用,但我选择尽量避免使用它们。 + +### 在 GitHub 上,默认分支具有特殊性 + +同样值得一提的是,在 GitHub 上存在一种“特殊的分支”:每一个 GitHub 仓库都有一个“默认分支”(在 Git 术语中,就是 `HEAD` 所指向的地方),具有以下的特别之处: + +* 初次克隆仓库时,默认会检出这个分支 +* 它作为拉取请求的默认接收分支 +* GitHub 建议应该保护这个默认分支,防止被强制推送,等等。 + +很可能还有许多我未曾想到的场景。 + +### 总结 + +这些说法在回顾时看似是显而易见的,但实际上我花费了大量时间去搞清楚一个更“直观”的分支概念,这是因为我已经习惯了技术性的定义,“分支是对某次提交的引用”。 + +同样,我也没有真正去思索过如何在每次执行 `git rebase` 或 `git merge` 命令时,让 Git 明确理解你分支之间的层次关系——对我而言,这已经成为第二天性,并没有觉得有何困扰。但当我反思这个问题时,可以明显看出,这很容易导致某些人混淆。 + +*(题图:MJ/a5a52832-fac8-4190-b3bd-fec70166aa16)* + +-------------------------------------------------------------------------------- + +via: https://jvns.ca/blog/2023/11/23/branches-intuition-reality/ + +作者:[Julia Evans][a] +选题:[lujun9972][b] +译者:[ChatGPT](https://linux.cn/lctt/ChatGPT) +校对:[wxy](https://github.com/wxy) + +本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出 + +[a]: https://jvns.ca/ +[b]: https://github.com/lujun9972 +[1]: https://jvns.ca/images/git-branch.png +[2]: https://jvns.ca/images/gh-pr.png +[3]: https://jvns.ca/images/backwards-rebase.png +[4]: https://jvns.ca/images/merge-two-ways.png +[0]: https://img.linux.net.cn/data/attachment/album/202312/01/004025i72vi4t0o7027cyf.png \ No newline at end of file diff --git a/sources/tech/20231123 git branches- intuition - reality.md b/sources/tech/20231123 git branches- intuition - reality.md deleted file mode 100644 index a8713ca604..0000000000 --- a/sources/tech/20231123 git branches- intuition - reality.md +++ /dev/null @@ -1,302 +0,0 @@ -[#]: subject: "git branches: intuition & reality" -[#]: via: "https://jvns.ca/blog/2023/11/23/branches-intuition-reality/" -[#]: author: "Julia Evans https://jvns.ca/" -[#]: collector: "lujun9972/lctt-scripts-1700446145" -[#]: translator: " " -[#]: reviewer: " " -[#]: publisher: " " -[#]: url: " " - -git branches: intuition & reality -====== - -Hello! I’ve been working on writing a zine about git so I’ve been thinking about git branches a lot. I keep hearing from people that they find the way git branches work to be counterintuitive. It got me thinking: what might an “intuitive” notion of a branch be, and how is it different from how git actually works? - -So in this post I want to briefly talk about - - * an intuitive mental model I think many people have - * how git actually represents branches internally (“branches are a pointer to a commit” etc) - * how the “intuitive model” and the real way it works are actually pretty closely related - * some limits of the intuitive model and why it might cause problems - - - -Nothing in this post is remotely groundbreaking so I’m going to try to keep it pretty short. - -### an intuitive model of a branch - -Of course, people have many different intuitions about branches. Here’s the one that I think corresponds most closely to the physical “a branch of an apple tree” metaphor. - -My guess is that a lot of people think about a git branch like this: the 2 commits in pink in this picture are on a “branch”. - -![][1] - -I think there are two important things about this diagram: - - 1. the branch has 2 commits on it - 2. the branch has a “parent” (`main`) which it’s an offshoot of - - - -That seems pretty reasonable, but that’s not how git defines a branch – most importantly, git doesn’t have any concept of a branch’s “parent”. So how does git define a branch? - -### in git, a branch is the full history - -In git, a branch is the full history of every previous commit, not just the “offshoot” commits. So in our picture above both branches (`main` and `branch`) have 4 commits on them. - -I made an example repository at which has its branches set up the same way as in the picture above. Let’s look at the 2 branches: - -`main` has 4 commits on it: - -``` - - $ git log --oneline main - 70f727a d - f654888 c - 3997a46 b - a74606f a - -``` - -and `mybranch` has 4 commits on it too. The bottom two commits are shared between both branches. - -``` - - $ git log --oneline mybranch - 13cb960 y - 9554dab x - 3997a46 b - a74606f a - -``` - -So `mybranch` has 4 commits on it, not just the 2 commits `13cb960` and `9554dab` that are “offshoot” commits. - -You can get git to draw all the commits on both branches like this: - -``` - - $ git log --all --oneline --graph - * 70f727a (HEAD -> main, origin/main) d - * f654888 c - | * 13cb960 (origin/mybranch, mybranch) y - | * 9554dab x - |/ - * 3997a46 b - * a74606f a - -``` - -### a branch is stored as a commit ID - -Internally in git, branches are stored as tiny text files which have a commit ID in them. That commit is the latest commit on the branch. This is the “technically correct” definition I was talking about at the beginning. - -Let’s look at the text files for `main` and `mybranch` in our example repo: - -``` - - $ cat .git/refs/heads/main - 70f727acbe9ea3e3ed3092605721d2eda8ebb3f4 - $ cat .git/refs/heads/mybranch - 13cb960ad86c78bfa2a85de21cd54818105692bc - -``` - -This makes sense: `70f727` is the latest commit on `main` and `13cb96` is the latest commit on `mybranch`. - -The reason this works is that every commit contains a pointer to its parent(s), so git can follow the chain of pointers to get every commit on the branch. - -Like I mentioned before, the thing that’s missing here is any relationship at all between these two branches. There’s no indication that `mybranch` is an offshoot of `main`. - -Now that we’ve talked about how the intuitive notion of a branch is “wrong”, I want to talk about how it’s also right in some very important ways. - -### people’s intuition is usually not that wrong - -I think it’s pretty popular to tell people that their intuition about git is “wrong”. I find that kind of silly – in general, even if people’s intuition about a topic is technically incorrect in some ways, people usually have the intuition they do for very legitimate reasons! “Wrong” models can be super useful. - -So let’s talk about 3 ways the intuitive “offshoot” notion of a branch matches up very closely with how we actually use git in practice. - -### rebases use the “intuitive” notion of a branch - -Now let’s go back to our original picture. - -![][1] - -When you rebase `mybranch` on `main`, it takes the commits on the “intuitive” branch (just the 2 pink commits) and replays them onto `main`. - -The result is that just the 2 (`x` and `y`) get copied. Here’s what that looks like: - -``` - - $ git switch mybranch - $ git rebase main - $ git log --oneline mybranch - 952fa64 (HEAD -> mybranch) y - 7d50681 x - 70f727a (origin/main, main) d - f654888 c - 3997a46 b - a74606f a - -``` - -Here `git rebase` has created two new commits (`952fa64` and `7d50681`) whose information comes from the previous two `x` and `y` commits. - -So the intuitive model isn’t THAT wrong! It tells you exactly what happens in a rebase. - -But because git doesn’t know that `mybranch` is an offshoot of `main`, you need to tell it explicitly where to rebase the branch. - -### merges use the “intuitive” notion of a branch too - -Merges don’t copy commits, but they do need a “base” commit: the way merges work is that it looks at two sets of changes (starting from the shared base) and then merges them. - -Let’s undo the rebase we just did and then see what the merge base is. - -``` - - $ git switch mybranch - $ git reset --hard 13cb960 # undo the rebase - $ git merge-base main mybranch - 3997a466c50d2618f10d435d36ef12d5c6f62f57 - -``` - -This gives us the “base” commit where our branch branched off, `3997a4`. That’s exactly the commit you would think it might be based on our intuitive picture. - -### github pull requests also use the intuitive idea - -If we create a pull request on GitHub to merge `mybranch` into `main`, it’ll also show us 2 commits: the commits `x` and `y`. That makes sense and also matches our intuitive notion of a branch. - -![][2] - -I assume if you make a merge request on GitLab it shows you something similar. - -### intuition is pretty good, but it has some limits - -This leaves our intuitive definition of a branch looking pretty good actually! The “intuitive” idea of what a branch is matches exactly with how merges and rebases and GitHub pull requests work. - -You do need to explicitly specify the other branch when merging or rebasing or making a pull request (like `git rebase main`), because git doesn’t know what branch you think your offshoot is based on. - -But the intuitive notion of a branch has one fairly serious problem: the way you intuitively think about `main` and an offshoot branch are very different, and git doesn’t know that. - -So let’s talk about the different kinds of git branches. - -### trunk and offshoot branches - -To a human, `main` and `mybranch` are pretty different, and you probably have pretty different intentions around how you want to use them. - -I think it’s pretty normal to think of some branches as being “trunk” branches, and some branches as being “offshoots”. Also you can have an offshoot of an offshoot. - -Of course, git itself doesn’t make any such distinctions (the term “offshoot” is one I just made up!), but what kind of a branch it is definitely affects how you treat it. - -For example: - - * you might rebase `mybranch` onto `main` but you probably wouldn’t rebase `main` onto `mybranch` – that would be weird! - * in general people are much more careful around rewriting the history on “trunk” branches than short-lived offshoot branches - - - -### git lets you do rebases “backwards” - -One thing I think throws people off about git is – because git doesn’t have any notion of whether a branch is an “offshoot” of another branch, it won’t give you any guidance about if/when it’s appropriate to rebase branch X on branch Y. You just have to know. - -for example, you can do either: - -``` - - $ git checkout main - $ git rebase mybranch - -``` - -or - -``` - - $ git checkout mybranch - $ git rebase main - -``` - -Git will happily let you do either one, even though in this case `git rebase main` is extremely normal and `git rebase mybranch` is pretty weird. A lot of people said they found this confusing so here’s a picture of the two kinds of rebases: - -![][3] - -Similarly, you can do merges “backwards”, though that’s much more normal than doing a backwards rebase – merging `mybranch` into `main` and `main` into `mybranch` are both useful things to do for different reasons. - -Here’s a diagram of the two ways you can merge: - -![][4] - -### git’s lack of hierarchy between branches is a little weird - -I hear the statement “the `main` branch is not special” a lot and I’ve been puzzled about it – in most of the repositories I work in, `main` **is** pretty special! Why are people saying it’s not? - -I think the point is that even though branches **do** have relationships between them (`main` is often special!), git doesn’t know anything about those relationships. - -You have to tell git explicitly about the relationship between branches every single time you run a git command like `git rebase` or `git merge`, and if you make a mistake things can get really weird. - -I don’t know whether git’s design here is “right” or “wrong” (it definitely has some pros and cons, and I’m very tired of reading endless arguments about it), but I do think it’s surprising to a lot of people for good reason. - -### git’s UI around branches is weird too - -Let’s say you want to look at just the “offshoot” commits on a branch, which as we’ve discussed is a completely normal thing to want. - -Here’s how to see just the 2 offshoot commits on our branch with `git log`: - -``` - - $ git switch mybranch - $ git log main..mybranch --oneline - 13cb960 (HEAD -> mybranch, origin/mybranch) y - 9554dab x - -``` - -You can look at the combined diff for those same 2 commits with `git diff` like this: - -``` - - $ git diff main...mybranch - -``` - -So to see the 2 commits `x` and `y` with `git log`, you need to use 2 dots (`..`), but to look at the same commits with `git diff`, you need to use 3 dots (`...`). - -Personally I can never remember what `..` and `...` mean so I just avoid them completely even though in principle they seem useful. - -### in GitHub, the default branch is special - -Also, it’s worth mentioning that GitHub does have a “special branch”: every github repo has a “default branch” (in git terms, it’s what `HEAD` points at), which is special in the following ways: - - * it’s what you check out when you `git clone` the repository - * it’s the default destination for pull requests - * github will suggest that you protect the default branch from force pushes - - - -and probably even more that I’m not thinking of. - -### that’s all! - -This all seems extremely obvious in retrospect, but it took me a long time to figure out what a more “intuitive” idea of a branch even might be because I was so used to the technical “a branch is a reference to a commit” definition. - -I also hadn’t really thought about how git makes you tell it about the hierarchy between your branches every time you run a `git rebase` or `git merge` command – for me it’s second nature to do that and it’s not a big deal, but now that I’m thinking about it, it’s pretty easy to see how somebody could get mixed up. - --------------------------------------------------------------------------------- - -via: https://jvns.ca/blog/2023/11/23/branches-intuition-reality/ - -作者:[Julia Evans][a] -选题:[lujun9972][b] -译者:[译者ID](https://github.com/译者ID) -校对:[校对者ID](https://github.com/校对者ID) - -本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出 - -[a]: https://jvns.ca/ -[b]: https://github.com/lujun9972 -[1]: https://jvns.ca/images/git-branch.png -[2]: https://jvns.ca/images/gh-pr.png -[3]: https://jvns.ca/images/backwards-rebase.png -[4]: https://jvns.ca/images/merge-two-ways.png