diff --git a/sources/tech/20231101 Confusing git terminology.md b/sources/tech/20231101 Confusing git terminology.md new file mode 100644 index 0000000000..464fbebd12 --- /dev/null +++ b/sources/tech/20231101 Confusing git terminology.md @@ -0,0 +1,544 @@ +[#]: subject: "Confusing git terminology" +[#]: via: "https://jvns.ca/blog/2023/11/01/confusing-git-terminology/" +[#]: author: "Julia Evans https://jvns.ca/" +[#]: collector: "lujun9972/lctt-scripts-1693450080" +[#]: translator: " " +[#]: reviewer: " " +[#]: publisher: " " +[#]: url: " " + +Confusing git terminology +====== + +Hello! I’m slowly working on explaining git. One of my biggest problems is that after almost 15 years of using git, I’ve become very used to git’s idiosyncracies and it’s easy for me to forget what’s confusing about it. + +So I asked people [on Mastodon][1]: + +> what git jargon do you find confusing? thinking of writing a blog post that explains some of git’s weirder terminology: “detached HEAD state”, “fast-forward”, “index/staging area/staged”, “ahead of ‘origin/main’ by 1 commit”, etc + +I got a lot of GREAT answers and I’ll try to summarize some of them here. Here’s a list of the terms: + + * [HEAD and “heads”][2] + * [“detached HEAD state”][3] + * [“ours” and “theirs” while merging or rebasing][4] + * [“Your branch is up to date with ‘origin/main’”][5] + * [HEAD^, HEAD~ HEAD^^, HEAD~~, HEAD^2, HEAD~2][6] + * [.. and …][7] + * [“can be fast-forwarded”][8] + * [“reference”, “symbolic reference”][9] + * [refspecs][10] + * [“tree-ish”][11] + * [“index”, “staged”, “cached”][12] + * [“reset”, “revert”, “restore”][13] + * [“untracked files”, “remote-tracking branch”, “track remote branch”][14] + * [checkout][15] + * [reflog][16] + * [merge vs rebase vs cherry-pick][17] + * [rebase –onto][18] + * [commit][19] + * [more confusing terms][20] + + + +I’ve done my best to explain what’s going on with these terms, but they cover basically every single major feature of git which is definitely too much for a single blog post so it’s pretty patchy in some places. + +### `HEAD` and “heads” + +A few people said they were confused by the terms `HEAD` and `refs/heads/main`, because it sounds like it’s some complicated technical internal thing. + +Here’s a quick summary: + + * “heads” are “branches”. Internally in git, branches are stored in a directory called `.git/refs/heads`. (technically the [official git glossary][21] says that the branch is all the commits on it and the head is just the most recent commit, but they’re 2 different ways to think about the same thing) + * `HEAD` is the current branch. It’s stored in `.git/HEAD`. + + + +I think that “a `head` is a branch, `HEAD` is the current branch” is a good candidate for the weirdest terminology choice in git, but it’s definitely too late for a clearer naming scheme so let’s move on. + +There are some important exceptions to “HEAD is the current branch”, which we’ll talk about next. + +### “detached HEAD state” + +You’ve probably seen this message: + +``` + + $ git checkout v0.1 + You are in 'detached HEAD' state. You can look around, make experimental + changes and commit them, and you can discard any commits you make in this + state without impacting any branches by switching back to a branch. + + [...] + +``` + +Here’s the deal with this message: + + * In Git, usually you have a “current branch” checked out, for example `main`. + * The place the current branch is stored is called `HEAD`. + * Any new commits you make will get added to your current branch, and if you run `git merge other_branch`, that will also affect your current branch + * But `HEAD` doesn’t **have** to be a branch! Instead it can be a commit ID. + * Git calls this state (where HEAD is a commit ID instead of a branch) “detached HEAD state” + * For example, you can get into detached HEAD state by checking out a tag, because a tag isn’t a branch + * if you don’t have a current branch, a bunch of things break: + * `git pull` doesn’t work at all (since the whole point of it is to update your current branch) + * neither does `git push` unless you use it in a special way + * `git commit`, `git merge`, `git rebase`, and `git cherry-pick` **do** still work, but they’ll leave you with “orphaned” commits that aren’t connected to any branch, so those commits will be hard to find + * You can get out of detached HEAD state by either creating a new branch or switching to an existing branch + + + +### “ours” and “theirs” while merging or rebasing + +If you have a merge conflict, you can run `git checkout --ours file.txt` to pick the version of `file.txt` from the “ours” side. But which side is “ours” and which side is “theirs”? + +I always find this confusing and I never use `git checkout --ours` because of that, but I looked it up to see which is which. + +For merges, here’s how it works: the current branch is “ours” and the branch you’re merging in is “theirs”, like this. Seems reasonable. + +``` + + $ git checkout merge-into-ours # current branch is "ours" + $ git merge from-theirs # branch we're merging in is "theirs" + +``` + +For rebases it’s the opposite – the current branch is “theirs” and the target branch we’re rebasing onto is “ours”, like this: + +``` + + $ git checkout theirs # current branch is "theirs" + $ git rebase ours # branch we're rebasing onto is "ours" + +``` + +I think the reason for this is that under the hood `git rebase main` is merging the current branch into main (it’s like `git checkout main; git merge current_branch`), but I still find it confusing. + +[This nice tiny site][22] explains the “ours” and “theirs” terms. + +A couple of people also mentioned that VSCode calls “ours”/“theirs” “current change”/“incoming change”, and that it’s confusing in the exact same way. + +### “Your branch is up to date with ‘origin/main’” + +This message seems straightforward – it’s saying that your `main` branch is up to date with the origin! + +But it’s actually a little misleading. You might think that this means that your `main` branch is up to date. It doesn’t. What it **actually** means is – if you last ran `git fetch` or `git pull` 5 days ago, then your `main` branch is up to date with all the changes **as of 5 days ago**. + +So if you don’t realize that, it can give you a false sense of security. + +I think git could theoretically give you a more useful message like “is up to date with the origin’s `main` **as of your last fetch 5 days ago** ” because the time that the most recent fetch happened is stored in the reflog, but it doesn’t. + +### `HEAD^`, `HEAD~` `HEAD^^`, `HEAD~~`, `HEAD^2`, `HEAD~2` + +I’ve known for a long time that `HEAD^` refers to the previous commit, but I’ve been confused for a long time about the difference between `HEAD~` and `HEAD^`. + +I looked it up, and here’s how these relate to each other: + + * `HEAD^` and `HEAD~` are the same thing (1 commit ago) + * `HEAD^^^` and `HEAD~~~` and `HEAD~3` are the same thing (3 commits ago) + * `HEAD^3` refers the the third parent of a commit, and is different from `HEAD~3` + + + +This seems weird – why are `HEAD~` and `HEAD^` the same thing? And what’s the “third parent”? Is that the same thing as the parent’s parent’s parent? (spoiler: it isn’t) Let’s talk about it! + +Most commits have only one parent. But merge commits have multiple parents – they’re merging together 2 or more commits. In Git `HEAD^` means “the parent of the HEAD commit”. But what if HEAD is a merge commit? What does `HEAD^` refer to? + +The answer is that `HEAD^` refers to the the **first** parent of the merge, `HEAD^2` is the second parent, `HEAD^3` is the third parent, etc. + +But I guess they also wanted a way to refer to “3 commits ago”, so `HEAD^3` is the third parent of the current commit (which may have many parents if it’s a merge commit), and `HEAD~3` is the parent’s parent’s parent. + +I think in the context of the merge commit ours/theirs discussion earlier, `HEAD^` is “ours” and `HEAD^2` is “theirs”. + +### `..` and `...` + +Here are two commands: + + * `git log main..test` + * `git log main...test` + + + +What’s the difference between `..` and `...`? I never use these so I had to look it up in [man git-range-diff][23]. It seems like the answer is that in this case: + +``` + + A - B main + \ + C - D test + +``` + + * `main..test` is commits C and D + * `test..main` is commit B + * `main...test` is commits B, C, and D + + + +But it gets worse: apparently `git diff` also supports `..` and `...`, but they do something completely different than they do with `git log`? I think the summary is: + + * `git log test..main` shows changes on `main` that aren’t on `test`, whereas `git log test...main` shows changes on _both_ sides. + * `git diff test..main` shows `test` changes _and_ `main` changes (it diffs `B` and `D`) whereas `git diff test...main` diffs `A` and `D` (it only shows you the diff on one side). + + + +[this blog post][24] talks about it a bit more. + +### “can be fast-forwarded” + +Here’s a very common message you’ll see in `git status`: + +``` + + $ git status + On branch main + Your branch is behind 'origin/main' by 2 commits, and can be fast-forwarded. + (use "git pull" to update your local branch) + +``` + +What does “fast-forwarded” mean? Basically it’s trying to say that the two branches look something like this: (newest commits are on the right) + +``` + + main: A - B - C + origin/main: A - B - C - D - E + +``` + +or visualized another way: + +``` + + A - B - C - D - E (origin/main) + | + main + +``` + +Here `origin/main` just has 2 extra commits that `main` doesn’t have, so it’s easy to bring `main` up to date – we just need to add those 2 commits. Literally nothing can possibly go wrong – there’s no possibility of merge conflicts. A fast forward merge is a very good thing! It’s the easiest way to combine 2 branches. + +After running `git pull`, you’ll end up this state: + +``` + + main: A - B - C - D - E + origin/main: A - B - C - D - E + +``` + +Here’s an example of a state which **can’t** be fast-forwarded. + +``` + + A - B - C - X (main) + | + - - D - E (origin/main) + +``` + +Here `main` has a commit that `origin/main` doesn’t have (`X`). So you can’t do a fast forward. In that case, `git status` would say: + +``` + + $ git status + Your branch and 'origin/main' have diverged, + and have 1 and 2 different commits each, respectively. + +``` + +### “reference”, “symbolic reference” + +I’ve always found the term “reference” kind of confusing. There are at least 3 things that get called “references” in git + + * branches and tags like `main` and `v0.2` + * `HEAD`, which is the current branch + * things like `HEAD^^^` which git will resolve to a commit ID. Technically these are probably not “references”, I guess git [calls them][25] “revision parameters” but I’ve never used that term. + + + +“symbolic reference” is a very weird term to me because personally I think the only symbolic reference I’ve ever used is `HEAD` (the current branch), and `HEAD` has a very central place in git (most of git’s core commands’ behaviour depends on the value of `HEAD`), so I’m not sure what the point of having it as a generic concept is. + +### refspecs + +When you configure a git remote in `.git/config`, there’s this `+refs/heads/main:refs/remotes/origin/main` thing. + +``` + + [remote "origin"] + url = git@github.com:jvns/pandas-cookbook + fetch = +refs/heads/main:refs/remotes/origin/main + +``` + +I don’t really know what this means, I’ve always just used whatever the default is when you do a `git clone` or `git remote add`, and I’ve never felt any motivation to learn about it or change it from the default. + +### “tree-ish” + +The man page for `git checkout` says: + +``` + + git checkout [-f|--ours|--theirs|-m|--conflict=