Update 20190407 Manage multimedia files with Git.md

This commit is contained in:
修昊 2020-02-09 23:20:41 +08:00 committed by GitHub
parent 7f9ef541e4
commit 9a105ab2eb

View File

@ -1,5 +1,5 @@
[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: translator: (svtter)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
@ -7,20 +7,22 @@
[#]: via: (https://opensource.com/article/19/4/manage-multimedia-files-git)
[#]: author: (Seth Kenlon https://opensource.com/users/seth)
Manage multimedia files with Git
通过 Git 来管理多媒体文件
======
Learn how to use Git to track large multimedia files in your projects in
the final article in our series on little-known uses of Git.
学习如何使用 Git 来追踪项目中的大型多媒体文件。
在系列中的最后一篇文章中描述了如何使用 Git。
![video editing dashboard][1]
Git is very specifically designed for source code version control, so it's rarely embraced by projects and industries that don't primarily work in plaintext. However, the advantages of an asynchronous workflow are appealing, especially in the ever-growing number of industries that combine serious computing with seriously artistic ventures, including web design, visual effects, video games, publishing, currency design (yes, that's a real industry), education… the list goes on and on.
Git 是专用于源代码版本控制的工具。因此Git 很少被用于非纯文本的项目以及行业。然而,异步工作流的优点是十分诱人的,尤其是在一些日益增长的行业中,这种类型的行业把重要的计算和重要的艺术冒险结合起来。其中,包括网页设计、视觉效果、视频游戏、出版、货币设计(是的,这是一个真实的行业),教育 ... 等等。还有许多行业属于这个类型。
In this series leading up to Git's 14th anniversary, we've shared six little-known ways to use Git. In this final article, we'll look at software that brings the advantages of Git to managing multimedia files.
在这个系列正要谈到 Git 14周年纪念日之际我们分享了六个少为人知的方式来使用 Git。在文章的末尾我们将会介绍一下那些利用 Git 优点来管理多媒体文件的软件。
### The problem with managing multimedia files with Git
It seems to be common knowledge that Git doesn't work well with non-text files, but it never hurts to challenge assumptions. Here's an example of copying a photo file using Git:
### Git 管理多媒体文件的问题
众所周知Git 用于处理非文本文件不是很好,但是这并不妨碍我们进行尝试。下面是一个使用 Git 来复制照片文件的例子:
```
$ du -hs
@ -35,8 +37,7 @@ $ du -hs
1.8M .
```
Nothing unusual so far; adding a 1.8MB photo to a directory results in a directory 1.8MB in size. So, let's try removing the file:
目前为止没有什么异常。增加一个 1.8MB 的照片到一个目录下,使得目录变成了 1.8 MB 的大小。所以下一步,我们尝试删除文件。
```
$ git rm dandelion.tif
@ -45,29 +46,29 @@ $ du -hs
828K .
```
You can see the problem here: Removing a large file after it's been committed increases a repository's size roughly eight times its original, barren state (from 108K to 828K). You can perform tests to get a better average, but this simple demonstration is consistent with my experience. The cost of committing files that aren't text-based is minimal at first, but the longer a project stays active, the more changes people make to static content, and the more those fractions start to add up. When a Git repository becomes very large, the major cost is usually speed. The time to perform pulls and pushes goes from being how long it takes to take a sip of coffee to how long it takes to wonder if your computer got kicked off the network.
在这里我们可以看到有些问题删除一个已经被提交的文件还是会使得仓库的大小扩大到原来的8倍从 108K 到 828K。我们可以测试多次来得到一个更好的平均值但是这个简单的演示与我的假设一直。提交非文本文件在一开始花费空间比较少但是一个工厂活跃地时间越长人们可能对静态内容修改的会更多更多的零碎文件会被加和到一起。当一个 Git 仓库变的越来越大,主要的成本往往是速度。拉取和推送的时间,从最初抿一口咖啡的时间到你觉得你可能踢掉了
The reason static content causes Git to grow in size is that formats based on text allow Git to pull out just the parts that have changed. Raster images and music files make as much sense to Git as they would to you if you looked at the binary data contained in a .png or .wav file. So Git just takes all the data and makes a new copy of it, even if only one pixel changes from one photo to the next.
导致 Git 中静态内容的体积不断扩大的原因是什么呢?那些通过文本的构成的文件,允许 Git 只拉取那些修改的部分。光栅图以及音乐文件对 Git 文件而言与文本不同,你可以查看一下 .png 和 .wav 文件中的二进制数据。所以Git 只不过是获取了全部的数据,并且创建了一个新的副本,哪怕是一张图仅仅修改了一个像素。
### Git-portal
In practice, many multimedia projects don't need or want to track the media's history. The media part of a project tends to have a different lifecycle than the text or code part of a project. Media assets generally progress in one direction: a picture starts as a pencil sketch, proceeds toward its destination as a digital painting, and, even if the text is rolled back to an earlier version, the art continues its forward progress. It's rare for media to be bound to a specific version of a project. The exceptions are usually graphics that reflect datasets—usually tables or graphs or charts—that can be done in text-based formats such as SVG.
在实践中许多多媒体项目不需要或者不想追踪媒体的历史记录。相对于文本后者代码的部分项目的媒体部分一般有一个不同的生命周期。媒体资源一般通过一个方向产生一张图片从铅笔草稿开始以数绘的形式抵达它的目的地。然后尽管文本能够回滚到早起的版本但是艺术只会一直向前。工程中的媒体很少被绑定到一个特定的版本。例外情况通常是反映数据集的图形通常是可以用基于文本的格式如SVG完成的表、图形或图表。
So, on many projects that involve both media and text (whether it's narrative prose or code), Git is an acceptable solution to file management, as long as there's a playground outside the version control cycle for artists to play in.
所以在许多同时包含文本无论是叙事散文还是代码和媒体的工程中Git 是一个用于文件管理的,可接受的解决方案,只要有一个在版本控制循环之外的游乐场来给艺术家游玩。
![Graphic showing relationship between art assets and Git][2]
A simple way to enable that is [Git-portal][3], a Bash script armed with Git hooks that moves your asset files to a directory outside Git's purview and replaces them with symlinks. Git commits the symlinks (sometimes called aliases or shortcuts), which are trivially small, so all you commit are your text files and whatever symlinks represent your media assets. Because the replacement files are symlinks, your project continues to function as expected because your local machine follows the symlinks to their "real" counterparts. Git-portal maintains a project's directory structure when it swaps out a file with a symlink, so it's easy to reverse the process, should you decide that Git-portal isn't right for your project or you need to build a version of your project without symlinks (for distribution, for instance).
一个简单的方法来启用这个特性是 [Git-portal][3],一个通过武装 Git hooks 的 Bash 脚本,它将静态文件从文件夹中移出 Git 的范围通过链接来取代。Git 提交链接文件有时候称作快捷方式这种链接文件比较小所以所有的提交都是文本文件和那些代表媒体文件的链接。替身文件是链接所以工程还会像预期的运行因为本地机器会处理他们转换成“真的”。当链接文件发生变动时Git-portal 维护了一个项目的结构,因此逆转这个过程很简单。用户需要考虑的,仅仅是 Git-portal 是否适用于工程,或者需要构建一个没有链接的工程版本(例如需要分发的时候)。
Git-portal also allows remote synchronization of assets over rsync, so you can set up a remote storage location as a centralized source of authority.
Git-portal 也允许通过 rsync 来远程同步静态资源,所以用户可以设置一个远程存储位置,来做为一个中心的授权源。
Git-portal is ideal for multimedia projects, including video game and tabletop game design, virtual reality projects with big 3D model renders and textures, [books][4] with graphics and .odt exports, collaborative [blog websites][5], music projects, and much more. It's not uncommon for an artist to perform versioning in their application—in the form of layers (in the graphics world) and tracks (in the music world)—so Git adds nothing to multimedia project files themselves. The power of Git is leveraged for other parts of artistic projects (prose and narrative, project management, subtitle files, credits, marketing copy, documentation, and so on), and the power of structured remote backups is leveraged by the artists.
Git-portal 对于多媒体的工程是一个理想的解决方案。类似的多媒体工程包括视频游戏桌面游戏需要进行大型3D模型渲染和纹理的虚拟现实工程[带图的书籍][4]以及 .odt 输出,协作型的[博客站点][5]音乐项目等等。艺术家在应用程序中以图层在图形世界中和曲目在音乐世界中的形式执行版本控制并不少见——因此Git 不会向多媒体项目文件本身添加任何内容。Git 的功能可用于艺术项目的其他部分(例如散文和叙述、项目管理、字幕文件、信贷、营销副本、文档等),而结构化远程备份的功能则由艺术家使用。
#### Install Git-portal
#### 安装 Git-portal
There are RPM packages for Git-portal located at <https://klaatu.fedorapeople.org/git-portal>, which you can download and install.
Git-portal 的RPM 安装包位于 <https://klaatu.fedorapeople.org/git-portal>,可用于下载和安装。
Alternately, you can install Git-portal manually from its home on GitLab. It's just a Bash script and some Git hooks (which are also Bash scripts), but it requires a quick build process so that it knows where to install itself:
此外,用户可以从 Git-portal 的 Gitlab 主页手动安装。这仅仅是一个 Bash 脚本以及一些 Git hooks也是 Bash 脚本),但是需要一个快速的构建过程来让它知道安装的位置。
```
@ -78,9 +79,9 @@ $ make
$ sudo make install
```
#### Use Git-portal
#### 使用 Git-portal
Git-portal is used alongside Git. This means, as with all large-file extensions to Git, there are some added steps to remember. But you only need Git-portal when dealing with your media assets, so it's pretty easy to remember unless you've acclimated yourself to treating large files the same as text files (which is rare for Git users). There's one setup step you must do to use Git-portal in a project:
Git-portal 与 Git 一起使用。这意味着,对于 Git 的所有大型文件扩展名,都需要记住一些额外的步骤。但是,你仅仅需要在处理你的媒体资源的时候使用 Git-portal所以很容易记住除非你把大文件都当做文本文件来进行处理对于 Git 用户很少见)。使用 Git-portal 必须做的一个安装步骤是:
```
@ -90,9 +91,9 @@ $ git init
$ git-portal init
```
Git-portal's **init** function creates a **_portal** directory in your Git repository and adds it to your .gitignore file.
Git-portal**init** 函数在 Git 仓库中创建了一个 **_portal** 文件夹并且添加到 .gitignore 文件中。
Using Git-portal in a daily routine integrates smoothly with Git. A good example is a MIDI-based music project: the project files produced by the music workstation are text-based, but the MIDI files are binary data:
在平日里使用 Git-portal 和 Git 协同十分平滑。一个较好的例子是基于 MIDI 的音乐项目:音乐工作站产生的项目文件是基于文本的,但是 MIDI 文件是二进制数据:
```
@ -108,7 +109,7 @@ $ git-portal song-Track*mid
$ git add song-Track*mid
```
If you look into the **_portal** directory, you'll find the original MIDI files. The files in their place are symlinks to **_portal** , which keeps the music workstation working as expected:
如果你查看一下 **_portal** 文件夹你会发现那里有原始的MIDI文件。这些文件在原本的位置被替换成了指向 **_portal** 的链接文件,使得音乐工作站像预期一样运行。
```
@ -121,7 +122,7 @@ $ ls -lG
[...] song-Track_2-1.mid -> _portal/song-Track_2-1.mid*
```
As with Git, you can also add a directory of files:
与 Git 相同,你也可以添加一个文件下的文件。
```
@ -132,7 +133,7 @@ $ ls -lG _portal/yoshimi
[...] yoshimi.stat -> ../_portal/yoshimi/yoshimi.stat*
```
Removal works as expected, but when removing something in **_portal** , you should use **git-portal rm** instead of **git rm**. Using Git-portal ensures that the file is removed from **_portal** :
删除功能也想预期一样工作,但是从 **_portal**中删除了一些东西。你应该使用 **git-portal rm** 而不是 **git rm**。使用 Git-portal 可以确保文件从 **_portal** 中删除:
```
@ -145,7 +146,7 @@ $ ls _portal/
song-Track_1-1.mid* song-Track_2-1.mid* yoshimi/
```
If you forget to use Git-portal, then you have to remove the portal file manually:
如果你忘记使用 Git-portal那么你需要手动删除 portal 文件:
```
@ -156,7 +157,7 @@ song-Track_1-1.mid* song-Track_2-1.mid* yoshimi/
$ trash _portal/song-Track_1-1.mid
```
Git-portal's only other function is to list all current symlinks and find any that may have become broken, which can sometimes happen if files move around in a project directory:
Git-portal 仅有的其他工程,是列出当前所有的链接并且找到里面已经损坏的部分。有时这种情况会因为项目文件夹中的文件被移动而发生:
```
@ -167,14 +168,13 @@ bigproject.git/song-Track_2-1.mid: symbolic link to _portal/song-Track_2-1.mid
bigproject.git/foo/yoshimi/yoshimi.stat: broken symbolic link to ../_portal/yoshimi/yoshimi.stat
```
If you're using Git-portal for a personal project and maintaining your own backups, this is technically all you need to know about Git-portal. If you want to add in collaborators or you want Git-portal to manage backups the way (more or less) Git does, you can a remote.
如果你使用 Git-portal 用于私人项目并且维护自己的备份,以上就是技术方面所有你需要知道关于 Git-portal 的事情了。如果你想要添加一个协作者或者你希望 Git-portal 来像 Git 的方式来管理备份,你可以创建一个远程。
#### Add Git-portal remotes
#### 增加 Git-portal remotes
Adding a remote location for Git-portal is done through Git's existing remote function. Git-portal implements Git hooks, scripts hidden in your repository's .git directory, to look at your remotes for any that begin with **_portal**. If it finds one, it attempts to **rsync** to the remote location and synchronize files. Git-portal performs this action anytime you do a Git push or a Git merge (or pull, which is really just a fetch and an automatic merge).
If you've only cloned Git repositories, then you may never have added a remote yourself. It's a standard Git procedure:
为 Git-portal 增加一个远程位置是通过 Git 已经存在的功能来实现的。Git-portal 实现了 Git hooks隐藏在仓库 .git 文件夹中的脚本,来寻找你的远程主机上是否存在以 **_portal** 开头的文件夹。如果它找到一个,它会尝试使用 **rsync** 来与远程位置同步文件。Git-portal 在用户进行 Git push 以及 Git 合并的时候(或者在进行 git pull的时候实际上是进行一次 fetch 和自动合并)会处理这项任务。
如果你近克隆了 Git 仓库,那么你可能永远不会自己添加一个 remote。这是一个标准的 Git 过程:
```
$ git remote add origin [git@gitdawg.com][6]:seth/bigproject.git
@ -183,6 +183,7 @@ origin [git@gitdawg.com][6]:seth/bigproject.git (fetch)
origin [git@gitdawg.com][6]:seth/bigproject.git (push)
```
**origin** 这个名字对你的主要 Git 仓库是一个流行的惯例,所以使用它来
The name **origin** is a popular convention for your main Git repository, so it makes sense to use it for your Git data. Your Git-portal data, however, is stored separately, so you must create a second remote to tell Git-portal where to push to and pull from. Depending on your Git host, you may need a separate server because gigabytes of media assets are unlikely to be accepted by a Git host with limited space. Or maybe you're on a server that permits you to access only your Git repository and not external storage directories: