mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-25 23:11:02 +08:00
finished
This commit is contained in:
parent
45cb8377e3
commit
492212f2b0
@ -1,215 +0,0 @@
|
||||
translating by firmianay
|
||||
|
||||
Splitting and Re-Assembling Files in Linux
|
||||
============================================================
|
||||
|
||||
![csplit](https://www.linux.com/sites/lcom/files/styles/rendered_file/public/split-files.png?itok=kZTP_VF9 "csplit")
|
||||
The very useful csplit command divides single files into multiple files. Carla Schroder explains.[Creative Commons Attribution][1]
|
||||
|
||||
Linux has several utilities for splitting up files. So why would you want to split your files? One use case is to split a large file into smaller sizes so that it fits on smaller media, like USB sticks. This is also a good trick to transfer files via USB sticks when you're stuck with FAT32, which has a maximum file size of 4GB, and your files are bigger than that. Another use case is to speed up network file transfers, because parallel transfers of small files are usually faster.
|
||||
|
||||
We'll learn how to use `csplit, split`, and `cat` to chop up and then put files back together. These work on any file type: text, image, audio, .iso, you name it.
|
||||
|
||||
### Split Files With csplit
|
||||
|
||||
`csplit` is one of those funny little commands that has been around forever, and when you discover it you wonder how you ever made it through life without it. `csplit` divides single files into multiple files. This example demonstrates its simplest invocation, which divides the file foo.txt into three files, split at line numbers 17 and 33:
|
||||
|
||||
```
|
||||
$ csplit foo.txt 17 33
|
||||
2591
|
||||
3889
|
||||
2359
|
||||
```
|
||||
|
||||
`csplit` creates three new files in the current directory, and prints the sizes of your new files in bytes. By default, each new file is named `xx _nn_` :
|
||||
|
||||
```
|
||||
$ ls
|
||||
xx00
|
||||
xx01
|
||||
xx02
|
||||
```
|
||||
|
||||
You can view the first ten lines of each of your new files all at once with the `head` command:
|
||||
|
||||
```
|
||||
$ head xx*
|
||||
|
||||
==> xx00 <==
|
||||
Foo File
|
||||
by Carla Schroder
|
||||
|
||||
Foo text
|
||||
|
||||
Foo subheading
|
||||
|
||||
More foo text
|
||||
|
||||
==> xx01 <==
|
||||
Foo text
|
||||
|
||||
Foo subheading
|
||||
|
||||
More foo text
|
||||
|
||||
==> xx02 <==
|
||||
Foo text
|
||||
|
||||
Foo subheading
|
||||
|
||||
More foo text
|
||||
```
|
||||
|
||||
What if you want to split a file into several files all containing the same number of lines? Specify the number of lines, and then enclose the number of repetitions in curly braces. This example repeats the split 4 times, and dumps the leftover in the last file:
|
||||
|
||||
```
|
||||
$ csplit foo.txt 5 {4}
|
||||
57
|
||||
1488
|
||||
249
|
||||
1866
|
||||
3798
|
||||
```
|
||||
|
||||
You may use the asterisk wildcard to tell `csplit` to repeat your split as many times as possible. Which sounds cool, but it fails if the file does not divide evenly:
|
||||
|
||||
```
|
||||
$ csplit foo.txt 10 {*}
|
||||
1545
|
||||
2115
|
||||
1848
|
||||
1901
|
||||
csplit: '10': line number out of range on repetition 4
|
||||
1430
|
||||
```
|
||||
|
||||
The default behavior is to delete the output files on errors. You can foil this with the `-k` option, which will not remove the output files when there are errors. Another gotcha is every time you run `csplit` it overwrites the previous files it created, so give your splits new filenames to save them. Use `--prefix= _prefix_` to set a different file prefix:
|
||||
|
||||
```
|
||||
$ csplit -k --prefix=mine foo.txt 5 {*}
|
||||
57
|
||||
1488
|
||||
249
|
||||
1866
|
||||
993
|
||||
csplit: '5': line number out of range on repetition 9
|
||||
437
|
||||
|
||||
$ ls
|
||||
mine00
|
||||
mine01
|
||||
mine02
|
||||
mine03
|
||||
mine04
|
||||
mine05
|
||||
```
|
||||
|
||||
The `-n` option changes the number of digits used to number your files:
|
||||
|
||||
```
|
||||
|
||||
$ csplit -n 3 --prefix=mine foo.txt 5 {4}
|
||||
57
|
||||
1488
|
||||
249
|
||||
1866
|
||||
1381
|
||||
3798
|
||||
|
||||
$ ls
|
||||
mine000
|
||||
mine001
|
||||
mine002
|
||||
mine003
|
||||
mine004
|
||||
mine005
|
||||
```
|
||||
|
||||
The "c" in `csplit` is "context". This means you can split your files based on all manner of arbitrary matches and clever regular expressions. This example splits the file into two parts. The first file ends at the line that precedes the line containing the first occurrence of "fie", and the second file starts with the line that includes "fie".
|
||||
|
||||
```
|
||||
$ csplit foo.txt /fie/
|
||||
```
|
||||
|
||||
Split the file at every occurrence of "fie":
|
||||
|
||||
```
|
||||
$ csplit foo.txt /fie/ {*}
|
||||
```
|
||||
|
||||
Split the file at the first 5 occurrences of "fie":
|
||||
|
||||
```
|
||||
$ csplit foo.txt /fie/ {5}
|
||||
```
|
||||
|
||||
Copy only the content that starts with the line that includes "fie", and omit everything that comes before it:
|
||||
|
||||
```
|
||||
$ csplit myfile %fie%
|
||||
```
|
||||
|
||||
### Splitting Files into Sizes
|
||||
|
||||
`split` is similar to `csplit`. It splits files into specific sizes, which is fabulous when you're splitting large files to copy to small media, or for network transfers. The default size is 1000 lines:
|
||||
|
||||
```
|
||||
$ split foo.mv
|
||||
$ ls -hl
|
||||
266K Aug 21 16:58 xaa
|
||||
267K Aug 21 16:58 xab
|
||||
315K Aug 21 16:58 xac
|
||||
[...]
|
||||
```
|
||||
|
||||
They come out to a similar size, but you can specify any size you want. This example is 20 megabytes:
|
||||
|
||||
```
|
||||
$ split -b 20M foo.mv
|
||||
```
|
||||
|
||||
The size abbreviations are K, M, G, T, P, E, Z, Y (powers of 1024), or KB, MB, GB, and so on for powers of 1000.
|
||||
|
||||
Choose your own prefix and suffix for the filenames:
|
||||
|
||||
```
|
||||
$ split -a 3 --numeric-suffixes=9 --additional-suffix=mine foo.mv SB
|
||||
240K Aug 21 17:44 SB009mine
|
||||
214K Aug 21 17:44 SB010mine
|
||||
220K Aug 21 17:44 SB011mine
|
||||
```
|
||||
|
||||
The `-a` controls how many numeric digits there are. `--numeric-suffixes` sets the starting point for numbering. The default prefix is x, and you can set a different prefix by typing it after the filename.
|
||||
|
||||
### Putting Split Files Together
|
||||
|
||||
You probably want to reassemble your files at some point. Good old `cat` takes care of this:
|
||||
|
||||
```
|
||||
$ cat SB0* > foo2.txt
|
||||
```
|
||||
|
||||
The asterisk wildcard in the example will snag any file that starts with SB0, which may not give the results you want. You can make a more exact match with question mark wildcards, using one per character:
|
||||
|
||||
```
|
||||
$ cat SB0?????? > foo2.txt
|
||||
```
|
||||
|
||||
As always, consult the relevant and man and info pages for complete command options.
|
||||
|
||||
_Learn more about Linux through the free ["Introduction to Linux" ][3]course from The Linux Foundation and edX._
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://www.linux.com/learn/intro-to-linux/2017/8/splitting-and-re-assembling-files-linux
|
||||
|
||||
作者:[CARLA SCHRODER ][a]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]:https://www.linux.com/users/cschroder
|
||||
[1]:https://www.linux.com/licenses/category/creative-commons-attribution
|
||||
[2]:https://www.linux.com/files/images/split-filespng
|
||||
[3]:https://training.linuxfoundation.org/linux-courses/system-administration-training/introduction-to-linux
|
@ -0,0 +1,213 @@
|
||||
在 Linux 中分割和重组文件
|
||||
============================================================
|
||||
|
||||
![csplit](https://www.linux.com/sites/lcom/files/styles/rendered_file/public/split-files.png?itok=kZTP_VF9 "csplit")
|
||||
|
||||
非常有用的 csplit 命令可以将单个文件分割成多个文件。Carla Schroder 解释说。[Creative Commons Attribution][1]
|
||||
|
||||
Linux 有几个用于分割文件的使用程序。那么你为什么要分割文件呢?一个用例是将大文件分割成更小的尺寸,以便它适用于比较小的存储介质,比如U盘。当您遇到 FAT32(最大文件大小为4GB),且您的文件大于 FAT32 时,通过U盘传输文件也是一个很好的技巧。另一个用例是加速网络文件传输,因为小文件的并行传输通常更快。
|
||||
|
||||
我们将学习如何使用 `csplit`,`split` 和 `cat` 来重新整理文件,然后再将文件合并在一起。这些操作在任何文件类型下都有用:text,image,audio,.iso 等。
|
||||
|
||||
### 使用 csplit 分割文件
|
||||
|
||||
`csplit` 是这些有趣的小命令中的一个,它永远伴你左右,当你开始使用它时就离不开了。`csplit` 将单个文件分割成多个文件。这个示例演示了最简单的使用方法,它将文件 foo.txt 分为三个文件,以行号 17 和 33 作为分割点:
|
||||
|
||||
```
|
||||
$ csplit foo.txt 17 33
|
||||
2591
|
||||
3889
|
||||
2359
|
||||
```
|
||||
|
||||
`csplit` 在当前目录下创建三个新文件,并以字节为单位打印出新文件的大小。默认情况下,每个新文件名为 `xx_nn`:
|
||||
|
||||
```
|
||||
$ ls
|
||||
xx00
|
||||
xx01
|
||||
xx02
|
||||
```
|
||||
|
||||
您可以使用 `head` 命令查看每个新文件的前十行:
|
||||
|
||||
```
|
||||
$ head xx*
|
||||
|
||||
==> xx00 <==
|
||||
Foo File
|
||||
by Carla Schroder
|
||||
|
||||
Foo text
|
||||
|
||||
Foo subheading
|
||||
|
||||
More foo text
|
||||
|
||||
==> xx01 <==
|
||||
Foo text
|
||||
|
||||
Foo subheading
|
||||
|
||||
More foo text
|
||||
|
||||
==> xx02 <==
|
||||
Foo text
|
||||
|
||||
Foo subheading
|
||||
|
||||
More foo text
|
||||
```
|
||||
|
||||
如果要将文件分割成包含相同行数的多个文件怎么办?可以指定行数,然后将重复次数放在在花括号中。此示例重复分割4次,并将剩下的转储到最后一个文件中:
|
||||
|
||||
```
|
||||
$ csplit foo.txt 5 {4}
|
||||
57
|
||||
1488
|
||||
249
|
||||
1866
|
||||
3798
|
||||
```
|
||||
|
||||
您可以使用星号通配符来告诉 `csplit` 尽可能多地重复分割。这听起来很酷,但是如果文件分割得不均匀,则可能会失败:
|
||||
|
||||
```
|
||||
$ csplit foo.txt 10 {*}
|
||||
1545
|
||||
2115
|
||||
1848
|
||||
1901
|
||||
csplit: '10': line number out of range on repetition 4
|
||||
1430
|
||||
```
|
||||
|
||||
默认的行为是删除发生错误时的输出文件。你可以用 `-k` 选项来解决这个问题,当有错误时,它就不会删除输出文件。另一个行为是每次运行 `csplit` 时,它将覆盖之前创建的文件,所以你需要使用新的文件名来分别保存它们。使用 `--prefix= _prefix_` 来设置一个不同的文件前缀:
|
||||
|
||||
```
|
||||
$ csplit -k --prefix=mine foo.txt 5 {*}
|
||||
57
|
||||
1488
|
||||
249
|
||||
1866
|
||||
993
|
||||
csplit: '5': line number out of range on repetition 9
|
||||
437
|
||||
|
||||
$ ls
|
||||
mine00
|
||||
mine01
|
||||
mine02
|
||||
mine03
|
||||
mine04
|
||||
mine05
|
||||
```
|
||||
|
||||
选项 `-n` 可用于改变对文件进行编号的数字位数:
|
||||
|
||||
```
|
||||
$ csplit -n 3 --prefix=mine foo.txt 5 {4}
|
||||
57
|
||||
1488
|
||||
249
|
||||
1866
|
||||
1381
|
||||
3798
|
||||
|
||||
$ ls
|
||||
mine000
|
||||
mine001
|
||||
mine002
|
||||
mine003
|
||||
mine004
|
||||
mine005
|
||||
```
|
||||
|
||||
`csplit` 中的 “c” 是上下文(context)的意思。这意味着你可以根据任意任意匹配的方式或者巧妙的正则表达式来分割文件。下面的例子将文件分为两部分。第一个文件在包含第一次出现 “fie” 的前一行处结束,第二个文件则以包含 “fie” 的行开头。
|
||||
|
||||
```
|
||||
$ csplit foo.txt /fie/
|
||||
```
|
||||
|
||||
在每次出现 “fie” 时分割文件:
|
||||
|
||||
```
|
||||
$ csplit foo.txt /fie/ {*}
|
||||
```
|
||||
|
||||
在 “fie” 前五次出现的地方分割文件:
|
||||
|
||||
```
|
||||
$ csplit foo.txt /fie/ {5}
|
||||
```
|
||||
|
||||
仅当内容以包含 “fie” 的行开始时才复制,并且省略前面的所有内容:
|
||||
|
||||
```
|
||||
$ csplit myfile %fie%
|
||||
```
|
||||
|
||||
### 将文件分割成不同大小
|
||||
|
||||
`split` 与 `csplit` 类似。它将文件分割成特定的大小,当您将大文件分割成小的多媒体文件或者使用网络传送时,这就非常棒了。默认的大小为 1000 行:
|
||||
|
||||
```
|
||||
$ split foo.mv
|
||||
$ ls -hl
|
||||
266K Aug 21 16:58 xaa
|
||||
267K Aug 21 16:58 xab
|
||||
315K Aug 21 16:58 xac
|
||||
[...]
|
||||
```
|
||||
|
||||
他们分割出来的大小相似,但你可以指定任何你想要的大小。这个例子中是 20M 字节:
|
||||
|
||||
```
|
||||
$ split -b 20M foo.mv
|
||||
```
|
||||
|
||||
尺寸单位缩写为 K,M,G,T,P,E,Z,Y(1024 的幂)或者 KB,MB,GB 等等(1000 的幂)。
|
||||
|
||||
为文件名选择你自己的前缀和后缀:
|
||||
|
||||
```
|
||||
$ split -a 3 --numeric-suffixes=9 --additional-suffix=mine foo.mv SB
|
||||
240K Aug 21 17:44 SB009mine
|
||||
214K Aug 21 17:44 SB010mine
|
||||
220K Aug 21 17:44 SB011mine
|
||||
```
|
||||
|
||||
`-a` 选项控制编号的数字位置。`--numeric-suffixes` 设置编号的开始值。默认前缀为 x,你也可以通过在文件名后输入它来设置一个不同的前缀。
|
||||
|
||||
### 将分割后的文件合并
|
||||
|
||||
你可能想在某个时候重组你的文件。常用的 `cat` 命令就用在这里:
|
||||
|
||||
```
|
||||
$ cat SB0* > foo2.txt
|
||||
```
|
||||
|
||||
示例中的星号通配符将匹配到所有以 SB0 开头的文件,这可能不会得到您想要的结果。您可以使用问号通配符进行更精确的匹配,每个字符使用一个问号:
|
||||
|
||||
```
|
||||
$ cat SB0?????? > foo2.txt
|
||||
```
|
||||
|
||||
和往常一样,请查阅相关的手册和信息页面以获取完整的命令选项。
|
||||
|
||||
_Learn more about Linux through the free ["Introduction to Linux" ][3]course from The Linux Foundation and edX._
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://www.linux.com/learn/intro-to-linux/2017/8/splitting-and-re-assembling-files-linux
|
||||
|
||||
作者:[CARLA SCHRODER ][a]
|
||||
译者:[firmianay](https://github.com/firmianay)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]:https://www.linux.com/users/cschroder
|
||||
[1]:https://www.linux.com/licenses/category/creative-commons-attribution
|
||||
[2]:https://www.linux.com/files/images/split-filespng
|
||||
[3]:https://training.linuxfoundation.org/linux-courses/system-administration-training/introduction-to-linux
|
Loading…
Reference in New Issue
Block a user