TranslateProject/sources/tech/20221230 Working with Btrfs - Snapshots.md
DarkSun ef602cbd35 选题[tech]: 20221230 Working with Btrfs – Snapshots
sources/tech/20221230 Working with Btrfs - Snapshots.md
2023-09-19 15:08:22 +08:00

24 KiB
Raw Blame History

Working with Btrfs Snapshots

Photo by Heliberto Arias on Unsplash

This article will explore what Btrfs snapshots are, how they work, and how you can benefit from taking snapshots in every-day situations. This is part of a series that takes a closer look at Btrfs, the default filesystem for Fedora Workstation and Fedora Silverblue since Fedora Linux 33.

In case you missed it, heres the previous article from this series: https://fedoramagazine.org/working-with-btrfs-subvolumes/

Introduction

Imagine you work on a file over extended periods of time, repeatedly adding changes and undoing them. Then, at some point you realize: Parts of the changes you undid two hours ago would be very helpful now. And yesterday you had already changed this particular bit, too, before you trashed that design. But of course, because you regularly save your files, old changes are lost. Many people have probably experienced a situation like this before. Wouldnt it be great if you could recover old file versions without having to manually copy them at regular intervals?

This is just one typical situation where Btrfs snapshots can help you out. When used correctly, snapshots also give you a great backup solution for your PC.

Below you will find a lot of examples related to snapshots. If you want to follow along, you must have access to a Btrfs filesystem and root access. You can check the file system of a directory using the following command:


    $ findmnt -no FSTYPE /home
    btrfs

Here the findmnt command shows the type of filesystem for your /home/ directory. If it says btrfs , youre all set. Lets create a new directory in which to perform some experiments:


    $ mkdir ~/btrfs-snapshot-test
    $ cd ~/btrfs-snapshot-test

In the text below, you will find lots of command responses in boxes such as shown above. Please keep in mind while reading/comparing command output that the box contents may be wrapped at the end of the line. This may make it difficult to recognize long lines that are broken across multiple lines for readability. When in doubt, try to resize your browser window and see how the text behaves!

Snapshots in Btrfs

Lets start with an elementary question: What is a Btrfs snapshot? If you look in the Docs [1]4 and Wiki [2]4, you wont immediately find an answer to this question. In fact, it is nowhere to be found in the “Features” section. If you search a little, you will find snapshots mentioned extensively along with Btrfs subvolumes [3]5. So now what?

Remember that snapshots were both mentioned in the previous articles of this series? There it said:

What is the advantage of CoW? In simple terms: a history of the modified and edited files can be kept. Btrfs will keep the references to the old file versions (inodes) somewhere they can be easily accessed. This reference is a snapshot : An image of the filesystem state at some point in time.

Working with Btrfs: General Concepts

and also:

Another advantage of separating / and /home is that you can take snapshots separately. A subvolume is a boundary for snapshots, and snapshots will never contain the contents of other subvolumes below the subvolume that the snapshot is taken of.

Working with Btrfs: Subvolumes

It seems snapshots have something to do with Btrfs subvolumes. You may have heard about snapshots in other contexts before, for example with LVM, the Logical Volume Manager. While technically they serve the same purpose, they are different in terms of how they reach their goal.

Every Btrfs snapshot is a subvolume. However, not every subvolume is a snapshot. The difference is in what the subvolume contains. A snapshot is a subvolume with added content: it holds references to current and/or past versions of files (inodes). Lets see where snapshots come from!

Creating Btrfs snapshots

To use snapshots, you need a Btrfs subvolume to take snapshots of. Lets create one inside our test folder (~/btrfs-snapshot-test):


    $ cd ~/btrfs-snapshot-test
    $ sudo btrfs subvolume create demo
    Create subvolume './demo'
    $ sudo chown -R $(id -u):$(id -g) demo/
    $ cd demo

Since by default Btrfs subvolumes are owned by root, you must call chown to modify the files in the subvolume to be owned by a regular user. Now add a few files inside it:


    $ touch foo bar baz
    $ echo "Lorem ipsum dolor sit amet, " > foo

Your directory now looks something like this:


    $ ls -l
    total 4
    -rw-r--r--. 1 hartan hartan  0 Dec 20 08:11 bar
    -rw-r--r--. 1 hartan hartan  0 Dec 20 08:11 baz
    -rw-r--r--. 1 hartan hartan 29 Dec 20 08:11 foo

Lets create the very first snapshot from that:


    $ cd ..
    $ sudo btrfs subvolume snapshot demo demo-1
    Create a snapshot of 'demo' in './demo-1'

And thats it. Lets see what was achieved:


    $ ls -l
    total 0
    drwxr-xr-x. 1 hartan hartan 18 Dec 20 08:11 demo
    drwxr-xr-x. 1 hartan hartan 18 Dec 20 08:11 demo-1
    $ tree
    .
    ├── demo
    │   ├── bar
    │   ├── baz
    │   └── foo
    └── demo-1
        ├── bar
        ├── baz
        └── foo

    2 directories, 6 files

It seems it made a copy! To verify, lets read the contents of foo from the snapshot:


    $ cat demo/foo
    Lorem ipsum dolor sit amet,
    $ cat demo-1/foo
    Lorem ipsum dolor sit amet,

The real effect becomes apparent when we modify the original file:


    $ echo "consectetur adipiscing elit, " >> demo/foo
    $ cat demo/foo
    Lorem ipsum dolor sit amet,
    consectetur adipiscing elit,
    $ cat demo-1/foo
    Lorem ipsum dolor sit amet,

This shows that the snapshot still holds the “old” version of the data: The content of foo hasnt changed. So far, you could have achieved the exact same thing with a simple file copy. You can now go ahead and continue working on the old file, too:


    $ echo "sed do eiusmod tempor incididunt" >> demo-1/foo
    $ cat demo-1/foo
    Lorem ipsum dolor sit amet,
    sed do eiusmod tempor incididunt

Under the hood, however, our snapshot is in fact a new Btrfs subvolume. You can verify this with the following command:


    $ sudo btrfs subvolume list -o .
    ID 259 gen 265 top level 256 path home/hartan/btrfs-snapshot-test/demo
    ID 260 gen 264 top level 256 path home/hartan/btrfs-snapshot-test/demo-1

Btrfs snapshots vs. file copies

So whats the point of all this? Up until now snapshots seem to be a complicated way to copy files around. In fact, there is more to snapshots than meets the eye. Lets create a bigger file:


    $ dd if=/dev/urandom of=demo/bigfile bs=1M count=512
    512+0 records in
    512+0 records out
    536870912 bytes (537 MB, 512 MiB) copied, 1.3454 s, 399 MB/s

There is now a new file demo/bigfile that is 512 MiB in size. Lets make another snapshot so you dont lose it when you modify the data:


    $ sudo btrfs subvolume snapshot demo demo-2
    Create a snapshot of 'demo' in './demo-2'

Now lets simulate some changes by appending a small string to the file:


    $ echo "small changes" >> demo/bigfile

Heres the resulting file structure:


    $ tree
    .
    ├── demo
    │   ├── bar
    │   ├── baz
    │   ├── bigfile
    │   └── foo
    ├── demo-1
    │   ├── bar
    │   ├── baz
    │   └── foo
    └── demo-2
        ├── bar
        ├── baz
        ├── bigfile
        └── foo

    3 directories, 11 files

But the real magic happens somewhere else. Had you copied demo/bigfile , you would now have two files of about 512 MiB in size with mostly the same content. However, since they are distinct copies, they would occupy about 1 GiB of storage total. Keep in mind that the difference between both files is hardly more than 10 Bytes thats almost nothing compared to the original file size.

Btrfs snapshots work different than file copies: They keep references to current and past inodes instead. When you appended the change to the file, under the hood Btrfs allocated some more space to store the changes in and added a reference to this new data to the original inode. The previous contents remain untouched. If it helps your mental model, you can think of this as “storing” merely the difference between the original file and the modified version.

Lets have a look at the effect of this:


    $ sudo compsize .
    Processed 11 files, 5 regular extents (9 refs), 3 inline.
    Type       Perc     Disk Usage   Uncompressed Referenced
    TOTAL      100%      512M         512M         1.0G
    none       100%      512M         512M         1.0G

The interesting figure here is seen in line “TOTAL”:

  • “Referenced” is the total size of all the files in the current directory, summed up
  • “Disk Usage” is the amount of storage space allocated on your disk to store the files

While you have a total of 1 GiB files, it takes merely 512 MiB to store them.

Btrfs snapshots and backups

So far, in this article, you have seen how to create Btrfs snapshots and what makes them so special. One may be tempted to think: If I take a series of Btrfs snapshots locally on my PC, I have a solid backup strategy. This is not the case. If the underlying data, which is shared by Btrfs subvolumes, is accidentally damaged (by something outside of Btrfs influence, e.g. cosmic rays), all the subvolumes pointing to this data contain the same error.

To turn the snapshots into real backups you should store them on a different Btrfs filesystem, such as on an external drive. For the purposes of this article lets create a new Btrfs filesystem contained inside a file and mount it to simulate an external drive. If you have an external drive formatted with Btrfs lying around, feel free to substitute all the paths mentioned in the following commands to try it out! Lets create a new Btrfs filesystem:

Note : The commands below will create a new file of 8 GB size on your filesystem. If you want to follow the steps below, please ensure you have at least 8 GB of disk space available. Do not allocate less than 8 GB to the file, as Btrfs may otherwise encounter issues during mounting.


    $ truncate -s 8G btrfs_filesystem.img
    $ sudo mkfs.btrfs -L "backup-drive" btrfs_filesystem.img
    btrfs-progs v5.18
    See http://btrfs.wiki.kernel.org for more information.

    [ ... ]

    Devices:
       ID        SIZE  PATH
        1     8.00GiB  btrfs_filesystem.img

These commands created a new file of 8 GB in size named btrfs_filesystem.img and formatted a Btrfs filesystem inside it. Now you can mount it as if it were an external drive:


    $ mkdir backup-drive
    $ sudo mount btrfs_filesystem.img backup-drive
    $ sudo chown -R $(id -u):$(id -g) backup-drive
    $ ls -lh
    total 4.7M
    drwxr-xr-x. 1 hartan hartan    0 Dec 20 08:35 backup-drive
    -rw-r--r--. 1 hartan hartan 8.0G Dec 20 08:37 btrfs_filesystem.img
    drwxr-xr-x. 1 hartan hartan   32 Dec 20 08:14 demo
    drwxr-xr-x. 1 hartan hartan   18 Dec 20 08:11 demo-1
    drwxr-xr-x. 1 hartan hartan   32 Dec 20 08:14 demo-2

Great, now there is an independent Btrfs filesystem mounted under backup-drive! Lets try to take another snapshot and place it there:


    $ sudo btrfs subvolume snapshot demo backup-drive/demo-3
    Create a snapshot of 'demo' in 'backup-drive/demo-3'
    ERROR: cannot snapshot 'demo': Invalid cross-device link

What happened? Well, you tried to take a snapshot of demo and store it in a different Btrfs filesystem (a different device from Btrfs point of view). Remember that a Btrfs subvolume only holds references to files and their contents (inodes)? This is exactly the problem: The files and contents exist in our home filesystem, but not in the newly-created backup-drive. You have to find a way to transfer the subvolume along with its contents to the new filesystem.

Storing snapshots on a different Btrfs filesystem

The Btrfs utilities include two special commands for this purpose. Lets see how they work first:


    $ sudo btrfs send demo | sudo btrfs receive backup-drive/
    ERROR: subvolume /home/hartan/btrfs-snapshot-test/demo is not read-only
    ERROR: empty stream is not considered valid

Another error! This time it tells you that the subvolume were trying to transfer is not read-only. This is true: You can write new contents to all of the snapshots/subvolumes created so far. You can create read-only snapshots like this:


    $ sudo btrfs subvolume snapshot -r demo demo-3-ro
    Create a readonly snapshot of 'demo' in './demo-3-ro'

Unlike previously, here the -r option is added to the snapshot subcommand. This creates a read-only snapshot, which is easily verified:


    $ touch demo-3-ro/another-file
    touch: cannot touch 'demo-3-ro/another-file': Read-only file system

Now you can retry transferring the subvolumes:


    $ sudo btrfs send demo-3-ro | sudo btrfs receive backup-drive/
    At subvol demo-3-ro
    At subvol demo-3-ro
    $ tree

    ├── backup-drive
    │   └── demo-3-ro
    │       ├── bar
    │       ├── baz
    │       ├── bigfile
    │       └── foo
    ├── btrfs_filesystem.img
    ├── demo
    [ ... ]
    └── demo-3-ro
        ├── bar
        ├── baz
        ├── bigfile
        └── foo

    6 directories, 20 files

It worked! You have successfully transferred a read-only snapshot of our original subvolume demo to an external Btrfs filesystem.

Storing snapshots on non-Btrfs filesystems

Above you have seen how you can store Btrfs subvolumes/snapshots on another Btrfs filesystem. But what can you do if you do not have another Btrfs filesystem and cannot create one, for example because the external drives need a filesystem that allows compatibility with Windows or MacOS hosts? In such cases you can store subvolumes in files:


    $ sudo btrfs send -f demo-3-ro-subvolume.btrfs demo-3-ro
    At subvol demo-3-ro
    $ ls -lh demo-3-ro-subvolume.btrfs
    -rw-------. 1 root root 513M Dec 21 10:39 demo-3-ro-subvolume.btrfs

The file demo-3-ro-subvolume.btrfs now contains everything that is needed to recreate the demo-3-ro subvolume at a later point in time.

Incrementally sending subvolumes

If you perform this action repeatedly for different subvolumes, you will notice at some point that the different subvolumes do not share their file contents any more. This is because when sending a subvolume such as above, all the data needed to recreate this standalone subvolume is transferred to the target. You can, however, instruct Btrfs to only send the difference between two subvolumes to the target! This so-called incremental send will ensure that shared references remain shared between the subvolumes. To demonstrate this, add a few more changes to our original subvolume:


    $ echo "a few more changes" >> demo/bigfile

Next create another read-only snapshot:


    $ sudo btrfs subvolume snapshot -r demo demo-4-ro
    Create a readonly snapshot of 'demo' in './demo-4-ro'

And now send it:


    $ sudo btrfs send -p demo-3-ro demo-4-ro | sudo btrfs receive backup-drive
    At subvol demo-4-ro
    At snapshot demo-4-ro

In the command above, the -p option specifies a parent subvolume, against which the differences are calculated. It is important to keep in mind that both the source and target Btrfs filesystem must contain the same, unmodified parent subvolume! Ensure that the new subvolume is really there:


    $ ls backup-drive/
    demo-3-ro  demo-4-ro
    $ ls -lR backup-drive/demo-4-ro/
    backup-drive/demo-4-ro/:
    total 524296
    -rw-r--r--. 1 hartan hartan         0 Dec 20 08:11 bar
    -rw-r--r--. 1 hartan hartan         0 Dec 20 08:11 baz
    -rw-r--r--. 1 hartan hartan 536870945 Dec 21 10:49 bigfile
    -rw-r--r--. 1 hartan hartan        59 Dec 20 08:13 foo

But how do you know whether the incremental send only transferred the difference between both subvolumes? Lets transfer the data stream to a file and see how big it is:


    $ sudo btrfs send -f demo-4-ro-diff.btrfs -p demo-3-ro demo-4-ro
    At subvol demo-4-ro
    $ ls -l demo-4-ro-diff.btrfs
    -rw-------. 1 root root 315 Dec 21 10:55 demo-4-ro-diff.btrfs

According to ls, the file is merely 315 bytes in size! This means that the incremental send only transferred the changes between the two subvolumes, along with additional Btrfs-specific metadata.

Restoring subvolumes from snapshots

Before continuing, lets do some cleaning up of the things you dont need at the moment:


    $ sudo rm -rf demo-4-ro-diff.btrfs demo-3-ro-subvolume.btrfs
    $ sudo btrfs subvolume delete demo-1 demo-2 demo-3-ro demo-4-ro
    $ ls -l
    total 531516
    drwxr-xr-x. 1 hartan hartan         36 Dec 21 10:50 backup-drive
    -rw-r--r--. 1 hartan hartan 8589934592 Dec 21 10:51 btrfs_filesystem.img
    drwxr-xr-x. 1 hartan hartan         32 Dec 20 08:14 demo

So far you have managed to create read/write and read-only snapshots of Btrfs subvolumes and send them to an external location. In order to turn this into a backup strategy, however, there has to be a way to send the subvolumes back to the original filesystem and make them writable again. For this purpose, lets move the demo subvolume somewhere else and try to recreate it from the most recent snapshot. First: Rename the “broken” subvolume. It will be deleted once the restore was successful:


    $ mv demo demo-broken

Second: Transfer the most recent snapshot back to this filesystem:


    $ sudo btrfs send backup-drive/demo-4-ro | sudo btrfs receive .
    At subvol backup-drive/demo-4-ro
    At subvol demo-4-ro
    [hartan@fedora btrfs-snapshot-test]$ ls
    backup-drive  btrfs_filesystem.img  demo-4-ro  demo-broken

Third: Create a read-write subvolume from the snapshot:


    $ sudo btrfs subvolume snapshot demo-4-ro demo
    Create a snapshot of 'demo-4-ro' in './demo'
    $ ls
    backup-drive  btrfs_filesystem.img  demo  demo-4-ro  demo-broken

The last step is important: You cannot just rename demo-4-ro to demo , because it would still be a read-only subvolume! Finally you can check whether everything you need is there:


    $ tree demo
    demo
    ├── bar
    ├── baz
    ├── bigfile
    └── foo

    0 directories, 4 files
    $ tail -c -19 demo/bigfile
    a few more changes

The last command above tells you that the last 19 characters in bigfile are in fact the change last performed. At this point, you may want to copy recent changes from demo-broken to the new demo subvolume. Since you didnt perform any other changes, you can now delete the obsolete subvolumes:


    $ sudo btrfs subvolume delete demo-4-ro demo-broken
    Delete subvolume (no-commit): '/home/hartan/btrfs-snapshot-test/demo-4-ro'
    Delete subvolume (no-commit): '/home/hartan/btrfs-snapshot-test/demo-broken'

And thats it! You have successfully restored the demo subvolume from a snapshot that was previously stored on a different Btrfs filesystem (external media).

Subvolumes as boundary for snapshots

In the second article of this series I mentioned that subvolumes are boundaries for snapshots, but what exactly does that mean? In simple terms, a snapshot of a subvolume will only contain the content of this particular subvolume, and none of the nested subvolumes below. Lets have a look at this:


    $ sudo btrfs subvolume create demo/nested
    Create subvolume 'demo/nested'
    $ sudo chown -R $(id -u):$(id -g) demo/nested
    $ touch demo/nested/another_file

Lets take a snapshot as before:


    $ sudo btrfs subvolume snapshot demo demo-nested
    Create a snapshot of 'demo' in './demo-nested'

And check out the contents:


    $ tree demo-nested
    demo-nested
    ├── bar
    ├── baz
    ├── bigfile
    ├── foo
    └── nested

    1 directory, 4 files

    $ tree demo
    demo
    ├── bar
    ├── baz
    ├── bigfile
    ├── foo
    └── nested
        └── another_file

    1 directory, 5 files

Notice that another_file is missing, even though the folder nested is present. This happens because nested is a subvolume: The snapshot of demo contains the folder (mountpoint) for the nested subvolume, but its contents arent present. Currently there is no way to perform snapshots recursively to include nested subvolumes. However, we can take advantage of this to exclude folders from snapshots! This is typically useful for data that you can reproduce easily, or that will rarely change. Examples include virtual machine or container images, movies, game files and more.

Before we wrap up the article, lets remove everything we created while testing:


    $ sudo btrfs subvolume delete demo/nested demo demo-nested
    Delete subvolume (no-commit): '/home/hartan/btrfs-snapshot-test/demo/nested'
    Delete subvolume (no-commit): '/home/hartan/btrfs-snapshot-test/demo'
    Delete subvolume (no-commit): '/home/hartan/btrfs-snapshot-test/demo-nested'
    $ sudo umount backup-drive
    $ cd ..
    $ rm -rf btrfs-snapshot-test/

Final thoughts on Btrfs-based backups

If you decide you want to use Btrfs to perform regular backups of your data, you may want to use a tool that automates this task for you. The Btrfs wiki has a list of backup tools specialized on Btrfs [4]4. On this page you will also find another summary of the steps to perform Btrfs backups by hand. Personally, I have had a lot of good experiences with btrbk [5]4 and I am using it to perform my own backups. In addition to backups, btrbk can also keep a list of Btrfs snapshots locally on your PC. I use this to safeguard against accidental data deletion.

If you want to know more about performing backups using Btrfs, leave a comment below and Ill consider writing a follow-up article that deals exclusively with this topic.

Conclusion

This article investigated Btrfs snapshots, which are Btrfs subvolumes under the hood. You learned how to create read/write and read-only snapshots, and how this mechanism can help safeguard against data loss.

The next articles in this series will deal with:

  • Compression Transparently saving storage space
  • Qgroups Limiting your filesystem size
  • RAID Replace your mdadm configuration

If there are other topics related to Btrfs that you want to know more about, have a look at the Btrfs Wiki [2]4 and Docs [1]4. Dont forget to check out the first two articles of this series, if you havent already! If you feel that there is something missing from this article series, let me know in the comments below. See you in the next article!

Sources


via: https://fedoramagazine.org/working-with-btrfs-snapshots/

作者:Andreas Hartmann 选题:lujun9972 译者:译者ID 校对:校对者ID

本文由 LCTT 原创编译,Linux中国 荣誉推出