Merge pull request #27223 from geekpi/translating

translated
This commit is contained in:
geekpi 2022-09-19 08:35:48 +08:00 committed by GitHub
commit c6dc461abc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 153 additions and 153 deletions

View File

@ -1,153 +0,0 @@
[#]: subject: "How I recovered my Linux system using a Live USB device"
[#]: via: "https://opensource.com/article/22/9/recover-linux-system-live-usb"
[#]: author: "David Both https://opensource.com/users/dboth"
[#]: collector: "lkxed"
[#]: translator: "geekpi"
[#]: reviewer: " "
[#]: publisher: " "
[#]: url: " "
How I recovered my Linux system using a Live USB device
======
The Fedora Live USB distribution provides an effective solution to boot and enter a recovery mode.
![USB drive][1]
Image by: Photo by [Markus Winkler][2] on [Unsplash][3]
I have a dozen or so physical computers in my home lab and even more VMs. I use most of these systems for testing and experimentation. I frequently write about using automation to make sysadmin tasks easier. I have also written in multiple places that I learn more from my own mistakes than I do in almost any other way.
I have learned a lot during the last couple of weeks.
I created a major problem for myself. Having been a sysadmin for years and written hundreds of articles and five books about Linux, I really should have known better. Then again, we all make mistakes, which is an important lesson: You're never too experienced to make a mistake.
I'm not going to discuss the details of my error. It's enough to tell you that it was a mistake and that I should have put a lot more thought into what I was doing before I did it. Besides, the details aren't really the point. Experience can't save you from every mistake you're going to make, but it can help you in recovery. And that's literally what this article is about: Using a Live USB distribution to boot and enter a recovery mode.
### The problem
First, I created the problem, which was essentially a bad configuration for the `/etc/default/grub` file. Next, I used Ansible to distribute the misconfigured file to all my physical computers and run `grub2-mkconfig`. All 12 of them. Really, really fast.
All but two failed to boot. They crashed during the very early stages of Linux startup with various errors indicating that the `/root` filesystem could not be located.
I could use the root password to get into "maintenance" mode, but without `/root` mounted, it was impossible to access even the simplest tools. Booting directly to the recovery kernel did not work either. The systems were truly broken.
### Recovery mode with Fedora
The only way to resolve this problem was to find a way to get into recovery mode. When all else fails, Fedora provides a really cool tool: The same Live USB thumb drive used to install new instances of Fedora.
After setting the BIOS to boot from the Live USB device, I booted into the Fedora 36 Xfce live user desktop. I opened two terminal sessions next to each other on the desktop and switched to root privilege in both.
I ran `lsblk` in one for reference. I used the results to identify the `/` root partition and the `boot` and `efi` partitions. I used one of my VMs, as seen below. There is no `efi` partition in this case because this VM does not use UEFI.
```
# lsblk
NAME          MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
loop0           7:0    0  1.5G  1 loop
loop1           7:1    0    6G  1 loop
├─live-rw     253:0    0    6G  0 dm   /
└─live-base   253:1    0    6G  1 dm  
loop2           7:2    0   32G  0 loop
└─live-rw     253:0    0    6G  0 dm   /
sda             8:0    0  120G  0 disk
├─sda1          8:1    0    1G  0 part
└─sda2          8:2    0  119G  0 part
  ├─vg01-swap 253:2    0    4G  0 lvm  
  ├─vg01-tmp  253:3    0   10G  0 lvm  
  ├─vg01-var  253:4    0   20G  0 lvm  
  ├─vg01-home 253:5    0    5G  0 lvm  
  ├─vg01-usr  253:6    0   20G  0 lvm  
  └─vg01-root 253:7    0    5G  0 lvm  
sr0            11:0    1  1.6G  0 rom  /run/initramfs/live
zram0         252:0    0    8G  0 disk [SWAP]
```
The `/dev/sda1` partition is easily identifiable as `/boot`, and the root partition is pretty obvious as well.
In the other terminal session, I performed a series of steps to recover my systems. The specific volume group names and device partitions such as `/dev/sda1` will differ for your systems. The commands shown here are specific to my situation.
The objective is to boot and get through startup using the Live USB, then mount only the necessary filesystems in an image directory and run the `chroot` command to run Linux in the chrooted image directory. This approach bypasses the damaged GRUB (or other) configuration files. However, it provides a complete running system with all the original filesystems mounted for recovery, both as the source of the tools required and the target of the changes to be made.
Here are the steps and related commands:
1. Create the directory `/mnt/sysimage` to provide a location for the `chroot` directory.
2. Mount the root partition on `/mnt/sysimage:`
```
# mount /dev/mapper/vg01-root /mnt/sysimage
```
3. Make `/mnt/sysimage` your working directory:
```
# cd /mnt/sysimage
```
4. Mount the `/boot` and `/boot/efi` filesystems.
5. Mount the other main filesystems. Filesystems like `/home` and `/tmp` are not needed for this procedure:
```
# mount /dev/mapper/vg01-usr usr
# mount /dev/mapper/vg01-var var
```
6. Mount important but already mounted filesystems that must be shared between the chrooted system and the original Live system, which is still out there and running:
```
# mount --bind /sys sys
# mount --bind /proc proc
```
7. Be sure to do the `/dev` directory last, or the other filesystems won't mount:
```
# mount --bind /dev dev
```
8. Chroot the system image:
```
# chroot /mnt/sysimage
```
The system is now ready for whatever you need to do to recover it to a working state. However, one time I was able to run my server for several days in this state until I could research and test real fixes. I don't really recommend that, but it can be an option in a dire emergency when things just need to get up and runningnow!
### The solution
The fix was easy once I got each system into recovery mode. Because my systems now worked just as if they had booted successfully, I simply made the necessary changes to `/etc/default/grub` and `/etc/fstab` and ran the `grub2-mkconfig > boot/grub2/grub.cfg` command. I used the `exit` command to exit from chroot and then rebooted the host.
Of course, I could not automate the recovery from my mishap. I had to perform this entire process manually on each host—a fitting bit of karmic retribution for using automation to quickly and easily propagate my own errors.
### Lessons learned
Despite their usefulness, I used to hate the "Lessons Learned" sessions we would have at some of my sysadmin jobs, but it does appear that I need to remind myself of a few things. So here are my "Lessons Learned" from this self-inflicted fiasco.
First, the ten systems that failed to boot used a different volume group naming scheme, and my new GRUB configuration failed to consider that. I just ignored the fact that they might possibly be different.
* Think it through completely.
* Not all systems are alike.
* Test everything.
* Verify everything.
* Never make assumptions.
Everything now works fine. Hopefully, I am a little bit smarter, too.
--------------------------------------------------------------------------------
via: https://opensource.com/article/22/9/recover-linux-system-live-usb
作者:[David Both][a]
选题:[lkxed][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/dboth
[b]: https://github.com/lkxed
[1]: https://opensource.com/sites/default/files/lead-images/markus-winkler-usb-unsplash.jpg
[2]: https://unsplash.com/@markuswinkler?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText
[3]: https://unsplash.com/s/photos/usb?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText

View File

@ -0,0 +1,153 @@
[#]: subject: "How I recovered my Linux system using a Live USB device"
[#]: via: "https://opensource.com/article/22/9/recover-linux-system-live-usb"
[#]: author: "David Both https://opensource.com/users/dboth"
[#]: collector: "lkxed"
[#]: translator: "geekpi"
[#]: reviewer: " "
[#]: publisher: " "
[#]: url: " "
我如何使用 Live USB 设备恢复我的 Linux 系统
======
Fedora Live USB 发行版为引导和进入恢复模式提供了有效的解决方案。
![USB 驱动器][1]
图片来源:[Markus Winkler][2] 发布于 [Unsplash][3]
我的家庭实验室里有十几台物理计算机以及更多的虚拟机。我使用这些系统中的大多数进行测试和实验。我经常写关于使用自动化来简化系统管理任务的文章。我还在多个地方写过,我从自己的错误中学到的东西比几乎任何其他方式都多。
在过去的几周里,我学到了很多东西。
我给自己制造了一个大问题。作为系统管理员多年,写了数百篇关于 Linux 的文章和五本书,我应该知道得更清楚。话又说回来,我们都会犯错,这是一个重要的教训:你永远不会因为有经验而不犯错。
我不打算讨论我的错误的细节。告诉你这是一个错误就足够了,在我做之前我应该多考虑一下我在做什么。此外,细节并不是重点。经验不能让你免于犯下的每一个错误,但它可以帮助你恢复。这就是本文要讨论的内容:使用 Live USB 发行版启动并进入恢复模式。
### 问题
首先,我创建了问题,这本质上是 `/etc/default/grub` 文件的错误配置。接下来,我使用 Ansible 将错误配置的文件分发到我所有的物理计算机并运行 `grub2-mkconfig`。全部 12 个。这真的,真的很快。
除了两台之外,所有的都无法启动。它们在 Linux 启动的早期阶段崩溃,出现各种无法定位 `/root` 文件系统的错误。
我可以使用 root 密码进入“维护”模式,但是如果没有挂载 `/root`,即使是最简单的工具也无法访问。直接引导到恢复内核也不起作用。系统真的被破坏了。
### Fedora 恢复模式
解决此问题的唯一方法是找到进入恢复模式的方法。当一切都失败时Fedora 提供了一个非常酷的工具:用于安装 Fedora 新实例的同一个 Live USB 驱动器。
将 BIOS 设置为从 Live USB 设备启动后,我启动到 Fedora 36 Xfce live 用户桌面。我在桌面上打开了两个相邻的终端会话,并在两者中都切换到了 root 权限。
我在一个中运行了 `lsblk` 以供参考。我使用结果来识别 `/` 根分区以及 `boot``efi` 分区。我使用了我的一台虚拟机,如下所示。在这种情况下没有 `efi` 分区,因为此 VM 不使用 UEFI。
```
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 1.5G 1 loop
loop1 7:1 0 6G 1 loop
├─live-rw 253:0 0 6G 0 dm /
└─live-base 253:1 0 6G 1 dm
loop2 7:2 0 32G 0 loop
└─live-rw 253:0 0 6G 0 dm /
sda 8:0 0 120G 0 disk
├─sda1 8:1 0 1G 0 part
└─sda2 8:2 0 119G 0 part
├─vg01-swap 253:2 0 4G 0 lvm
├─vg01-tmp 253:3 0 10G 0 lvm
├─vg01-var 253:4 0 20G 0 lvm
├─vg01-home 253:5 0 5G 0 lvm
├─vg01-usr 253:6 0 20G 0 lvm
└─vg01-root 253:7 0 5G 0 lvm
sr0 11:0 1 1.6G 0 rom /run/initramfs/live
zram0 252:0 0 8G 0 disk [SWAP]
```
`/dev/sda1` 分区很容易识别为 `/boot`,根分区也很明显。
在另一个终端会话中,我执行了一系列步骤来恢复我的系统。特定的卷组名称和设备分区(例如 `/dev/sda1`)因系统而异。此处显示的命令特定于我的情况。
目标是使用 Live USB 引导并完成启动,然后仅在镜像目录中挂载必要的文件系统,并运行 `chroot` 命令在 chroot 镜像目录中运行 Linux。这种方法绕过损坏的 GRUB或其他配置文件。但是它提供了一个完整的运行系统其中安装了所有原始文件系统以进行恢复既是所需工具的来源也是要进行更改的目标。
以下是步骤和相关命令:
1. 创建目录 `/mnt/sysimage` 以提供 `chroot` 目录的位置。
2. 将根分区挂载到 `/mnt/sysimage`
```
# mount /dev/mapper/vg01-root /mnt/sysimage
```
3. 将 `/mnt/sysimage` 设为你的工作目录:
```
# cd /mnt/sysimage
```
4. 挂载 `/boot``/boot/efi` 文件系统。
5. 挂载其他主要文件系统。此步骤不需要像 `/home``/tmp` 这样的文件系统:
```
# mount /dev/mapper/vg01-usr usr
# mount /dev/mapper/vg01-var var
```
6. 挂载重要但已挂载的文件系统,它们必须在已经 chroot 的系统和原始 Live 系统之间共享,而后者仍然在外面运行:
```
# mount --bind /sys sys
# mount --bind /proc proc
```
7. 一定要最后操作 `/dev` 目录,否则其他文件系统不会挂载:
```
# mount --bind /dev dev
```
8. chroot 系统镜像:
```
# chroot /mnt/sysimage
```
系统现在已经准备好了,无论你需要做什么,都可以把它恢复到一个工作状态。然而,有一次我能够在这种状态下运行我的服务器数天,直到我能够研究和测试真正的修复方法。我并不推荐这样做,但在紧急情况下,当有任务需要启动和运行时,这可能是一个选择。
### 解决方案
当我让每个系统进入恢复模式,修复就很容易了。因为我的系统现在就像成功启动一样工作,我只需对 `/etc/default/grub``/etc/fstab` 进行必要的更改并运行 `grub2-mkconfig > boot/grub2/grub.cfg ` 命令。我使用 `exit` 命令退出 chroot然后重启主机。
当然,我无法自动从我的意外事故中恢复过来。我必须在每台主机上手动执行整个过程,这是使用自动化快速和容易地传播我自己的错误的一点报应。
### 得到教训
尽管它们很有用,我曾经讨厌在我的一些系统管理员工作中举行的“经验教训”会议,但看来我确实需要提醒自己一些事情。因此,这里是我从这场自作自受的惨败中获得的“教训”。
首先,无法引导的十个系统使用了不同的卷组命名方案,而我的新 GRUB 配置没有考虑到这一点。我只是忽略了它们可能不同的事实。
* 彻底考虑清楚。
* 并非所有系统都相同。
* 测试一切。
* 验证一切。
* 永远不要做假设。
现在一切正常。希望我也聪明一点。
--------------------------------------------------------------------------------
via: https://opensource.com/article/22/9/recover-linux-system-live-usb
作者:[David Both][a]
选题:[lkxed][b]
译者:[geekpi](https://github.com/geekpi)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/dboth
[b]: https://github.com/lkxed
[1]: https://opensource.com/sites/default/files/lead-images/markus-winkler-usb-unsplash.jpg
[2]: https://unsplash.com/@markuswinkler?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText
[3]: https://unsplash.com/s/photos/usb?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText