Archive for May, 2023

Deal with mismatches in LVM RAID1

May 13, 2023

Let’s say you are using LVM on your Linux installation, and since you had a second identical disk lying around, you configured RAID1 to protect your installation against disk failures. Good idea, right?

Until one day, after a kernel panic or a power failure, you reboot your machine, run a routine lvchange --syncaction check and your raid_mismatch_count is in the thousands.

The LVM documentation happily suggests to run a lvchange --syncaction repair to fix the inconsistencies, but something doesn’t add up. Magic in computer science doesn’t exist, and RAID1 works by basically duplicating every block write on both disks. If there are only two copies, how does lvchange --syncaction repair know which of the two inconsistent copies of every mismatched sector is the right one? You read some more, and finally come across the answer:

Scrubbing Limitations
The repair mode can make the RAID LV data consistent, but it does not know which data is correct.  The result may be consistent but incorrect data.  When two different blocks of data must be made consistent, it chooses the block from the device that would be used during RAID initialization.

TL;DR, it doesn’t. Repair just takes the content of inconsistent blocks from the first disk and copies it to the second one. There does not seem to be any guarantee that this operation won’t permanently erase good data overwriting it with bad data.

What a good news, you think. Fixing thousand of inconsistencies is now starting to look like a long operation. Let’s try to track those inconsistencies, then.
Reading the documentation further, we’re in for a new surprise though.

The check mode can only report the number of inconsistent blocks, it cannot report which blocks are inconsistent.  This makes it impossible to know which device has errors, or if the errors affect file system data, metadata or nothing at all.

What the f…? Who’s the &$&%# who designed this? Isn’t RAID meant to, you know, protect your data? How comes the handling of inconsistencies seems to be, basically unimplemented?

A lot more searching across the Internet and… nothing turns out. Only people in the same situation asking what to do and being met with dead silence such as this one:

https://superuser.com/questions/1746635/how-to-deal-with-lvm-raid-mismatches-are-there-benign-causes

Rather than running a blind “repair” risking rendering my machine unbootable and having to reinstall the entire OS and restore backups, I started thinking how to get out of this situation.
The best idea I initially came up with was some kind of a black box approach: somehow mount in read-only mode both RAID instances at the same time and compare all the files.
However, I don’t know LVM well enough to split a volume group with redundancy in two volumes without redundancy to mount them separately. Modifying the volume group to do this and then modifying it back again after entails the risk to mess up and end up being forced to reinstall everything.

After thinking about it a little further, it became evident that a better strategy would be to somehow compare two filesystems NOT mounted at the same time, so as to not modify the LVM. And it turns out that I wrote a tool to do so a while ago for unrelated reasons.

In case someone else ends up in the same situation, here’s how I fixed it.
You’ll need an USB stick with a live Linux distro that also allows permanent storage, I used Kubuntu, and you’ll need to be able to connect to the Internet from your live distro.

First, boot your distro, download and build directorydiffmerge

sudo apt update
sudo apt install g++ cmake make git libboost-program-options-dev libcrypto++-dev --no-install-recommends
git clone https://github.com/fedetft/directorydiffmerge.git
cd directorydiffmerge
mkdir build
cd build
cmake ..
make

Then mount your LVM volume with only the first disk. I did it graphically from KDE partitionmanager by only unlocking the first disk. My setup is an encrypted LVM, so it’s easy to unlock only one disk.

sudo pvs
  WARNING: Couldn't find device with uuid <uuid>.
  WARNING: VG vgkubuntu is missing PV <uid> (last written to /dev/mapper/sdb1_crypt).
  PV                          VG        Fmt  Attr PSize   PFree
  /dev/mapper/luks-<uid> vgkubuntu lvm2 a--  929,32g <1,91g
  [unknown]              vgkubuntu lvm2 a-m  929,32g <1,91g
sudo vgs
  WARNING: Couldn't find device with uuid <uid>.
  WARNING: VG vgkubuntu is missing PV <uid> (last written to /dev/mapper/sdb1_crypt).
  VG        #PV #LV #SN Attr   VSize  VFree
  vgkubuntu   2   1   0 wz-pn- <1,82t 3,81g
sudo lvs
  WARNING: Couldn't find device with uuid <uid>.
  WARNING: VG vgkubuntu is missing PV <uid> (last written to /dev/mapper/sdb1_crypt).
  LV   VG        Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  root vgkubuntu rwi-a-r-p- 927,41g                                    100,00

Then mount in read-only mode the logical volume, and use directorydiffmerge to compute the hash of all files in the first lvm image version (this will take a while)

mkdir mylv
sudo mount -o ro /dev/vgkubuntu/root mylv
sudo ./ddm ls mylv -o firstdisk.txt
sudo umount mylv

Ignore any hardlink or unsupported file type warnings, they are not relevant.
What directorydiffmerge did, is to save in the metadata file firstdisk.txt information on all files and directories found in the disk, including a hash of each file. This will allow to compare the content of the two LVM copies without them being mounted at the same time.

Then, do the same for the second disk. I did it graphically from KDE partitionmanager by unmounting the volume group, locking the first disk, and unlocking the second disk, the logical volume was automatically created, this time lamenting the lack of the first disk.

sudo pvs
  WARNING: Couldn't find device with uuid <uuid>.
  WARNING: VG vgkubuntu is missing PV <uuid> (last written to /dev/mapper/sda3_crypt).
  PV                           VG        Fmt  Attr PSize   PFree
  /dev/mapper/luks-<uuid> vgkubuntu lvm2 a--  929,32g <1,91g
  [unknown]               vgkubuntu lvm2 a-m  929,32g <1,91g
sudo vgs
  WARNING: Couldn't find device with uuid <uuid>.
  WARNING: VG vgkubuntu is missing PV <uuid> (last written to /dev/mapper/sda3_crypt).
  VG        #PV #LV #SN Attr   VSize  VFree
  vgkubuntu   2   1   0 wz-pn- <1,82t 3,81g
sudo lvs
  WARNING: Couldn't find device with uuid <uuid>.
  WARNING: VG vgkubuntu is missing PV <uuid> (last written to /dev/mapper/sda3_crypt).
  LV   VG        Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  root vgkubuntu rwi-a-r-p- 927,41g                                    100,00

Then, mount in read-only mode the logical volume and compute the hash of all files in the second LVM image version

sudo mount -o ro /dev/vgkubuntu/root mylv
sudo ./ddm ls mylv -o seconddisk.txt
sudo umount mylv

At this point, you can finally compare the state of the two filesystems

sudo ./ddm diff firstdisk.txt seconddisk.txt

Directorydiffmerge will print the list of all files that differ in their content or metadata. Note down the differing files, mount again the LVM with the first disk and copy out of the LVM volume the differing files, do the same with the second disk and copy out of the LVM volume the second version of the differing files, then inspect the files one by one, and decide which one is right.

Finally, mount LVM with BOTH disks and repair the volume

sudo lvchange --syncaction repair

This command will “repair” the raid, that is unconditionally owerwrite the different sectors of the second disk with those in the first disk. However, since you took both versions of the different files out of the two disks, now you have a chance after the “repair” to overvwrite the “repaired” files with the one you checked are truly correct. Enjoy.

Summarizing, it looks to me that the design of the LVM RAID1 fails to acknowledge that disk failing are not the only way you can lose data. If kernel panics or power failures occur while disk I/O is happening, there HAS to be a way to let the user fix inconsistencies, but there isn’t…