CVE-2022-48901 Information
Description
In the Linux kernel the following vulnerability has been resolved:
btrfs: do not start relocation until in progress drops are done
We hit a bug with a recovering relocation on mount for one of our file systems in production. I reproduced this locally by injecting errors into snapshot delete with balance running at the same time. This presented as an error while looking up an extent item
WARNING: CPU: 5 PID: 1501 at fs/btrfs/extent-tree.c:866 lookup_inline_extent_backref+0x647/0x680
CPU: 5 PID: 1501 Comm: btrfs-balance Not tainted 5.16.0-rc8+ 8
RIP: 0010:lookup_inline_extent_backref+0x647/0x680
RSP: 0018:ffffae0a023ab960 EFLAGS: 00010202
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000000000c RDI: 0000000000000000
RBP: ffff943fd2a39b60 R08: 0000000000000000 R09: 0000000000000001
R10: 0001434088152de0 R11: 0000000000000000 R12: 0000000001d05000
R13: ffff943fd2a39b60 R14: ffff943fdb96f2a0 R15: ffff9442fc923000
FS: 0000000000000000(0000) GS:ffff944e9eb40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1157b1fca8 CR3: 000000010f092000 CR4: 0000000000350ee0
Call Trace:
Normally snapshot deletion and relocation are excluded from running at the same time by the fs_info->cleaner_mutex. However if we had a pending balance waiting to get the ->cleaner_mutex and a snapshot deletion was running and then the box crashed we would come up in a state where we have a half deleted snapshot.
Again in the normal case the snapshot deletion needs to complete before relocation can start but in this case relocation could very well start before the snapshot deletion completes as we simply add the root to the dead roots list and wait for the next time the cleaner runs to clean up the snapshot.
Fix this by setting a bit on the fs_info if we have any DEAD_ROOT’s that had a pending drop_progress key. If they do then we know we were in the middle of the drop operation and set a flag on the fs_info. Then balance can wait until this flag is cleared to start up again.
If there are DEAD_ROOT’s that don’t have a drop_progress set then we’re safe to start balance right away as we’ll be properly protected by the cleaner_mutex.
Reference
https://git.kernel.org/stable/c/6599d5e8bd758d897fd2ef4dc388ae50278b1f7e https://git.kernel.org/stable/c/5e70bc827b563caf22e1203428cc3719643de5aa https://git.kernel.org/stable/c/b4be6aefa73c9a6899ef3ba9c5faaa8a66e333ef
Share on: