CVE-2024-57975 Information

Description

In the Linux kernel the following vulnerability has been resolved:

btrfs: do proper folio cleanup when run_delalloc_nocow() failed

[BUG] With CONFIG_DEBUG_VM set test case generic/476 has some chance to crash with the following VM_BUG_ON_FOLIO():

BTRFS error (device dm-3): cow_file_range failed start 1146880 end 1253375 len 106496 ret -28 BTRFS error (device dm-3): run_delalloc_nocow failed start 1146880 end 1253375 len 106496 ret -28 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664 aops:btrfs_aops [btrfs] ino:101 dentry name(?): 1774\n flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff) page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio)) ————[ cut here ]———— kernel BUG at mm/page-writeback.c:2992! Internal error: Oops - BUG: 00000000f2000800 [1] SMP CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G OE 6.12.0-rc7-custom+ 87 Tainted: [O]=OOT_MODULE [E]=UNSIGNED_MODULE Hardware name: QEMU KVM Virtual Machine BIOS unknown 2/2/2022 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs] pc : folio_clear_dirty_for_io+0x128/0x258 lr : folio_clear_dirty_for_io+0x128/0x258 Call trace: folio_clear_dirty_for_io+0x128/0x258 btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs] __process_folios_contig+0x154/0x268 [btrfs] extent_clear_unlock_delalloc+0x5c/0x80 [btrfs] run_delalloc_nocow+0x5f8/0x760 [btrfs] btrfs_run_delalloc_range+0xa8/0x220 [btrfs] writepage_delalloc+0x230/0x4c8 [btrfs] extent_writepage+0xb8/0x358 [btrfs] extent_write_cache_pages+0x21c/0x4e8 [btrfs] btrfs_writepages+0x94/0x150 [btrfs] do_writepages+0x74/0x190 filemap_fdatawrite_wbc+0x88/0xc8 start_delalloc_inodes+0x178/0x3a8 [btrfs] btrfs_start_delalloc_roots+0x174/0x280 [btrfs] shrink_delalloc+0x114/0x280 [btrfs] flush_space+0x250/0x2f8 [btrfs] btrfs_async_reclaim_data_space+0x180/0x228 [btrfs] process_one_work+0x164/0x408 worker_thread+0x25c/0x388 kthread+0x100/0x118 ret_from_fork+0x10/0x20 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000) —[ end trace 0000000000000000 ]—

[CAUSE] The first two lines of extra debug messages show the problem is caused by the error handling of run_delalloc_nocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

0                 16K                  32K
|//////////////////////////////////////|
|  Pre-allocated  |

And the range [0 16K) has a preallocated extent.

  • Enter run_delalloc_nocow() for range [0 16K) Which found range [0 16K) is preallocated can do the proper NOCOW write.

  • Enter fallback_to_fow() for range [16K 32K) Since the range [16K 32K) is not backed by preallocated extent we have to go COW.

  • cow_file_range() failed for range [16K 32K) So cow_file_range() will do the clean up by clearing folio dirty unlock the folios.

    Now the folios in range [16K 32K) is unlocked.

  • Enter extent_clear_unlock_delalloc() from run_delalloc_nocow() Which is called with PAGE_START_WRITEBACK to start page writeback. But folios can only be marked writeback when it’s properly locked thus this triggered the VM_BUG_ON_FOLIO().

Furthermore there is another hidden but common bug that run_delalloc_nocow() is not clearing the folio dirty flags in its error handling path. This is the common bug shared between run_delalloc_nocow() and cow_file_range().

[FIX]

  • Clear folio dirty for range [@start @cur_offset) Introduce a helper cleanup_dirty_folios() which will find and lock the folio in the range clear the dirty flag and start/end the writeback with the extra handling for the @locked_folio.

  • Introduce a helper to clear folio dirty start and end writeback

  • Introduce a helper to record the last failed COW range end This is to trace which range we should skip to avoid double unlocking.

  • Skip the failed COW range for the e


truncated—

Reference

https://git.kernel.org/stable/c/2434533f1c963e7317c45880c98287e5bed98325 https://git.kernel.org/stable/c/5ae72abbf91eb172ce3a838a4dc34be3c9707296 https://git.kernel.org/stable/c/c2b47df81c8e20a8e8cd94f0d7df211137ae94ed

Share on: