CVE-2024-50082 Information
Description
In the Linux kernel the following vulnerability has been resolved:
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
We’re seeing crashes from rq_qos_wake_function that look like this:
BUG: unable to handle page fault for address: ffffafe180a40084
PF: supervisor write access in kernel mode
PF: error_code(0x0002) - not-present page
PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
Oops: Oops: 0002 [1] PREEMPT SMP PTI
CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 11
Hardware name: QEMU Standard PC (i440FX + PIIX 1996) BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00
So rq_qos_wake_function() calls wake_up_process(data->task) which calls try_to_wake_up() which faults in raw_spin_lock_irqsave(&p->pi_lock).
p comes from data->task and data comes from the waitqueue entry which is stored on the waiter’s stack in rq_qos_wait(). Analyzing the core dump with drgn I found that the waiter had already woken up and moved on to a completely unrelated code path clobbering what was previously data->task. Meanwhile the waker was passing the clobbered garbage in data->task to wake_up_process() leading to the crash.
What’s happening is that in between rq_qos_wake_function() deleting the waitqueue entry and calling wake_up_process() rq_qos_wait() is finding that it already got a token and returning. The race looks like this:
rq_qos_wait() rq_qos_wake_function()
prepare_to_wait_exclusive() data->got_token = true; list_del_init(&curr->entry); if (data.got_token) break; finish_wait(&rqw->wait &data.wq); ^- returns immediately because list_empty_careful(&wq_entry->entry) is true … return go do something else … wake_up_process(data->task) (NO LONGER VALID!)-^
Normally finish_wait() is supposed to synchronize against the waker. But as noted above it is returning immediately because the waitqueue entry has already been removed from the waitqueue.
The bug is that rq_qos_wake_function() is accessing the waitqueue entry AFTER deleting it. Note that autoremove_wake_function() wakes the waiter and THEN deletes the waitqueue entry which is the proper order.
Fix it by swapping the order. We also need to use list_del_init_careful() to match the list_empty_careful() in finish_wait().
CVSS Vector
CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:N/I:N/A:H
Reference
https://git.kernel.org/stable/c/3bc6d0f8b70a9101456cf02ab99acb75254e1852 https://git.kernel.org/stable/c/455a469758e57a6fe070e3e342db12e4a629e0eb https://git.kernel.org/stable/c/b5e900a3612b69423a0e1b0ab67841a1fb4af80f https://git.kernel.org/stable/c/4c5b123ab289767afe940389dbb963c5c05e594e https://git.kernel.org/stable/c/04f283fc16c8d5db641b6bffd2d8310aa7eccebc https://git.kernel.org/stable/c/e972b08b91ef48488bae9789f03cfedb148667fb
Attack Complexity
HIGH
Privileges Required
LOW
User Interaction Required
LOW
Scope
NONE
Confidentiality Impact
UNCHANGED
Integrity Impact
NONE
Availability Impact
NONE
Base Score
HIGH
Base Severity
4.7
Share on: