CVE-2024-56592 Information
Description
In the Linux kernel the following vulnerability has been resolved:
bpf: Call free_htab_elem() after htab_unlock_bucket()
For htab of maps when the map is removed from the htab it may hold the last reference of the map. bpf_map_fd_put_ptr() will invoke bpf_map_free_id() to free the id of the removed map element. However bpf_map_fd_put_ptr() is invoked while holding a bucket lock (raw_spin_lock_t) and bpf_map_free_id() attempts to acquire map_idr_lock (spinlock_t) triggering the following lockdep warning:
============================= [ BUG: Invalid wait context ] 6.11.0-rc4+ 49 Not tainted
test_maps/4881 is trying to lock:
ffffffff84884578 (map_idr_lock)+…-3:3 at: bpf_map_free_id.part.0+0x21/0x70
other info that might help us debug this:
context-5:5
2 locks held by test_maps/4881:
0: ffffffff846caf60 (rcu_read_lock)….-1:3 at: bpf_fd_htab_map_update_elem+0xf9/0x270
1: ffff888149ced148 (&htab->lockdep_key2)….-2:2 at: htab_map_update_elem+0x178/0xa80
stack backtrace:
CPU: 0 UID: 0 PID: 4881 Comm: test_maps Not tainted 6.11.0-rc4+ 49
Hardware name: QEMU Standard PC (i440FX + PIIX 1996) …
Call Trace:
One way to fix the lockdep warning is using raw_spinlock_t for map_idr_lock as well. However bpf_map_alloc_id() invokes idr_alloc_cyclic() after acquiring map_idr_lock it will trigger a similar lockdep warning because the slab’s lock (s->cpu_slab->lock) is still a spinlock.
Instead of changing map_idr_lock’s type fix the issue by invoking htab_put_fd_value() after htab_unlock_bucket(). However only deferring the invocation of htab_put_fd_value() is not enough because the old map pointers in htab of maps can not be saved during batched deletion. Therefore also defer the invocation of free_htab_elem() so these to-be-freed elements could be linked together similar to lru map.
There are four callers for ->map_fd_put_ptr:
(1) alloc_htab_elem() (through htab_put_fd_value()) It invokes ->map_fd_put_ptr() under a raw_spinlock_t. The invocation of htab_put_fd_value() can not simply move after htab_unlock_bucket() because the old element has already been stashed in htab->extra_elems. It may be reused immediately after htab_unlock_bucket() and the invocation of htab_put_fd_value() after htab_unlock_bucket() may release the newly-added element incorrectly. Therefore saving the map pointer of the old element for htab of maps before unlocking the bucket and releasing the map_ptr after unlock. Beside the map pointer in the old element should do the same thing for the special fields in the old element as well.
(2) free_htab_elem() (through htab_put_fd_value()) Its caller includes __htab_map_lookup_and_delete_elem() htab_map_delete_elem() and __htab_map_lookup_and_delete_batch().
For htab_map_delete_elem() simply invoke free_htab_elem() after htab_unlock_bucket(). For __htab_map_lookup_and_delete_batch() just like lru map linking the to-be-freed element into node_to_free list and invoking free_htab_elem() for these element after unlock. It is safe to reuse batch_flink as the link for node_to_free because these elements have been removed from the hash llist.
Because htab of maps doesn’t support lookup_and_delete operation __htab_map_lookup_and_delete_elem() doesn’t have the problem so kept it as
truncated—
Reference
https://git.kernel.org/stable/c/10e8a2dec9ff1b81de8e892b0850924038adbc6d https://git.kernel.org/stable/c/a50b4aa3007e63a590d501341f304676ebc74b3b https://git.kernel.org/stable/c/b9e9ed90b10c82a4e9d4d70a2890f06bfcdd3b78
Share on: