CVE-2024-41010 Information

Description

In the Linux kernel the following vulnerability has been resolved:

bpf: Fix too early release of tcx_entry

Pedro Pinto and later independently also Hyunwoo Kim and Wongi Lee reported an issue that the tcx_entry can be released too early leading to a use after free (UAF) when an active old-style ingress or clsact qdisc with a shared tc block is later replaced by another ingress or clsact instance.

Essentially the sequence to trigger the UAF (one example) can be as follows:

  1. A network namespace is created

  2. An ingress qdisc is created. This allocates a tcx_entry and &tcx_entry->miniq is stored in the qdisc’s miniqp->p_miniq. At the same time a tcf block with index 1 is created.

  3. chain0 is attached to the tcf block. chain0 must be connected to the block linked to the ingress qdisc to later reach the function tcf_chain0_head_change_cb_del() which triggers the UAF.

  4. Create and graft a clsact qdisc. This causes the ingress qdisc created in step 1 to be removed thus freeing the previously linked tcx_entry:

    rtnetlink_rcv_msg() => tc_modify_qdisc() => qdisc_create() => clsact_init() [a] => qdisc_graft() => qdisc_destroy() => __qdisc_destroy() => ingress_destroy() [b] => tcx_entry_free() => kfree_rcu() // tcx_entry freed

  5. Finally the network namespace is closed. This registers the cleanup_net worker and during the process of releasing the remaining clsact qdisc it accesses the tcx_entry that was already freed in step 4 causing the UAF to occur:

    cleanup_net() => ops_exit_list() => default_device_exit_batch() => unregister_netdevice_many() => unregister_netdevice_many_notify() => dev_shutdown() => qdisc_put() => clsact_destroy() [c] => tcf_block_put_ext() => tcf_chain0_head_change_cb_del() => tcf_chain_head_change_item() => clsact_chain_head_change() => mini_qdisc_pair_swap() // UAF

There are also other variants the gist is to add an ingress (or clsact) qdisc with a specific shared block then to replace that qdisc waiting for the tcx_entry kfree_rcu() to be executed and subsequently accessing the current active qdisc’s miniq one way or another.

The correct fix is to turn the miniq_active boolean into a counter. What can be observed at step 2 above the counter transitions from 0->1 at step [a] from 1->2 (in order for the miniq object to remain active during the replacement) then in [b] from 2->1 and finally [c] 1->0 with the eventual release. The reference counter in general ranges from [02] and it does not need to be atomic since all access to the counter is protected by the rtnl mutex. With this in place there is no longer a UAF happening and the tcx_entry is freed at the correct time.

Reference

https://git.kernel.org/stable/c/1cb6f0bae50441f4b4b32a28315853b279c7404e https://git.kernel.org/stable/c/230bb13650b0f186f540500fd5f5f7096a822a2a https://git.kernel.org/stable/c/f61ecf1bd5b562ebfd7d430ccb31619857e80857

Share on: