CVE-2021-47094 Information
Description
In the Linux kernel the following vulnerability has been resolved:
KVM: x86/mmu: Don’t advance iterator after restart due to yielding
After dropping mmu_lock in the TDP MMU restart the iterator during tdp_iter_next() and do not advance the iterator. Advancing the iterator results in skipping the top-level SPTE and all its children which is fatal if any of the skipped SPTEs were not visited before yielding.
When zapping all SPTEs i.e. when min_level == root_level restarting the iter and then invoking tdp_iter_next() is always fatal if the current gfn has as a valid SPTE as advancing the iterator results in try_step_side() skipping the current gfn which wasn’t visited before yielding.
Sprinkle WARNs on iter->yielded being true in various helpers that are often used in conjunction with yielding and tag the helper with __must_check to reduce the probabily of improper usage.
Failing to zap a top-level SPTE manifests in one of two ways. If a valid SPTE is skipped by both kvm_tdp_mmu_zap_all() and kvm_tdp_mmu_put_root() the shadow page will be leaked and KVM will WARN accordingly.
WARNING: CPU: 1 PID: 3509 at arch/x86/kvm/mmu/tdp_mmu.c:46 [kvm]
RIP: 0010:kvm_mmu_uninit_tdp_mmu+0x3e/0x50 [kvm]
Call Trace:
If kvm_tdp_mmu_zap_all() skips a gfn/SPTE but that SPTE is then zapped by kvm_tdp_mmu_put_root() KVM triggers a use-after-free in the form of marking a struct page as dirty/accessed after it has been put back on the free list. This directly triggers a WARN due to encountering a page with page_count() == 0 but it can also lead to data corruption and additional errors in the kernel.
WARNING: CPU: 7 PID: 1995658 at arch/x86/kvm/../../../virt/kvm/kvm_main.c:171
RIP: 0010:kvm_is_zone_device_pfn.part.0+0x9e/0xd0 [kvm]
Call Trace:
Note the underlying bug existed even before commit 1af4a96025b3 (\KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed) moved calls to tdp_mmu_iter_cond_resched() to the beginning of loops as KVM could still incorrectly advance past a top-level entry when yielding on a lower-level entry. But with respect to leaking shadow pages the bug was introduced by yielding before processing the current gfn.
Alternatively tdp_mmu_iter_cond_resched() could simply fall through or callers could jump to their etry\ label. The downside of that approach is that tdp_mmu_iter_cond_resched() must be called before anything else in the loop and there’s no easy way to enfornce that requirement.
Ideally KVM would handling the cond_resched() fully within the iterator macro (the code is actually quite clean) and avoid this entire class of bugs but that is extremely difficult do wh
truncated—
Reference
https://git.kernel.org/stable/c/d884eefd75cc54887bc2e9e724207443525dfb2c https://git.kernel.org/stable/c/3a0f64de479cae75effb630a2e0a237ca0d0623c
Share on: