CVE-2023-52738 Information
Description
In the Linux kernel the following vulnerability has been resolved:
drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini routine - such function is expected to be called only after the respective init function - drm_sched_init() - was executed successfully.
Happens that we faced a driver probe failure in the Steam Deck recently and the function drm_sched_fini() was called even without its counter-part had been previously called causing the following oops:
amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli 338
Hardware name: Valve Jupiter/Jupiter BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[…]
Call Trace:
To prevent that check if the drm_sched was properly initialized for a given ring before calling its fini counter-part.
Notice ideally we’d use sched.ready for that; such field is set as the latest thing on drm_sched_init(). But amdgpu seems to \override\ the meaning of such field - in the above oops for example it was a GFX ring causing the crash and the sched.ready field was set to true in the ring init routine regardless of the state of the DRM scheduler. Hence we ended-up using sched.ops as per Christian’s suggestion [0] and also removed the no_scheduler check [1].
[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/ [1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/
Reference
https://git.kernel.org/stable/c/2e557c8ca2c585bdef591b8503ba83b85f5d0afd https://git.kernel.org/stable/c/2bcbbef9cace772f5b7128b11401c515982de34b https://git.kernel.org/stable/c/5ad7bbf3dba5c4a684338df1f285080f2588b535
Share on: