mbox series

[0/5,V2,SRU,M/L] Fix AMDGPU: the screen freeze with W7500

Message ID 20230719150112.1883903-1-koba.ko@canonical.com
Headers show
Series Fix AMDGPU: the screen freeze with W7500 | expand

Message

Koba Ko July 19, 2023, 3:01 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/2027957

[impact]
While booting into OOBE, the screen freeze [AMD W7500 only]

[fix]
AMDGPU would allocate pcie gen/lane dynamically after ASPM is enabled.
Intel CPU may not support the dynamic lane/speed switching.

Solution is,
- Detect Intel x86 systems that don't support dynamic switching
- Override the input caps to maximum supported for that system
- Force all PCIe levels to use the same settings, rather than try to configure each level differently.

[test cases]
1. boot with w7500
2. the screen doesn't freeze and can't find the error message in dmesg.
"amdgpu: [drm] *ERROR* [CRTC:72:crtc-0] flip_done timed out"

[where the issue could happen]
low, this could lead issue when setting higher speeds than supported.

[Misc]
1. jammy, amdgpu isn't loaded on this platform with 5.15-73-generic.
2. kinetic, amdgpu failed to probe the vga controller with 5.19-46-generic.
3. Passed cbd build against Mantic&Lunar.
4. For Lunar, need modification for function sienna_cichlid_update_pcie_parameters,
   so provide another patch for e701156ccc6c) drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters.
~~~
V2: cherry-picked from linux-next

Comments

Tim Gardner July 19, 2023, 3:28 p.m. UTC | #1
On 7/19/23 9:01 AM, Koba Ko wrote:
> BugLink: https://bugs.launchpad.net/bugs/2027957
> 
> [impact]
> While booting into OOBE, the screen freeze [AMD W7500 only]
> 
> [fix]
> AMDGPU would allocate pcie gen/lane dynamically after ASPM is enabled.
> Intel CPU may not support the dynamic lane/speed switching.
> 
> Solution is,
> - Detect Intel x86 systems that don't support dynamic switching
> - Override the input caps to maximum supported for that system
> - Force all PCIe levels to use the same settings, rather than try to configure each level differently.
> 
> [test cases]
> 1. boot with w7500
> 2. the screen doesn't freeze and can't find the error message in dmesg.
> "amdgpu: [drm] *ERROR* [CRTC:72:crtc-0] flip_done timed out"
> 
> [where the issue could happen]
> low, this could lead issue when setting higher speeds than supported.
> 
> [Misc]
> 1. jammy, amdgpu isn't loaded on this platform with 5.15-73-generic.
> 2. kinetic, amdgpu failed to probe the vga controller with 5.19-46-generic.
> 3. Passed cbd build against Mantic&Lunar.
> 4. For Lunar, need modification for function sienna_cichlid_update_pcie_parameters,
>     so provide another patch for e701156ccc6c) drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters.
> ~~~
> V2: cherry-picked from linux-next
> 
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Stefan Bader July 20, 2023, 7:28 a.m. UTC | #2
On 19.07.23 17:01, Koba Ko wrote:
> BugLink: https://bugs.launchpad.net/bugs/2027957
> 
> [impact]
> While booting into OOBE, the screen freeze [AMD W7500 only]
> 
> [fix]
> AMDGPU would allocate pcie gen/lane dynamically after ASPM is enabled.
> Intel CPU may not support the dynamic lane/speed switching.
> 
> Solution is,
> - Detect Intel x86 systems that don't support dynamic switching
> - Override the input caps to maximum supported for that system
> - Force all PCIe levels to use the same settings, rather than try to configure each level differently.
> 
> [test cases]
> 1. boot with w7500
> 2. the screen doesn't freeze and can't find the error message in dmesg.
> "amdgpu: [drm] *ERROR* [CRTC:72:crtc-0] flip_done timed out"
> 
> [where the issue could happen]
> low, this could lead issue when setting higher speeds than supported.
> 
> [Misc]
> 1. jammy, amdgpu isn't loaded on this platform with 5.15-73-generic.
> 2. kinetic, amdgpu failed to probe the vga controller with 5.19-46-generic.
> 3. Passed cbd build against Mantic&Lunar.
> 4. For Lunar, need modification for function sienna_cichlid_update_pcie_parameters,
>     so provide another patch for e701156ccc6c) drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters.
> ~~~
> V2: cherry-picked from linux-next
> 
Since I was ok with the first set (implicitly thinking we would update 
the SHA1/sob to linux-next).

Acked-by: Stefan Bader <stefan.bader@canonical.com>
Andrea Righi July 27, 2023, 6:47 a.m. UTC | #3
On Wed, Jul 19, 2023 at 11:01:07PM +0800, Koba Ko wrote:
> BugLink: https://bugs.launchpad.net/bugs/2027957
> 
> [impact]
> While booting into OOBE, the screen freeze [AMD W7500 only]
> 
> [fix]
> AMDGPU would allocate pcie gen/lane dynamically after ASPM is enabled.
> Intel CPU may not support the dynamic lane/speed switching.
> 
> Solution is,
> - Detect Intel x86 systems that don't support dynamic switching
> - Override the input caps to maximum supported for that system
> - Force all PCIe levels to use the same settings, rather than try to configure each level differently.
> 
> [test cases]
> 1. boot with w7500
> 2. the screen doesn't freeze and can't find the error message in dmesg.
> "amdgpu: [drm] *ERROR* [CRTC:72:crtc-0] flip_done timed out"
> 
> [where the issue could happen]
> low, this could lead issue when setting higher speeds than supported.
> 
> [Misc]
> 1. jammy, amdgpu isn't loaded on this platform with 5.15-73-generic.
> 2. kinetic, amdgpu failed to probe the vga controller with 5.19-46-generic.
> 3. Passed cbd build against Mantic&Lunar.
> 4. For Lunar, need modification for function sienna_cichlid_update_pcie_parameters,
>    so provide another patch for e701156ccc6c) drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters.

Already applied to mantic/linux-unstable via periodic upstream rebase.

-Andrea
Roxana Nicolescu Aug. 2, 2023, 6:49 a.m. UTC | #4
On 19/07/2023 17:01, Koba Ko wrote:
> BugLink: https://bugs.launchpad.net/bugs/2027957
>
> [impact]
> While booting into OOBE, the screen freeze [AMD W7500 only]
>
> [fix]
> AMDGPU would allocate pcie gen/lane dynamically after ASPM is enabled.
> Intel CPU may not support the dynamic lane/speed switching.
>
> Solution is,
> - Detect Intel x86 systems that don't support dynamic switching
> - Override the input caps to maximum supported for that system
> - Force all PCIe levels to use the same settings, rather than try to configure each level differently.
>
> [test cases]
> 1. boot with w7500
> 2. the screen doesn't freeze and can't find the error message in dmesg.
> "amdgpu: [drm] *ERROR* [CRTC:72:crtc-0] flip_done timed out"
>
> [where the issue could happen]
> low, this could lead issue when setting higher speeds than supported.
>
> [Misc]
> 1. jammy, amdgpu isn't loaded on this platform with 5.15-73-generic.
> 2. kinetic, amdgpu failed to probe the vga controller with 5.19-46-generic.
> 3. Passed cbd build against Mantic&Lunar.
> 4. For Lunar, need modification for function sienna_cichlid_update_pcie_parameters,
>     so provide another patch for e701156ccc6c) drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters.
> ~~~
> V2: cherry-picked from linux-next
>
Applied to lunar:master-next. Thanks!

Roxana