Message ID | 20200602060441.1719138-1-vicamo.yang@canonical.com |
---|---|
Headers | show |
Series | PCI: Avoid FLR for AMD Matisse/Starship HD Audio & USB 3.0 | expand |
The changes are sane with restricted scope and impact:
Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Sultan Alsawaf <sultan.alsawaf@canonical.com> On Tue, Jun 02, 2020 at 02:04:38PM +0800, You-Sheng Yang wrote: > BugLink: https://bugs.launchpad.net/bugs/1865988 > > [Impact] > > Devices affected: > > * [1022:148c] USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Starship > USB 3.0 Host Controller > * [1022:149c] USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse > USB 3.0 Host Controller > * [1022:1487] Audio device [0403]: Advanced Micro Devices, Inc. [AMD] > Starship/Matisse HD Audio Controller > > Despite advertising FLReset device capabilities, performing a function level > reset of either of these devices causes the system to lock up. This is of > particular issue where these devices appear in their own IOMMU groups and are > well suited to VFIO passthrough. > > Issue was introduced in AMD's "AGESA Combo-AM4 1.0.0.4 Patch B" microcode > update, and affects dozens of motherboard models across various vendors. > > Additional discussion of this issue: > https://www.reddit.com/r/VFIO/comments/eba5mh/workaround_patch_for_passing_through_usb_and/ > > [Fix] > > Two commits currently landed in linux-pci pci/virutualization: > * 0d14f06cd665 PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0 > * 5727043c73fd PCI: Avoid FLR for AMD Starship USB 3.0 > > [Test Case] > > Peform the test on an impacted system: > > * B350, B450, X370, X470, X570 motherboards (practically anything with an AM4 > socket); > * Ryzen 3000-series CPU (2000-series possibly also affected); > * BIOS/UEFI firmware that includes "AGESA Combo-AM4 1.0.0.4 Patch B" (check > vendor release notes) > > In the above case where '0000:10:00.3' is the USB controller '1022:149c', issue > a reset command: > > $ echo 1 | sudo tee /sys/bus/pci/devices/0000\:10\:00.3/reset > > Impacted systems will not return successfully and become unstable, requiring a > reboot. `/var/logs/syslog` will show something resembling the following: > > xhci_hcd 0000:10:00.3: not ready 1023ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 2047ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 4095ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 8191ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 16383ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 32767ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 65535ms after FLR; giving up > clocksource: timekeeping watchdog on CPU14: Marking clocksource 'tsc' as unstable because the skew is too large: > clocksource: 'hpet' wd_now: f63fcfe wd_last: d468894 mask: ffffffff > clocksource: 'tsc' cs_now: 60e67e17758 cs_last: 60d2a81ce24 mask: ffffffffffffffff > tsc: Marking TSC unstable due to clocksource watchdog > TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. > sched_clock: Marking unstable (1817664630139, 314261908)<-(1817981099530, -2209419) > > [Regression Risk] > Low. These two patches affect only systems with a device needs fix. > > [Other Info] > v2: update origin of cherr-pick line > > Kevin Buettner (1): > UBUNTU: SAUCE: PCI: Avoid FLR for AMD Starship USB 3.0 > > Marcos Scriven (1): > UBUNTU: SAUCE: PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0 > > drivers/pci/quirks.c | 20 ++++++++++++++++---- > 1 file changed, 16 insertions(+), 4 deletions(-) > > -- > 2.25.1 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
On 2020-06-02 14:04:38 , You-Sheng Yang wrote: > BugLink: https://bugs.launchpad.net/bugs/1865988 > > [Impact] > > Devices affected: > > * [1022:148c] USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Starship > USB 3.0 Host Controller > * [1022:149c] USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse > USB 3.0 Host Controller > * [1022:1487] Audio device [0403]: Advanced Micro Devices, Inc. [AMD] > Starship/Matisse HD Audio Controller > > Despite advertising FLReset device capabilities, performing a function level > reset of either of these devices causes the system to lock up. This is of > particular issue where these devices appear in their own IOMMU groups and are > well suited to VFIO passthrough. > > Issue was introduced in AMD's "AGESA Combo-AM4 1.0.0.4 Patch B" microcode > update, and affects dozens of motherboard models across various vendors. > > Additional discussion of this issue: > https://www.reddit.com/r/VFIO/comments/eba5mh/workaround_patch_for_passing_through_usb_and/ > > [Fix] > > Two commits currently landed in linux-pci pci/virutualization: > * 0d14f06cd665 PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0 > * 5727043c73fd PCI: Avoid FLR for AMD Starship USB 3.0 > > [Test Case] > > Peform the test on an impacted system: > > * B350, B450, X370, X470, X570 motherboards (practically anything with an AM4 > socket); > * Ryzen 3000-series CPU (2000-series possibly also affected); > * BIOS/UEFI firmware that includes "AGESA Combo-AM4 1.0.0.4 Patch B" (check > vendor release notes) > > In the above case where '0000:10:00.3' is the USB controller '1022:149c', issue > a reset command: > > $ echo 1 | sudo tee /sys/bus/pci/devices/0000\:10\:00.3/reset > > Impacted systems will not return successfully and become unstable, requiring a > reboot. `/var/logs/syslog` will show something resembling the following: > > xhci_hcd 0000:10:00.3: not ready 1023ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 2047ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 4095ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 8191ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 16383ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 32767ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 65535ms after FLR; giving up > clocksource: timekeeping watchdog on CPU14: Marking clocksource 'tsc' as unstable because the skew is too large: > clocksource: 'hpet' wd_now: f63fcfe wd_last: d468894 mask: ffffffff > clocksource: 'tsc' cs_now: 60e67e17758 cs_last: 60d2a81ce24 mask: ffffffffffffffff > tsc: Marking TSC unstable due to clocksource watchdog > TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. > sched_clock: Marking unstable (1817664630139, 314261908)<-(1817981099530, -2209419) > > [Regression Risk] > Low. These two patches affect only systems with a device needs fix. > > [Other Info] > v2: update origin of cherr-pick line > > Kevin Buettner (1): > UBUNTU: SAUCE: PCI: Avoid FLR for AMD Starship USB 3.0 > > Marcos Scriven (1): > UBUNTU: SAUCE: PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0 > > drivers/pci/quirks.c | 20 ++++++++++++++++---- > 1 file changed, 16 insertions(+), 4 deletions(-) > > -- > 2.25.1 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
On 2.6.2020 9.04, You-Sheng Yang wrote: > BugLink: https://bugs.launchpad.net/bugs/1865988 > > [Impact] > > Devices affected: > > * [1022:148c] USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Starship > USB 3.0 Host Controller > * [1022:149c] USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse > USB 3.0 Host Controller > * [1022:1487] Audio device [0403]: Advanced Micro Devices, Inc. [AMD] > Starship/Matisse HD Audio Controller > > Despite advertising FLReset device capabilities, performing a function level > reset of either of these devices causes the system to lock up. This is of > particular issue where these devices appear in their own IOMMU groups and are > well suited to VFIO passthrough. > > Issue was introduced in AMD's "AGESA Combo-AM4 1.0.0.4 Patch B" microcode > update, and affects dozens of motherboard models across various vendors. > > Additional discussion of this issue: > https://www.reddit.com/r/VFIO/comments/eba5mh/workaround_patch_for_passing_through_usb_and/ > > [Fix] > > Two commits currently landed in linux-pci pci/virutualization: > * 0d14f06cd665 PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0 > * 5727043c73fd PCI: Avoid FLR for AMD Starship USB 3.0 > > [Test Case] > > Peform the test on an impacted system: > > * B350, B450, X370, X470, X570 motherboards (practically anything with an AM4 > socket); > * Ryzen 3000-series CPU (2000-series possibly also affected); > * BIOS/UEFI firmware that includes "AGESA Combo-AM4 1.0.0.4 Patch B" (check > vendor release notes) > > In the above case where '0000:10:00.3' is the USB controller '1022:149c', issue > a reset command: > > $ echo 1 | sudo tee /sys/bus/pci/devices/0000\:10\:00.3/reset > > Impacted systems will not return successfully and become unstable, requiring a > reboot. `/var/logs/syslog` will show something resembling the following: > > xhci_hcd 0000:10:00.3: not ready 1023ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 2047ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 4095ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 8191ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 16383ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 32767ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 65535ms after FLR; giving up > clocksource: timekeeping watchdog on CPU14: Marking clocksource 'tsc' as unstable because the skew is too large: > clocksource: 'hpet' wd_now: f63fcfe wd_last: d468894 mask: ffffffff > clocksource: 'tsc' cs_now: 60e67e17758 cs_last: 60d2a81ce24 mask: ffffffffffffffff > tsc: Marking TSC unstable due to clocksource watchdog > TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. > sched_clock: Marking unstable (1817664630139, 314261908)<-(1817981099530, -2209419) > > [Regression Risk] > Low. These two patches affect only systems with a device needs fix. > > [Other Info] > v2: update origin of cherr-pick line > > Kevin Buettner (1): > UBUNTU: SAUCE: PCI: Avoid FLR for AMD Starship USB 3.0 > > Marcos Scriven (1): > UBUNTU: SAUCE: PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0 > > drivers/pci/quirks.c | 20 ++++++++++++++++---- > 1 file changed, 16 insertions(+), 4 deletions(-) > applied to oem-5.6-next, thanks
Acked-By: AceLan Kao <acelan.kao@canonical.com>
On Tue, Jun 02, 2020 at 02:04:38PM +0800, You-Sheng Yang wrote: > BugLink: https://bugs.launchpad.net/bugs/1865988 > > [Impact] > > Devices affected: > > * [1022:148c] USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Starship > USB 3.0 Host Controller > * [1022:149c] USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse > USB 3.0 Host Controller > * [1022:1487] Audio device [0403]: Advanced Micro Devices, Inc. [AMD] > Starship/Matisse HD Audio Controller > > Despite advertising FLReset device capabilities, performing a function level > reset of either of these devices causes the system to lock up. This is of > particular issue where these devices appear in their own IOMMU groups and are > well suited to VFIO passthrough. > > Issue was introduced in AMD's "AGESA Combo-AM4 1.0.0.4 Patch B" microcode > update, and affects dozens of motherboard models across various vendors. > > Additional discussion of this issue: > https://www.reddit.com/r/VFIO/comments/eba5mh/workaround_patch_for_passing_through_usb_and/ > > [Fix] > > Two commits currently landed in linux-pci pci/virutualization: > * 0d14f06cd665 PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0 > * 5727043c73fd PCI: Avoid FLR for AMD Starship USB 3.0 > > [Test Case] > > Peform the test on an impacted system: > > * B350, B450, X370, X470, X570 motherboards (practically anything with an AM4 > socket); > * Ryzen 3000-series CPU (2000-series possibly also affected); > * BIOS/UEFI firmware that includes "AGESA Combo-AM4 1.0.0.4 Patch B" (check > vendor release notes) > > In the above case where '0000:10:00.3' is the USB controller '1022:149c', issue > a reset command: > > $ echo 1 | sudo tee /sys/bus/pci/devices/0000\:10\:00.3/reset > > Impacted systems will not return successfully and become unstable, requiring a > reboot. `/var/logs/syslog` will show something resembling the following: > > xhci_hcd 0000:10:00.3: not ready 1023ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 2047ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 4095ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 8191ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 16383ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 32767ms after FLR; waiting > xhci_hcd 0000:10:00.3: not ready 65535ms after FLR; giving up > clocksource: timekeeping watchdog on CPU14: Marking clocksource 'tsc' as unstable because the skew is too large: > clocksource: 'hpet' wd_now: f63fcfe wd_last: d468894 mask: ffffffff > clocksource: 'tsc' cs_now: 60e67e17758 cs_last: 60d2a81ce24 mask: ffffffffffffffff > tsc: Marking TSC unstable due to clocksource watchdog > TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. > sched_clock: Marking unstable (1817664630139, 314261908)<-(1817981099530, -2209419) > > [Regression Risk] > Low. These two patches affect only systems with a device needs fix. > > [Other Info] > v2: update origin of cherr-pick line Applied to unstable/master, thanks!