mbox series

[SRU,M,0/1] Fix iwlwifi crash on BE200

Message ID 20240327124552.106215-1-aaron.ma@canonical.com
Headers show
Series Fix iwlwifi crash on BE200 | expand

Message

Aaron Ma March 27, 2024, 12:45 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/2058808

[Impact]
iwlwifi crashed with the following error log:
[  282.045897] Invalid rxb from HW 0
[  282.045941] WARNING: CPU: 3 PID: 784 at drivers/net/wireless/intel/iwlwifi/pcie/rx.c:1489 iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
......
[  282.046175] CPU: 3 PID: 784 Comm: irq/185-iwlwifi Not tainted 6.5.0-1016-oem #17-Ubuntu
[  282.046181] RIP: 0010:iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
[  282.046247] Call Trace:
[  282.046250]  <IRQ>
[  282.046254]  ? show_regs+0x6d/0x80
[  282.046264]  ? __warn+0x89/0x160
[  282.046269]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
[  282.046308]  ? report_bug+0x17e/0x1b0
[  282.046315]  ? handle_bug+0x46/0x90  
[  282.046319]  ? exc_invalid_op+0x18/0x80
[  282.046323]  ? asm_exc_invalid_op+0x1b/0x20
[  282.046331]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
[  282.046366]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
[  282.046400]  ? enqueue_task+0x10/0x1a0
[  282.046405]  iwl_pcie_napi_poll_msix+0x32/0x100 [iwlwifi]
[  282.046440]  __napi_poll+0x30/0x1f0  
[  282.046445]  net_rx_action+0x181/0x2e0
[  282.046449]  ? __irq_wake_thread+0x42/0x50
[  282.046455]  __do_softirq+0xd9/0x349 
[  282.046461]  ? __pfx_irq_thread_fn+0x10/0x10
[  282.046465]  do_softirq.part.0+0x41/0x80
[  282.046471]  </IRQ>
[  282.046472]  <TASK>
[  282.046473]  __local_bh_enable_ip+0x72/0x80
[  282.046479]  iwl_pcie_irq_rx_msix_handler+0xd7/0x1a0 [iwlwifi]
[  282.046515]  irq_thread_fn+0x21/0x70 
[  282.046519]  irq_thread+0xf8/0x1c0
[  282.046549]  ? __pfx_irq_thread_dtor+0x10/0x10
[  282.046554]  ? __pfx_irq_thread+0x10/0x10
[  282.046558]  kthread+0xef/0x120
[  282.046564]  ? __pfx_kthread+0x10/0x10
[  282.046570]  ret_from_fork+0x44/0x70
[  282.046575]  ? __pfx_kthread+0x10/0x10
[  282.046580]  ret_from_fork_asm+0x1b/0x30
[  282.046586]  </TASK>
[  282.046587] ---[ end trace 0000000000000000 ]---
[  282.046976] iwlwifi 0000:09:00.0: Microcode SW error detected. Restarting 0x0.

[Fix]
From stable updates:
commit c1c1039135c3 ("wifi: iwlwifi: increase number of RX buffers for EHT devices")
increase number of RX buffers for new wifi card BE200, it needs one more
commit to support the bigger queue's RB status / write pointer.

[Test]
Tested on hardware, Intel BE200 works fine after stress with iperf3 for 20 mins.

[Where problems could occur]
It may break Intel wifi driver.

The commit c1c1039135c3 is in 6.7 kernel and backported to 6.5 kernel.
SRU for 6.5 mantic and oem-6.5.

Johannes Berg (1):
  wifi: iwlwifi: pcie: fix RB status reading

 drivers/net/wireless/intel/iwlwifi/pcie/internal.h |  8 ++++----
 drivers/net/wireless/intel/iwlwifi/pcie/rx.c       |  2 +-
 drivers/net/wireless/intel/iwlwifi/pcie/trans.c    | 12 ++++--------
 3 files changed, 9 insertions(+), 13 deletions(-)

Comments

Timo Aaltonen March 27, 2024, 1:46 p.m. UTC | #1
Aaron Ma kirjoitti 27.3.2024 klo 14.45:
> BugLink: https://bugs.launchpad.net/bugs/2058808
> 
> [Impact]
> iwlwifi crashed with the following error log:
> [  282.045897] Invalid rxb from HW 0
> [  282.045941] WARNING: CPU: 3 PID: 784 at drivers/net/wireless/intel/iwlwifi/pcie/rx.c:1489 iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> ......
> [  282.046175] CPU: 3 PID: 784 Comm: irq/185-iwlwifi Not tainted 6.5.0-1016-oem #17-Ubuntu
> [  282.046181] RIP: 0010:iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046247] Call Trace:
> [  282.046250]  <IRQ>
> [  282.046254]  ? show_regs+0x6d/0x80
> [  282.046264]  ? __warn+0x89/0x160
> [  282.046269]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046308]  ? report_bug+0x17e/0x1b0
> [  282.046315]  ? handle_bug+0x46/0x90
> [  282.046319]  ? exc_invalid_op+0x18/0x80
> [  282.046323]  ? asm_exc_invalid_op+0x1b/0x20
> [  282.046331]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046366]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046400]  ? enqueue_task+0x10/0x1a0
> [  282.046405]  iwl_pcie_napi_poll_msix+0x32/0x100 [iwlwifi]
> [  282.046440]  __napi_poll+0x30/0x1f0
> [  282.046445]  net_rx_action+0x181/0x2e0
> [  282.046449]  ? __irq_wake_thread+0x42/0x50
> [  282.046455]  __do_softirq+0xd9/0x349
> [  282.046461]  ? __pfx_irq_thread_fn+0x10/0x10
> [  282.046465]  do_softirq.part.0+0x41/0x80
> [  282.046471]  </IRQ>
> [  282.046472]  <TASK>
> [  282.046473]  __local_bh_enable_ip+0x72/0x80
> [  282.046479]  iwl_pcie_irq_rx_msix_handler+0xd7/0x1a0 [iwlwifi]
> [  282.046515]  irq_thread_fn+0x21/0x70
> [  282.046519]  irq_thread+0xf8/0x1c0
> [  282.046549]  ? __pfx_irq_thread_dtor+0x10/0x10
> [  282.046554]  ? __pfx_irq_thread+0x10/0x10
> [  282.046558]  kthread+0xef/0x120
> [  282.046564]  ? __pfx_kthread+0x10/0x10
> [  282.046570]  ret_from_fork+0x44/0x70
> [  282.046575]  ? __pfx_kthread+0x10/0x10
> [  282.046580]  ret_from_fork_asm+0x1b/0x30
> [  282.046586]  </TASK>
> [  282.046587] ---[ end trace 0000000000000000 ]---
> [  282.046976] iwlwifi 0000:09:00.0: Microcode SW error detected. Restarting 0x0.
> 
> [Fix]
>  From stable updates:
> commit c1c1039135c3 ("wifi: iwlwifi: increase number of RX buffers for EHT devices")
> increase number of RX buffers for new wifi card BE200, it needs one more
> commit to support the bigger queue's RB status / write pointer.
> 
> [Test]
> Tested on hardware, Intel BE200 works fine after stress with iperf3 for 20 mins.
> 
> [Where problems could occur]
> It may break Intel wifi driver.
> 
> The commit c1c1039135c3 is in 6.7 kernel and backported to 6.5 kernel.
> SRU for 6.5 mantic and oem-6.5.
> 
> Johannes Berg (1):
>    wifi: iwlwifi: pcie: fix RB status reading
> 
>   drivers/net/wireless/intel/iwlwifi/pcie/internal.h |  8 ++++----
>   drivers/net/wireless/intel/iwlwifi/pcie/rx.c       |  2 +-
>   drivers/net/wireless/intel/iwlwifi/pcie/trans.c    | 12 ++++--------
>   3 files changed, 9 insertions(+), 13 deletions(-)
> 

Acked-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Roxana Nicolescu March 27, 2024, 1:56 p.m. UTC | #2
On 27/03/2024 13:45, Aaron Ma wrote:
> BugLink: https://bugs.launchpad.net/bugs/2058808
>
> [Impact]
> iwlwifi crashed with the following error log:
> [  282.045897] Invalid rxb from HW 0
> [  282.045941] WARNING: CPU: 3 PID: 784 at drivers/net/wireless/intel/iwlwifi/pcie/rx.c:1489 iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> ......
> [  282.046175] CPU: 3 PID: 784 Comm: irq/185-iwlwifi Not tainted 6.5.0-1016-oem #17-Ubuntu
> [  282.046181] RIP: 0010:iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046247] Call Trace:
> [  282.046250]  <IRQ>
> [  282.046254]  ? show_regs+0x6d/0x80
> [  282.046264]  ? __warn+0x89/0x160
> [  282.046269]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046308]  ? report_bug+0x17e/0x1b0
> [  282.046315]  ? handle_bug+0x46/0x90
> [  282.046319]  ? exc_invalid_op+0x18/0x80
> [  282.046323]  ? asm_exc_invalid_op+0x1b/0x20
> [  282.046331]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046366]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046400]  ? enqueue_task+0x10/0x1a0
> [  282.046405]  iwl_pcie_napi_poll_msix+0x32/0x100 [iwlwifi]
> [  282.046440]  __napi_poll+0x30/0x1f0
> [  282.046445]  net_rx_action+0x181/0x2e0
> [  282.046449]  ? __irq_wake_thread+0x42/0x50
> [  282.046455]  __do_softirq+0xd9/0x349
> [  282.046461]  ? __pfx_irq_thread_fn+0x10/0x10
> [  282.046465]  do_softirq.part.0+0x41/0x80
> [  282.046471]  </IRQ>
> [  282.046472]  <TASK>
> [  282.046473]  __local_bh_enable_ip+0x72/0x80
> [  282.046479]  iwl_pcie_irq_rx_msix_handler+0xd7/0x1a0 [iwlwifi]
> [  282.046515]  irq_thread_fn+0x21/0x70
> [  282.046519]  irq_thread+0xf8/0x1c0
> [  282.046549]  ? __pfx_irq_thread_dtor+0x10/0x10
> [  282.046554]  ? __pfx_irq_thread+0x10/0x10
> [  282.046558]  kthread+0xef/0x120
> [  282.046564]  ? __pfx_kthread+0x10/0x10
> [  282.046570]  ret_from_fork+0x44/0x70
> [  282.046575]  ? __pfx_kthread+0x10/0x10
> [  282.046580]  ret_from_fork_asm+0x1b/0x30
> [  282.046586]  </TASK>
> [  282.046587] ---[ end trace 0000000000000000 ]---
> [  282.046976] iwlwifi 0000:09:00.0: Microcode SW error detected. Restarting 0x0.
>
> [Fix]
>  From stable updates:
> commit c1c1039135c3 ("wifi: iwlwifi: increase number of RX buffers for EHT devices")
> increase number of RX buffers for new wifi card BE200, it needs one more
> commit to support the bigger queue's RB status / write pointer.
>
> [Test]
> Tested on hardware, Intel BE200 works fine after stress with iperf3 for 20 mins.
>
> [Where problems could occur]
> It may break Intel wifi driver.
>
> The commit c1c1039135c3 is in 6.7 kernel and backported to 6.5 kernel.
> SRU for 6.5 mantic and oem-6.5.
>
> Johannes Berg (1):
>    wifi: iwlwifi: pcie: fix RB status reading
>
>   drivers/net/wireless/intel/iwlwifi/pcie/internal.h |  8 ++++----
>   drivers/net/wireless/intel/iwlwifi/pcie/rx.c       |  2 +-
>   drivers/net/wireless/intel/iwlwifi/pcie/trans.c    | 12 ++++--------
>   3 files changed, 9 insertions(+), 13 deletions(-)
>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Stefan Bader March 27, 2024, 2:38 p.m. UTC | #3
On 27.03.24 13:45, Aaron Ma wrote:
> BugLink: https://bugs.launchpad.net/bugs/2058808
> 
> [Impact]
> iwlwifi crashed with the following error log:
> [  282.045897] Invalid rxb from HW 0
> [  282.045941] WARNING: CPU: 3 PID: 784 at drivers/net/wireless/intel/iwlwifi/pcie/rx.c:1489 iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> ......
> [  282.046175] CPU: 3 PID: 784 Comm: irq/185-iwlwifi Not tainted 6.5.0-1016-oem #17-Ubuntu
> [  282.046181] RIP: 0010:iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046247] Call Trace:
> [  282.046250]  <IRQ>
> [  282.046254]  ? show_regs+0x6d/0x80
> [  282.046264]  ? __warn+0x89/0x160
> [  282.046269]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046308]  ? report_bug+0x17e/0x1b0
> [  282.046315]  ? handle_bug+0x46/0x90
> [  282.046319]  ? exc_invalid_op+0x18/0x80
> [  282.046323]  ? asm_exc_invalid_op+0x1b/0x20
> [  282.046331]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046366]  ? iwl_pcie_rx_handle+0x3ce/0x640 [iwlwifi]
> [  282.046400]  ? enqueue_task+0x10/0x1a0
> [  282.046405]  iwl_pcie_napi_poll_msix+0x32/0x100 [iwlwifi]
> [  282.046440]  __napi_poll+0x30/0x1f0
> [  282.046445]  net_rx_action+0x181/0x2e0
> [  282.046449]  ? __irq_wake_thread+0x42/0x50
> [  282.046455]  __do_softirq+0xd9/0x349
> [  282.046461]  ? __pfx_irq_thread_fn+0x10/0x10
> [  282.046465]  do_softirq.part.0+0x41/0x80
> [  282.046471]  </IRQ>
> [  282.046472]  <TASK>
> [  282.046473]  __local_bh_enable_ip+0x72/0x80
> [  282.046479]  iwl_pcie_irq_rx_msix_handler+0xd7/0x1a0 [iwlwifi]
> [  282.046515]  irq_thread_fn+0x21/0x70
> [  282.046519]  irq_thread+0xf8/0x1c0
> [  282.046549]  ? __pfx_irq_thread_dtor+0x10/0x10
> [  282.046554]  ? __pfx_irq_thread+0x10/0x10
> [  282.046558]  kthread+0xef/0x120
> [  282.046564]  ? __pfx_kthread+0x10/0x10
> [  282.046570]  ret_from_fork+0x44/0x70
> [  282.046575]  ? __pfx_kthread+0x10/0x10
> [  282.046580]  ret_from_fork_asm+0x1b/0x30
> [  282.046586]  </TASK>
> [  282.046587] ---[ end trace 0000000000000000 ]---
> [  282.046976] iwlwifi 0000:09:00.0: Microcode SW error detected. Restarting 0x0.
> 
> [Fix]
>  From stable updates:
> commit c1c1039135c3 ("wifi: iwlwifi: increase number of RX buffers for EHT devices")
> increase number of RX buffers for new wifi card BE200, it needs one more
> commit to support the bigger queue's RB status / write pointer.
> 
> [Test]
> Tested on hardware, Intel BE200 works fine after stress with iperf3 for 20 mins.
> 
> [Where problems could occur]
> It may break Intel wifi driver.
> 
> The commit c1c1039135c3 is in 6.7 kernel and backported to 6.5 kernel.
> SRU for 6.5 mantic and oem-6.5.
> 
> Johannes Berg (1):
>    wifi: iwlwifi: pcie: fix RB status reading
> 
>   drivers/net/wireless/intel/iwlwifi/pcie/internal.h |  8 ++++----
>   drivers/net/wireless/intel/iwlwifi/pcie/rx.c       |  2 +-
>   drivers/net/wireless/intel/iwlwifi/pcie/trans.c    | 12 ++++--------
>   3 files changed, 9 insertions(+), 13 deletions(-)
> 

Applied to mantic:linux/master-next (potential s-cycle candidate). Thanks.

-Stefan