Message ID | 20250513105529.241745-2-michal.kubiak@intel.com |
---|---|
State | Accepted |
Delegated to: | Anthony Nguyen |
Headers | show |
Series | Fix XDP loading on machines with many CPUs | expand |
On Tue, May 13, 2025 at 12:55:27PM +0200, Michal Kubiak wrote: > When the XDP program is loaded, the XDP callback adds new Tx queues. > This means that the callback must update the Tx scheduler with the new > queue number. In the event of a Tx scheduler failure, the XDP callback > should also fail and roll back any changes previously made for XDP > preparation. > > The previous implementation had a bug that not all changes made by the > XDP callback were rolled back. This caused the crash with the following > call trace: > > [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 > [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [#1] SMP NOPTI > [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ #14 PREEMPT(voluntary) > [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 > [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] > > [...] > > [ +0.002715] Call Trace: > [ +0.002452] <IRQ> > [ +0.002021] ? __die_body.cold+0x19/0x29 > [ +0.003922] ? die_addr+0x3c/0x60 > [ +0.003319] ? exc_general_protection+0x17c/0x400 > [ +0.004707] ? asm_exc_general_protection+0x26/0x30 > [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] > [ +0.004835] ice_napi_poll+0x665/0x680 [ice] > [ +0.004320] __napi_poll+0x28/0x190 > [ +0.003500] net_rx_action+0x198/0x360 > [ +0.003752] ? update_rq_clock+0x39/0x220 > [ +0.004013] handle_softirqs+0xf1/0x340 > [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 > [ +0.003925] __irq_exit_rcu+0xc2/0xe0 > [ +0.003665] common_interrupt+0x85/0xa0 > [ +0.003839] </IRQ> > [ +0.002098] <TASK> > [ +0.002106] asm_common_interrupt+0x26/0x40 > [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 > > Fix this by performing the missing unmapping of XDP queues from > q_vectors and setting the XDP rings pointer back to NULL after all those > queues are released. > Also, add an immediate exit from the XDP callback in case of ring > preparation failure. > > Fixes: efc2214b6047 ("ice: Add support for XDP") > Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> > Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> > Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> > Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org>
> -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of > Michal Kubiak > Sent: Tuesday, May 13, 2025 4:25 PM > To: intel-wired-lan@lists.osuosl.org > Cc: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; Lobakin, Aleksander > <aleksander.lobakin@intel.com>; Kitszel, Przemyslaw > <przemyslaw.kitszel@intel.com>; dawid.osuchowski@linux.intel.com; Keller, > Jacob E <jacob.e.keller@intel.com>; Brandeburg, Jesse > <jbrandeburg@cloudflare.com>; netdev@vger.kernel.org; Kubiak, Michal > <michal.kubiak@intel.com>; Loktionov, Aleksandr > <aleksandr.loktionov@intel.com> > Subject: [Intel-wired-lan] [PATCH iwl-net v3 1/3] ice: fix Tx scheduler error > handling in XDP callback > > When the XDP program is loaded, the XDP callback adds new Tx queues. > This means that the callback must update the Tx scheduler with the new > queue number. In the event of a Tx scheduler failure, the XDP callback should > also fail and roll back any changes previously made for XDP preparation. > > The previous implementation had a bug that not all changes made by the XDP > callback were rolled back. This caused the crash with the following call trace: > > [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ > +0.382335] Oops: general protection fault, probably for non-canonical > address 0x50a2250a90495525: 0000 [#1] SMP NOPTI [ +0.010710] CPU: 103 > UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ #14 > PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation > M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 > 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] > > [...] > > [ +0.002715] Call Trace: > [ +0.002452] <IRQ> > [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? > die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 > [ +0.004707] ? asm_exc_general_protection+0x26/0x30 > [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] > ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ > +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? > update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ > +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] > __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ > +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] > asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: > 0010:cpuidle_enter_state+0xd3/0x690 > > Fix this by performing the missing unmapping of XDP queues from q_vectors > and setting the XDP rings pointer back to NULL after all those queues are > released. > Also, add an immediate exit from the XDP callback in case of ring preparation > failure. > > Fixes: efc2214b6047 ("ice: Add support for XDP") > Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> > Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> > Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> > Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> > --- > drivers/net/ethernet/intel/ice/ice_main.c | 47 ++++++++++++++++------- > 1 file changed, 33 insertions(+), 14 deletions(-) > Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 8119913b5f69..34df104ac567 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -2740,6 +2740,27 @@ void ice_map_xdp_rings(struct ice_vsi *vsi) } } +/** + * ice_unmap_xdp_rings - Unmap XDP rings from interrupt vectors + * @vsi: the VSI with XDP rings being unmapped + */ +static void ice_unmap_xdp_rings(struct ice_vsi *vsi) +{ + int v_idx; + + ice_for_each_q_vector(vsi, v_idx) { + struct ice_q_vector *q_vector = vsi->q_vectors[v_idx]; + struct ice_tx_ring *ring; + + ice_for_each_tx_ring(ring, q_vector->tx) + if (!ring->tx_buf || !ice_ring_is_xdp(ring)) + break; + + /* restore the value of last node prior to XDP setup */ + q_vector->tx.tx_ring = ring; + } +} + /** * ice_prepare_xdp_rings - Allocate, configure and setup Tx rings for XDP * @vsi: VSI to bring up Tx rings used by XDP @@ -2803,7 +2824,7 @@ int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog, if (status) { dev_err(dev, "Failed VSI LAN queue config for XDP, error: %d\n", status); - goto clear_xdp_rings; + goto unmap_xdp_rings; } /* assign the prog only when it's not already present on VSI; @@ -2819,6 +2840,8 @@ int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog, ice_vsi_assign_bpf_prog(vsi, prog); return 0; +unmap_xdp_rings: + ice_unmap_xdp_rings(vsi); clear_xdp_rings: ice_for_each_xdp_txq(vsi, i) if (vsi->xdp_rings[i]) { @@ -2835,6 +2858,8 @@ int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog, mutex_unlock(&pf->avail_q_mutex); devm_kfree(dev, vsi->xdp_rings); + vsi->xdp_rings = NULL; + return -ENOMEM; } @@ -2850,7 +2875,7 @@ int ice_destroy_xdp_rings(struct ice_vsi *vsi, enum ice_xdp_cfg cfg_type) { u16 max_txqs[ICE_MAX_TRAFFIC_CLASS] = { 0 }; struct ice_pf *pf = vsi->back; - int i, v_idx; + int i; /* q_vectors are freed in reset path so there's no point in detaching * rings @@ -2858,17 +2883,7 @@ int ice_destroy_xdp_rings(struct ice_vsi *vsi, enum ice_xdp_cfg cfg_type) if (cfg_type == ICE_XDP_CFG_PART) goto free_qmap; - ice_for_each_q_vector(vsi, v_idx) { - struct ice_q_vector *q_vector = vsi->q_vectors[v_idx]; - struct ice_tx_ring *ring; - - ice_for_each_tx_ring(ring, q_vector->tx) - if (!ring->tx_buf || !ice_ring_is_xdp(ring)) - break; - - /* restore the value of last node prior to XDP setup */ - q_vector->tx.tx_ring = ring; - } + ice_unmap_xdp_rings(vsi); free_qmap: mutex_lock(&pf->avail_q_mutex); @@ -3013,11 +3028,14 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog, xdp_ring_err = ice_vsi_determine_xdp_res(vsi); if (xdp_ring_err) { NL_SET_ERR_MSG_MOD(extack, "Not enough Tx resources for XDP"); + goto resume_if; } else { xdp_ring_err = ice_prepare_xdp_rings(vsi, prog, ICE_XDP_CFG_FULL); - if (xdp_ring_err) + if (xdp_ring_err) { NL_SET_ERR_MSG_MOD(extack, "Setting up XDP Tx resources failed"); + goto resume_if; + } } xdp_features_set_redirect_target(vsi->netdev, true); /* reallocate Rx queues that are used for zero-copy */ @@ -3035,6 +3053,7 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog, NL_SET_ERR_MSG_MOD(extack, "Freeing XDP Rx resources failed"); } +resume_if: if (if_running) ret = ice_up(vsi);