diff mbox series

[iwl-net,v2,5/6] ice: remove ICE_CFG_BUSY locking from AF_XDP code

Message ID 20240724164840.2536605-6-larysa.zaremba@intel.com
State Changes Requested
Delegated to: Anthony Nguyen
Headers show
Series ice: fix synchronization between .ndo_bpf() and reset | expand

Commit Message

Larysa Zaremba July 24, 2024, 4:48 p.m. UTC
Locking used in ice_qp_ena() and ice_qp_dis() does pretty much nothing,
because ICE_CFG_BUSY is a state flag that is supposed to be set in a PF
state, not VSI one. Therefore it does not protect the queue pair from
e.g. reset.

Despite being useless, it still can deadlock the unfortunate functions that
have fell into the same ICE_CFG_BUSY-VSI trap. This happens if ice_qp_ena
returns an error.

Remove ICE_CFG_BUSY locking from ice_qp_dis() and ice_qp_ena().

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 9 ---------
 1 file changed, 9 deletions(-)

Comments

Jacob Keller July 24, 2024, 6:37 p.m. UTC | #1
On 7/24/2024 9:48 AM, Larysa Zaremba wrote:
> Locking used in ice_qp_ena() and ice_qp_dis() does pretty much nothing,
> because ICE_CFG_BUSY is a state flag that is supposed to be set in a PF
> state, not VSI one. Therefore it does not protect the queue pair from
> e.g. reset.
> 

Yea, unfortunately a lot of places accidentally use the wrong flags. I
wonder if this is something sparse could help with identifying by having
the flags tagged in some way...

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

> Despite being useless, it still can deadlock the unfortunate functions that
> have fell into the same ICE_CFG_BUSY-VSI trap. This happens if ice_qp_ena
> returns an error.
> 

This wording makes it sound like other functions have this issue. Is it
only these two left?

Seems like there are a few other places which check this:

> ice_xsk.c
> 176:    while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) {
> 253:    clear_bit(ICE_CFG_BUSY, vsi->state);
> 

These two are fixed by your patch.

> ice_main.c
> 334:    while (test_and_set_bit(ICE_CFG_BUSY, vsi->state))
> 475:    clear_bit(ICE_CFG_BUSY, vsi->state);

These two appear to be ice_vsi_sync_fltr.

> 3791:   while (test_and_set_bit(ICE_CFG_BUSY, vsi->state))
> 3828:   clear_bit(ICE_CFG_BUSY, vsi->state);

These two appear to be ice_vlan_rx_add_vid.

> 3854:   while (test_and_set_bit(ICE_CFG_BUSY, vsi->state))
> 3897:   clear_bit(ICE_CFG_BUSY, vsi->state);

These two appear to be ice_vlan_rx_kill_vid.

> ice.h
> 299:    ICE_CFG_BUSY,
>

This is part of the ice_pf_state enumeration. So yes, we really
shouldn't be checking it in the vsi->state. In the strictest sense this
could be leading to a out-of-bounds read or set, but we happen to luck
into working because the DECLARE_BITMAP uses longs so there is junk data
after the end of the actual state bit size. The bit functions don't get
passed the size so can't have annotations which would catch this.
 Obviously not your fault, and don't need to be fixed in this series,
but its at least a semantic bug if not actually trigger-able by
anything. It looks like VLAN functions *are* using this flag
intentionally, if incorrectly. Its unclear what the correct fix is to me
offhand. Perhaps just creating a VSI specific flag for VLANs... or
perhaps replacing the flag with a regular synchronization primitive....

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

> Remove ICE_CFG_BUSY locking from ice_qp_dis() and ice_qp_ena().
> 
> Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_xsk.c | 9 ---------
>  1 file changed, 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> index 5dd50a2866cc..d23fd4ea9129 100644
> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> @@ -163,7 +163,6 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
>  	struct ice_q_vector *q_vector;
>  	struct ice_tx_ring *tx_ring;
>  	struct ice_rx_ring *rx_ring;
> -	int timeout = 50;
>  	int err;
>  
>  	if (q_idx >= vsi->num_rxq || q_idx >= vsi->num_txq)
> @@ -173,13 +172,6 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
>  	rx_ring = vsi->rx_rings[q_idx];
>  	q_vector = rx_ring->q_vector;
>  
> -	while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) {
> -		timeout--;
> -		if (!timeout)
> -			return -EBUSY;
> -		usleep_range(1000, 2000);
> -	}
> -
>  	ice_qvec_dis_irq(vsi, rx_ring, q_vector);
>  	ice_qvec_toggle_napi(vsi, q_vector, false);
>  
> @@ -250,7 +242,6 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
>  	ice_qvec_ena_irq(vsi, q_vector);
>  
>  	netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> -	clear_bit(ICE_CFG_BUSY, vsi->state);
>  
>  	return 0;
>  }
Rout, ChandanX Aug. 8, 2024, 2:17 a.m. UTC | #2
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Zaremba, Larysa
>Sent: Wednesday, July 24, 2024 10:19 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Drewek, Wojciech <wojciech.drewek@intel.com>; Fijalkowski, Maciej
><maciej.fijalkowski@intel.com>; Jesper Dangaard Brouer <hawk@kernel.org>;
>Daniel Borkmann <daniel@iogearbox.net>; Zaremba, Larysa
><larysa.zaremba@intel.com>; netdev@vger.kernel.org; John Fastabend
><john.fastabend@gmail.com>; Alexei Starovoitov <ast@kernel.org>; linux-
>kernel@vger.kernel.org; Eric Dumazet <edumazet@google.com>; Kubiak,
>Michal <michal.kubiak@intel.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Nambiar, Amritha
><amritha.nambiar@intel.com>; Keller, Jacob E <jacob.e.keller@intel.com>;
>Jakub Kicinski <kuba@kernel.org>; bpf@vger.kernel.org; Paolo Abeni
><pabeni@redhat.com>; David S. Miller <davem@davemloft.net>; Karlsson,
>Magnus <magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH iwl-net v2 5/6] ice: remove ICE_CFG_BUSY
>locking from AF_XDP code
>
>Locking used in ice_qp_ena() and ice_qp_dis() does pretty much nothing,
>because ICE_CFG_BUSY is a state flag that is supposed to be set in a PF state,
>not VSI one. Therefore it does not protect the queue pair from e.g. reset.
>
>Despite being useless, it still can deadlock the unfortunate functions that have
>fell into the same ICE_CFG_BUSY-VSI trap. This happens if ice_qp_ena returns
>an error.
>
>Remove ICE_CFG_BUSY locking from ice_qp_dis() and ice_qp_ena().
>
>Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
>Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice_xsk.c | 9 ---------
> 1 file changed, 9 deletions(-)
>

Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Maciej Fijalkowski Aug. 12, 2024, 1:03 p.m. UTC | #3
On Wed, Jul 24, 2024 at 06:48:36PM +0200, Larysa Zaremba wrote:
> Locking used in ice_qp_ena() and ice_qp_dis() does pretty much nothing,
> because ICE_CFG_BUSY is a state flag that is supposed to be set in a PF
> state, not VSI one. Therefore it does not protect the queue pair from
> e.g. reset.
> 
> Despite being useless, it still can deadlock the unfortunate functions that
> have fell into the same ICE_CFG_BUSY-VSI trap. This happens if ice_qp_ena
> returns an error.
> 
> Remove ICE_CFG_BUSY locking from ice_qp_dis() and ice_qp_ena().

Why not just check the pf->state ? And address other broken callsites?

> 
> Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_xsk.c | 9 ---------
>  1 file changed, 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> index 5dd50a2866cc..d23fd4ea9129 100644
> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> @@ -163,7 +163,6 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
>  	struct ice_q_vector *q_vector;
>  	struct ice_tx_ring *tx_ring;
>  	struct ice_rx_ring *rx_ring;
> -	int timeout = 50;
>  	int err;
>  
>  	if (q_idx >= vsi->num_rxq || q_idx >= vsi->num_txq)
> @@ -173,13 +172,6 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
>  	rx_ring = vsi->rx_rings[q_idx];
>  	q_vector = rx_ring->q_vector;
>  
> -	while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) {
> -		timeout--;
> -		if (!timeout)
> -			return -EBUSY;
> -		usleep_range(1000, 2000);
> -	}
> -
>  	ice_qvec_dis_irq(vsi, rx_ring, q_vector);
>  	ice_qvec_toggle_napi(vsi, q_vector, false);
>  
> @@ -250,7 +242,6 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
>  	ice_qvec_ena_irq(vsi, q_vector);
>  
>  	netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> -	clear_bit(ICE_CFG_BUSY, vsi->state);
>  
>  	return 0;
>  }
> -- 
> 2.43.0
>
Larysa Zaremba Aug. 12, 2024, 3:59 p.m. UTC | #4
On Mon, Aug 12, 2024 at 03:03:19PM +0200, Maciej Fijalkowski wrote:
> On Wed, Jul 24, 2024 at 06:48:36PM +0200, Larysa Zaremba wrote:
> > Locking used in ice_qp_ena() and ice_qp_dis() does pretty much nothing,
> > because ICE_CFG_BUSY is a state flag that is supposed to be set in a PF
> > state, not VSI one. Therefore it does not protect the queue pair from
> > e.g. reset.
> > 
> > Despite being useless, it still can deadlock the unfortunate functions that
> > have fell into the same ICE_CFG_BUSY-VSI trap. This happens if ice_qp_ena
> > returns an error.
> > 
> > Remove ICE_CFG_BUSY locking from ice_qp_dis() and ice_qp_ena().
> 
> Why not just check the pf->state ?

I would just cite Jakub: "you lose lockdep and all other infra normal mutex 
would give you." [0]

[0] https://lore.kernel.org/netdev/20240612140935.54981c49@kernel.org/

> And address other broken callsites?

Because the current state of sychronization does not allow me to assume this 
would fix anything and testing all the places would be out of scope for theese 
series.

With Dawid's patch [1], a mutex for XDP and miscellaneous changes from these 
series I think we would probably come pretty close being able to get rid of 
ICE_CFG_BUSY at least when locking software resources.

[1] 
https://lore.kernel.org/netdev/20240812125009.62635-1-dawid.osuchowski@linux.intel.com/

> > 
> > Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> > Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_xsk.c | 9 ---------
> >  1 file changed, 9 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > index 5dd50a2866cc..d23fd4ea9129 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > @@ -163,7 +163,6 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
> >  	struct ice_q_vector *q_vector;
> >  	struct ice_tx_ring *tx_ring;
> >  	struct ice_rx_ring *rx_ring;
> > -	int timeout = 50;
> >  	int err;
> >  
> >  	if (q_idx >= vsi->num_rxq || q_idx >= vsi->num_txq)
> > @@ -173,13 +172,6 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
> >  	rx_ring = vsi->rx_rings[q_idx];
> >  	q_vector = rx_ring->q_vector;
> >  
> > -	while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) {
> > -		timeout--;
> > -		if (!timeout)
> > -			return -EBUSY;
> > -		usleep_range(1000, 2000);
> > -	}
> > -
> >  	ice_qvec_dis_irq(vsi, rx_ring, q_vector);
> >  	ice_qvec_toggle_napi(vsi, q_vector, false);
> >  
> > @@ -250,7 +242,6 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
> >  	ice_qvec_ena_irq(vsi, q_vector);
> >  
> >  	netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> > -	clear_bit(ICE_CFG_BUSY, vsi->state);
> >  
> >  	return 0;
> >  }
> > -- 
> > 2.43.0
> >
Maciej Fijalkowski Aug. 13, 2024, 10:28 a.m. UTC | #5
On Mon, Aug 12, 2024 at 05:59:21PM +0200, Larysa Zaremba wrote:
> On Mon, Aug 12, 2024 at 03:03:19PM +0200, Maciej Fijalkowski wrote:
> > On Wed, Jul 24, 2024 at 06:48:36PM +0200, Larysa Zaremba wrote:
> > > Locking used in ice_qp_ena() and ice_qp_dis() does pretty much nothing,
> > > because ICE_CFG_BUSY is a state flag that is supposed to be set in a PF
> > > state, not VSI one. Therefore it does not protect the queue pair from
> > > e.g. reset.
> > > 
> > > Despite being useless, it still can deadlock the unfortunate functions that
> > > have fell into the same ICE_CFG_BUSY-VSI trap. This happens if ice_qp_ena
> > > returns an error.
> > > 
> > > Remove ICE_CFG_BUSY locking from ice_qp_dis() and ice_qp_ena().
> > 
> > Why not just check the pf->state ?
> 
> I would just cite Jakub: "you lose lockdep and all other infra normal mutex 
> would give you." [0]

I was not sure why you're bringing up mutex here but I missed 2nd patch
somehow :) let me start from the beginning.

> 
> [0] https://lore.kernel.org/netdev/20240612140935.54981c49@kernel.org/
> 
> > And address other broken callsites?
> 
> Because the current state of sychronization does not allow me to assume this 
> would fix anything and testing all the places would be out of scope for theese 
> series.
> 
> With Dawid's patch [1], a mutex for XDP and miscellaneous changes from these 
> series I think we would probably come pretty close being able to get rid of 
> ICE_CFG_BUSY at least when locking software resources.
> 
> [1] 
> https://lore.kernel.org/netdev/20240812125009.62635-1-dawid.osuchowski@linux.intel.com/
> 
> > > 
> > > Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> > > Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
> > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > ---
> > >  drivers/net/ethernet/intel/ice/ice_xsk.c | 9 ---------
> > >  1 file changed, 9 deletions(-)
> > > 
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > > index 5dd50a2866cc..d23fd4ea9129 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > > @@ -163,7 +163,6 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
> > >  	struct ice_q_vector *q_vector;
> > >  	struct ice_tx_ring *tx_ring;
> > >  	struct ice_rx_ring *rx_ring;
> > > -	int timeout = 50;
> > >  	int err;
> > >  
> > >  	if (q_idx >= vsi->num_rxq || q_idx >= vsi->num_txq)
> > > @@ -173,13 +172,6 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
> > >  	rx_ring = vsi->rx_rings[q_idx];
> > >  	q_vector = rx_ring->q_vector;
> > >  
> > > -	while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) {
> > > -		timeout--;
> > > -		if (!timeout)
> > > -			return -EBUSY;
> > > -		usleep_range(1000, 2000);
> > > -	}
> > > -
> > >  	ice_qvec_dis_irq(vsi, rx_ring, q_vector);
> > >  	ice_qvec_toggle_napi(vsi, q_vector, false);
> > >  
> > > @@ -250,7 +242,6 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
> > >  	ice_qvec_ena_irq(vsi, q_vector);
> > >  
> > >  	netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> > > -	clear_bit(ICE_CFG_BUSY, vsi->state);
> > >  
> > >  	return 0;
> > >  }
> > > -- 
> > > 2.43.0
> > >
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 5dd50a2866cc..d23fd4ea9129 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -163,7 +163,6 @@  static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
 	struct ice_q_vector *q_vector;
 	struct ice_tx_ring *tx_ring;
 	struct ice_rx_ring *rx_ring;
-	int timeout = 50;
 	int err;
 
 	if (q_idx >= vsi->num_rxq || q_idx >= vsi->num_txq)
@@ -173,13 +172,6 @@  static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
 	rx_ring = vsi->rx_rings[q_idx];
 	q_vector = rx_ring->q_vector;
 
-	while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) {
-		timeout--;
-		if (!timeout)
-			return -EBUSY;
-		usleep_range(1000, 2000);
-	}
-
 	ice_qvec_dis_irq(vsi, rx_ring, q_vector);
 	ice_qvec_toggle_napi(vsi, q_vector, false);
 
@@ -250,7 +242,6 @@  static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
 	ice_qvec_ena_irq(vsi, q_vector);
 
 	netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
-	clear_bit(ICE_CFG_BUSY, vsi->state);
 
 	return 0;
 }