diff mbox series

xen-netback: don't populate the hash cache on XenBus disconnect

Message ID 1551363086-29652-1-git-send-email-igor.druzhinin@citrix.com
State Accepted
Delegated to: David Miller
Headers show
Series xen-netback: don't populate the hash cache on XenBus disconnect | expand

Commit Message

Igor Druzhinin Feb. 28, 2019, 2:11 p.m. UTC
Occasionally, during the disconnection procedure on XenBus which
includes hash cache deinitialization there might be some packets
still in-flight on other processors. Handling of these packets includes
hashing and hash cache population that finally results in hash cache
data structure corruption.

In order to avoid this we prevent hashing of those packets if there
are no queues initialized. In that case RCU protection of queues guards
the hash cache as well.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
---

Found this while applying the previous patch to our patchqueue. Seems it
never went to the mailing list and, to my knowledge, the problem is still
present. From my recollection, it only happened on stress frontend on/off
test with Windows guests (since only those detach the frontend completely).
So better late than never.

---
 drivers/net/xen-netback/hash.c      | 2 ++
 drivers/net/xen-netback/interface.c | 7 +++++++
 2 files changed, 9 insertions(+)

Comments

Paul Durrant Feb. 28, 2019, 2:37 p.m. UTC | #1
> -----Original Message-----
> From: Igor Druzhinin [mailto:igor.druzhinin@citrix.com]
> Sent: 28 February 2019 14:11
> To: xen-devel@lists.xenproject.org; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> Cc: Wei Liu <wei.liu2@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; davem@davemloft.net; Igor
> Druzhinin <igor.druzhinin@citrix.com>
> Subject: [PATCH] xen-netback: don't populate the hash cache on XenBus disconnect
> 
> Occasionally, during the disconnection procedure on XenBus which
> includes hash cache deinitialization there might be some packets
> still in-flight on other processors. Handling of these packets includes
> hashing and hash cache population that finally results in hash cache
> data structure corruption.
> 
> In order to avoid this we prevent hashing of those packets if there
> are no queues initialized. In that case RCU protection of queues guards
> the hash cache as well.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> 
> Found this while applying the previous patch to our patchqueue. Seems it
> never went to the mailing list and, to my knowledge, the problem is still
> present. From my recollection, it only happened on stress frontend on/off
> test with Windows guests (since only those detach the frontend completely).
> So better late than never.
> 
> ---
>  drivers/net/xen-netback/hash.c      | 2 ++
>  drivers/net/xen-netback/interface.c | 7 +++++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c
> index 0ccb021..10d580c 100644
> --- a/drivers/net/xen-netback/hash.c
> +++ b/drivers/net/xen-netback/hash.c
> @@ -454,6 +454,8 @@ void xenvif_init_hash(struct xenvif *vif)
>  	if (xenvif_hash_cache_size == 0)
>  		return;
> 
> +	BUG_ON(vif->hash.cache.count);
> +
>  	spin_lock_init(&vif->hash.cache.lock);
>  	INIT_LIST_HEAD(&vif->hash.cache.list);
>  }
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index 182d677..6da1251 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -153,6 +153,13 @@ static u16 xenvif_select_queue(struct net_device *dev, struct sk_buff *skb,
>  {
>  	struct xenvif *vif = netdev_priv(dev);
>  	unsigned int size = vif->hash.size;
> +	unsigned int num_queues;
> +
> +	/* If queues are not set up internally - always return 0
> +	 * as the packet going to be dropped anyway */
> +	num_queues = READ_ONCE(vif->num_queues);
> +	if (num_queues < 1)
> +		return 0;
> 
>  	if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
>  		return fallback(dev, skb, NULL) % dev->real_num_tx_queues;
> --
> 2.7.4
David Miller Feb. 28, 2019, 8:51 p.m. UTC | #2
From: Igor Druzhinin <igor.druzhinin@citrix.com>
Date: Thu, 28 Feb 2019 14:11:26 +0000

> Occasionally, during the disconnection procedure on XenBus which
> includes hash cache deinitialization there might be some packets
> still in-flight on other processors. Handling of these packets includes
> hashing and hash cache population that finally results in hash cache
> data structure corruption.
> 
> In order to avoid this we prevent hashing of those packets if there
> are no queues initialized. In that case RCU protection of queues guards
> the hash cache as well.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>

Applied and queued up for -stable, thanks.
diff mbox series

Patch

diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c
index 0ccb021..10d580c 100644
--- a/drivers/net/xen-netback/hash.c
+++ b/drivers/net/xen-netback/hash.c
@@ -454,6 +454,8 @@  void xenvif_init_hash(struct xenvif *vif)
 	if (xenvif_hash_cache_size == 0)
 		return;
 
+	BUG_ON(vif->hash.cache.count);
+
 	spin_lock_init(&vif->hash.cache.lock);
 	INIT_LIST_HEAD(&vif->hash.cache.list);
 }
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 182d677..6da1251 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -153,6 +153,13 @@  static u16 xenvif_select_queue(struct net_device *dev, struct sk_buff *skb,
 {
 	struct xenvif *vif = netdev_priv(dev);
 	unsigned int size = vif->hash.size;
+	unsigned int num_queues;
+
+	/* If queues are not set up internally - always return 0
+	 * as the packet going to be dropped anyway */
+	num_queues = READ_ONCE(vif->num_queues);
+	if (num_queues < 1)
+		return 0;
 
 	if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
 		return fallback(dev, skb, NULL) % dev->real_num_tx_queues;