diff mbox

[net-next,v2,8/9] xen-netback: Timeout packets in RX path

Message ID 1386892097-15502-9-git-send-email-zoltan.kiss@citrix.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Zoltan Kiss Dec. 12, 2013, 11:48 p.m. UTC
A malicious or buggy guest can leave its queue filled indefinitely, in which
case qdisc start to queue packets for that VIF. If those packets came from an
another guest, it can block its slots and prevent shutdown. To avoid that, we
make sure the queue is drained in every 10 seconds

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
---
 drivers/net/xen-netback/common.h    |    5 +++++
 drivers/net/xen-netback/interface.c |   21 ++++++++++++++++++++-
 drivers/net/xen-netback/netback.c   |   10 ++++++++++
 3 files changed, 35 insertions(+), 1 deletion(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Wei Liu Dec. 13, 2013, 3:44 p.m. UTC | #1
On Thu, Dec 12, 2013 at 11:48:16PM +0000, Zoltan Kiss wrote:
> A malicious or buggy guest can leave its queue filled indefinitely, in which
> case qdisc start to queue packets for that VIF. If those packets came from an
> another guest, it can block its slots and prevent shutdown. To avoid that, we
> make sure the queue is drained in every 10 seconds
> 

Oh I see where the 10 second constraint in previous patch comes from.

Could you define a macro for this constant then use it everywhere.

> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> ---
[...]
> +static void xenvif_wake_queue(unsigned long data)
> +{
> +	struct xenvif *vif = (struct xenvif *)data;
> +
> +	netdev_err(vif->dev, "timer fires\n");

What timer? This error message needs to be more specific.

> +	if (netif_queue_stopped(vif->dev)) {
> +		netdev_err(vif->dev, "draining TX queue\n");
> +		netif_wake_queue(vif->dev);
> +	}
> +}
> +
>  static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
>  	struct xenvif *vif = netdev_priv(dev);
> @@ -141,8 +152,13 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	 * then turn off the queue to give the ring a chance to
>  	 * drain.
>  	 */
> -	if (!xenvif_rx_ring_slots_available(vif, min_slots_needed))
> +	if (!xenvif_rx_ring_slots_available(vif, min_slots_needed)) {
> +		vif->wake_queue.function = xenvif_wake_queue;
> +		vif->wake_queue.data = (unsigned long)vif;
>  		xenvif_stop_queue(vif);
> +		mod_timer(&vif->wake_queue,
> +			jiffies + rx_drain_timeout_jiffies);
> +	}
>  

Do you need to use jiffies_64 instead of jiffies?

This timer is only armed when ring is full. So what happens when the
ring is not full and some other parts of the system holds on to the
packets forever? Can this happen?

Wei.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zoltan Kiss Dec. 16, 2013, 5:16 p.m. UTC | #2
On 13/12/13 15:44, Wei Liu wrote:
> On Thu, Dec 12, 2013 at 11:48:16PM +0000, Zoltan Kiss wrote:
>> A malicious or buggy guest can leave its queue filled indefinitely, in which
>> case qdisc start to queue packets for that VIF. If those packets came from an
>> another guest, it can block its slots and prevent shutdown. To avoid that, we
>> make sure the queue is drained in every 10 seconds
>>
>
> Oh I see where the 10 second constraint in previous patch comes from.
>
> Could you define a macro for this constant then use it everywhere.
Well, they are not entirely the same thing, but worth making them the 
same. How about using "unmap_timeout > (rx_drain_timeout_msecs/1000)" in 
xenvif_free()? Then netback won't complain about a stucked page if an 
another guest is permitted to hold on to it.

>
>> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
>> ---
> [...]
>> +static void xenvif_wake_queue(unsigned long data)
>> +{
>> +	struct xenvif *vif = (struct xenvif *)data;
>> +
>> +	netdev_err(vif->dev, "timer fires\n");
>
> What timer? This error message needs to be more specific.
I forgot to remove this, I used it for debugging only. The other message 
2 line below is more important

>
>> +	if (netif_queue_stopped(vif->dev)) {
>> +		netdev_err(vif->dev, "draining TX queue\n");
>> +		netif_wake_queue(vif->dev);
>> +	}
>> +}
>> +
>>   static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
>>   {
>>   	struct xenvif *vif = netdev_priv(dev);
>> @@ -141,8 +152,13 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
>>   	 * then turn off the queue to give the ring a chance to
>>   	 * drain.
>>   	 */
>> -	if (!xenvif_rx_ring_slots_available(vif, min_slots_needed))
>> +	if (!xenvif_rx_ring_slots_available(vif, min_slots_needed)) {
>> +		vif->wake_queue.function = xenvif_wake_queue;
>> +		vif->wake_queue.data = (unsigned long)vif;
>>   		xenvif_stop_queue(vif);
>> +		mod_timer(&vif->wake_queue,
>> +			jiffies + rx_drain_timeout_jiffies);
>> +	}
>>
>
> Do you need to use jiffies_64 instead of jiffies?
Well, we don't use time_after_eq here, just set the timer. AFAIK that 
should be OK.

> This timer is only armed when ring is full. So what happens when the
> ring is not full and some other parts of the system holds on to the
> packets forever? Can this happen?
This timer is not to protect the receiving guest, but to protect the 
sender. If the ring is not full, then netback will put the packet there 
and release the skb back.
This patch is to replace delayed copy from classic kernel times. There 
we handled this problem on the sender side: after a timer expired we 
made a local copy of the packet and released back the pages. It had 
stronger guarantees that a guest will always get back its pages, but it 
also caused more unnecessary copies when the system is already loaded 
and we should really thrash the packet. Unfortunately we can't do that 
as the sender is no longer in control.
Instead I choose this more lightweight solution, because practice said 
an another guest's queue is the only place where the packet can get 
stucked, especially if that guest is malicious, buggy, or too slow.
Other parts (e.g. a driver) can also hold on the packet if they are 
buggy, but then we should fix that bug rather than feed it with more 
guest pages.

Zoli
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Liu Dec. 16, 2013, 7:03 p.m. UTC | #3
On Mon, Dec 16, 2013 at 05:16:17PM +0000, Zoltan Kiss wrote:
> On 13/12/13 15:44, Wei Liu wrote:
> >On Thu, Dec 12, 2013 at 11:48:16PM +0000, Zoltan Kiss wrote:
> >>A malicious or buggy guest can leave its queue filled indefinitely, in which
> >>case qdisc start to queue packets for that VIF. If those packets came from an
> >>another guest, it can block its slots and prevent shutdown. To avoid that, we
> >>make sure the queue is drained in every 10 seconds
> >>
> >
> >Oh I see where the 10 second constraint in previous patch comes from.
> >
> >Could you define a macro for this constant then use it everywhere.
> Well, they are not entirely the same thing, but worth making them
> the same. How about using "unmap_timeout >
> (rx_drain_timeout_msecs/1000)" in xenvif_free()? Then netback won't
> complain about a stucked page if an another guest is permitted to
> hold on to it.
> 

Thanks for clarification. I see the difference. If they are not the same
by definition then we need to think more about making them the same in
practice.

If we use  "unmap_timeout > (rx_drain_timeout_msecs/1000)" then we
basically assume that guest RX path is the one who is most likely to
hold the packet for the longest time.

Wei.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index e022812..a834818 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -128,6 +128,8 @@  struct xenvif {
 	 */
 	bool rx_event;
 
+	struct timer_list wake_queue;
+
 	/* Given MAX_BUFFER_OFFSET of 4096 the worst case is that each
 	 * head/fragment page uses 2 copy operations because it
 	 * straddles two buffers in the frontend.
@@ -223,4 +225,7 @@  void xenvif_idx_unmap(struct xenvif *vif, u16 pending_idx);
 
 extern bool separate_tx_rx_irq;
 
+extern unsigned int rx_drain_timeout_msecs;
+extern unsigned int rx_drain_timeout_jiffies;
+
 #endif /* __XEN_NETBACK__COMMON_H__ */
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 7aa3535..eaf406f 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -114,6 +114,17 @@  static irqreturn_t xenvif_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static void xenvif_wake_queue(unsigned long data)
+{
+	struct xenvif *vif = (struct xenvif *)data;
+
+	netdev_err(vif->dev, "timer fires\n");
+	if (netif_queue_stopped(vif->dev)) {
+		netdev_err(vif->dev, "draining TX queue\n");
+		netif_wake_queue(vif->dev);
+	}
+}
+
 static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct xenvif *vif = netdev_priv(dev);
@@ -141,8 +152,13 @@  static int xenvif_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * then turn off the queue to give the ring a chance to
 	 * drain.
 	 */
-	if (!xenvif_rx_ring_slots_available(vif, min_slots_needed))
+	if (!xenvif_rx_ring_slots_available(vif, min_slots_needed)) {
+		vif->wake_queue.function = xenvif_wake_queue;
+		vif->wake_queue.data = (unsigned long)vif;
 		xenvif_stop_queue(vif);
+		mod_timer(&vif->wake_queue,
+			jiffies + rx_drain_timeout_jiffies);
+	}
 
 	skb_queue_tail(&vif->rx_queue, skb);
 	xenvif_kick_thread(vif);
@@ -341,6 +357,8 @@  struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	init_timer(&vif->credit_timeout);
 	vif->credit_window_start = get_jiffies_64();
 
+	init_timer(&vif->wake_queue);
+
 	dev->netdev_ops	= &xenvif_netdev_ops;
 	dev->hw_features = NETIF_F_SG |
 		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
@@ -515,6 +533,7 @@  void xenvif_disconnect(struct xenvif *vif)
 		xenvif_carrier_off(vif);
 
 	if (vif->task) {
+		del_timer_sync(&vif->wake_queue);
 		kthread_stop(vif->task);
 		vif->task = NULL;
 	}
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 1078ae8..e6c56b5 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -64,6 +64,14 @@  static unsigned int fatal_skb_slots = FATAL_SKB_SLOTS_DEFAULT;
 module_param(fatal_skb_slots, uint, 0444);
 
 /*
+ * When guest ring is filled up, qdisc queues the packets for us, but we have
+ * to timeout them, otherwise other guests' packets can get stucked there
+ */
+unsigned int rx_drain_timeout_msecs = 10000;
+module_param(rx_drain_timeout_msecs, uint, 0444);
+unsigned int rx_drain_timeout_jiffies;
+
+/*
  * To avoid confusion, we define XEN_NETBK_LEGACY_SLOTS_MAX indicating
  * the maximum slots a valid packet can use. Now this value is defined
  * to be XEN_NETIF_NR_SLOTS_MIN, which is supposed to be supported by
@@ -2051,6 +2059,8 @@  static int __init netback_init(void)
 	if (rc)
 		goto failed_init;
 
+	rx_drain_timeout_jiffies = msecs_to_jiffies(rx_drain_timeout_msecs);
+
 	return 0;
 
 failed_init: