Message ID | 1252396189.16528.19.camel@dengdd-desktop |
---|---|
State | Deferred, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, 2009-09-08 at 15:49 +0800, DDD wrote: > This race will break the messages order. > > Sequence of events in the problem case: > > Assume there is one skb "A" in skb_queue and the next action of > netpoll_send_skb() is: sending out "B" skb. > The right order of messages should be: send A first, then send B. > > But as following orders, it will send B first, then send A. I would say no, the order of messages A and B queued on different CPUs is undefined. The only issue is that we can queue a message A on CPU0, then causally trigger a message on CPU1 B that arrives first. But bear in mind that the message A >>may never arrive<< because the message is about a lockup that kills processing of delayed work. Generally speaking, queueing should be a last ditch effort. We should instead aim to deliver all messages immediately, even if they might be out of order. Because out of order is better than not arriving at all.
On Tue, 2009-09-08 at 12:27 -0500, Matt Mackall wrote: > On Tue, 2009-09-08 at 15:49 +0800, DDD wrote: > > This race will break the messages order. > > > > Sequence of events in the problem case: > > > > Assume there is one skb "A" in skb_queue and the next action of > > netpoll_send_skb() is: sending out "B" skb. > > The right order of messages should be: send A first, then send B. > > > > But as following orders, it will send B first, then send A. > > I would say no, the order of messages A and B queued on different CPUs > is undefined. The only issue is that we can queue a message A on CPU0, > then causally trigger a message on CPU1 B that arrives first. But bear > in mind that the message A >>may never arrive<< because the message is > about a lockup that kills processing of delayed work. > > Generally speaking, queueing should be a last ditch effort. We should > instead aim to deliver all messages immediately, even if they might be > out of order. Because out of order is better than not arriving at all. Ah, yes, out of order is better than not arriving at all. :-) Especially it is based on UDP, so I don't think the order of messages is important. :-) I take back this patch. Thank you very much, Dongdong -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h index 2524267..3757141 100644 --- a/include/linux/netpoll.h +++ b/include/linux/netpoll.h @@ -27,6 +27,7 @@ struct netpoll_info { atomic_t refcnt; int rx_flags; spinlock_t rx_lock; + __u32 txq_skbuff_len; struct netpoll *rx_np; /* netpoll that registered an rx_hook */ struct sk_buff_head arp_tx; /* list of arp requests to reply to */ struct sk_buff_head txq; diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 1b76eb1..6d22426 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -78,11 +78,14 @@ static void queue_process(struct work_struct *work) __netif_tx_unlock(txq); local_irq_restore(flags); + npinfo->txq_skbuff_len = skb_queue_len(&npinfo->txq); schedule_delayed_work(&npinfo->tx_work, HZ/10); return; } __netif_tx_unlock(txq); local_irq_restore(flags); + + npinfo->txq_skbuff_len = skb_queue_len(&npinfo->txq); } } @@ -291,7 +294,7 @@ static void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb) } /* don't get messages out of order, and no recursion */ - if (skb_queue_len(&npinfo->txq) == 0 && !netpoll_owner_active(dev)) { + if (npinfo->txq_skbuff_len == 0 && !netpoll_owner_active(dev)) { struct netdev_queue *txq; unsigned long flags; @@ -329,7 +332,8 @@ static void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb) if (status != NETDEV_TX_OK) { skb_queue_tail(&npinfo->txq, skb); - schedule_delayed_work(&npinfo->tx_work,0); + npinfo->txq_skbuff_len = skb_queue_len(&npinfo->txq); + schedule_delayed_work(&npinfo->tx_work, 0); } }
This race will break the messages order. Sequence of events in the problem case: Assume there is one skb "A" in skb_queue and the next action of netpoll_send_skb() is: sending out "B" skb. The right order of messages should be: send A first, then send B. But as following orders, it will send B first, then send A. CPU0 CPU1 queue_process() netpoll_send_skb(B) A = skb_dequeue() /**** A was removed from skb_queue, so skb_queue_len = 0 ****/ Check skb_queue_len == 0 Yes sendout (B) sendout (A) The solution is to introduce the txq_skbuff_len in netpoll_info struct. The txq_skbuff_len will keep the skb queue lenght utill the skb was sent out, so that this race condition will gone. Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com> --- include/linux/netpoll.h | 1 + net/core/netpoll.c | 8 ++++++-- 2 files changed, 7 insertions(+), 2 deletions(-)