From patchwork Tue Mar 23 20:25:53 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duyck, Alexander H" X-Patchwork-Id: 48363 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 7ACB1B7CE2 for ; Wed, 24 Mar 2010 07:26:03 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752263Ab0CWUZ7 (ORCPT ); Tue, 23 Mar 2010 16:25:59 -0400 Received: from mga01.intel.com ([192.55.52.88]:62481 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751988Ab0CWUZ6 (ORCPT ); Tue, 23 Mar 2010 16:25:58 -0400 Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP; 23 Mar 2010 13:21:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.51,296,1267430400"; d="scan'208";a="783301035" Received: from gitlad.jf.intel.com ([10.23.21.59]) by fmsmga001.fm.intel.com with ESMTP; 23 Mar 2010 13:25:42 -0700 Received: from gitlad.jf.intel.com (gitlad.jf.intel.com [127.0.0.1]) by gitlad.jf.intel.com (8.14.2/8.14.2) with ESMTP id o2NKPrG9021627 for ; Tue, 23 Mar 2010 13:25:54 -0700 From: Alexander Duyck Subject: [RFC PATCH] net: add additional lock to qdisc to increase enqueue/dequeue fairness To: netdev@vger.kernel.org Date: Tue, 23 Mar 2010 13:25:53 -0700 Message-ID: <20100323202553.21598.10754.stgit@gitlad.jf.intel.com> User-Agent: StGIT/0.14.2 MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The qdisc layer shows a significant issue when you start transmitting from multiple CPUs. The issue is that the transmit rate drops significantly, and I believe it is due to the fact that the spinlock is shared between the 1 dequeue, and n-1 enqueue cpu threads. In order to improve this situation I am adding one additional lock which will need to be obtained during the enqueue portion of the path. This essentially allows sch_direct_xmit to jump to near the head of the line when attempting to obtain the lock after completing a transmit. Running the script below I saw an increase from 200K packets per second to 1.07M packets per second as a result of this patch. for j in `seq 0 15`; do for i in `seq 0 7`; do netperf -H -t UDP_STREAM -l 600 -N -T $i -- -m 6 & done done Signed-off-by: Alexander Duyck --- include/net/sch_generic.h | 3 ++- net/core/dev.c | 7 ++++++- net/sched/sch_generic.c | 1 + 3 files changed, 9 insertions(+), 2 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 67dc08e..f5088a9 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -51,7 +51,6 @@ struct Qdisc { struct list_head list; u32 handle; u32 parent; - atomic_t refcnt; struct gnet_stats_rate_est rate_est; int (*reshape_fail)(struct sk_buff *skb, struct Qdisc *q); @@ -65,6 +64,8 @@ struct Qdisc { struct netdev_queue *dev_queue; struct Qdisc *next_sched; + atomic_t refcnt; + spinlock_t enqueue_lock; struct sk_buff *gso_skb; /* * For performance sake on SMP, we put highly modified fields at the end diff --git a/net/core/dev.c b/net/core/dev.c index a03aab4..8a2d508 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2006,8 +2006,10 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, spinlock_t *root_lock = qdisc_lock(q); int rc; + spin_lock(&q->enqueue_lock); spin_lock(root_lock); if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) { + spin_unlock(&q->enqueue_lock); kfree_skb(skb); rc = NET_XMIT_DROP; } else if ((q->flags & TCQ_F_CAN_BYPASS) && !qdisc_qlen(q) && @@ -2018,7 +2020,9 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, * xmit the skb directly. */ __qdisc_update_bstats(q, skb->len); - if (sch_direct_xmit(skb, q, dev, txq, root_lock)) + spin_unlock(&q->enqueue_lock); + rc = sch_direct_xmit(skb, q, dev, txq, root_lock); + if (rc) __qdisc_run(q); else clear_bit(__QDISC_STATE_RUNNING, &q->state); @@ -2026,6 +2030,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, rc = NET_XMIT_SUCCESS; } else { rc = qdisc_enqueue_root(skb, q); + spin_unlock(&q->enqueue_lock); qdisc_run(q); } spin_unlock(root_lock); diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 5173c1e..4b5e125 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -538,6 +538,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p); sch->padded = (char *) sch - (char *) p; + spin_lock_init(&sch->enqueue_lock); INIT_LIST_HEAD(&sch->list); skb_queue_head_init(&sch->q); sch->ops = ops;