From patchwork Thu Jul 9 12:03:50 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 29626 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@bilbo.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from ozlabs.org (ozlabs.org [203.10.76.45]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx.ozlabs.org", Issuer "CA Cert Signing Authority" (verified OK)) by bilbo.ozlabs.org (Postfix) with ESMTPS id 62C48B7086 for ; Thu, 9 Jul 2009 22:04:24 +1000 (EST) Received: by ozlabs.org (Postfix) id 54E2CDDDFB; Thu, 9 Jul 2009 22:04:24 +1000 (EST) Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id DBD83DDDF8 for ; Thu, 9 Jul 2009 22:04:23 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758466AbZGIMEQ (ORCPT ); Thu, 9 Jul 2009 08:04:16 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1757708AbZGIMEQ (ORCPT ); Thu, 9 Jul 2009 08:04:16 -0400 Received: from www.tglx.de ([62.245.132.106]:55471 "EHLO www.tglx.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756544AbZGIMEP (ORCPT ); Thu, 9 Jul 2009 08:04:15 -0400 Received: from localhost (www.tglx.de [127.0.0.1]) by www.tglx.de (8.13.8/8.13.8/TGLX-2007100201) with ESMTP id n69C3odR007390 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 9 Jul 2009 14:03:51 +0200 Date: Thu, 9 Jul 2009 14:03:50 +0200 (CEST) From: Thomas Gleixner To: Jarek Poplawski cc: Andres Freund , Joao Correia , Arun R Bharadwaj , Stephen Hemminger , netdev@vger.kernel.org, LKML , Patrick McHardy , Peter Zijlstra Subject: Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem) In-Reply-To: <20090709104412.GA3651@ami.dom.local> Message-ID: References: <200907031326.21822.andres@anarazel.de> <200907071811.27570.andres@anarazel.de> <20090708080852.GC3148@ami.dom.local> <200907090023.18040.andres@anarazel.de> <20090708224828.GD3666@ami.dom.local> <20090709104412.GA3651@ami.dom.local> User-Agent: Alpine 2.00 (LFD 1167 2008-08-23) MIME-Version: 1.0 X-Virus-Scanned: clamav-milter 0.95.1 at www.tglx.de X-Virus-Status: Clean X-Spam-Status: No, score=-1.8 required=5.0 tests=ALL_TRUSTED,AWL autolearn=failed version=3.2.4 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on www.tglx.de Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Thu, 9 Jul 2009, Jarek Poplawski wrote: > > > > I have the feeling that the code relies on some implicit cpu > > boundness, which is not longer guaranteed with the timer migration > > changes, but that's a question for the network experts. > > As a matter of fact, I've just looked at this __netif_schedule(), > which really is cpu bound, so you might be 100% right. So the watchdog is the one which causes the trouble. The patch below should fix this. Thanks, tglx --- -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 24d17ce..fbe554f 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -485,7 +485,7 @@ void qdisc_watchdog_schedule(struct qdisc_watchdog *wd, psched_time_t expires) wd->qdisc->flags |= TCQ_F_THROTTLED; time = ktime_set(0, 0); time = ktime_add_ns(time, PSCHED_TICKS2NS(expires)); - hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS); + hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS_PINNED); } EXPORT_SYMBOL(qdisc_watchdog_schedule);