Message ID | 20150922215049.3088.32475.stgit@ahduyck-vm-fedora22 |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Alexander Duyck <aduyck@mirantis.com> Date: Tue, 22 Sep 2015 14:56:08 -0700 > Rather than carry around a value of budget that is 0 or less we can instead > just loop through and pass 0 to each napi->poll call. If any driver > returns a value for work done that is non-zero then we can report that > driver and continue rather than allowing a bad actor to make the budget > value negative and pass that negative value to napi->poll. Unfortunately we have drivers that won't do any TX work if the budget is zero. Using the budget for TX work is unfortunate and not the recommended way for drivers to do things, but it's not explicitly disallowed either. So I'm not applying this because it definitely has the potential to break something. Sorry. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/26/2015 10:36 PM, David Miller wrote: > From: Alexander Duyck <aduyck@mirantis.com> > Date: Tue, 22 Sep 2015 14:56:08 -0700 > >> Rather than carry around a value of budget that is 0 or less we can instead >> just loop through and pass 0 to each napi->poll call. If any driver >> returns a value for work done that is non-zero then we can report that >> driver and continue rather than allowing a bad actor to make the budget >> value negative and pass that negative value to napi->poll. > Unfortunately we have drivers that won't do any TX work if the budget > is zero. Well that is what we are doing right now. The fact is the call starts out with a budget of 0, and it is somewhat hidden from the call since the budget is assigned a value of 0 in netpoll_poll_dev. That is one of the things I was wanting do address because that is clear as mud from looking at poll_one_napi. Based on the code you would assume budget starts out as a non-zero value and it doesn't. > Using the budget for TX work is unfortunate and not the recommended > way for drivers to do things, but it's not explicitly disallowed > either. > > So I'm not applying this because it definitely has the potential > to break something. > > Sorry. I don't see how this introduces a regression when all I am doing is avoiding tracking a value that should be 0 assuming everything is working correctly. If work returns a non-zero value with the code as it currently is then the WARN_ONCE is triggered, and the value of budget is becoming negative. I would consider a negative budget value worse than a 0 budget value. I'll go back through the patch and rebase it since it looks like Neil had to submit a v3 of his patch and it may have impacted mine. However perhaps we need to revisit this code if you think it is risky as the only thing my changes did is remove the ability for the budget value to go from 0 to negative and then passing that negative value into the function. - Alex -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Alexander Duyck <alexander.duyck@gmail.com> Date: Sun, 27 Sep 2015 15:58:56 -0700 > On 09/26/2015 10:36 PM, David Miller wrote: >> From: Alexander Duyck <aduyck@mirantis.com> >> Date: Tue, 22 Sep 2015 14:56:08 -0700 >> >>> Rather than carry around a value of budget that is 0 or less we can >>> instead >>> just loop through and pass 0 to each napi->poll call. If any driver >>> returns a value for work done that is non-zero then we can report that >>> driver and continue rather than allowing a bad actor to make the >>> budget >>> value negative and pass that negative value to napi->poll. >> Unfortunately we have drivers that won't do any TX work if the budget >> is zero. > > Well that is what we are doing right now. The fact is the call starts > out with a budget of 0, and it is somewhat hidden from the call since > the budget is assigned a value of 0 in netpoll_poll_dev. That is one > of the things I was wanting do address because that is clear as mud > from looking at poll_one_napi. Based on the code you would assume > budget starts out as a non-zero value and it doesn't. I see, thanks for explaining. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 9312b665ff73..df9b2fd5fee8 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -140,16 +140,16 @@ static void queue_process(struct work_struct *work) * case. Further, we test the poll_owner to avoid recursion on UP * systems where the lock doesn't exist. */ -static int poll_one_napi(struct napi_struct *napi, int budget) +static void poll_one_napi(struct napi_struct *napi) { - int work = 0; + int work; /* net_rx_action's ->poll() invocations and our's are * synchronized by this test which is only made while * holding the napi->poll_lock. */ if (!test_bit(NAPI_STATE_SCHED, &napi->state)) - return budget; + return; /* * If we set this bit but see that it has already been set, @@ -158,26 +158,26 @@ static int poll_one_napi(struct napi_struct *napi, int budget) */ if(test_and_set_bit(NAPI_STATE_NPSVC, &napi->state)) - goto out; + return; - work = napi->poll(napi, budget); - WARN_ONCE(work > budget, "%pF exceeded budget in poll\n", napi->poll); + /* We explicilty pass the polling call a budget of 0 to + * indicate that we are clearing the Tx path only. + */ + work = napi->poll(napi, 0); + WARN_ONCE(work, "%pF exceeded budget in poll\n", napi->poll); trace_napi_poll(napi); clear_bit(NAPI_STATE_NPSVC, &napi->state); - -out: - return budget - work; } -static void poll_napi(struct net_device *dev, int budget) +static void poll_napi(struct net_device *dev) { struct napi_struct *napi; list_for_each_entry(napi, &dev->napi_list, dev_list) { if (napi->poll_owner != smp_processor_id() && spin_trylock(&napi->poll_lock)) { - budget = poll_one_napi(napi, budget); + poll_one_napi(napi); spin_unlock(&napi->poll_lock); } } @@ -187,7 +187,6 @@ static void netpoll_poll_dev(struct net_device *dev) { const struct net_device_ops *ops; struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo); - int budget = 0; /* Don't do any rx activity if the dev_lock mutex is held * the dev_open/close paths use this to block netpoll activity @@ -210,7 +209,7 @@ static void netpoll_poll_dev(struct net_device *dev) /* Process pending work on NIC */ ops->ndo_poll_controller(dev); - poll_napi(dev, budget); + poll_napi(dev); up(&ni->dev_lock);
For some reason we were carrying the budget value around between the various calls to napi->poll. If for example one of the drivers called had a bug in which it returned a non-zero value for work this could result in the budget value becoming negative. Rather than carry around a value of budget that is 0 or less we can instead just loop through and pass 0 to each napi->poll call. If any driver returns a value for work done that is non-zero then we can report that driver and continue rather than allowing a bad actor to make the budget value negative and pass that negative value to napi->poll. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> --- This patch is meant to be applied after Neil's patch: [PATCH v2] netpoll: Close race condition between poll_one_napi and napi_disable net/core/netpoll.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html