diff mbox

net: Prevent multiple NAPI instances co-existing in the list

Message ID CA+U0gVh+TyEUJ+qmg+FE=UhvvQywfNxcouCyv1sutZ3fav5FAw@mail.gmail.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Dennis Chen Jan. 8, 2015, 8:22 a.m. UTC
Some drivers may clear the NAPI_STATE_SCHED bit upon the state of the
NAPI instance after exhaust the budget in the poll function, which
will open a window for next device interrupt handler to insert a same
instance to the list after calling list_add_tail(&n->poll_list,
repoll) if we don't set this bit.

Signed-off-by: Dennis Chen <kernel.org.gnu@gmail.com>
---
 net/core/dev.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Herbert Xu Jan. 8, 2015, 11:15 a.m. UTC | #1
Dennis Chen <kernel.org.gnu@gmail.com> wrote:
> Some drivers may clear the NAPI_STATE_SCHED bit upon the state of the
> NAPI instance after exhaust the budget in the poll function, which
> will open a window for next device interrupt handler to insert a same
> instance to the list after calling list_add_tail(&n->poll_list,
> repoll) if we don't set this bit.
> 
> Signed-off-by: Dennis Chen <kernel.org.gnu@gmail.com>

Which driver is doing this?

Cheers,
Sergei Shtylyov Jan. 8, 2015, 1:04 p.m. UTC | #2
Hello.

On 1/8/2015 11:22 AM, Dennis Chen wrote:

> Some drivers may clear the NAPI_STATE_SCHED bit upon the state of the
> NAPI instance after exhaust the budget in the poll function, which
> will open a window for next device interrupt handler to insert a same
> instance to the list after calling list_add_tail(&n->poll_list,
> repoll) if we don't set this bit.

> Signed-off-by: Dennis Chen <kernel.org.gnu@gmail.com>
> ---
>   net/core/dev.c |    8 ++++++++
>   1 file changed, 8 insertions(+)

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 683d493..b3107ac 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4619,6 +4619,14 @@ static int napi_poll(struct napi_struct *n,
> struct list_head *repoll)
>                               n->dev ? n->dev->name : "backlog");
>                  goto out_unlock;
>          }
> +
> +       /* Some drivers may exit the polling mode when exhaust the

    s/exhaust/exhausting/.

> +        * budget. Set the NAPI_STATE_SCHED bit to prevent multiple NAPI
> +        * instances in the list in case of next device interrupt raised.
> +        */
> +       if (unlikely(!test_and_set_bit(NAPI_STATE_SCHED, &n->state)))
> +               pr_warn_once("%s: exit polling mode after exhaust the budget\n",

    Likewise. And s/exit/exiting/.

WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Jan. 8, 2015, 2:51 p.m. UTC | #3
On Thu, 2015-01-08 at 16:22 +0800, Dennis Chen wrote:
> Some drivers may clear the NAPI_STATE_SCHED bit upon the state of the
> NAPI instance after exhaust the budget in the poll function, which
> will open a window for next device interrupt handler to insert a same
> instance to the list after calling list_add_tail(&n->poll_list,
> repoll) if we don't set this bit.
> 
> Signed-off-by: Dennis Chen <kernel.org.gnu@gmail.com>
> ---


Well no.

I am removing some atomic ops in the stack, please do not add new ones,
especially if no driver is that buggy.

The unlikely() wont help the expensive stuff being done here.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dennis Chen Jan. 9, 2015, 2:24 a.m. UTC | #4
On Thu, Jan 8, 2015 at 7:15 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> Dennis Chen <kernel.org.gnu@gmail.com> wrote:
>> Some drivers may clear the NAPI_STATE_SCHED bit upon the state of the
>> NAPI instance after exhaust the budget in the poll function, which
>> will open a window for next device interrupt handler to insert a same
>> instance to the list after calling list_add_tail(&n->poll_list,
>> repoll) if we don't set this bit.
>>
>> Signed-off-by: Dennis Chen <kernel.org.gnu@gmail.com>
>
> Which driver is doing this?
>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Hi Herbert, please see this code piece in napi_poll:

     /* Some drivers may have called napi_schedule
     * prior to exhausting their budget.
     */
    if (unlikely(!list_empty(&n->poll_list))) {
        pr_warn_once("%s: Budget exhausted after napi rescheduled\n",
                 n->dev ? n->dev->name : "backlog");
        goto out_unlock;
    }

Here "Some drivers" may have called  napi_schedule to make
n->poll_list is not empty, does that mean "Some drivers" will clear
NAPI_STATE_SCHED bit, otherwise the napi_schedule() will do nothing,
does that make sense for you question? ;-)
Dennis Chen Jan. 9, 2015, 2:26 a.m. UTC | #5
I am very curious about the reason that you're removing the atomic ops
in the stack, what's the benifit?

On Thu, Jan 8, 2015 at 10:51 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2015-01-08 at 16:22 +0800, Dennis Chen wrote:
>> Some drivers may clear the NAPI_STATE_SCHED bit upon the state of the
>> NAPI instance after exhaust the budget in the poll function, which
>> will open a window for next device interrupt handler to insert a same
>> instance to the list after calling list_add_tail(&n->poll_list,
>> repoll) if we don't set this bit.
>>
>> Signed-off-by: Dennis Chen <kernel.org.gnu@gmail.com>
>> ---
>
>
> Well no.
>
> I am removing some atomic ops in the stack, please do not add new ones,
> especially if no driver is that buggy.
>
> The unlikely() wont help the expensive stuff being done here.
>
>
Herbert Xu Jan. 9, 2015, 2:27 a.m. UTC | #6
On Fri, Jan 09, 2015 at 10:24:18AM +0800, Dennis Chen wrote:
>
> Hi Herbert, please see this code piece in napi_poll:
> 
>      /* Some drivers may have called napi_schedule
>      * prior to exhausting their budget.
>      */
>     if (unlikely(!list_empty(&n->poll_list))) {
>         pr_warn_once("%s: Budget exhausted after napi rescheduled\n",
>                  n->dev ? n->dev->name : "backlog");
>         goto out_unlock;
>     }
> 
> Here "Some drivers" may have called  napi_schedule to make
> n->poll_list is not empty, does that mean "Some drivers" will clear
> NAPI_STATE_SCHED bit, otherwise the napi_schedule() will do nothing,
> does that make sense for you question? ;-)

No it tells me that you don't understand the problem at all.
Those drivers will end up resetting the NAPI_STATE_SCHED bit
after clearing it.

Cheers,
Dennis Chen Jan. 9, 2015, 2:28 a.m. UTC | #7
hello,

On Thu, Jan 8, 2015 at 9:04 PM, Sergei Shtylyov
<sergei.shtylyov@cogentembedded.com> wrote:
> Hello.
>
> On 1/8/2015 11:22 AM, Dennis Chen wrote:
>
>> Some drivers may clear the NAPI_STATE_SCHED bit upon the state of the
>> NAPI instance after exhaust the budget in the poll function, which
>> will open a window for next device interrupt handler to insert a same
>> instance to the list after calling list_add_tail(&n->poll_list,
>> repoll) if we don't set this bit.
>
>
>> Signed-off-by: Dennis Chen <kernel.org.gnu@gmail.com>
>> ---
>>   net/core/dev.c |    8 ++++++++
>>   1 file changed, 8 insertions(+)
>
>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 683d493..b3107ac 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -4619,6 +4619,14 @@ static int napi_poll(struct napi_struct *n,
>> struct list_head *repoll)
>>                               n->dev ? n->dev->name : "backlog");
>>                  goto out_unlock;
>>          }
>> +
>> +       /* Some drivers may exit the polling mode when exhaust the
>
>
>    s/exhaust/exhausting/.
>
>> +        * budget. Set the NAPI_STATE_SCHED bit to prevent multiple NAPI
>> +        * instances in the list in case of next device interrupt raised.
>> +        */
>> +       if (unlikely(!test_and_set_bit(NAPI_STATE_SCHED, &n->state)))
>> +               pr_warn_once("%s: exit polling mode after exhaust the
>> budget\n",
>
>
>    Likewise. And s/exit/exiting/.
>
> WBR, Sergei
>

will be updated if applied :-)
Dennis Chen Jan. 9, 2015, 2:32 a.m. UTC | #8
On Fri, Jan 9, 2015 at 10:27 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Fri, Jan 09, 2015 at 10:24:18AM +0800, Dennis Chen wrote:
>>
>> Hi Herbert, please see this code piece in napi_poll:
>>
>>      /* Some drivers may have called napi_schedule
>>      * prior to exhausting their budget.
>>      */
>>     if (unlikely(!list_empty(&n->poll_list))) {
>>         pr_warn_once("%s: Budget exhausted after napi rescheduled\n",
>>                  n->dev ? n->dev->name : "backlog");
>>         goto out_unlock;
>>     }
>>
>> Here "Some drivers" may have called  napi_schedule to make
>> n->poll_list is not empty, does that mean "Some drivers" will clear
>> NAPI_STATE_SCHED bit, otherwise the napi_schedule() will do nothing,
>> does that make sense for you question? ;-)
>
> No it tells me that you don't understand the problem at all.
> Those drivers will end up resetting the NAPI_STATE_SCHED bit
> after clearing it.
>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Thanks, would you pls give me an example of those drivers? I'll study
it further...
Herbert Xu Jan. 9, 2015, 2:34 a.m. UTC | #9
On Fri, Jan 09, 2015 at 10:32:13AM +0800, Dennis Chen wrote:
>
> Thanks, would you pls give me an example of those drivers? I'll study
> it further...

There are no known drivers doing this currently since that would
be a bug and they would be fixed quickly.  But if you look back
in history virtio_net was doing this.

Cheers,
Dennis Chen Jan. 9, 2015, 2:50 a.m. UTC | #10
On Fri, Jan 9, 2015 at 10:34 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Fri, Jan 09, 2015 at 10:32:13AM +0800, Dennis Chen wrote:
>>
>> Thanks, would you pls give me an example of those drivers? I'll study
>> it further...
>
> There are no known drivers doing this currently since that would
> be a bug and they would be fixed quickly.  But if you look back
> in history virtio_net was doing this.
>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

So the only reason for the code '!list_empty()' I can see is to avoid
possible bug code in new drivers, thus pls think if it's possible
that there's a window opened for creating multiple NAPI instances in
the list for the new drivers?
Dennis Chen Jan. 9, 2015, 3:07 a.m. UTC | #11
On Fri, Jan 9, 2015 at 10:34 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Fri, Jan 09, 2015 at 10:32:13AM +0800, Dennis Chen wrote:
>>
>> Thanks, would you pls give me an example of those drivers? I'll study
>> it further...
>
> There are no known drivers doing this currently since that would
> be a bug and they would be fixed quickly.  But if you look back
> in history virtio_net was doing this.
>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Is this the below patch you mentioned?
https://lkml.org/lkml/2014/7/23/134
David Miller Jan. 9, 2015, 3:12 a.m. UTC | #12
From: Dennis Chen <kernel.org.gnu@gmail.com>
Date: Fri, 9 Jan 2015 10:26:48 +0800

> I am very curious about the reason that you're removing the atomic ops
> in the stack, what's the benifit?

1) Do not top post.

2) The reason is, obviously, performance.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 683d493..b3107ac 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4619,6 +4619,14 @@  static int napi_poll(struct napi_struct *n,
struct list_head *repoll)
                             n->dev ? n->dev->name : "backlog");
                goto out_unlock;
        }
+
+       /* Some drivers may exit the polling mode when exhaust the
+        * budget. Set the NAPI_STATE_SCHED bit to prevent multiple NAPI
+        * instances in the list in case of next device interrupt raised.
+        */
+       if (unlikely(!test_and_set_bit(NAPI_STATE_SCHED, &n->state)))
+               pr_warn_once("%s: exit polling mode after exhaust the budget\n",
+                            n->dev ? n->dev->name : "backlog");

        list_add_tail(&n->poll_list, repoll);