diff mbox

[net-next,2/4] udp offload: allow GRO on 0 checksum packets

Message ID 60369c715ef9948b088416639f0bec800f632f9a.1467907022.git.pabeni@redhat.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Paolo Abeni July 7, 2016, 3:58 p.m. UTC
currently, UDP packets with zero checksum are not allowed to
use udp offload's GRO. This patch admits such packets to
GRO, if the related socket settings allow it: ipv6 packets
are not admitted if the sockets don't have the no_check6_rx
flag set.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/ipv4/udp_offload.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Alexander H Duyck July 8, 2016, 4:46 p.m. UTC | #1
On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> currently, UDP packets with zero checksum are not allowed to
> use udp offload's GRO. This patch admits such packets to
> GRO, if the related socket settings allow it: ipv6 packets
> are not admitted if the sockets don't have the no_check6_rx
> flag set.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  net/ipv4/udp_offload.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index 9c37338..ac783f4 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -257,7 +257,7 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
>         struct sock *sk;
>
>         if (NAPI_GRO_CB(skb)->encap_mark ||
> -           (skb->ip_summed != CHECKSUM_PARTIAL &&
> +           (uh->check && skb->ip_summed != CHECKSUM_PARTIAL &&
>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>              !NAPI_GRO_CB(skb)->csum_valid))
>                 goto out;

So now all zero checksum UDP traffic will be targeted for GRO if I am
understanding this right.  Have you looked into how much of an impact
this might have on performance for non-tunnel UDP traffic using a zero
checksum?  I'm thinking it will be negative.  The issue is you are now
going to be performing an extra socket lookup for all incoming UDP
frames that have a zero checksum.

Also in the line below this line we are setting the encap_mark.  That
will probably need to be moved down to the point just before we call
gro_receive so that we can avoid setting unneeded data in the
NAPI_GRO_CB.

> @@ -271,6 +271,10 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
>         if (!sk || !udp_sk(sk)->gro_receive)
>                 goto out_unlock;
>
> +       if (!uh->check && skb->protocol == cpu_to_be16(ETH_P_IPV6) &&
> +           !udp_sk(sk)->no_check6_rx)
> +               goto out_unlock;
> +
>         flush = 0;
>
>         for (p = *head; p; p = p->next) {

So I am pretty sure this check doesn't pass the sniff test.
Specifically I don't believe you can use skb->protocol like you
currently are as it could be an 8021q frame for instance that is being
aggregated so the skb->protocol would be invalid.  I think what you
should probably be using is NAPI_GRO_CB(skb)->is_ipv6 although it
occurs to me that in the case of tunnels I don't know if that value is
being reset for IPv4 like it should be.

- Alex
Hannes Frederic Sowa July 8, 2016, 4:56 p.m. UTC | #2
On 08.07.2016 12:46, Alexander Duyck wrote:
> On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>> currently, UDP packets with zero checksum are not allowed to
>> use udp offload's GRO. This patch admits such packets to
>> GRO, if the related socket settings allow it: ipv6 packets
>> are not admitted if the sockets don't have the no_check6_rx
>> flag set.
>>
>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>> ---
>>  net/ipv4/udp_offload.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
>> index 9c37338..ac783f4 100644
>> --- a/net/ipv4/udp_offload.c
>> +++ b/net/ipv4/udp_offload.c
>> @@ -257,7 +257,7 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
>>         struct sock *sk;
>>
>>         if (NAPI_GRO_CB(skb)->encap_mark ||
>> -           (skb->ip_summed != CHECKSUM_PARTIAL &&
>> +           (uh->check && skb->ip_summed != CHECKSUM_PARTIAL &&
>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>              !NAPI_GRO_CB(skb)->csum_valid))
>>                 goto out;
> 
> So now all zero checksum UDP traffic will be targeted for GRO if I am
> understanding this right.  Have you looked into how much of an impact
> this might have on performance for non-tunnel UDP traffic using a zero
> checksum?  I'm thinking it will be negative.  The issue is you are now
> going to be performing an extra socket lookup for all incoming UDP
> frames that have a zero checksum.

Are zero checksummed UDP protocols rare and only happen in case where we
have tunneling protocols, which need the socket lookup anyway? That
said, we haven't really focused on the impact here and thought it
shouldn't matter to try to speed up zero checksum UDP protocols over
zero ones.

> Also in the line below this line we are setting the encap_mark.  That
> will probably need to be moved down to the point just before we call
> gro_receive so that we can avoid setting unneeded data in the
> NAPI_GRO_CB.

Ack.

>> @@ -271,6 +271,10 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
>>         if (!sk || !udp_sk(sk)->gro_receive)
>>                 goto out_unlock;
>>
>> +       if (!uh->check && skb->protocol == cpu_to_be16(ETH_P_IPV6) &&
>> +           !udp_sk(sk)->no_check6_rx)
>> +               goto out_unlock;
>> +
>>         flush = 0;
>>
>>         for (p = *head; p; p = p->next) {
> 
> So I am pretty sure this check doesn't pass the sniff test.
> Specifically I don't believe you can use skb->protocol like you
> currently are as it could be an 8021q frame for instance that is being
> aggregated so the skb->protocol would be invalid.  I think what you
> should probably be using is NAPI_GRO_CB(skb)->is_ipv6 although it
> occurs to me that in the case of tunnels I don't know if that value is
> being reset for IPv4 like it should be.

Thanks, we probably should switch to sk->sk_family (we don't allow dual
family sockets with tunnel drivers so far)?

Bye,
Hannes
Alexander H Duyck July 8, 2016, 5:08 p.m. UTC | #3
On Fri, Jul 8, 2016 at 9:56 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On 08.07.2016 12:46, Alexander Duyck wrote:
>> On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>>> currently, UDP packets with zero checksum are not allowed to
>>> use udp offload's GRO. This patch admits such packets to
>>> GRO, if the related socket settings allow it: ipv6 packets
>>> are not admitted if the sockets don't have the no_check6_rx
>>> flag set.
>>>
>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>> ---
>>>  net/ipv4/udp_offload.c | 6 +++++-
>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
>>> index 9c37338..ac783f4 100644
>>> --- a/net/ipv4/udp_offload.c
>>> +++ b/net/ipv4/udp_offload.c
>>> @@ -257,7 +257,7 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
>>>         struct sock *sk;
>>>
>>>         if (NAPI_GRO_CB(skb)->encap_mark ||
>>> -           (skb->ip_summed != CHECKSUM_PARTIAL &&
>>> +           (uh->check && skb->ip_summed != CHECKSUM_PARTIAL &&
>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>              !NAPI_GRO_CB(skb)->csum_valid))
>>>                 goto out;
>>
>> So now all zero checksum UDP traffic will be targeted for GRO if I am
>> understanding this right.  Have you looked into how much of an impact
>> this might have on performance for non-tunnel UDP traffic using a zero
>> checksum?  I'm thinking it will be negative.  The issue is you are now
>> going to be performing an extra socket lookup for all incoming UDP
>> frames that have a zero checksum.
>
> Are zero checksummed UDP protocols rare and only happen in case where we
> have tunneling protocols, which need the socket lookup anyway? That
> said, we haven't really focused on the impact here and thought it
> shouldn't matter to try to speed up zero checksum UDP protocols over
> zero ones.

I'm not sure how rare they are, but I know they are used for more than
just tunnels, especially in the case of IPv4.  What I suspect will
happen with this being implemented is that we will end up with all
sorts of people coming forward complaining about performance
regressions when we add an extra socket lookup to their fast-path.
I'm sure Jesper's pktgen tests would show a some negatives with
something like this as pktgen uses a 0 UDP checksum as I recall.
However I would suspect he probably runs such tests with GRO already
disabled.

>> Also in the line below this line we are setting the encap_mark.  That
>> will probably need to be moved down to the point just before we call
>> gro_receive so that we can avoid setting unneeded data in the
>> NAPI_GRO_CB.
>
> Ack.
>
>>> @@ -271,6 +271,10 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
>>>         if (!sk || !udp_sk(sk)->gro_receive)
>>>                 goto out_unlock;
>>>
>>> +       if (!uh->check && skb->protocol == cpu_to_be16(ETH_P_IPV6) &&
>>> +           !udp_sk(sk)->no_check6_rx)
>>> +               goto out_unlock;
>>> +
>>>         flush = 0;
>>>
>>>         for (p = *head; p; p = p->next) {
>>
>> So I am pretty sure this check doesn't pass the sniff test.
>> Specifically I don't believe you can use skb->protocol like you
>> currently are as it could be an 8021q frame for instance that is being
>> aggregated so the skb->protocol would be invalid.  I think what you
>> should probably be using is NAPI_GRO_CB(skb)->is_ipv6 although it
>> occurs to me that in the case of tunnels I don't know if that value is
>> being reset for IPv4 like it should be.

I just looked at the function and verified the v4 path for UDP GRO is
setting this to zero as it should.

> Thanks, we probably should switch to sk->sk_family (we don't allow dual
> family sockets with tunnel drivers so far)?

I don't know what the situation there is.  I just now that for v4 vs
v6 UDP the NAPI_GRO_CB has a field called is_ipv6 which is populated
just before calling into the tunnel GRO path.  If you use that you can
guarantee that you are looking at the right type for the protocol
instead of guessing at it based on skb->protocol.

- Alex
Tom Herbert July 8, 2016, 9:03 p.m. UTC | #4
On Thu, Jul 7, 2016 at 10:58 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> currently, UDP packets with zero checksum are not allowed to
> use udp offload's GRO. This patch admits such packets to
> GRO, if the related socket settings allow it: ipv6 packets
> are not admitted if the sockets don't have the no_check6_rx
> flag set.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  net/ipv4/udp_offload.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index 9c37338..ac783f4 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -257,7 +257,7 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
>         struct sock *sk;
>
>         if (NAPI_GRO_CB(skb)->encap_mark ||
> -           (skb->ip_summed != CHECKSUM_PARTIAL &&
> +           (uh->check && skb->ip_summed != CHECKSUM_PARTIAL &&
>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>              !NAPI_GRO_CB(skb)->csum_valid))

Paolo,

I think you might be misunderstanding the intent of this conditional.
It is trying to deduce that that the inner TCP checksum has likely
been validated or can be validated without doing  packet checksum
calculation. This is trying to avoid doing host side checksum
calculation in the GRO path and really has little to do with rather
uh->check is zero or not. The assumption was that we shouldn't compute
whole packet checksums in the GRO path because of performance. If this
assumption is no longer valid (i.e. there's good data saying doing
checksums in GRO path is a benefit) then all the checksum parts of
this conditional should be removed.

Tom

>                 goto out;
> @@ -271,6 +271,10 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
>         if (!sk || !udp_sk(sk)->gro_receive)
>                 goto out_unlock;
>
> +       if (!uh->check && skb->protocol == cpu_to_be16(ETH_P_IPV6) &&
> +           !udp_sk(sk)->no_check6_rx)
> +               goto out_unlock;
> +
>         flush = 0;
>
>         for (p = *head; p; p = p->next) {
> --
> 1.8.3.1
>
Paolo Abeni July 11, 2016, 1:21 p.m. UTC | #5
On Fri, 2016-07-08 at 16:03 -0500, Tom Herbert wrote:
> On Thu, Jul 7, 2016 at 10:58 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> > currently, UDP packets with zero checksum are not allowed to
> > use udp offload's GRO. This patch admits such packets to
> > GRO, if the related socket settings allow it: ipv6 packets
> > are not admitted if the sockets don't have the no_check6_rx
> > flag set.
> >
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  net/ipv4/udp_offload.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> > index 9c37338..ac783f4 100644
> > --- a/net/ipv4/udp_offload.c
> > +++ b/net/ipv4/udp_offload.c
> > @@ -257,7 +257,7 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
> >         struct sock *sk;
> >
> >         if (NAPI_GRO_CB(skb)->encap_mark ||
> > -           (skb->ip_summed != CHECKSUM_PARTIAL &&
> > +           (uh->check && skb->ip_summed != CHECKSUM_PARTIAL &&
> >              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
> >              !NAPI_GRO_CB(skb)->csum_valid))
> 
> Paolo,
> 
> I think you might be misunderstanding the intent of this conditional.
> It is trying to deduce that that the inner TCP checksum has likely
> been validated or can be validated without doing  packet checksum
> calculation. This is trying to avoid doing host side checksum
> calculation in the GRO path and really has little to do with rather
> uh->check is zero or not. The assumption was that we shouldn't compute
> whole packet checksums in the GRO path because of performance. If this
> assumption is no longer valid (i.e. there's good data saying doing
> checksums in GRO path is a benefit) then all the checksum parts of
> this conditional should be removed.

Oh, my bad! I was hit by an ixgbe errata (82599 sometimes marks zero
checksum udp packets with CHECKSUM_NONE), so in my tests the above
condition was matched by 0 csum UDP packets. Than I misread csum_cnt
documentation, assuming it was not incremented for zero checksum UDP
packets: I thought that the matches I saw were due to !uh->check
instead of missing offload.

Thank you for the clarification,

Paolo
diff mbox

Patch

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 9c37338..ac783f4 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -257,7 +257,7 @@  struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
 	struct sock *sk;
 
 	if (NAPI_GRO_CB(skb)->encap_mark ||
-	    (skb->ip_summed != CHECKSUM_PARTIAL &&
+	    (uh->check && skb->ip_summed != CHECKSUM_PARTIAL &&
 	     NAPI_GRO_CB(skb)->csum_cnt == 0 &&
 	     !NAPI_GRO_CB(skb)->csum_valid))
 		goto out;
@@ -271,6 +271,10 @@  struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
 	if (!sk || !udp_sk(sk)->gro_receive)
 		goto out_unlock;
 
+	if (!uh->check && skb->protocol == cpu_to_be16(ETH_P_IPV6) &&
+	    !udp_sk(sk)->no_check6_rx)
+		goto out_unlock;
+
 	flush = 0;
 
 	for (p = *head; p; p = p->next) {