diff mbox

[net-next] net-gre-gro: Fix a bug that breaks the forwarding path

Message ID 1405262780-20232-1-git-send-email-hkchu@google.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Jerry Chu July 13, 2014, 2:46 p.m. UTC
From: Jerry Chu <hkchu@google.com>

Fixed a bug that was introduced by my GRE-GRO patch
(bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
support to the GRO stack) that breaks the forwarding path
because various GSO related fields were not set. The bug will
cause on the egress path either the GSO code to fail, or a
GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
fix has been tested for both cases.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
---
 net/core/dev.c           | 2 ++
 net/ipv4/af_inet.c       | 3 +++
 net/ipv4/gre_offload.c   | 3 +++
 net/ipv4/tcp_offload.c   | 2 +-
 net/ipv6/tcpv6_offload.c | 2 +-
 5 files changed, 10 insertions(+), 2 deletions(-)

Comments

Or Gerlitz July 14, 2014, 7:05 p.m. UTC | #1
On Sun, Jul 13, 2014 at 5:46 PM, H.K. Jerry Chu <hkchu@google.com> wrote:
>
> From: Jerry Chu <hkchu@google.com>
>
> Fixed a bug that was introduced by my GRE-GRO patch
> (bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
> support to the GRO stack) that breaks the forwarding path
> because various GSO related fields were not set. The bug will
> cause on the egress path either the GSO code to fail, or a
> GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
> fix has been tested for both cases.


Hi Jerry,

Anything different in this version vs. the one you posted earlier on
February or this is a plain re-post?

Or.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jerry Chu July 14, 2014, 7:16 p.m. UTC | #2
Hi Or,

On Mon, Jul 14, 2014 at 12:05 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>
> On Sun, Jul 13, 2014 at 5:46 PM, H.K. Jerry Chu <hkchu@google.com> wrote:
> >
> > From: Jerry Chu <hkchu@google.com>
> >
> > Fixed a bug that was introduced by my GRE-GRO patch
> > (bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
> > support to the GRO stack) that breaks the forwarding path
> > because various GSO related fields were not set. The bug will
> > cause on the egress path either the GSO code to fail, or a
> > GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
> > fix has been tested for both cases.
>
>
> Hi Jerry,
>
> Anything different in this version vs. the one you posted earlier on
> February or this is a plain re-post?

I simply moved the patch against the latest net-next and resolved some small
conflict so yes it's pretty much the same. Also I don't see the subsequent
discussion on skb->encapsulation affects the validity of this patch so
i'm resubmitting
it. Also the patch has been confirmed to address the problem Wolfgang reported
last week. Feel free to test against the configuration (VXLAN?) you
had some question about earlier.

Thanks,

Jerry

>
>
> Or.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz July 14, 2014, 8:38 p.m. UTC | #3
On Mon, Jul 14, 2014 at 10:16 PM, Jerry Chu <hkchu@google.com> wrote:
> On Mon, Jul 14, 2014 at 12:05 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>> On Sun, Jul 13, 2014 at 5:46 PM, H.K. Jerry Chu <hkchu@google.com> wrote:
>>> Fixed a bug that was introduced by my GRE-GRO patch
>>> (bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
>>> support to the GRO stack) that breaks the forwarding path
>>> because various GSO related fields were not set. The bug will
>>> cause on the egress path either the GSO code to fail, or a
>>> GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
>>> fix has been tested for both cases.

>> Anything different in this version vs. the one you posted earlier on
>> February or this is a plain re-post?

> I simply moved the patch against the latest net-next and resolved some small
> conflict so yes it's pretty much the same.

OK, got it.

> Also I don't see the subsequent
> discussion on skb->encapsulation affects the validity of this patch so
> i'm resubmitting it. Also the patch has been confirmed to address the problem
> Wolfgang reported last week. Feel free to test against the configuration (VXLAN?)
> you had some question about earlier.

Well, re-reading that thread, we were not very decisive there... my
comment of setting the inner network header twice in inet_gro_complete
doesn't apply only to vxlan but to other tunneling protocols. Also, if
we really need (why do we? or explained it on the Feb thread) to set
skb->encapsulation for the sake of TX in a protocol (GRE) gro
complete, looks a bit fishy to me... but saying this I think brings us
back to that incomplete discussion [1]  sounds as this needs some
plumbering... Still, this way or another I understand a regression was
introduced here and should be fixed.

Or.

[1] http://marc.info/?t=139353642700003&r=1&w=2
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jerry Chu July 14, 2014, 9:30 p.m. UTC | #4
On Mon, Jul 14, 2014 at 1:38 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
> On Mon, Jul 14, 2014 at 10:16 PM, Jerry Chu <hkchu@google.com> wrote:
>> On Mon, Jul 14, 2014 at 12:05 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>>> On Sun, Jul 13, 2014 at 5:46 PM, H.K. Jerry Chu <hkchu@google.com> wrote:
>>>> Fixed a bug that was introduced by my GRE-GRO patch
>>>> (bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
>>>> support to the GRO stack) that breaks the forwarding path
>>>> because various GSO related fields were not set. The bug will
>>>> cause on the egress path either the GSO code to fail, or a
>>>> GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
>>>> fix has been tested for both cases.
>
>>> Anything different in this version vs. the one you posted earlier on
>>> February or this is a plain re-post?
>
>> I simply moved the patch against the latest net-next and resolved some small
>> conflict so yes it's pretty much the same.
>
> OK, got it.
>
>> Also I don't see the subsequent
>> discussion on skb->encapsulation affects the validity of this patch so
>> i'm resubmitting it. Also the patch has been confirmed to address the problem
>> Wolfgang reported last week. Feel free to test against the configuration (VXLAN?)
>> you had some question about earlier.
>
> Well, re-reading that thread, we were not very decisive there... my
> comment of setting the inner network header twice in inet_gro_complete
> doesn't apply only to vxlan but to other tunneling protocols.

Understood, but my reply about it applying to the inner most hdr in the end
is not limited to vxlan either.

> Also, if
> we really need (why do we? or explained it on the Feb thread) to set
> skb->encapsulation for the sake of TX in a protocol (GRE) gro
> complete, looks a bit fishy to me...

Why does it look fishy? When h/w support sLRO on tunneled pkts
driver will set that bit. So it seems very reasonable for the GRO stack
that supports tunneled pkts to set that bit too.

> but saying this I think brings us
> back to that incomplete discussion [1]  sounds as this needs some
> plumbering... Still, this way or another I understand a regression was
> introduced here and should be fixed.

Yep. I thought we've made reasonable progress on the skb->encapsualtion
even though we can't close all the questions/issues (but we have a real
bug that needs to be fixed here).

Jerry

>
> Or.
>
> [1] http://marc.info/?t=139353642700003&r=1&w=2
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller July 14, 2014, 10:05 p.m. UTC | #5
From: "H.K. Jerry Chu" <hkchu@google.com>
Date: Sun, 13 Jul 2014 07:46:20 -0700

> From: Jerry Chu <hkchu@google.com>
> 
> Fixed a bug that was introduced by my GRE-GRO patch
> (bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
> support to the GRO stack) that breaks the forwarding path
> because various GSO related fields were not set. The bug will
> cause on the egress path either the GSO code to fail, or a
> GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
> fix has been tested for both cases.
> 
> Signed-off-by: H.K. Jerry Chu <hkchu@google.com>

As a regression fix, this should be targetted at 'net' rather than
'net-next' so I can queue it up for -stable too.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jerry Chu July 14, 2014, 10:18 p.m. UTC | #6
On Mon, Jul 14, 2014 at 3:05 PM, David Miller <davem@davemloft.net> wrote:
>
> From: "H.K. Jerry Chu" <hkchu@google.com>
> Date: Sun, 13 Jul 2014 07:46:20 -0700
>
> > From: Jerry Chu <hkchu@google.com>
> >
> > Fixed a bug that was introduced by my GRE-GRO patch
> > (bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
> > support to the GRO stack) that breaks the forwarding path
> > because various GSO related fields were not set. The bug will
> > cause on the egress path either the GSO code to fail, or a
> > GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
> > fix has been tested for both cases.
> >
> > Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
>
> As a regression fix, this should be targetted at 'net' rather than
> 'net-next' so I can queue it up for -stable too.

Ok, will submit a separate patch for 'net' soon.

Thanks,

Jerry

>
> Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz July 15, 2014, 2:06 p.m. UTC | #7
On Tue, Jul 15, 2014 at 12:30 AM, Jerry Chu <hkchu@google.com> wrote:
>
> On Mon, Jul 14, 2014 at 1:38 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
> > On Mon, Jul 14, 2014 at 10:16 PM, Jerry Chu <hkchu@google.com> wrote:
> >> On Mon, Jul 14, 2014 at 12:05 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
> >>> On Sun, Jul 13, 2014 at 5:46 PM, H.K. Jerry Chu <hkchu@google.com> wrote:
> >>>> Fixed a bug that was introduced by my GRE-GRO patch
> >>>> (bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
> >>>> support to the GRO stack) that breaks the forwarding path
> >>>> because various GSO related fields were not set. The bug will
> >>>> cause on the egress path either the GSO code to fail, or a
> >>>> GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
> >>>> fix has been tested for both cases.
> >
> >>> Anything different in this version vs. the one you posted earlier on
> >>> February or this is a plain re-post?
> >
> >> I simply moved the patch against the latest net-next and resolved some small
> >> conflict so yes it's pretty much the same.
> >
> > OK, got it.
> >
> >> Also I don't see the subsequent
> >> discussion on skb->encapsulation affects the validity of this patch so
> >> i'm resubmitting it. Also the patch has been confirmed to address the problem
> >> Wolfgang reported last week. Feel free to test against the configuration (VXLAN?)
> >> you had some question about earlier.
> >
> > Well, re-reading that thread, we were not very decisive there... my
> > comment of setting the inner network header twice in inet_gro_complete
> > doesn't apply only to vxlan but to other tunneling protocols.
>
> Understood, but my reply about it applying to the inner most hdr in the end
> is not limited to vxlan either.
>
> > Also, if
> > we really need (why do we? or explained it on the Feb thread) to set
> > skb->encapsulation for the sake of TX in a protocol (GRE) gro
> > complete, looks a bit fishy to me...
>
> Why does it look fishy? When h/w support sLRO on tunneled pkts
> driver will set that bit. So it seems very reasonable for the GRO stack
> that supports tunneled pkts to set that bit too.


what do you mean by "h/w support sLRO on tunneled pkts" here? as far
as I understand, the driver should set the bit when they know it's an
encapsulated
packet for which the HW offloaded checksum verification.

So my fishiness comment was referring to the fact that we see here
that the stack
is setting this bit too sometimes.

> > but saying this I think brings us
> > back to that incomplete discussion [1]  sounds as this needs some
> > plumbering... Still, this way or another I understand a regression was
> > introduced here and should be fixed.

> Yep. I thought we've made reasonable progress on the skb->encapsualtion
> even though we can't close all the questions/issues

where/how exactly (or more or less) this progress comes into play, is it
by the description exchanged in the emails sent over the Feb/Mar thread
or in Tom's 3.15/16/17 patches?

> (but we have a real bug that needs to be fixed here).

agree...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jerry Chu July 15, 2014, 5:17 p.m. UTC | #8
On Tue, Jul 15, 2014 at 7:06 AM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
> On Tue, Jul 15, 2014 at 12:30 AM, Jerry Chu <hkchu@google.com> wrote:
>>
>> On Mon, Jul 14, 2014 at 1:38 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>> > On Mon, Jul 14, 2014 at 10:16 PM, Jerry Chu <hkchu@google.com> wrote:
>> >> On Mon, Jul 14, 2014 at 12:05 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>> >>> On Sun, Jul 13, 2014 at 5:46 PM, H.K. Jerry Chu <hkchu@google.com> wrote:
>> >>>> Fixed a bug that was introduced by my GRE-GRO patch
>> >>>> (bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
>> >>>> support to the GRO stack) that breaks the forwarding path
>> >>>> because various GSO related fields were not set. The bug will
>> >>>> cause on the egress path either the GSO code to fail, or a
>> >>>> GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
>> >>>> fix has been tested for both cases.
>> >
>> >>> Anything different in this version vs. the one you posted earlier on
>> >>> February or this is a plain re-post?
>> >
>> >> I simply moved the patch against the latest net-next and resolved some small
>> >> conflict so yes it's pretty much the same.
>> >
>> > OK, got it.
>> >
>> >> Also I don't see the subsequent
>> >> discussion on skb->encapsulation affects the validity of this patch so
>> >> i'm resubmitting it. Also the patch has been confirmed to address the problem
>> >> Wolfgang reported last week. Feel free to test against the configuration (VXLAN?)
>> >> you had some question about earlier.
>> >
>> > Well, re-reading that thread, we were not very decisive there... my
>> > comment of setting the inner network header twice in inet_gro_complete
>> > doesn't apply only to vxlan but to other tunneling protocols.
>>
>> Understood, but my reply about it applying to the inner most hdr in the end
>> is not limited to vxlan either.
>>
>> > Also, if
>> > we really need (why do we? or explained it on the Feb thread) to set
>> > skb->encapsulation for the sake of TX in a protocol (GRE) gro
>> > complete, looks a bit fishy to me...
>>
>> Why does it look fishy? When h/w support sLRO on tunneled pkts
>> driver will set that bit. So it seems very reasonable for the GRO stack
>> that supports tunneled pkts to set that bit too.
>
>
> what do you mean by "h/w support sLRO on tunneled pkts" here? as far
> as I understand, the driver should set the bit when they know it's an
> encapsulated
> packet for which the HW offloaded checksum verification.
>
> So my fishiness comment was referring to the fact that we see here
> that the stack
> is setting this bit too sometimes.

There are many other places in the "stack" that will set that bit if you
grep it. I think many of them are there to make GSO work on tunneled pkts.
The code here is no different.

>
>> > but saying this I think brings us
>> > back to that incomplete discussion [1]  sounds as this needs some
>> > plumbering... Still, this way or another I understand a regression was
>> > introduced here and should be fixed.
>
>> Yep. I thought we've made reasonable progress on the skb->encapsualtion
>> even though we can't close all the questions/issues
>
> where/how exactly (or more or less) this progress comes into play, is it
> by the description exchanged in the emails sent over the Feb/Mar thread
> or in Tom's 3.15/16/17 patches?

Davem wanted a more precise definition of skb->encapsulation and I thought
some worthy discussion has happened as part of the thread of the
original patch (even though it left with some more questions but
that's often the case
with complex kernel code, right?)

>
>> (but we have a real bug that needs to be fixed here).
>
> agree...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz July 15, 2014, 6:23 p.m. UTC | #9
On Tue, Jul 15, 2014 at 8:17 PM, Jerry Chu <hkchu@google.com> wrote:
> On Tue, Jul 15, 2014 at 7:06 AM, Or Gerlitz <or.gerlitz@gmail.com> wrote:

>> what do you mean by "h/w support sLRO on tunneled pkts" here? as far
>> as I understand, the driver should set the bit when they know it's an
>> encapsulated  packet for which the HW offloaded checksum verification.

can you clarify the sLRO comment?

>> So my fishiness comment was referring to the fact that we see here
>> that the stack is setting this bit too sometimes.

> There are many other places in the "stack" that will set that bit if you
> grep it. I think many of them are there to make GSO work on tunneled pkts.
> The code here is no different.

I am not near the code now, but AFAIK, the "stack" sets it in the TX
path and the driver sets it in the RX path, any deviation you see
there for this convension except for the change introduced by this
patch.

> Davem wanted a more precise definition of skb->encapsulation and I thought
> some worthy discussion has happened as part of the thread of the
> original patch (even though it left with some more questions but
> that's often the case with complex kernel code, right?)


complex as it may be, if the design is valid (e.g doesn't contradict
itself, etc...) it should be subject to proper documentation and
implementation. So in that repspect, I read Dave's comment as saying
-- guys, this isn't the case with that skb encapsulation bit.

For a nearby example, see the documentation of the semantic of the
different checksum values (e.g none/unnecessary/complete/
partial) that the stack and drivers are setting, it wasn't very
clear/percise since maybe kernel 2.4 and only over one of the last
kernel cycles on Dec 2013 this patch made it clear "net: skbuff:
improve comment on checksumming"
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Herbert July 15, 2014, 6:44 p.m. UTC | #10
On Tue, Jul 15, 2014 at 10:17 AM, Jerry Chu <hkchu@google.com> wrote:
> On Tue, Jul 15, 2014 at 7:06 AM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>> On Tue, Jul 15, 2014 at 12:30 AM, Jerry Chu <hkchu@google.com> wrote:
>>>
>>> On Mon, Jul 14, 2014 at 1:38 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>>> > On Mon, Jul 14, 2014 at 10:16 PM, Jerry Chu <hkchu@google.com> wrote:
>>> >> On Mon, Jul 14, 2014 at 12:05 PM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>>> >>> On Sun, Jul 13, 2014 at 5:46 PM, H.K. Jerry Chu <hkchu@google.com> wrote:
>>> >>>> Fixed a bug that was introduced by my GRE-GRO patch
>>> >>>> (bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
>>> >>>> support to the GRO stack) that breaks the forwarding path
>>> >>>> because various GSO related fields were not set. The bug will
>>> >>>> cause on the egress path either the GSO code to fail, or a
>>> >>>> GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
>>> >>>> fix has been tested for both cases.
>>> >
>>> >>> Anything different in this version vs. the one you posted earlier on
>>> >>> February or this is a plain re-post?
>>> >
>>> >> I simply moved the patch against the latest net-next and resolved some small
>>> >> conflict so yes it's pretty much the same.
>>> >
>>> > OK, got it.
>>> >
>>> >> Also I don't see the subsequent
>>> >> discussion on skb->encapsulation affects the validity of this patch so
>>> >> i'm resubmitting it. Also the patch has been confirmed to address the problem
>>> >> Wolfgang reported last week. Feel free to test against the configuration (VXLAN?)
>>> >> you had some question about earlier.
>>> >
>>> > Well, re-reading that thread, we were not very decisive there... my
>>> > comment of setting the inner network header twice in inet_gro_complete
>>> > doesn't apply only to vxlan but to other tunneling protocols.
>>>
>>> Understood, but my reply about it applying to the inner most hdr in the end
>>> is not limited to vxlan either.
>>>
>>> > Also, if
>>> > we really need (why do we? or explained it on the Feb thread) to set
>>> > skb->encapsulation for the sake of TX in a protocol (GRE) gro
>>> > complete, looks a bit fishy to me...
>>>
>>> Why does it look fishy? When h/w support sLRO on tunneled pkts
>>> driver will set that bit. So it seems very reasonable for the GRO stack
>>> that supports tunneled pkts to set that bit too.
>>
>>
>> what do you mean by "h/w support sLRO on tunneled pkts" here? as far
>> as I understand, the driver should set the bit when they know it's an
>> encapsulated
>> packet for which the HW offloaded checksum verification.
>>
>> So my fishiness comment was referring to the fact that we see here
>> that the stack
>> is setting this bit too sometimes.
>
> There are many other places in the "stack" that will set that bit if you
> grep it. I think many of them are there to make GSO work on tunneled pkts.
> The code here is no different.
>
>>
>>> > but saying this I think brings us
>>> > back to that incomplete discussion [1]  sounds as this needs some
>>> > plumbering... Still, this way or another I understand a regression was
>>> > introduced here and should be fixed.
>>
>>> Yep. I thought we've made reasonable progress on the skb->encapsualtion
>>> even though we can't close all the questions/issues
>>
>> where/how exactly (or more or less) this progress comes into play, is it
>> by the description exchanged in the emails sent over the Feb/Mar thread
>> or in Tom's 3.15/16/17 patches?
>
> Davem wanted a more precise definition of skb->encapsulation and I thought
> some worthy discussion has happened as part of the thread of the
> original patch (even though it left with some more questions but
> that's often the case
> with complex kernel code, right?)
>
Unfortunately, skb->encapsulation is overloaded with different
meanings in TX and RX path. I think the invariant should be that if it
is set, the inner headers are valid, so this is currently only useful
on TX. For Rx, where it's used to indicate inner checksum was
validated, I intend to a add a separate field in skbuf (probably two
bits to allow for multiple level encapsulations).

>>
>>> (but we have a real bug that needs to be fixed here).
>>
>> agree...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jerry Chu July 15, 2014, 8:22 p.m. UTC | #11
On Tue, Jul 15, 2014 at 11:21 AM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
> On Tue, Jul 15, 2014 at 8:17 PM, Jerry Chu <hkchu@google.com> wrote:
>>
>> On Tue, Jul 15, 2014 at 7:06 AM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>> > On Tue, Jul 15, 2014 at 12:30 AM, Jerry Chu <hkchu@google.com> wrote:
>> >>
>> >> On Mon, Jul 14, 2014 at 1:38 PM, Or Gerlitz <or.gerlitz@gmail.com>
>> >> wrote:
>> >> > On Mon, Jul 14, 2014 at 10:16 PM, Jerry Chu <hkchu@google.com> wrote:
>> >> >> On Mon, Jul 14, 2014 at 12:05 PM, Or Gerlitz <or.gerlitz@gmail.com>
>> >> >> wrote:
>> >> >>> On Sun, Jul 13, 2014 at 5:46 PM, H.K. Jerry Chu <hkchu@google.com>
>> >> >>> wrote:
>> >> >>>> Fixed a bug that was introduced by my GRE-GRO patch
>> >> >>>> (bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
>> >> >>>> support to the GRO stack) that breaks the forwarding path
>> >> >>>> because various GSO related fields were not set. The bug will
>> >> >>>> cause on the egress path either the GSO code to fail, or a
>> >> >>>> GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
>> >> >>>> fix has been tested for both cases.
>> >> >
>> >> >>> Anything different in this version vs. the one you posted earlier
>> >> >>> on
>> >> >>> February or this is a plain re-post?
>> >> >
>> >> >> I simply moved the patch against the latest net-next and resolved
>> >> >> some small
>> >> >> conflict so yes it's pretty much the same.
>> >> >
>> >> > OK, got it.
>> >> >
>> >> >> Also I don't see the subsequent
>> >> >> discussion on skb->encapsulation affects the validity of this patch
>> >> >> so
>> >> >> i'm resubmitting it. Also the patch has been confirmed to address
>> >> >> the problem
>> >> >> Wolfgang reported last week. Feel free to test against the
>> >> >> configuration (VXLAN?)
>> >> >> you had some question about earlier.
>> >> >
>> >> > Well, re-reading that thread, we were not very decisive there... my
>> >> > comment of setting the inner network header twice in
>> >> > inet_gro_complete
>> >> > doesn't apply only to vxlan but to other tunneling protocols.
>> >>
>> >> Understood, but my reply about it applying to the inner most hdr in the
>> >> end
>> >> is not limited to vxlan either.
>> >>
>> >> > Also, if
>> >> > we really need (why do we? or explained it on the Feb thread) to set
>> >> > skb->encapsulation for the sake of TX in a protocol (GRE) gro
>> >> > complete, looks a bit fishy to me...
>> >>
>> >> Why does it look fishy? When h/w support sLRO on tunneled pkts
>> >> driver will set that bit. So it seems very reasonable for the GRO stack
>> >> that supports tunneled pkts to set that bit too.
>> >
>> >
>> > what do you mean by "h/w support sLRO on tunneled pkts" here? as far
>> > as I understand, the driver should set the bit when they know it's an
>> > encapsulated
>> > packet for which the HW offloaded checksum verification.
>> >
>> > So my fishiness comment was referring to the fact that we see here
>> > that the stack
>> > is setting this bit too sometimes.
>>
>> There are many other places in the "stack" that will set that bit if you
>> grep it. I think many of them are there to make GSO work on tunneled pkts.
>> The code here is no different.
>
>
> I am not near the code now, but AFAIK, the "stack" sets it in the TX path
> and the driver sets it in the RX path, any deviation you see there for this
> convension except for the change introduced by this patch.

Where does the forwarding path (e.g., the case at hand) belong then?

AFAICT the current dev_hard_start_xmit() code pretty much requires
skb->encapsulation to be set on all tunneled pkts in order for the proper
TX offload to be possible.

>
>
>>
>>
>> >
>> >> > but saying this I think brings us
>> >> > back to that incomplete discussion [1]  sounds as this needs some
>> >> > plumbering... Still, this way or another I understand a regression
>> >> > was
>> >> > introduced here and should be fixed.
>> >
>> >> Yep. I thought we've made reasonable progress on the skb->encapsualtion
>> >> even though we can't close all the questions/issues
>> >
>> > where/how exactly (or more or less) this progress comes into play, is it
>> > by the description exchanged in the emails sent over the Feb/Mar thread
>> > or in Tom's 3.15/16/17 patches?
>>
>> Davem wanted a more precise definition of skb->encapsulation and I thought
>> some worthy discussion has happened as part of the thread of the
>> original patch (even though it left with some more questions but
>> that's often the case with complex kernel code, right?)
>
>
> complex as it may be, if the design is valid (e.g doesn't contradict itself,
> etc...) it should be subject to proper documentation and implementation. So
> in that repspect, I read Dave's comment as saying -- guys, this isn't the
> case with that skb encapsulation bit.
>
> For a nearby example, see the documentation of the semantic of the different
> checksum values (e.g none/unnecessary/complete/partial) that the stack and
> drivers are setting, it wasn't very clear/percise since maybe kernel 2.4 and
> only over one of the last kernel cycles on Dec 2013 this patch made it clear
> "net: skbuff: improve comment on checksumming"
>
> Or.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz July 16, 2014, 12:12 p.m. UTC | #12
On Tue, Jul 15, 2014 at 9:44 PM, Tom Herbert <therbert@google.com> wrote:
> Unfortunately, skb->encapsulation is overloaded with different
> meanings in TX and RX path. I think the invariant should be that if it
> is set, the inner headers are valid, so this is currently only useful
> on TX. For Rx, where it's used to indicate inner checksum was validated

I think what Jerry is trying to say here is that the TX part of the
forwarding path uses some pre-knowledge
from the RX part which sets some fields

> I intend to a add a separate field in skbuf (probably two
> bits to allow for multiple level encapsulations).
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz July 16, 2014, 12:15 p.m. UTC | #13
On Tue, Jul 15, 2014 at 11:22 PM, Jerry Chu <hkchu@google.com> wrote:
> On Tue, Jul 15, 2014 at 11:21 AM, Or Gerlitz <or.gerlitz@gmail.com> wrote:

>> I am not near the code now, but AFAIK, the "stack" sets it in the TX path
>> and the driver sets it in the RX path, any deviation you see there for this
>> convension except for the change introduced by this patch.

> Where does the forwarding path (e.g., the case at hand) belong then?

yep, so forwarding flow is something like

HW --> driver RX --> stack RX --> some stack processing --> stack TX
--> driver TX --> HW

and this is different from a path of

application --> stack TX --> driver TX --> HW

It seems as of even before we throw in the tunneling thing, the GRO stack
which is part of that "stack RX" code I mentioned above sets the
gso_type of SKBs
for the sake of forwarding  so these things already happen.

> AFAICT the current dev_hard_start_xmit() code pretty much requires
> skb->encapsulation to be set on all tunneled pkts in order for the proper
> TX offload to be possible.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 6e2a2cd..d603c14 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4089,6 +4089,8 @@  static void napi_reuse_skb(struct napi_struct *napi, struct sk_buff *skb)
 	skb->vlan_tci = 0;
 	skb->dev = napi->dev;
 	skb->skb_iif = 0;
+	skb->encapsulation = 0;
+	skb_shinfo(skb)->gso_type = 0;
 	skb->truesize = SKB_TRUESIZE(skb_end_offset(skb));
 
 	napi->skb = skb;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index d5e6836..d156b3c 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1429,6 +1429,9 @@  static int inet_gro_complete(struct sk_buff *skb, int nhoff)
 	int proto = iph->protocol;
 	int err = -ENOSYS;
 
+	if (skb->encapsulation)
+		skb_set_inner_network_header(skb, nhoff);
+
 	csum_replace2(&iph->check, iph->tot_len, newlen);
 	iph->tot_len = newlen;
 
diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c
index eb92deb..f0bdd47 100644
--- a/net/ipv4/gre_offload.c
+++ b/net/ipv4/gre_offload.c
@@ -263,6 +263,9 @@  static int gre_gro_complete(struct sk_buff *skb, int nhoff)
 	int err = -ENOENT;
 	__be16 type;
 
+	skb->encapsulation = 1;
+	skb_shinfo(skb)->gso_type = SKB_GSO_GRE;
+
 	type = greh->protocol;
 	if (greh->flags & GRE_KEY)
 		grehlen += GRE_HEADER_SECTION;
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index 4e86c59..55046ec 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -309,7 +309,7 @@  static int tcp4_gro_complete(struct sk_buff *skb, int thoff)
 
 	th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr,
 				  iph->daddr, 0);
-	skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+	skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4;
 
 	return tcp_gro_complete(skb);
 }
diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
index 8517d3c..01b0ff9 100644
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -73,7 +73,7 @@  static int tcp6_gro_complete(struct sk_buff *skb, int thoff)
 
 	th->check = ~tcp_v6_check(skb->len - thoff, &iph->saddr,
 				  &iph->daddr, 0);
-	skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+	skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV6;
 
 	return tcp_gro_complete(skb);
 }