diff mbox series

[net-next] net: udp Allow CHECKSUM_UNNECESSARY packets to do GRO.

Message ID 1548214428-114642-1-git-send-email-maowenan@huawei.com
State Changes Requested
Delegated to: David Miller
Headers show
Series [net-next] net: udp Allow CHECKSUM_UNNECESSARY packets to do GRO. | expand

Commit Message

maowenan Jan. 23, 2019, 3:33 a.m. UTC
When udp4_gro_receive() get one packet that uh->check=0,
skb_gro_checksum_validate_zero_check() will set the
skb->ip_summed = CHECKSUM_UNNECESSARY;
skb->csum_level = 0;
Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
It is not our expect,  because check=0 in udp header indicates this
packet is no need to caculate checksum, we should go further to do GRO.

This patch changes the value of csum_cnt according to skb->csum_level.
---
 include/linux/netdevice.h | 1 +
 1 file changed, 1 insertion(+)

Comments

maowenan Jan. 29, 2019, 3 a.m. UTC | #1
Hi all,
Do you have any comments about this change?


On 2019/1/23 11:33, Mao Wenan wrote:
> When udp4_gro_receive() get one packet that uh->check=0,
> skb_gro_checksum_validate_zero_check() will set the
> skb->ip_summed = CHECKSUM_UNNECESSARY;
> skb->csum_level = 0;
> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
> It is not our expect,  because check=0 in udp header indicates this
> packet is no need to caculate checksum, we should go further to do GRO.
> 
> This patch changes the value of csum_cnt according to skb->csum_level.
> ---
>  include/linux/netdevice.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 1377d08..9c819f1 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
>  		 * during GRO. This saves work if we fallback to normal path.
>  		 */
>  		__skb_incr_checksum_unnecessary(skb);
> +		NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
>  	}
>  }
>  
>
Tom Herbert Jan. 29, 2019, 4:01 a.m. UTC | #2
On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@huawei.com> wrote:
>
> Hi all,
> Do you have any comments about this change?
>
>
> On 2019/1/23 11:33, Mao Wenan wrote:
> > When udp4_gro_receive() get one packet that uh->check=0,
> > skb_gro_checksum_validate_zero_check() will set the
> > skb->ip_summed = CHECKSUM_UNNECESSARY;
> > skb->csum_level = 0;
> > Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
> > It is not our expect,  because check=0 in udp header indicates this
> > packet is no need to caculate checksum, we should go further to do GRO.
> >
> > This patch changes the value of csum_cnt according to skb->csum_level.
> > ---
> >  include/linux/netdevice.h | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 1377d08..9c819f1 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
> >                * during GRO. This saves work if we fallback to normal path.
> >                */
> >               __skb_incr_checksum_unnecessary(skb);
> > +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;

That doesn't look right. This would be reinitializing the GRO
checksums from the beginning.

> >       }
> >  }
> >
> >
>
I assume the code is bailing on this conditional:

if (NAPI_GRO_CB(skb)->encap_mark ||
            (skb->ip_summed != CHECKSUM_PARTIAL &&
             NAPI_GRO_CB(skb)->csum_cnt == 0 &&
             !NAPI_GRO_CB(skb)->csum_valid) ||
            !udp_sk(sk)->gro_receive)
                goto out_unlock;

I am trying to remember why this needs to check csum_cnt. If there was
a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
received.
maowenan Jan. 29, 2019, 6:04 a.m. UTC | #3
On 2019/1/29 12:01, Tom Herbert wrote:
> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@huawei.com> wrote:
>>
>> Hi all,
>> Do you have any comments about this change?
>>
>>
>> On 2019/1/23 11:33, Mao Wenan wrote:
>>> When udp4_gro_receive() get one packet that uh->check=0,
>>> skb_gro_checksum_validate_zero_check() will set the
>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
>>> skb->csum_level = 0;
>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
>>> It is not our expect,  because check=0 in udp header indicates this
>>> packet is no need to caculate checksum, we should go further to do GRO.
>>>
>>> This patch changes the value of csum_cnt according to skb->csum_level.
>>> ---
>>>  include/linux/netdevice.h | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>> index 1377d08..9c819f1 100644
>>> --- a/include/linux/netdevice.h
>>> +++ b/include/linux/netdevice.h
>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
>>>                * during GRO. This saves work if we fallback to normal path.
>>>                */
>>>               __skb_incr_checksum_unnecessary(skb);
>>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
> 
> That doesn't look right. This would be reinitializing the GRO
> checksums from the beginning.
> 
>>>       }
>>>  }
>>>
>>>
>>
> I assume the code is bailing on this conditional:
> 
> if (NAPI_GRO_CB(skb)->encap_mark ||
>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>              !NAPI_GRO_CB(skb)->csum_valid) ||
>             !udp_sk(sk)->gro_receive)
>                 goto out_unlock;
> 
> I am trying to remember why this needs to check csum_cnt. If there was
> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
> received.

We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.

so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().

> 
> .
>
Tom Herbert Jan. 29, 2019, 6:24 a.m. UTC | #4
On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@huawei.com> wrote:
>
>
>
> On 2019/1/29 12:01, Tom Herbert wrote:
> > On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@huawei.com> wrote:
> >>
> >> Hi all,
> >> Do you have any comments about this change?
> >>
> >>
> >> On 2019/1/23 11:33, Mao Wenan wrote:
> >>> When udp4_gro_receive() get one packet that uh->check=0,
> >>> skb_gro_checksum_validate_zero_check() will set the
> >>> skb->ip_summed = CHECKSUM_UNNECESSARY;
> >>> skb->csum_level = 0;
> >>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
> >>> It is not our expect,  because check=0 in udp header indicates this
> >>> packet is no need to caculate checksum, we should go further to do GRO.
> >>>
> >>> This patch changes the value of csum_cnt according to skb->csum_level.
> >>> ---
> >>>  include/linux/netdevice.h | 1 +
> >>>  1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> >>> index 1377d08..9c819f1 100644
> >>> --- a/include/linux/netdevice.h
> >>> +++ b/include/linux/netdevice.h
> >>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
> >>>                * during GRO. This saves work if we fallback to normal path.
> >>>                */
> >>>               __skb_incr_checksum_unnecessary(skb);
> >>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
> >
> > That doesn't look right. This would be reinitializing the GRO
> > checksums from the beginning.
> >
> >>>       }
> >>>  }
> >>>
> >>>
> >>
> > I assume the code is bailing on this conditional:
> >
> > if (NAPI_GRO_CB(skb)->encap_mark ||
> >             (skb->ip_summed != CHECKSUM_PARTIAL &&
> >              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
> >              !NAPI_GRO_CB(skb)->csum_valid) ||
> >             !udp_sk(sk)->gro_receive)
> >                 goto out_unlock;
> >
> > I am trying to remember why this needs to check csum_cnt. If there was
> > a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
> > was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
> > received.
>
> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
>
> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
>
Yes, but the csum_level is changing since we've gone beyond the
checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
initialized to skb->csum_level + 1 at the start of GRO processing.

If I recall, the rule is that UDP GRO requires at least one non-zero
checksum to be verified. The idea is that if we end up computing
packet checksums on the host for inner checksums like TCP during GRO,
then that's negating the performance benefits of GRO. Had UDP check
not been zero then we would do checksum unnecessary conversion and so
csum_valid would be set for the remainded of GRO processing. The
existing code is following the rule I believe, so this may be working
as intended.

Tom

> >
> > .
> >
>
maowenan Jan. 29, 2019, 8:08 a.m. UTC | #5
On 2019/1/29 14:24, Tom Herbert wrote:
> On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@huawei.com> wrote:
>>
>>
>>
>> On 2019/1/29 12:01, Tom Herbert wrote:
>>> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@huawei.com> wrote:
>>>>
>>>> Hi all,
>>>> Do you have any comments about this change?
>>>>
>>>>
>>>> On 2019/1/23 11:33, Mao Wenan wrote:
>>>>> When udp4_gro_receive() get one packet that uh->check=0,
>>>>> skb_gro_checksum_validate_zero_check() will set the
>>>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
>>>>> skb->csum_level = 0;
>>>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
>>>>> It is not our expect,  because check=0 in udp header indicates this
>>>>> packet is no need to caculate checksum, we should go further to do GRO.
>>>>>
>>>>> This patch changes the value of csum_cnt according to skb->csum_level.
>>>>> ---
>>>>>  include/linux/netdevice.h | 1 +
>>>>>  1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>>> index 1377d08..9c819f1 100644
>>>>> --- a/include/linux/netdevice.h
>>>>> +++ b/include/linux/netdevice.h
>>>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
>>>>>                * during GRO. This saves work if we fallback to normal path.
>>>>>                */
>>>>>               __skb_incr_checksum_unnecessary(skb);
>>>>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
>>>
>>> That doesn't look right. This would be reinitializing the GRO
>>> checksums from the beginning.
>>>
>>>>>       }
>>>>>  }
>>>>>
>>>>>
>>>>
>>> I assume the code is bailing on this conditional:
>>>
>>> if (NAPI_GRO_CB(skb)->encap_mark ||
>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
>>>             !udp_sk(sk)->gro_receive)
>>>                 goto out_unlock;
>>>
>>> I am trying to remember why this needs to check csum_cnt. If there was
>>> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
>>> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
>>> received.
>>
>> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
>> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
>> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
>> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
>>
>> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
>>
> Yes, but the csum_level is changing since we've gone beyond the
> checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
> initialized to skb->csum_level + 1 at the start of GRO processing.
> 
> If I recall, the rule is that UDP GRO requires at least one non-zero
> checksum to be verified. The idea is that if we end up computing
> packet checksums on the host for inner checksums like TCP during GRO,
> then that's negating the performance benefits of GRO. Had UDP check
> not been zero then we would do checksum unnecessary conversion and so
> csum_valid would be set for the remainded of GRO processing. The
> existing code is following the rule I believe, so this may be working
> as intended.

Do you have any suggestion if I need do GRO as udp->check is zero?
My previous modification which works fine as below:
	if (NAPI_GRO_CB(skb)->encap_mark ||
	    (skb->ip_summed != CHECKSUM_PARTIAL &&
+	     skb->ip_summed != CHECKSUM_UNNECESSARY &&
	     NAPI_GRO_CB(skb)->csum_cnt == 0 &&
	     !NAPI_GRO_CB(skb)->csum_valid) ||
	    !udp_sk(sk)->gro_receive)
		goto out_unlock;


> 
> Tom
> 
>>>
>>> .
>>>
>>
> 
> .
>
Tom Herbert Jan. 29, 2019, 8:24 p.m. UTC | #6
On Tue, Jan 29, 2019 at 12:08 AM maowenan <maowenan@huawei.com> wrote:
>
>
>
> On 2019/1/29 14:24, Tom Herbert wrote:
> > On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@huawei.com> wrote:
> >>
> >>
> >>
> >> On 2019/1/29 12:01, Tom Herbert wrote:
> >>> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@huawei.com> wrote:
> >>>>
> >>>> Hi all,
> >>>> Do you have any comments about this change?
> >>>>
> >>>>
> >>>> On 2019/1/23 11:33, Mao Wenan wrote:
> >>>>> When udp4_gro_receive() get one packet that uh->check=0,
> >>>>> skb_gro_checksum_validate_zero_check() will set the
> >>>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
> >>>>> skb->csum_level = 0;
> >>>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
> >>>>> It is not our expect,  because check=0 in udp header indicates this
> >>>>> packet is no need to caculate checksum, we should go further to do GRO.
> >>>>>
> >>>>> This patch changes the value of csum_cnt according to skb->csum_level.
> >>>>> ---
> >>>>>  include/linux/netdevice.h | 1 +
> >>>>>  1 file changed, 1 insertion(+)
> >>>>>
> >>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> >>>>> index 1377d08..9c819f1 100644
> >>>>> --- a/include/linux/netdevice.h
> >>>>> +++ b/include/linux/netdevice.h
> >>>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
> >>>>>                * during GRO. This saves work if we fallback to normal path.
> >>>>>                */
> >>>>>               __skb_incr_checksum_unnecessary(skb);
> >>>>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
> >>>
> >>> That doesn't look right. This would be reinitializing the GRO
> >>> checksums from the beginning.
> >>>
> >>>>>       }
> >>>>>  }
> >>>>>
> >>>>>
> >>>>
> >>> I assume the code is bailing on this conditional:
> >>>
> >>> if (NAPI_GRO_CB(skb)->encap_mark ||
> >>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
> >>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
> >>>              !NAPI_GRO_CB(skb)->csum_valid) ||
> >>>             !udp_sk(sk)->gro_receive)
> >>>                 goto out_unlock;
> >>>
> >>> I am trying to remember why this needs to check csum_cnt. If there was
> >>> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
> >>> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
> >>> received.
> >>
> >> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
> >> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
> >> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
> >> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
> >>
> >> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
> >>
> > Yes, but the csum_level is changing since we've gone beyond the
> > checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
> > initialized to skb->csum_level + 1 at the start of GRO processing.
> >
> > If I recall, the rule is that UDP GRO requires at least one non-zero
> > checksum to be verified. The idea is that if we end up computing
> > packet checksums on the host for inner checksums like TCP during GRO,
> > then that's negating the performance benefits of GRO. Had UDP check
> > not been zero then we would do checksum unnecessary conversion and so
> > csum_valid would be set for the remainded of GRO processing. The
> > existing code is following the rule I believe, so this may be working
> > as intended.
>
> Do you have any suggestion if I need do GRO as udp->check is zero?
> My previous modification which works fine as below:
>         if (NAPI_GRO_CB(skb)->encap_mark ||
>             (skb->ip_summed != CHECKSUM_PARTIAL &&
> +            skb->ip_summed != CHECKSUM_UNNECESSARY &&

That's effectively disabling the rule that we need a real checksum
calculation to proceed with GRO. Besides that, the device returning
one checksum-unnecessary level because UDP csum is zero is pretty
pointelss; we can just as easily deduce get to same state just by
looking at the field with CHECKSUM_NONE. What we really want to see
for GRO is a real checksum computation being done on the packet.

A few questions:

What type of packets are being GROed? Are these TCP? What performance
difference do you see with our patch? Can you try enabling UDP
checksums, and even RCO with VXLAN? With UDP encapsulation we
generally see better performance with checksum enabled since UDP
checksum offload is ubiquitous and we can easily convert
checksum-unnecessary (with non-zero csum) to checksum-complete.

Tom




>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>              !NAPI_GRO_CB(skb)->csum_valid) ||
>             !udp_sk(sk)->gro_receive)
>                 goto out_unlock;
>
>
> >
> > Tom
> >
> >>>
> >>> .
> >>>
> >>
> >
> > .
> >
>
maowenan Jan. 31, 2019, 1:57 a.m. UTC | #7
On 2019/1/30 4:24, Tom Herbert wrote:
> On Tue, Jan 29, 2019 at 12:08 AM maowenan <maowenan@huawei.com> wrote:
>>
>>
>>
>> On 2019/1/29 14:24, Tom Herbert wrote:
>>> On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@huawei.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2019/1/29 12:01, Tom Herbert wrote:
>>>>> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@huawei.com> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>> Do you have any comments about this change?
>>>>>>
>>>>>>
>>>>>> On 2019/1/23 11:33, Mao Wenan wrote:
>>>>>>> When udp4_gro_receive() get one packet that uh->check=0,
>>>>>>> skb_gro_checksum_validate_zero_check() will set the
>>>>>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
>>>>>>> skb->csum_level = 0;
>>>>>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
>>>>>>> It is not our expect,  because check=0 in udp header indicates this
>>>>>>> packet is no need to caculate checksum, we should go further to do GRO.
>>>>>>>
>>>>>>> This patch changes the value of csum_cnt according to skb->csum_level.
>>>>>>> ---
>>>>>>>  include/linux/netdevice.h | 1 +
>>>>>>>  1 file changed, 1 insertion(+)
>>>>>>>
>>>>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>>>>> index 1377d08..9c819f1 100644
>>>>>>> --- a/include/linux/netdevice.h
>>>>>>> +++ b/include/linux/netdevice.h
>>>>>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
>>>>>>>                * during GRO. This saves work if we fallback to normal path.
>>>>>>>                */
>>>>>>>               __skb_incr_checksum_unnecessary(skb);
>>>>>>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
>>>>>
>>>>> That doesn't look right. This would be reinitializing the GRO
>>>>> checksums from the beginning.
>>>>>
>>>>>>>       }
>>>>>>>  }
>>>>>>>
>>>>>>>
>>>>>>
>>>>> I assume the code is bailing on this conditional:
>>>>>
>>>>> if (NAPI_GRO_CB(skb)->encap_mark ||
>>>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
>>>>>             !udp_sk(sk)->gro_receive)
>>>>>                 goto out_unlock;
>>>>>
>>>>> I am trying to remember why this needs to check csum_cnt. If there was
>>>>> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
>>>>> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
>>>>> received.
>>>>
>>>> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
>>>> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
>>>> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
>>>> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
>>>>
>>>> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
>>>>
>>> Yes, but the csum_level is changing since we've gone beyond the
>>> checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
>>> initialized to skb->csum_level + 1 at the start of GRO processing.
>>>
>>> If I recall, the rule is that UDP GRO requires at least one non-zero
>>> checksum to be verified. The idea is that if we end up computing
>>> packet checksums on the host for inner checksums like TCP during GRO,
>>> then that's negating the performance benefits of GRO. Had UDP check
>>> not been zero then we would do checksum unnecessary conversion and so
>>> csum_valid would be set for the remainded of GRO processing. The
>>> existing code is following the rule I believe, so this may be working
>>> as intended.
>>
>> Do you have any suggestion if I need do GRO as udp->check is zero?
>> My previous modification which works fine as below:
>>         if (NAPI_GRO_CB(skb)->encap_mark ||
>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>> +            skb->ip_summed != CHECKSUM_UNNECESSARY &&
> 
> That's effectively disabling the rule that we need a real checksum
> calculation to proceed with GRO. Besides that, the device returning
> one checksum-unnecessary level because UDP csum is zero is pretty
> pointelss; we can just as easily deduce get to same state just by
> looking at the field with CHECKSUM_NONE. What we really want to see
> for GRO is a real checksum computation being done on the packet.
> 
> A few questions:
> 
> What type of packets are being GROed? Are these TCP? What performance
> difference do you see with our patch? Can you try enabling UDP
> checksums, and even RCO with VXLAN? With UDP encapsulation we
> generally see better performance with checksum enabled since UDP
> checksum offload is ubiquitous and we can easily convert
> checksum-unnecessary (with non-zero csum) to checksum-complete.

We use the physical network card calculate the checksum of the inner packet with checksum offload.
Set the udp checksum of the vxlan header is 0.

With this patch, the bandwidth of TCP between two VMs increase from 2Gbit/s to 6Gbit/s.

> 
> Tom
> 
> 
> 
> 
>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>              !NAPI_GRO_CB(skb)->csum_valid) ||
>>             !udp_sk(sk)->gro_receive)
>>                 goto out_unlock;
>>
>>
>>>
>>> Tom
>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>> .
>>>
>>
> 
> .
>
Tom Herbert Jan. 31, 2019, 2:43 a.m. UTC | #8
On Wed, Jan 30, 2019 at 5:58 PM maowenan <maowenan@huawei.com> wrote:
>
>
>
> On 2019/1/30 4:24, Tom Herbert wrote:
> > On Tue, Jan 29, 2019 at 12:08 AM maowenan <maowenan@huawei.com> wrote:
> >>
> >>
> >>
> >> On 2019/1/29 14:24, Tom Herbert wrote:
> >>> On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@huawei.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 2019/1/29 12:01, Tom Herbert wrote:
> >>>>> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@huawei.com> wrote:
> >>>>>>
> >>>>>> Hi all,
> >>>>>> Do you have any comments about this change?
> >>>>>>
> >>>>>>
> >>>>>> On 2019/1/23 11:33, Mao Wenan wrote:
> >>>>>>> When udp4_gro_receive() get one packet that uh->check=0,
> >>>>>>> skb_gro_checksum_validate_zero_check() will set the
> >>>>>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
> >>>>>>> skb->csum_level = 0;
> >>>>>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
> >>>>>>> It is not our expect,  because check=0 in udp header indicates this
> >>>>>>> packet is no need to caculate checksum, we should go further to do GRO.
> >>>>>>>
> >>>>>>> This patch changes the value of csum_cnt according to skb->csum_level.
> >>>>>>> ---
> >>>>>>>  include/linux/netdevice.h | 1 +
> >>>>>>>  1 file changed, 1 insertion(+)
> >>>>>>>
> >>>>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> >>>>>>> index 1377d08..9c819f1 100644
> >>>>>>> --- a/include/linux/netdevice.h
> >>>>>>> +++ b/include/linux/netdevice.h
> >>>>>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
> >>>>>>>                * during GRO. This saves work if we fallback to normal path.
> >>>>>>>                */
> >>>>>>>               __skb_incr_checksum_unnecessary(skb);
> >>>>>>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
> >>>>>
> >>>>> That doesn't look right. This would be reinitializing the GRO
> >>>>> checksums from the beginning.
> >>>>>
> >>>>>>>       }
> >>>>>>>  }
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>> I assume the code is bailing on this conditional:
> >>>>>
> >>>>> if (NAPI_GRO_CB(skb)->encap_mark ||
> >>>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
> >>>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
> >>>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
> >>>>>             !udp_sk(sk)->gro_receive)
> >>>>>                 goto out_unlock;
> >>>>>
> >>>>> I am trying to remember why this needs to check csum_cnt. If there was
> >>>>> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
> >>>>> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
> >>>>> received.
> >>>>
> >>>> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
> >>>> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
> >>>> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
> >>>> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
> >>>>
> >>>> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
> >>>>
> >>> Yes, but the csum_level is changing since we've gone beyond the
> >>> checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
> >>> initialized to skb->csum_level + 1 at the start of GRO processing.
> >>>
> >>> If I recall, the rule is that UDP GRO requires at least one non-zero
> >>> checksum to be verified. The idea is that if we end up computing
> >>> packet checksums on the host for inner checksums like TCP during GRO,
> >>> then that's negating the performance benefits of GRO. Had UDP check
> >>> not been zero then we would do checksum unnecessary conversion and so
> >>> csum_valid would be set for the remainded of GRO processing. The
> >>> existing code is following the rule I believe, so this may be working
> >>> as intended.
> >>
> >> Do you have any suggestion if I need do GRO as udp->check is zero?
> >> My previous modification which works fine as below:
> >>         if (NAPI_GRO_CB(skb)->encap_mark ||
> >>             (skb->ip_summed != CHECKSUM_PARTIAL &&
> >> +            skb->ip_summed != CHECKSUM_UNNECESSARY &&
> >
> > That's effectively disabling the rule that we need a real checksum
> > calculation to proceed with GRO. Besides that, the device returning
> > one checksum-unnecessary level because UDP csum is zero is pretty
> > pointelss; we can just as easily deduce get to same state just by
> > looking at the field with CHECKSUM_NONE. What we really want to see
> > for GRO is a real checksum computation being done on the packet.
> >
> > A few questions:
> >
> > What type of packets are being GROed? Are these TCP? What performance
> > difference do you see with our patch? Can you try enabling UDP
> > checksums, and even RCO with VXLAN? With UDP encapsulation we
> > generally see better performance with checksum enabled since UDP
> > checksum offload is ubiquitous and we can easily convert
> > checksum-unnecessary (with non-zero csum) to checksum-complete.
>
> We use the physical network card calculate the checksum of the inner packet with checksum offload.
> Set the udp checksum of the vxlan header is 0.
>
I see. It sounds like the device is really verifying two checksums in
the packet, the outer UDP checksum (which is zero for UDP) and an
inner checksum, but only reporting one checksum was verified. The
driver needs to set csum_level to 1 in this case (meaning two
checksums have been verified for checksum-unnecessary). What NIC are
you using?

Tom


> With this patch, the bandwidth of TCP between two VMs increase from 2Gbit/s to 6Gbit/s.
>
> >
> > Tom
> >
> >
> >
> >
> >>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
> >>              !NAPI_GRO_CB(skb)->csum_valid) ||
> >>             !udp_sk(sk)->gro_receive)
> >>                 goto out_unlock;
> >>
> >>
> >>>
> >>> Tom
> >>>
> >>>>>
> >>>>> .
> >>>>>
> >>>>
> >>>
> >>> .
> >>>
> >>
> >
> > .
> >
>
maowenan Jan. 31, 2019, 2:58 a.m. UTC | #9
On 2019/1/31 10:43, Tom Herbert wrote:
> On Wed, Jan 30, 2019 at 5:58 PM maowenan <maowenan@huawei.com> wrote:
>>
>>
>>
>> On 2019/1/30 4:24, Tom Herbert wrote:
>>> On Tue, Jan 29, 2019 at 12:08 AM maowenan <maowenan@huawei.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2019/1/29 14:24, Tom Herbert wrote:
>>>>> On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@huawei.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2019/1/29 12:01, Tom Herbert wrote:
>>>>>>> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@huawei.com> wrote:
>>>>>>>>
>>>>>>>> Hi all,
>>>>>>>> Do you have any comments about this change?
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/1/23 11:33, Mao Wenan wrote:
>>>>>>>>> When udp4_gro_receive() get one packet that uh->check=0,
>>>>>>>>> skb_gro_checksum_validate_zero_check() will set the
>>>>>>>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
>>>>>>>>> skb->csum_level = 0;
>>>>>>>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
>>>>>>>>> It is not our expect,  because check=0 in udp header indicates this
>>>>>>>>> packet is no need to caculate checksum, we should go further to do GRO.
>>>>>>>>>
>>>>>>>>> This patch changes the value of csum_cnt according to skb->csum_level.
>>>>>>>>> ---
>>>>>>>>>  include/linux/netdevice.h | 1 +
>>>>>>>>>  1 file changed, 1 insertion(+)
>>>>>>>>>
>>>>>>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>>>>>>> index 1377d08..9c819f1 100644
>>>>>>>>> --- a/include/linux/netdevice.h
>>>>>>>>> +++ b/include/linux/netdevice.h
>>>>>>>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
>>>>>>>>>                * during GRO. This saves work if we fallback to normal path.
>>>>>>>>>                */
>>>>>>>>>               __skb_incr_checksum_unnecessary(skb);
>>>>>>>>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
>>>>>>>
>>>>>>> That doesn't look right. This would be reinitializing the GRO
>>>>>>> checksums from the beginning.
>>>>>>>
>>>>>>>>>       }
>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>> I assume the code is bailing on this conditional:
>>>>>>>
>>>>>>> if (NAPI_GRO_CB(skb)->encap_mark ||
>>>>>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>>>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>>>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
>>>>>>>             !udp_sk(sk)->gro_receive)
>>>>>>>                 goto out_unlock;
>>>>>>>
>>>>>>> I am trying to remember why this needs to check csum_cnt. If there was
>>>>>>> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
>>>>>>> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
>>>>>>> received.
>>>>>>
>>>>>> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
>>>>>> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
>>>>>> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
>>>>>> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
>>>>>>
>>>>>> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
>>>>>>
>>>>> Yes, but the csum_level is changing since we've gone beyond the
>>>>> checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
>>>>> initialized to skb->csum_level + 1 at the start of GRO processing.
>>>>>
>>>>> If I recall, the rule is that UDP GRO requires at least one non-zero
>>>>> checksum to be verified. The idea is that if we end up computing
>>>>> packet checksums on the host for inner checksums like TCP during GRO,
>>>>> then that's negating the performance benefits of GRO. Had UDP check
>>>>> not been zero then we would do checksum unnecessary conversion and so
>>>>> csum_valid would be set for the remainded of GRO processing. The
>>>>> existing code is following the rule I believe, so this may be working
>>>>> as intended.
>>>>
>>>> Do you have any suggestion if I need do GRO as udp->check is zero?
>>>> My previous modification which works fine as below:
>>>>         if (NAPI_GRO_CB(skb)->encap_mark ||
>>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>> +            skb->ip_summed != CHECKSUM_UNNECESSARY &&
>>>
>>> That's effectively disabling the rule that we need a real checksum
>>> calculation to proceed with GRO. Besides that, the device returning
>>> one checksum-unnecessary level because UDP csum is zero is pretty
>>> pointelss; we can just as easily deduce get to same state just by
>>> looking at the field with CHECKSUM_NONE. What we really want to see
>>> for GRO is a real checksum computation being done on the packet.
>>>
>>> A few questions:
>>>
>>> What type of packets are being GROed? Are these TCP? What performance
>>> difference do you see with our patch? Can you try enabling UDP
>>> checksums, and even RCO with VXLAN? With UDP encapsulation we
>>> generally see better performance with checksum enabled since UDP
>>> checksum offload is ubiquitous and we can easily convert
>>> checksum-unnecessary (with non-zero csum) to checksum-complete.
>>
>> We use the physical network card calculate the checksum of the inner packet with checksum offload.
>> Set the udp checksum of the vxlan header is 0.
>>
> I see. It sounds like the device is really verifying two checksums in
> the packet, the outer UDP checksum (which is zero for UDP) and an
> inner checksum, but only reporting one checksum was verified. The
> driver needs to set csum_level to 1 in this case (meaning two
> checksums have been verified for checksum-unnecessary). What NIC are
> you using?

Currently it is 82599, whose driver can't recognize the vxlan packet.
I guess so many NICs can't do this checking in it's driver, so I think this is a
common case, will we fix it in stack?

> 
> Tom
> 
> 
>> With this patch, the bandwidth of TCP between two VMs increase from 2Gbit/s to 6Gbit/s.
>>
>>>
>>> Tom
>>>
>>>
>>>
>>>
>>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
>>>>             !udp_sk(sk)->gro_receive)
>>>>                 goto out_unlock;
>>>>
>>>>
>>>>>
>>>>> Tom
>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>> .
>>>
>>
> 
> .
>
Tom Herbert Jan. 31, 2019, 4:33 a.m. UTC | #10
On Wed, Jan 30, 2019 at 6:59 PM maowenan <maowenan@huawei.com> wrote:
>
>
>
> On 2019/1/31 10:43, Tom Herbert wrote:
> > On Wed, Jan 30, 2019 at 5:58 PM maowenan <maowenan@huawei.com> wrote:
> >>
> >>
> >>
> >> On 2019/1/30 4:24, Tom Herbert wrote:
> >>> On Tue, Jan 29, 2019 at 12:08 AM maowenan <maowenan@huawei.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 2019/1/29 14:24, Tom Herbert wrote:
> >>>>> On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@huawei.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 2019/1/29 12:01, Tom Herbert wrote:
> >>>>>>> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@huawei.com> wrote:
> >>>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>> Do you have any comments about this change?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 2019/1/23 11:33, Mao Wenan wrote:
> >>>>>>>>> When udp4_gro_receive() get one packet that uh->check=0,
> >>>>>>>>> skb_gro_checksum_validate_zero_check() will set the
> >>>>>>>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
> >>>>>>>>> skb->csum_level = 0;
> >>>>>>>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
> >>>>>>>>> It is not our expect,  because check=0 in udp header indicates this
> >>>>>>>>> packet is no need to caculate checksum, we should go further to do GRO.
> >>>>>>>>>
> >>>>>>>>> This patch changes the value of csum_cnt according to skb->csum_level.
> >>>>>>>>> ---
> >>>>>>>>>  include/linux/netdevice.h | 1 +
> >>>>>>>>>  1 file changed, 1 insertion(+)
> >>>>>>>>>
> >>>>>>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> >>>>>>>>> index 1377d08..9c819f1 100644
> >>>>>>>>> --- a/include/linux/netdevice.h
> >>>>>>>>> +++ b/include/linux/netdevice.h
> >>>>>>>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
> >>>>>>>>>                * during GRO. This saves work if we fallback to normal path.
> >>>>>>>>>                */
> >>>>>>>>>               __skb_incr_checksum_unnecessary(skb);
> >>>>>>>>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
> >>>>>>>
> >>>>>>> That doesn't look right. This would be reinitializing the GRO
> >>>>>>> checksums from the beginning.
> >>>>>>>
> >>>>>>>>>       }
> >>>>>>>>>  }
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>> I assume the code is bailing on this conditional:
> >>>>>>>
> >>>>>>> if (NAPI_GRO_CB(skb)->encap_mark ||
> >>>>>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
> >>>>>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
> >>>>>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
> >>>>>>>             !udp_sk(sk)->gro_receive)
> >>>>>>>                 goto out_unlock;
> >>>>>>>
> >>>>>>> I am trying to remember why this needs to check csum_cnt. If there was
> >>>>>>> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
> >>>>>>> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
> >>>>>>> received.
> >>>>>>
> >>>>>> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
> >>>>>> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
> >>>>>> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
> >>>>>> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
> >>>>>>
> >>>>>> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
> >>>>>>
> >>>>> Yes, but the csum_level is changing since we've gone beyond the
> >>>>> checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
> >>>>> initialized to skb->csum_level + 1 at the start of GRO processing.
> >>>>>
> >>>>> If I recall, the rule is that UDP GRO requires at least one non-zero
> >>>>> checksum to be verified. The idea is that if we end up computing
> >>>>> packet checksums on the host for inner checksums like TCP during GRO,
> >>>>> then that's negating the performance benefits of GRO. Had UDP check
> >>>>> not been zero then we would do checksum unnecessary conversion and so
> >>>>> csum_valid would be set for the remainded of GRO processing. The
> >>>>> existing code is following the rule I believe, so this may be working
> >>>>> as intended.
> >>>>
> >>>> Do you have any suggestion if I need do GRO as udp->check is zero?
> >>>> My previous modification which works fine as below:
> >>>>         if (NAPI_GRO_CB(skb)->encap_mark ||
> >>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
> >>>> +            skb->ip_summed != CHECKSUM_UNNECESSARY &&
> >>>
> >>> That's effectively disabling the rule that we need a real checksum
> >>> calculation to proceed with GRO. Besides that, the device returning
> >>> one checksum-unnecessary level because UDP csum is zero is pretty
> >>> pointelss; we can just as easily deduce get to same state just by
> >>> looking at the field with CHECKSUM_NONE. What we really want to see
> >>> for GRO is a real checksum computation being done on the packet.
> >>>
> >>> A few questions:
> >>>
> >>> What type of packets are being GROed? Are these TCP? What performance
> >>> difference do you see with our patch? Can you try enabling UDP
> >>> checksums, and even RCO with VXLAN? With UDP encapsulation we
> >>> generally see better performance with checksum enabled since UDP
> >>> checksum offload is ubiquitous and we can easily convert
> >>> checksum-unnecessary (with non-zero csum) to checksum-complete.
> >>
> >> We use the physical network card calculate the checksum of the inner packet with checksum offload.
> >> Set the udp checksum of the vxlan header is 0.
> >>
> > I see. It sounds like the device is really verifying two checksums in
> > the packet, the outer UDP checksum (which is zero for UDP) and an
> > inner checksum, but only reporting one checksum was verified. The
> > driver needs to set csum_level to 1 in this case (meaning two
> > checksums have been verified for checksum-unnecessary). What NIC are
> > you using?
>
> Currently it is 82599, whose driver can't recognize the vxlan packet.
> I guess so many NICs can't do this checking in it's driver, so I think this is a
> common case, will we fix it in stack?
>
Mao,

The problem isn't in the stack, it seems to be in the driver. If the
device reports a verified checksum for an encpasulated packet then the
driver needs to set csum_level to 1. Otherwise, the stack can't just
assume that the inner checksum was verified. *How* a driver deduces
that the device is reporting about an encapsulated checksum is
specific to the device and its driver. I'm not sure which driver your
running, but if you search the code there should be something like
"skb->csum_level =1" that would be a clue about support. A good
example is ixgbe, if the device reports checksum verified and that
packet was VXLAN, it deduces that the inner checksum was verified and
so the driver sets CHECKSUM_UNNECESSARY and skb->csum_level. Of course
all this complexity goes away when devices just provide
checksum-complete.

Tom

> >
> > Tom
> >
> >
> >> With this patch, the bandwidth of TCP between two VMs increase from 2Gbit/s to 6Gbit/s.
> >>
> >>>
> >>> Tom
> >>>
> >>>
> >>>
> >>>
> >>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
> >>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
> >>>>             !udp_sk(sk)->gro_receive)
> >>>>                 goto out_unlock;
> >>>>
> >>>>
> >>>>>
> >>>>> Tom
> >>>>>
> >>>>>>>
> >>>>>>> .
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>> .
> >>>>>
> >>>>
> >>>
> >>> .
> >>>
> >>
> >
> > .
> >
>
maowenan Feb. 20, 2019, 8:32 a.m. UTC | #11
On 2019/1/31 12:33, Tom Herbert wrote:
> On Wed, Jan 30, 2019 at 6:59 PM maowenan <maowenan@huawei.com> wrote:
>>
>>
>>
>> On 2019/1/31 10:43, Tom Herbert wrote:
>>> On Wed, Jan 30, 2019 at 5:58 PM maowenan <maowenan@huawei.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2019/1/30 4:24, Tom Herbert wrote:
>>>>> On Tue, Jan 29, 2019 at 12:08 AM maowenan <maowenan@huawei.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2019/1/29 14:24, Tom Herbert wrote:
>>>>>>> On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@huawei.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/1/29 12:01, Tom Herbert wrote:
>>>>>>>>> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@huawei.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>> Do you have any comments about this change?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019/1/23 11:33, Mao Wenan wrote:
>>>>>>>>>>> When udp4_gro_receive() get one packet that uh->check=0,
>>>>>>>>>>> skb_gro_checksum_validate_zero_check() will set the
>>>>>>>>>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
>>>>>>>>>>> skb->csum_level = 0;
>>>>>>>>>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
>>>>>>>>>>> It is not our expect,  because check=0 in udp header indicates this
>>>>>>>>>>> packet is no need to caculate checksum, we should go further to do GRO.
>>>>>>>>>>>
>>>>>>>>>>> This patch changes the value of csum_cnt according to skb->csum_level.
>>>>>>>>>>> ---
>>>>>>>>>>>  include/linux/netdevice.h | 1 +
>>>>>>>>>>>  1 file changed, 1 insertion(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>>>>>>>>> index 1377d08..9c819f1 100644
>>>>>>>>>>> --- a/include/linux/netdevice.h
>>>>>>>>>>> +++ b/include/linux/netdevice.h
>>>>>>>>>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
>>>>>>>>>>>                * during GRO. This saves work if we fallback to normal path.
>>>>>>>>>>>                */
>>>>>>>>>>>               __skb_incr_checksum_unnecessary(skb);
>>>>>>>>>>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
>>>>>>>>>
>>>>>>>>> That doesn't look right. This would be reinitializing the GRO
>>>>>>>>> checksums from the beginning.
>>>>>>>>>
>>>>>>>>>>>       }
>>>>>>>>>>>  }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> I assume the code is bailing on this conditional:
>>>>>>>>>
>>>>>>>>> if (NAPI_GRO_CB(skb)->encap_mark ||
>>>>>>>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>>>>>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>>>>>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
>>>>>>>>>             !udp_sk(sk)->gro_receive)
>>>>>>>>>                 goto out_unlock;
>>>>>>>>>
>>>>>>>>> I am trying to remember why this needs to check csum_cnt. If there was
>>>>>>>>> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
>>>>>>>>> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
>>>>>>>>> received.
>>>>>>>>
>>>>>>>> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
>>>>>>>> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
>>>>>>>> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
>>>>>>>> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
>>>>>>>>
>>>>>>>> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
>>>>>>>>
>>>>>>> Yes, but the csum_level is changing since we've gone beyond the
>>>>>>> checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
>>>>>>> initialized to skb->csum_level + 1 at the start of GRO processing.
>>>>>>>
>>>>>>> If I recall, the rule is that UDP GRO requires at least one non-zero
>>>>>>> checksum to be verified. The idea is that if we end up computing
>>>>>>> packet checksums on the host for inner checksums like TCP during GRO,
>>>>>>> then that's negating the performance benefits of GRO. Had UDP check
>>>>>>> not been zero then we would do checksum unnecessary conversion and so
>>>>>>> csum_valid would be set for the remainded of GRO processing. The
>>>>>>> existing code is following the rule I believe, so this may be working
>>>>>>> as intended.
>>>>>>
>>>>>> Do you have any suggestion if I need do GRO as udp->check is zero?
>>>>>> My previous modification which works fine as below:
>>>>>>         if (NAPI_GRO_CB(skb)->encap_mark ||
>>>>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>>>> +            skb->ip_summed != CHECKSUM_UNNECESSARY &&
>>>>>
>>>>> That's effectively disabling the rule that we need a real checksum
>>>>> calculation to proceed with GRO. Besides that, the device returning
>>>>> one checksum-unnecessary level because UDP csum is zero is pretty
>>>>> pointelss; we can just as easily deduce get to same state just by
>>>>> looking at the field with CHECKSUM_NONE. What we really want to see
>>>>> for GRO is a real checksum computation being done on the packet.
>>>>>
>>>>> A few questions:
>>>>>
>>>>> What type of packets are being GROed? Are these TCP? What performance
>>>>> difference do you see with our patch? Can you try enabling UDP
>>>>> checksums, and even RCO with VXLAN? With UDP encapsulation we
>>>>> generally see better performance with checksum enabled since UDP
>>>>> checksum offload is ubiquitous and we can easily convert
>>>>> checksum-unnecessary (with non-zero csum) to checksum-complete.
>>>>
>>>> We use the physical network card calculate the checksum of the inner packet with checksum offload.
>>>> Set the udp checksum of the vxlan header is 0.
>>>>
>>> I see. It sounds like the device is really verifying two checksums in
>>> the packet, the outer UDP checksum (which is zero for UDP) and an
>>> inner checksum, but only reporting one checksum was verified. The
>>> driver needs to set csum_level to 1 in this case (meaning two
>>> checksums have been verified for checksum-unnecessary). What NIC are
>>> you using?
>>
>> Currently it is 82599, whose driver can't recognize the vxlan packet.
>> I guess so many NICs can't do this checking in it's driver, so I think this is a
>> common case, will we fix it in stack?
>>
> Mao,
> 
> The problem isn't in the stack, it seems to be in the driver. If the
> device reports a verified checksum for an encpasulated packet then the
> driver needs to set csum_level to 1. Otherwise, the stack can't just
> assume that the inner checksum was verified. *How* a driver deduces
> that the device is reporting about an encapsulated checksum is
> specific to the device and its driver. I'm not sure which driver your
> running, but if you search the code there should be something like
> "skb->csum_level =1" that would be a clue about support. A good
> example is ixgbe, if the device reports checksum verified and that
> packet was VXLAN, it deduces that the inner checksum was verified and
> so the driver sets CHECKSUM_UNNECESSARY and skb->csum_level. Of course
> all this complexity goes away when devices just provide
> checksum-complete.
> 
> Tom

Hi Tom,

I have checked code in ixgbe_main.c ixgbe_rx_checksum(). UDP frames with a 0 checksum can be marked as
checksum errors. If it returns here, skb->ip_summed will be set to CHECKSUM_NONE.
Then the packet will be flush to stack in udp_gro_receive().
Do you think whether it is the fault of driver or not?

ixgbe_rx_checksum():
	if (ixgbe_test_staterr(rx_desc, IXGBE_RXDADV_ERR_TCPE)) {
		/*
		 * 82599 errata, UDP frames with a 0 checksum can be marked as
		 * checksum errors.
		 */
		if ((pkt_info & cpu_to_le16(IXGBE_RXDADV_PKTTYPE_UDP)) &&
		    test_bit(__IXGBE_RX_CSUM_UDP_ZERO_ERR, &ring->state))
			return;

		ring->rx_stats.csum_err++;
		return;
	}

udp_gro_receive():
if (NAPI_GRO_CB(skb)->encap_mark ||
	    (skb->ip_summed != CHECKSUM_PARTIAL &&
	     NAPI_GRO_CB(skb)->csum_cnt == 0 &&
	     !NAPI_GRO_CB(skb)->csum_valid) ||
	    !udp_sk(sk)->gro_receive)
		goto out_unlock;

> 
>>>
>>> Tom
>>>
>>>
>>>> With this patch, the bandwidth of TCP between two VMs increase from 2Gbit/s to 6Gbit/s.
>>>>
>>>>>
>>>>> Tom
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
>>>>>>             !udp_sk(sk)->gro_receive)
>>>>>>                 goto out_unlock;
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Tom
>>>>>>>
>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>> .
>>>
>>
> 
> .
>
diff mbox series

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1377d08..9c819f1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2764,6 +2764,7 @@  static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
 		 * during GRO. This saves work if we fallback to normal path.
 		 */
 		__skb_incr_checksum_unnecessary(skb);
+		NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
 	}
 }