diff mbox series

AW: big ICMP requests get disrupted on IPSec tunnel activation

Message ID EB8510AA7A943D43916A72C9B8F4181F62A0981D@cvk038.intra.cvk.de
State RFC
Delegated to: David Miller
Headers show
Series AW: big ICMP requests get disrupted on IPSec tunnel activation | expand

Commit Message

Bartschies, Thomas Oct. 16, 2019, 6:54 p.m. UTC
Hello,

I had to adapt the second half to my test kernel. Just had to make some guesses, tested. and it works.
Your conclusions are correct. I also suspected something like that, but with no knowledge of the inner
workings of the ip stack I had very little chances to find the right fix by myself.

Thank you very much. Will retest know for a secondary forwarding problem that's much harder to reproduce.

Regards,
--
Thomas Bartschies
CVK IT-Systeme

-----Ursprüngliche Nachricht-----
Von: Eric Dumazet [mailto:eric.dumazet@gmail.com] 
Gesendet: Mittwoch, 16. Oktober 2019 17:41
An: Bartschies, Thomas <Thomas.Bartschies@cvk.de>; 'David Ahern' <dsahern@gmail.com>; 'netdev@vger.kernel.org' <netdev@vger.kernel.org>
Betreff: Re: big ICMP requests get disrupted on IPSec tunnel activation

On 10/16/19 8:31 AM, Eric Dumazet wrote:
> 
> 
> On 10/16/19 5:57 AM, Bartschies, Thomas wrote:
>> Hello,
>>
>> did another test. This time I've changed the order. First triggered the IPSec policy and then tried to ping in parallel with a big packet size.
>> Could also reproduce the issue, but the trace was completely different. May be this time I've got the trace for the problematic connection?
>>
> 
> This one was probably a false positive.
> 
> The other one, I finally understood what was going on.
> 
> You told us you removed netfilter, but it seems you still have the ip defrag modules there.
> 
> (For a pure fowarding node, no reassembly-defrag should be needed)
> 
> When ip_forward() is used, it correctly clears skb->tstamp
> 
> But later, ip_do_fragment() might re-use the skbs found attached to 
> the master skb and we do not init properly their skb->tstamp
> 
> The master skb->tstamp should be copied to the children.
> 
> I will send a patch asap.
> 
> Thanks.
> 

Can you try :

                                ip_fraglist_prepare(skb, &iter);
                        }
 
+                       skb->tstamp = tstamp;
                        err = output(net, sk, skb);
 
                        if (!err)

Comments

Eric Dumazet Oct. 16, 2019, 7:54 p.m. UTC | #1
On 10/16/19 11:54 AM, Bartschies, Thomas wrote:
> Hello,
> 
> I had to adapt the second half to my test kernel. Just had to make some guesses, tested. and it works.
> Your conclusions are correct. I also suspected something like that, but with no knowledge of the inner
> workings of the ip stack I had very little chances to find the right fix by myself.
> 
> Thank you very much. Will retest know for a secondary forwarding problem that's much harder to reproduce.
> 

Sorry for the delay.

The patch backported to 5.2.18 would be something like :

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 8c2ec35b6512f1486cf2ea01f4a19444c7422642..96c02146be0af1e66230627b401c35757f9dc702 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -626,6 +626,7 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
        if (skb_has_frag_list(skb)) {
                struct sk_buff *frag, *frag2;
                unsigned int first_len = skb_pagelen(skb);
+               ktime_t tstamp = skb->tstamp;
 
                if (first_len - hlen > mtu ||
                    ((first_len - hlen) & 7) ||
@@ -687,6 +688,7 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
                                ip_send_check(iph);
                        }
 
+                       skb->tstamp = tstamp;
                        err = output(net, sk, skb);
 
                        if (!err)



> Regards,
> --
> Thomas Bartschies
> CVK IT-Systeme
> 
> -----Ursprüngliche Nachricht-----
> Von: Eric Dumazet [mailto:eric.dumazet@gmail.com] 
> Gesendet: Mittwoch, 16. Oktober 2019 17:41
> An: Bartschies, Thomas <Thomas.Bartschies@cvk.de>; 'David Ahern' <dsahern@gmail.com>; 'netdev@vger.kernel.org' <netdev@vger.kernel.org>
> Betreff: Re: big ICMP requests get disrupted on IPSec tunnel activation
> 
> On 10/16/19 8:31 AM, Eric Dumazet wrote:
>>
>>
>> On 10/16/19 5:57 AM, Bartschies, Thomas wrote:
>>> Hello,
>>>
>>> did another test. This time I've changed the order. First triggered the IPSec policy and then tried to ping in parallel with a big packet size.
>>> Could also reproduce the issue, but the trace was completely different. May be this time I've got the trace for the problematic connection?
>>>
>>
>> This one was probably a false positive.
>>
>> The other one, I finally understood what was going on.
>>
>> You told us you removed netfilter, but it seems you still have the ip defrag modules there.
>>
>> (For a pure fowarding node, no reassembly-defrag should be needed)
>>
>> When ip_forward() is used, it correctly clears skb->tstamp
>>
>> But later, ip_do_fragment() might re-use the skbs found attached to 
>> the master skb and we do not init properly their skb->tstamp
>>
>> The master skb->tstamp should be copied to the children.
>>
>> I will send a patch asap.
>>
>> Thanks.
>>
> 
> Can you try :
> 
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 28fca408812c5576fc4ea957c1c4dec97ec8faf3..c880229a01712ba5a9ed413f8aab2b56dfe93c82 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -808,6 +808,7 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
>         if (skb_has_frag_list(skb)) {
>                 struct sk_buff *frag, *frag2;
>                 unsigned int first_len = skb_pagelen(skb);
> +               ktime_t tstamp = skb->tstamp;
>  
>                 if (first_len - hlen > mtu ||
>                     ((first_len - hlen) & 7) || @@ -846,6 +847,7 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
>                                 ip_fraglist_prepare(skb, &iter);
>                         }
>  
> +                       skb->tstamp = tstamp;
>                         err = output(net, sk, skb);
>  
>                         if (!err)
>
diff mbox series

Patch

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 28fca408812c5576fc4ea957c1c4dec97ec8faf3..c880229a01712ba5a9ed413f8aab2b56dfe93c82 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -808,6 +808,7 @@  int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
        if (skb_has_frag_list(skb)) {
                struct sk_buff *frag, *frag2;
                unsigned int first_len = skb_pagelen(skb);
+               ktime_t tstamp = skb->tstamp;
 
                if (first_len - hlen > mtu ||
                    ((first_len - hlen) & 7) || @@ -846,6 +847,7 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,