diff mbox

route:ip_rt_frag_needed always return unzero

Message ID 1318921469-25599-1-git-send-email-gaofeng@cn.fujitsu.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Gao feng Oct. 18, 2011, 7:04 a.m. UTC
int function ip_rt_frag_need,if peer is null,
there is no need to do ipprot->err_handler.
I am right?

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/ipv4/route.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

Eric Dumazet Oct. 18, 2011, 9:23 a.m. UTC | #1
Le mardi 18 octobre 2011 à 15:04 +0800, Gao feng a écrit :
> int function ip_rt_frag_need,if peer is null,
> there is no need to do ipprot->err_handler.
> I am right?
> 
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> ---
>  net/ipv4/route.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 075212e..6cde0fa 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -1574,7 +1574,7 @@ unsigned short ip_rt_frag_needed(struct net *net, const struct iphdr *iph,
>  
>  		atomic_inc(&__rt_peer_genid);
>  	}
> -	return est_mtu ? : new_mtu;
> +	return est_mtu;
>  }
>  
>  static void check_peer_pmtu(struct dst_entry *dst, struct inet_peer *peer)

No idea why you want this, your changelog is a bit cryptic :)

Wont this bypass the raw_icmp_error(skb, protocol, info);
call in icmp_unreach() as well ?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gao feng Oct. 19, 2011, 1:34 a.m. UTC | #2
2011.10.18 17:23, Eric Dumazet wrote:
> Le mardi 18 octobre 2011 à 15:04 +0800, Gao feng a écrit :
>> int function ip_rt_frag_need,if peer is null,
>> there is no need to do ipprot->err_handler.
>> I am right?
>>
>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>> ---
>>  net/ipv4/route.c |    2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>> index 075212e..6cde0fa 100644
>> --- a/net/ipv4/route.c
>> +++ b/net/ipv4/route.c
>> @@ -1574,7 +1574,7 @@ unsigned short ip_rt_frag_needed(struct net *net, const struct iphdr *iph,
>>  
>>  		atomic_inc(&__rt_peer_genid);
>>  	}
>> -	return est_mtu ? : new_mtu;
>> +	return est_mtu;
>>  }
>>  
>>  static void check_peer_pmtu(struct dst_entry *dst, struct inet_peer *peer)
> 
> No idea why you want this, your changelog is a bit cryptic :)
> 
> Wont this bypass the raw_icmp_error(skb, protocol, info);
> call in icmp_unreach() as well ?
> 
> 

thanks Eric!

I mean that the pmtu is update by inet_peer->pmtu_learned as I know.
so in function ip_rt_frag_needed,
if inet_peer is null or someting else make the setting of inet_peer->pmtu_learned failed.
there is no need to call function tcp_v4_err.

the call stack is
icmp_unreach
  |
  |--->ip_rt_frag_needed(fill inet_peer)
  |
  |--->raw_icmp_error()
  |
  |--->ipprot->err_handler(tcp_v4_err or something else)
	|
	|--->tcp_v4_err(frag need icmp is triggered by tcp packet)
		|
		|--->do_pmtu_discovery
		(in this function both __sk_dst_check or dst->ops->update_pmtu
		need struct inet_peer to update pmtu)

so,I think when set inet_peer->pmtu_learned failed,
in func icmp_unreach we should goto out immediately.

And it's confuse me that why func ping_err and udp_err not update the pmtu?
What I miss?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 19, 2011, 3:49 a.m. UTC | #3
Le mercredi 19 octobre 2011 à 09:34 +0800, Gao feng a écrit :
> 2011.10.18 17:23, Eric Dumazet wrote:
> > Le mardi 18 octobre 2011 à 15:04 +0800, Gao feng a écrit :
> >> int function ip_rt_frag_need,if peer is null,
> >> there is no need to do ipprot->err_handler.
> >> I am right?
> >>
> >> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> >> ---
> >>  net/ipv4/route.c |    2 +-
> >>  1 files changed, 1 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> >> index 075212e..6cde0fa 100644
> >> --- a/net/ipv4/route.c
> >> +++ b/net/ipv4/route.c
> >> @@ -1574,7 +1574,7 @@ unsigned short ip_rt_frag_needed(struct net *net, const struct iphdr *iph,
> >>  
> >>  		atomic_inc(&__rt_peer_genid);
> >>  	}
> >> -	return est_mtu ? : new_mtu;
> >> +	return est_mtu;
> >>  }
> >>  
> >>  static void check_peer_pmtu(struct dst_entry *dst, struct inet_peer *peer)
> > 
> > No idea why you want this, your changelog is a bit cryptic :)
> > 
> > Wont this bypass the raw_icmp_error(skb, protocol, info);
> > call in icmp_unreach() as well ?
> > 
> > 
> 
> thanks Eric!
> 
> I mean that the pmtu is update by inet_peer->pmtu_learned as I know.
> so in function ip_rt_frag_needed,
> if inet_peer is null or someting else make the setting of inet_peer->pmtu_learned failed.
> there is no need to call function tcp_v4_err.
> 
> the call stack is
> icmp_unreach
>   |
>   |--->ip_rt_frag_needed(fill inet_peer)
>   |
>   |--->raw_icmp_error()
>   |
>   |--->ipprot->err_handler(tcp_v4_err or something else)
> 	|
> 	|--->tcp_v4_err(frag need icmp is triggered by tcp packet)
> 		|
> 		|--->do_pmtu_discovery
> 		(in this function both __sk_dst_check or dst->ops->update_pmtu
> 		need struct inet_peer to update pmtu)
> 
> so,I think when set inet_peer->pmtu_learned failed,
> in func icmp_unreach we should goto out immediately.
> 
> And it's confuse me that why func ping_err and udp_err not update the pmtu?
> What I miss?

You dont answer my question : After your patch, we now dont call
raw_icmp_error() anymore. Why is is valid ?

Not finding/create inet_peer is very unlikely : This occurs only under
high stress and out of memory condition. Is it really happening on your
machines ?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gao feng Oct. 19, 2011, 5:20 a.m. UTC | #4
于 2011年10月19日 11:49, Eric Dumazet 写道:
> Le mercredi 19 octobre 2011 à 09:34 +0800, Gao feng a écrit :
>> 2011.10.18 17:23, Eric Dumazet wrote:
>>> Le mardi 18 octobre 2011 à 15:04 +0800, Gao feng a écrit :
>>>> int function ip_rt_frag_need,if peer is null,
>>>> there is no need to do ipprot->err_handler.
>>>> I am right?
>>>>
>>>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>>>> ---
>>>>  net/ipv4/route.c |    2 +-
>>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>>>> index 075212e..6cde0fa 100644
>>>> --- a/net/ipv4/route.c
>>>> +++ b/net/ipv4/route.c
>>>> @@ -1574,7 +1574,7 @@ unsigned short ip_rt_frag_needed(struct net *net, const struct iphdr *iph,
>>>>  
>>>>  		atomic_inc(&__rt_peer_genid);
>>>>  	}
>>>> -	return est_mtu ? : new_mtu;
>>>> +	return est_mtu;
>>>>  }
>>>>  
>>>>  static void check_peer_pmtu(struct dst_entry *dst, struct inet_peer *peer)
>>>
>>> No idea why you want this, your changelog is a bit cryptic :)
>>>
>>> Wont this bypass the raw_icmp_error(skb, protocol, info);
>>> call in icmp_unreach() as well ?
>>>
>>>
>>
>> thanks Eric!
>>
>> I mean that the pmtu is update by inet_peer->pmtu_learned as I know.
>> so in function ip_rt_frag_needed,
>> if inet_peer is null or someting else make the setting of inet_peer->pmtu_learned failed.
>> there is no need to call function tcp_v4_err.
>>
>> the call stack is
>> icmp_unreach
>>   |
>>   |--->ip_rt_frag_needed(fill inet_peer)
>>   |
>>   |--->raw_icmp_error()
>>   |
>>   |--->ipprot->err_handler(tcp_v4_err or something else)
>> 	|
>> 	|--->tcp_v4_err(frag need icmp is triggered by tcp packet)
>> 		|
>> 		|--->do_pmtu_discovery
>> 		(in this function both __sk_dst_check or dst->ops->update_pmtu
>> 		need struct inet_peer to update pmtu)
>>
>> so,I think when set inet_peer->pmtu_learned failed,
>> in func icmp_unreach we should goto out immediately.
>>
>> And it's confuse me that why func ping_err and udp_err not update the pmtu?
>> What I miss?
> 
> You dont answer my question : After your patch, we now dont call
> raw_icmp_error() anymore. Why is is valid ?

After my patch
raw_icmp_error don't call only when setting inet_peer failed(ip_rt_frag_needed return zero).
And I think it's unexpected,should goto out immediately.

In orig ip_rt_frag_need,
zero can be return only when pmtu(get from icmp packet) is zero and peer is NULL.
in this case,raw_icmp_error will not be call too.this is valid??

The return value of ip_rt_frag_needed is not very clear.
Eric,can you tell me something?

> 
> Not finding/create inet_peer is very unlikely : This occurs only under
> high stress and out of memory condition. Is it really happening on your
> machines ?
> 
> 

It's not happening on my machines.

thanks.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 19, 2011, 5:47 a.m. UTC | #5
Le mercredi 19 octobre 2011 à 13:20 +0800, Gao feng a écrit :

> In orig ip_rt_frag_need,
> zero can be return only when pmtu(get from icmp packet) is zero and peer is NULL.
> in this case,raw_icmp_error will not be call too.this is valid??

I dont know, tell me.

I am afraid you're adding a possible regression here.

pmtu from icmp_packet is not zero in the common case.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gao feng Oct. 19, 2011, 6:36 a.m. UTC | #6
2011.10.19 13:47, Eric Dumazet wrote:
> Le mercredi 19 octobre 2011 à 13:20 +0800, Gao feng a écrit :
> 
>> In orig ip_rt_frag_need,
>> zero can be return only when pmtu(get from icmp packet) is zero and peer is NULL.
>> in this case,raw_icmp_error will not be call too.this is valid??
> 
> I dont know, tell me.
> 
> I am afraid you're adding a possible regression here.
> 
> pmtu from icmp_packet is not zero in the common case.
> 
> 

AS raw socket is special,I think it's should call raw_icmp_error even the icmp packet is invalid.
if I am right,I think it's another BUG.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steffen Klassert Oct. 19, 2011, 7:26 a.m. UTC | #7
On Wed, Oct 19, 2011 at 01:20:28PM +0800, Gao feng wrote:
> 于 2011年10月19日 11:49, Eric Dumazet 写道:
> > Le mercredi 19 octobre 2011 à 09:34 +0800, Gao feng a écrit :
> >>
> >> I mean that the pmtu is update by inet_peer->pmtu_learned as I know.
> >> so in function ip_rt_frag_needed,
> >> if inet_peer is null or someting else make the setting of inet_peer->pmtu_learned failed.
> >> there is no need to call function tcp_v4_err.
> >>
> >> the call stack is
> >> icmp_unreach
> >>   |
> >>   |--->ip_rt_frag_needed(fill inet_peer)
> >>   |
> >>   |--->raw_icmp_error()
> >>   |
> >>   |--->ipprot->err_handler(tcp_v4_err or something else)
> >> 	|
> >> 	|--->tcp_v4_err(frag need icmp is triggered by tcp packet)
> >> 		|
> >> 		|--->do_pmtu_discovery
> >> 		(in this function both __sk_dst_check or dst->ops->update_pmtu
> >> 		need struct inet_peer to update pmtu)
> >>
> >> so,I think when set inet_peer->pmtu_learned failed,
> >> in func icmp_unreach we should goto out immediately.
> >>
> >> And it's confuse me that why func ping_err and udp_err not update the pmtu?
> >> What I miss?

On udp and raw sockets, the user is responsible to adjust the packet
size according to the mtu value he may find in the socket's error queue.
So we shoud provide the user with this information, even in the unlikely
case where we could not create an inet_peer.

> > 
> > You dont answer my question : After your patch, we now dont call
> > raw_icmp_error() anymore. Why is is valid ?
> 
> After my patch
> raw_icmp_error don't call only when setting inet_peer failed(ip_rt_frag_needed return zero).
> And I think it's unexpected,should goto out immediately.
> 
> In orig ip_rt_frag_need,
> zero can be return only when pmtu(get from icmp packet) is zero and peer is NULL.
> in this case,raw_icmp_error will not be call too.this is valid??
> 

It is valid in the sense that we should not provide the user
with a mtu information if we know that the value we got from
the icmp packet ist bogus. But perhaps we can think about
making the check for a valid mtu unconditionally and let
ip_rt_frag_needed return a valid mtu in any case.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gao feng Oct. 19, 2011, 8:07 a.m. UTC | #8
2011.10.19 15:26, Steffen Klassert wrote:

> 
> On udp and raw sockets, the user is responsible to adjust the packet
> size according to the mtu value he may find in the socket's error queue.
> So we shoud provide the user with this information, even in the unlikely
> case where we could not create an inet_peer.
> 

Agree.Maybe we should modify mtu immediately when create inet_peer failed

> 
> It is valid in the sense that we should not provide the user
> with a mtu information if we know that the value we got from
> the icmp packet ist bogus. But perhaps we can think about
> making the check for a valid mtu unconditionally and let
> ip_rt_frag_needed return a valid mtu in any case.
> 
> 

I think we should return the pmtu in icmp packet to the raw socket,
and the valid mtu to tcp_v4_err or something else.

thanks Steffen.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steffen Klassert Oct. 19, 2011, 8:38 a.m. UTC | #9
On Wed, Oct 19, 2011 at 04:07:46PM +0800, Gao feng wrote:
> 2011.10.19 15:26, Steffen Klassert wrote:
> 
> > 
> > It is valid in the sense that we should not provide the user
> > with a mtu information if we know that the value we got from
> > the icmp packet ist bogus. But perhaps we can think about
> > making the check for a valid mtu unconditionally and let
> > ip_rt_frag_needed return a valid mtu in any case.
> > 
> > 
> 
> I think we should return the pmtu in icmp packet to the raw socket,
> and the valid mtu to tcp_v4_err or something else.
> 

Why you want to handle raw sockets different here?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gao feng Oct. 19, 2011, 8:59 a.m. UTC | #10
2011.10.19 16:38, Steffen Klassert wrote:
> On Wed, Oct 19, 2011 at 04:07:46PM +0800, Gao feng wrote:
>> 2011.10.19 15:26, Steffen Klassert wrote:
>>
>>>
>>> It is valid in the sense that we should not provide the user
>>> with a mtu information if we know that the value we got from
>>> the icmp packet ist bogus. But perhaps we can think about
>>> making the check for a valid mtu unconditionally and let
>>> ip_rt_frag_needed return a valid mtu in any case.
>>>
>>>
>>
>> I think we should return the pmtu in icmp packet to the raw socket,
>> and the valid mtu to tcp_v4_err or something else.
>>
> 
> Why you want to handle raw sockets different here?
> 

Return the pmtu in icmp packet to the raw socket,
let user space program to decide what to do.
BUT this is not important.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 075212e..6cde0fa 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1574,7 +1574,7 @@  unsigned short ip_rt_frag_needed(struct net *net, const struct iphdr *iph,
 
 		atomic_inc(&__rt_peer_genid);
 	}
-	return est_mtu ? : new_mtu;
+	return est_mtu;
 }
 
 static void check_peer_pmtu(struct dst_entry *dst, struct inet_peer *peer)