diff mbox

net: take care of bonding in build_skb_flow_key

Message ID 1452059368-7527-1-git-send-email-wen.gang.wang@oracle.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Wengang Wang Jan. 6, 2016, 5:49 a.m. UTC
A problem is found that we are looking for route basing a bonding device and
deal with path MTU there: The path MTU is set to the active slave device, not
the bonding master.

The patch tries to fix the issue by letting build_skb_flow_key() take care
of the transition of device index from bonding slave to the master.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
---
 net/ipv4/route.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

David Miller Jan. 6, 2016, 6:18 a.m. UTC | #1
From: Wengang Wang <wen.gang.wang@oracle.com>
Date: Wed,  6 Jan 2016 13:49:28 +0800

> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct flowi4 *fl4, const struct sk_buff *skb,
>  			       const struct sock *sk)
>  {
>  	const struct iphdr *iph = ip_hdr(skb);
> -	int oif = skb->dev->ifindex;
> +	int oif;
> +	struct net_device *master = NULL;
> +
>  	u8 tos = RT_TOS(iph->tos);
>  	u8 prot = iph->protocol;
>  	u32 mark = skb->mark;
>  

Please fix the stlye of these variable declarations:

1) Order them from longest line to shortest line, also known
   as "reverse christmas tree" layout.

2) Do not add an empty line in the middle of these variable
   declarations.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhu Yanjun Jan. 6, 2016, 6:19 a.m. UTC | #2
IMHO, this should fix in bonding driver because the active slave mtu 
should be the same with the master.
bonding master's mtu is changed to path MTU, then slave dev's MTU should 
be changed, too.

Zhu Yanjun
On 01/06/2016 01:49 PM, Wengang Wang wrote:
> A problem is found that we are looking for route basing a bonding device and
> deal with path MTU there: The path MTU is set to the active slave device, not
> the bonding master.
>
> The patch tries to fix the issue by letting build_skb_flow_key() take care
> of the transition of device index from bonding slave to the master.
>
> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
> ---
>   net/ipv4/route.c | 11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 85f184e..3053f10 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct flowi4 *fl4, const struct sk_buff *skb,
>   			       const struct sock *sk)
>   {
>   	const struct iphdr *iph = ip_hdr(skb);
> -	int oif = skb->dev->ifindex;
> +	int oif;
> +	struct net_device *master = NULL;
> +
>   	u8 tos = RT_TOS(iph->tos);
>   	u8 prot = iph->protocol;
>   	u32 mark = skb->mark;
>   
> +	if (skb->dev->flags & IFF_SLAVE)
> +		master = netdev_master_upper_dev_get(skb->dev);
> +	if (master)
> +		oif = master->ifindex;
> +	else
> +		oif = skb->dev->ifindex;
> +
>   	__build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
>   }
>   

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wengang Wang Jan. 6, 2016, 6:32 a.m. UTC | #3
在 2016年01月06日 14:18, David Miller 写道:
> From: Wengang Wang <wen.gang.wang@oracle.com>
> Date: Wed,  6 Jan 2016 13:49:28 +0800
>
>> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct flowi4 *fl4, const struct sk_buff *skb,
>>   			       const struct sock *sk)
>>   {
>>   	const struct iphdr *iph = ip_hdr(skb);
>> -	int oif = skb->dev->ifindex;
>> +	int oif;
>> +	struct net_device *master = NULL;
>> +
>>   	u8 tos = RT_TOS(iph->tos);
>>   	u8 prot = iph->protocol;
>>   	u32 mark = skb->mark;
>>   
> Please fix the stlye of these variable declarations:
>
> 1) Order them from longest line to shortest line, also known
>     as "reverse christmas tree" layout.
>
> 2) Do not add an empty line in the middle of these variable
>     declarations.
OK, will do in second drop.

thanks,
wengang
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhu Yanjun Jan. 6, 2016, 6:44 a.m. UTC | #4
IMHO, the following comments will help us all.

         case NETDEV_CHANGEMTU:
                 /* TODO: Should slaves be allowed to
                  * independently alter their MTU?  For
                  * an active-backup bond, slaves need
                  * not be the same type of device, so
                  * MTUs may vary.  For other modes,
                  * slaves arguably should have the
                  * same MTUs. To do this, we'd need to
                  * take over the slave's change_mtu
                  * function for the duration of their
                  * servitude.
                  */
                 break;

Best Regards!
Zhu Yanjun

On 01/06/2016 02:32 PM, Wengang Wang wrote:
>
>
> 在 2016年01月06日 14:18, David Miller 写道:
>> From: Wengang Wang <wen.gang.wang@oracle.com>
>> Date: Wed,  6 Jan 2016 13:49:28 +0800
>>
>>> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct flowi4 
>>> *fl4, const struct sk_buff *skb,
>>>                      const struct sock *sk)
>>>   {
>>>       const struct iphdr *iph = ip_hdr(skb);
>>> -    int oif = skb->dev->ifindex;
>>> +    int oif;
>>> +    struct net_device *master = NULL;
>>> +
>>>       u8 tos = RT_TOS(iph->tos);
>>>       u8 prot = iph->protocol;
>>>       u32 mark = skb->mark;
>> Please fix the stlye of these variable declarations:
>>
>> 1) Order them from longest line to shortest line, also known
>>     as "reverse christmas tree" layout.
>>
>> 2) Do not add an empty line in the middle of these variable
>>     declarations.
> OK, will do in second drop.
>
> thanks,
> wengang
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wengang Wang Jan. 6, 2016, 7:06 a.m. UTC | #5
Hi Yanjun,

Thanks for your review.
Master MTU is same as that for slaves.
Maybe fixing in bonding driver is a good idea, but I don't find a good 
place to do that.
Let's go through the simplified follow:

...
1) Fragmentation.
    --This is is done is against the bonding master device(device MTU 
and path MTU)
2) bond_start_xmit
3) ipoib_start_xmit(slaves are IPoIB interfaces)

For the first send
1) fragment size is 7000(in my case)
2) bond_start_xmit its self is fine
3) ipoib_start_xmit sees the packet size 7000 is larger than the 
internal limit 2044, drops the packet and try to update PMTU.
     without the patch, it tried update PMTU on slave device(no changes 
to master).

the seconds send comes, since no changes happen on bonding master(PMTU), 
the fragment size is still 7000 and the behavior is just the same as the 
first send.

With the patch, the bonding master PMTU is changed to 2044 after the 
first send(hopefully), for the seconds send the fragment size is set to 
2044.

To fix in bonding code, I don't find where we can.

thanks,
wengang

在 2016年01月06日 14:19, zhuyj 写道:
>
> IMHO, this should fix in bonding driver because the active slave mtu 
> should be the same with the master.
> bonding master's mtu is changed to path MTU, then slave dev's MTU 
> should be changed, too.
>
> Zhu Yanjun
> On 01/06/2016 01:49 PM, Wengang Wang wrote:
>> A problem is found that we are looking for route basing a bonding 
>> device and
>> deal with path MTU there: The path MTU is set to the active slave 
>> device, not
>> the bonding master.
>>
>> The patch tries to fix the issue by letting build_skb_flow_key() take 
>> care
>> of the transition of device index from bonding slave to the master.
>>
>> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
>> ---
>>   net/ipv4/route.c | 11 ++++++++++-
>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>> index 85f184e..3053f10 100644
>> --- a/net/ipv4/route.c
>> +++ b/net/ipv4/route.c
>> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct flowi4 
>> *fl4, const struct sk_buff *skb,
>>                      const struct sock *sk)
>>   {
>>       const struct iphdr *iph = ip_hdr(skb);
>> -    int oif = skb->dev->ifindex;
>> +    int oif;
>> +    struct net_device *master = NULL;
>> +
>>       u8 tos = RT_TOS(iph->tos);
>>       u8 prot = iph->protocol;
>>       u32 mark = skb->mark;
>>   +    if (skb->dev->flags & IFF_SLAVE)
>> +        master = netdev_master_upper_dev_get(skb->dev);
>> +    if (master)
>> +        oif = master->ifindex;
>> +    else
>> +        oif = skb->dev->ifindex;
>> +
>>       __build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
>>   }
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wengang Wang Jan. 6, 2016, 7:11 a.m. UTC | #6
Hmm, we are not changing device MTU but PMTU...

thanks,
wengang

在 2016年01月06日 14:44, zhuyj 写道:
> IMHO, the following comments will help us all.
>
>         case NETDEV_CHANGEMTU:
>                 /* TODO: Should slaves be allowed to
>                  * independently alter their MTU?  For
>                  * an active-backup bond, slaves need
>                  * not be the same type of device, so
>                  * MTUs may vary.  For other modes,
>                  * slaves arguably should have the
>                  * same MTUs. To do this, we'd need to
>                  * take over the slave's change_mtu
>                  * function for the duration of their
>                  * servitude.
>                  */
>                 break;
>
> Best Regards!
> Zhu Yanjun
>
> On 01/06/2016 02:32 PM, Wengang Wang wrote:
>>
>>
>> 在 2016年01月06日 14:18, David Miller 写道:
>>> From: Wengang Wang <wen.gang.wang@oracle.com>
>>> Date: Wed,  6 Jan 2016 13:49:28 +0800
>>>
>>>> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct flowi4 
>>>> *fl4, const struct sk_buff *skb,
>>>>                      const struct sock *sk)
>>>>   {
>>>>       const struct iphdr *iph = ip_hdr(skb);
>>>> -    int oif = skb->dev->ifindex;
>>>> +    int oif;
>>>> +    struct net_device *master = NULL;
>>>> +
>>>>       u8 tos = RT_TOS(iph->tos);
>>>>       u8 prot = iph->protocol;
>>>>       u32 mark = skb->mark;
>>> Please fix the stlye of these variable declarations:
>>>
>>> 1) Order them from longest line to shortest line, also known
>>>     as "reverse christmas tree" layout.
>>>
>>> 2) Do not add an empty line in the middle of these variable
>>>     declarations.
>> OK, will do in second drop.
>>
>> thanks,
>> wengang
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhu Yanjun Jan. 6, 2016, 7:35 a.m. UTC | #7
IMHO, "The path MTU is set to the active slave device, not the bonding 
master."
Can we set PMTU to bonding master when path MTU is set to the active 
slave device?

If not appropriate, you can ignore it.

Best Regards!
Zhu Yanjun

On 01/06/2016 03:06 PM, Wengang Wang wrote:
> Hi Yanjun,
>
> Thanks for your review.
> Master MTU is same as that for slaves.
> Maybe fixing in bonding driver is a good idea, but I don't find a good 
> place to do that.
> Let's go through the simplified follow:
>
> ...
> 1) Fragmentation.
>    --This is is done is against the bonding master device(device MTU 
> and path MTU)
> 2) bond_start_xmit
> 3) ipoib_start_xmit(slaves are IPoIB interfaces)
>
> For the first send
> 1) fragment size is 7000(in my case)
> 2) bond_start_xmit its self is fine
> 3) ipoib_start_xmit sees the packet size 7000 is larger than the 
> internal limit 2044, drops the packet and try to update PMTU.
>     without the patch, it tried update PMTU on slave device(no changes 
> to master).
>
> the seconds send comes, since no changes happen on bonding 
> master(PMTU), the fragment size is still 7000 and the behavior is just 
> the same as the first send.
>
> With the patch, the bonding master PMTU is changed to 2044 after the 
> first send(hopefully), for the seconds send the fragment size is set 
> to 2044.
>
> To fix in bonding code, I don't find where we can.
>
> thanks,
> wengang
>
> 在 2016年01月06日 14:19, zhuyj 写道:
>>
>> IMHO, this should fix in bonding driver because the active slave mtu 
>> should be the same with the master.
>> bonding master's mtu is changed to path MTU, then slave dev's MTU 
>> should be changed, too.
>>
>> Zhu Yanjun
>> On 01/06/2016 01:49 PM, Wengang Wang wrote:
>>> A problem is found that we are looking for route basing a bonding 
>>> device and
>>> deal with path MTU there: The path MTU is set to the active slave 
>>> device, not
>>> the bonding master.
>>>
>>> The patch tries to fix the issue by letting build_skb_flow_key() 
>>> take care
>>> of the transition of device index from bonding slave to the master.
>>>
>>> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
>>> ---
>>>   net/ipv4/route.c | 11 ++++++++++-
>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>>> index 85f184e..3053f10 100644
>>> --- a/net/ipv4/route.c
>>> +++ b/net/ipv4/route.c
>>> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct flowi4 
>>> *fl4, const struct sk_buff *skb,
>>>                      const struct sock *sk)
>>>   {
>>>       const struct iphdr *iph = ip_hdr(skb);
>>> -    int oif = skb->dev->ifindex;
>>> +    int oif;
>>> +    struct net_device *master = NULL;
>>> +
>>>       u8 tos = RT_TOS(iph->tos);
>>>       u8 prot = iph->protocol;
>>>       u32 mark = skb->mark;
>>>   +    if (skb->dev->flags & IFF_SLAVE)
>>> +        master = netdev_master_upper_dev_get(skb->dev);
>>> +    if (master)
>>> +        oif = master->ifindex;
>>> +    else
>>> +        oif = skb->dev->ifindex;
>>> +
>>>       __build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
>>>   }
>>
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wengang Wang Jan. 6, 2016, 7:45 a.m. UTC | #8
在 2016年01月06日 15:35, zhuyj 写道:
> IMHO, "The path MTU is set to the active slave device, not the bonding 
> master."
> Can we set PMTU to bonding master when path MTU is set to the active 
> slave device?
>
Actually the route is set on bonding master, not on any slave, the 
trying to set PMTU to the active slave failed(it didn't found a route on 
slave).
So "set PMTU to bonding master when path MTU is set to the active slave 
device" is lacking a base.

thanks,
wengang

> If not appropriate, you can ignore it.
>
> Best Regards!
> Zhu Yanjun
>
> On 01/06/2016 03:06 PM, Wengang Wang wrote:
>> Hi Yanjun,
>>
>> Thanks for your review.
>> Master MTU is same as that for slaves.
>> Maybe fixing in bonding driver is a good idea, but I don't find a 
>> good place to do that.
>> Let's go through the simplified follow:
>>
>> ...
>> 1) Fragmentation.
>>    --This is is done is against the bonding master device(device MTU 
>> and path MTU)
>> 2) bond_start_xmit
>> 3) ipoib_start_xmit(slaves are IPoIB interfaces)
>>
>> For the first send
>> 1) fragment size is 7000(in my case)
>> 2) bond_start_xmit its self is fine
>> 3) ipoib_start_xmit sees the packet size 7000 is larger than the 
>> internal limit 2044, drops the packet and try to update PMTU.
>>     without the patch, it tried update PMTU on slave device(no 
>> changes to master).
>>
>> the seconds send comes, since no changes happen on bonding 
>> master(PMTU), the fragment size is still 7000 and the behavior is 
>> just the same as the first send.
>>
>> With the patch, the bonding master PMTU is changed to 2044 after the 
>> first send(hopefully), for the seconds send the fragment size is set 
>> to 2044.
>>
>> To fix in bonding code, I don't find where we can.
>>
>> thanks,
>> wengang
>>
>> 在 2016年01月06日 14:19, zhuyj 写道:
>>>
>>> IMHO, this should fix in bonding driver because the active slave mtu 
>>> should be the same with the master.
>>> bonding master's mtu is changed to path MTU, then slave dev's MTU 
>>> should be changed, too.
>>>
>>> Zhu Yanjun
>>> On 01/06/2016 01:49 PM, Wengang Wang wrote:
>>>> A problem is found that we are looking for route basing a bonding 
>>>> device and
>>>> deal with path MTU there: The path MTU is set to the active slave 
>>>> device, not
>>>> the bonding master.
>>>>
>>>> The patch tries to fix the issue by letting build_skb_flow_key() 
>>>> take care
>>>> of the transition of device index from bonding slave to the master.
>>>>
>>>> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
>>>> ---
>>>>   net/ipv4/route.c | 11 ++++++++++-
>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>>>> index 85f184e..3053f10 100644
>>>> --- a/net/ipv4/route.c
>>>> +++ b/net/ipv4/route.c
>>>> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct flowi4 
>>>> *fl4, const struct sk_buff *skb,
>>>>                      const struct sock *sk)
>>>>   {
>>>>       const struct iphdr *iph = ip_hdr(skb);
>>>> -    int oif = skb->dev->ifindex;
>>>> +    int oif;
>>>> +    struct net_device *master = NULL;
>>>> +
>>>>       u8 tos = RT_TOS(iph->tos);
>>>>       u8 prot = iph->protocol;
>>>>       u32 mark = skb->mark;
>>>>   +    if (skb->dev->flags & IFF_SLAVE)
>>>> +        master = netdev_master_upper_dev_get(skb->dev);
>>>> +    if (master)
>>>> +        oif = master->ifindex;
>>>> +    else
>>>> +        oif = skb->dev->ifindex;
>>>> +
>>>>       __build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
>>>>   }
>>>
>>
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhu Yanjun Jan. 6, 2016, 8 a.m. UTC | #9
On 01/06/2016 03:45 PM, Wengang Wang wrote:
>
>
> 在 2016年01月06日 15:35, zhuyj 写道:
>> IMHO, "The path MTU is set to the active slave device, not the 
>> bonding master."
>> Can we set PMTU to bonding master when path MTU is set to the active 
>> slave device?
>>
> Actually the route is set on bonding master, not on any slave, the 
> trying to set PMTU to the active slave failed(it didn't found a route 
> on slave).
In your commit log:

"
A problem is found that we are looking for route basing a bonding device and
deal with path MTU there: The path MTU is set to the active slave 
device, not
the bonding master.
"

and

"the trying to set PMTU to the active slave failed"

I am confused.

Zhu Yanjun

> So "set PMTU to bonding master when path MTU is set to the active 
> slave device" is lacking a base.
>
> thanks,
> wengang
>
>> If not appropriate, you can ignore it.
>>
>> Best Regards!
>> Zhu Yanjun
>>
>> On 01/06/2016 03:06 PM, Wengang Wang wrote:
>>> Hi Yanjun,
>>>
>>> Thanks for your review.
>>> Master MTU is same as that for slaves.
>>> Maybe fixing in bonding driver is a good idea, but I don't find a 
>>> good place to do that.
>>> Let's go through the simplified follow:
>>>
>>> ...
>>> 1) Fragmentation.
>>>    --This is is done is against the bonding master device(device MTU 
>>> and path MTU)
>>> 2) bond_start_xmit
>>> 3) ipoib_start_xmit(slaves are IPoIB interfaces)
>>>
>>> For the first send
>>> 1) fragment size is 7000(in my case)
>>> 2) bond_start_xmit its self is fine
>>> 3) ipoib_start_xmit sees the packet size 7000 is larger than the 
>>> internal limit 2044, drops the packet and try to update PMTU.
>>>     without the patch, it tried update PMTU on slave device(no 
>>> changes to master).
>>>
>>> the seconds send comes, since no changes happen on bonding 
>>> master(PMTU), the fragment size is still 7000 and the behavior is 
>>> just the same as the first send.
>>>
>>> With the patch, the bonding master PMTU is changed to 2044 after the 
>>> first send(hopefully), for the seconds send the fragment size is set 
>>> to 2044.
>>>
>>> To fix in bonding code, I don't find where we can.
>>>
>>> thanks,
>>> wengang
>>>
>>> 在 2016年01月06日 14:19, zhuyj 写道:
>>>>
>>>> IMHO, this should fix in bonding driver because the active slave 
>>>> mtu should be the same with the master.
>>>> bonding master's mtu is changed to path MTU, then slave dev's MTU 
>>>> should be changed, too.
>>>>
>>>> Zhu Yanjun
>>>> On 01/06/2016 01:49 PM, Wengang Wang wrote:
>>>>> A problem is found that we are looking for route basing a bonding 
>>>>> device and
>>>>> deal with path MTU there: The path MTU is set to the active slave 
>>>>> device, not
>>>>> the bonding master.
>>>>>
>>>>> The patch tries to fix the issue by letting build_skb_flow_key() 
>>>>> take care
>>>>> of the transition of device index from bonding slave to the master.
>>>>>
>>>>> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
>>>>> ---
>>>>>   net/ipv4/route.c | 11 ++++++++++-
>>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>>>>> index 85f184e..3053f10 100644
>>>>> --- a/net/ipv4/route.c
>>>>> +++ b/net/ipv4/route.c
>>>>> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct flowi4 
>>>>> *fl4, const struct sk_buff *skb,
>>>>>                      const struct sock *sk)
>>>>>   {
>>>>>       const struct iphdr *iph = ip_hdr(skb);
>>>>> -    int oif = skb->dev->ifindex;
>>>>> +    int oif;
>>>>> +    struct net_device *master = NULL;
>>>>> +
>>>>>       u8 tos = RT_TOS(iph->tos);
>>>>>       u8 prot = iph->protocol;
>>>>>       u32 mark = skb->mark;
>>>>>   +    if (skb->dev->flags & IFF_SLAVE)
>>>>> +        master = netdev_master_upper_dev_get(skb->dev);
>>>>> +    if (master)
>>>>> +        oif = master->ifindex;
>>>>> +    else
>>>>> +        oif = skb->dev->ifindex;
>>>>> +
>>>>>       __build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
>>>>>   }
>>>>
>>>
>>
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wengang Wang Jan. 6, 2016, 8:14 a.m. UTC | #10
在 2016年01月06日 16:00, zhuyj 写道:
> On 01/06/2016 03:45 PM, Wengang Wang wrote:
>>
>>
>> 在 2016年01月06日 15:35, zhuyj 写道:
>>> IMHO, "The path MTU is set to the active slave device, not the 
>>> bonding master."
>>> Can we set PMTU to bonding master when path MTU is set to the active 
>>> slave device?
>>>
>> Actually the route is set on bonding master, not on any slave, the 
>> trying to set PMTU to the active slave failed(it didn't found a route 
>> on slave).
> In your commit log:
>
> "
> A problem is found that we are looking for route basing a bonding 
> device and
> deal with path MTU there: The path MTU is set to the active slave 
> device, not
> the bonding master.
> "
>
> and
>
> "the trying to set PMTU to the active slave failed"
>
> I am confused.
>

Maybe changing "The path MTU is set to the active slave device, not the 
bonding master" to
"The path MTU is tried to set to the active slave device, not to the 
bonding master" is better.

It tried to change the PMTU to the slave. Whether the setting can 
succeed depends:
if the route is there(on slave), it goes successfully; if not no route 
found, it goes unsuccessfully.
For the no-route case, your suggestion breaks.

thanks,
wengang

> Zhu Yanjun
>
>> So "set PMTU to bonding master when path MTU is set to the active 
>> slave device" is lacking a base.
>>
>> thanks,
>> wengang
>>
>>> If not appropriate, you can ignore it.
>>>
>>> Best Regards!
>>> Zhu Yanjun
>>>
>>> On 01/06/2016 03:06 PM, Wengang Wang wrote:
>>>> Hi Yanjun,
>>>>
>>>> Thanks for your review.
>>>> Master MTU is same as that for slaves.
>>>> Maybe fixing in bonding driver is a good idea, but I don't find a 
>>>> good place to do that.
>>>> Let's go through the simplified follow:
>>>>
>>>> ...
>>>> 1) Fragmentation.
>>>>    --This is is done is against the bonding master device(device 
>>>> MTU and path MTU)
>>>> 2) bond_start_xmit
>>>> 3) ipoib_start_xmit(slaves are IPoIB interfaces)
>>>>
>>>> For the first send
>>>> 1) fragment size is 7000(in my case)
>>>> 2) bond_start_xmit its self is fine
>>>> 3) ipoib_start_xmit sees the packet size 7000 is larger than the 
>>>> internal limit 2044, drops the packet and try to update PMTU.
>>>>     without the patch, it tried update PMTU on slave device(no 
>>>> changes to master).
>>>>
>>>> the seconds send comes, since no changes happen on bonding 
>>>> master(PMTU), the fragment size is still 7000 and the behavior is 
>>>> just the same as the first send.
>>>>
>>>> With the patch, the bonding master PMTU is changed to 2044 after 
>>>> the first send(hopefully), for the seconds send the fragment size 
>>>> is set to 2044.
>>>>
>>>> To fix in bonding code, I don't find where we can.
>>>>
>>>> thanks,
>>>> wengang
>>>>
>>>> 在 2016年01月06日 14:19, zhuyj 写道:
>>>>>
>>>>> IMHO, this should fix in bonding driver because the active slave 
>>>>> mtu should be the same with the master.
>>>>> bonding master's mtu is changed to path MTU, then slave dev's MTU 
>>>>> should be changed, too.
>>>>>
>>>>> Zhu Yanjun
>>>>> On 01/06/2016 01:49 PM, Wengang Wang wrote:
>>>>>> A problem is found that we are looking for route basing a bonding 
>>>>>> device and
>>>>>> deal with path MTU there: The path MTU is set to the active slave 
>>>>>> device, not
>>>>>> the bonding master.
>>>>>>
>>>>>> The patch tries to fix the issue by letting build_skb_flow_key() 
>>>>>> take care
>>>>>> of the transition of device index from bonding slave to the master.
>>>>>>
>>>>>> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
>>>>>> ---
>>>>>>   net/ipv4/route.c | 11 ++++++++++-
>>>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>>>>>> index 85f184e..3053f10 100644
>>>>>> --- a/net/ipv4/route.c
>>>>>> +++ b/net/ipv4/route.c
>>>>>> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct 
>>>>>> flowi4 *fl4, const struct sk_buff *skb,
>>>>>>                      const struct sock *sk)
>>>>>>   {
>>>>>>       const struct iphdr *iph = ip_hdr(skb);
>>>>>> -    int oif = skb->dev->ifindex;
>>>>>> +    int oif;
>>>>>> +    struct net_device *master = NULL;
>>>>>> +
>>>>>>       u8 tos = RT_TOS(iph->tos);
>>>>>>       u8 prot = iph->protocol;
>>>>>>       u32 mark = skb->mark;
>>>>>>   +    if (skb->dev->flags & IFF_SLAVE)
>>>>>> +        master = netdev_master_upper_dev_get(skb->dev);
>>>>>> +    if (master)
>>>>>> +        oif = master->ifindex;
>>>>>> +    else
>>>>>> +        oif = skb->dev->ifindex;
>>>>>> +
>>>>>>       __build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
>>>>>>   }
>>>>>
>>>>
>>>
>>
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhu Yanjun Jan. 6, 2016, 8:18 a.m. UTC | #11
On 01/06/2016 04:14 PM, Wengang Wang wrote:
>
>
> 在 2016年01月06日 16:00, zhuyj 写道:
>> On 01/06/2016 03:45 PM, Wengang Wang wrote:
>>>
>>>
>>> 在 2016年01月06日 15:35, zhuyj 写道:
>>>> IMHO, "The path MTU is set to the active slave device, not the 
>>>> bonding master."
>>>> Can we set PMTU to bonding master when path MTU is set to the 
>>>> active slave device?
>>>>
>>> Actually the route is set on bonding master, not on any slave, the 
>>> trying to set PMTU to the active slave failed(it didn't found a 
>>> route on slave).
>> In your commit log:
>>
>> "
>> A problem is found that we are looking for route basing a bonding 
>> device and
>> deal with path MTU there: The path MTU is set to the active slave 
>> device, not
>> the bonding master.
>> "
>>
>> and
>>
>> "the trying to set PMTU to the active slave failed"
>>
>> I am confused.
>>
>
> Maybe changing "The path MTU is set to the active slave device, not 
> the bonding master" to
> "The path MTU is tried to set to the active slave device, not to the 
> bonding master" is better.
Maybe you should explain your problem clearly.

Zhu Yanjun
>
> It tried to change the PMTU to the slave. Whether the setting can 
> succeed depends:
> if the route is there(on slave), it goes successfully; if not no route 
> found, it goes unsuccessfully.
> For the no-route case, your suggestion breaks.
>
> thanks,
> wengang
>
>> Zhu Yanjun
>>
>>> So "set PMTU to bonding master when path MTU is set to the active 
>>> slave device" is lacking a base.
>>>
>>> thanks,
>>> wengang
>>>
>>>> If not appropriate, you can ignore it.
>>>>
>>>> Best Regards!
>>>> Zhu Yanjun
>>>>
>>>> On 01/06/2016 03:06 PM, Wengang Wang wrote:
>>>>> Hi Yanjun,
>>>>>
>>>>> Thanks for your review.
>>>>> Master MTU is same as that for slaves.
>>>>> Maybe fixing in bonding driver is a good idea, but I don't find a 
>>>>> good place to do that.
>>>>> Let's go through the simplified follow:
>>>>>
>>>>> ...
>>>>> 1) Fragmentation.
>>>>>    --This is is done is against the bonding master device(device 
>>>>> MTU and path MTU)
>>>>> 2) bond_start_xmit
>>>>> 3) ipoib_start_xmit(slaves are IPoIB interfaces)
>>>>>
>>>>> For the first send
>>>>> 1) fragment size is 7000(in my case)
>>>>> 2) bond_start_xmit its self is fine
>>>>> 3) ipoib_start_xmit sees the packet size 7000 is larger than the 
>>>>> internal limit 2044, drops the packet and try to update PMTU.
>>>>>     without the patch, it tried update PMTU on slave device(no 
>>>>> changes to master).
>>>>>
>>>>> the seconds send comes, since no changes happen on bonding 
>>>>> master(PMTU), the fragment size is still 7000 and the behavior is 
>>>>> just the same as the first send.
>>>>>
>>>>> With the patch, the bonding master PMTU is changed to 2044 after 
>>>>> the first send(hopefully), for the seconds send the fragment size 
>>>>> is set to 2044.
>>>>>
>>>>> To fix in bonding code, I don't find where we can.
>>>>>
>>>>> thanks,
>>>>> wengang
>>>>>
>>>>> 在 2016年01月06日 14:19, zhuyj 写道:
>>>>>>
>>>>>> IMHO, this should fix in bonding driver because the active slave 
>>>>>> mtu should be the same with the master.
>>>>>> bonding master's mtu is changed to path MTU, then slave dev's MTU 
>>>>>> should be changed, too.
>>>>>>
>>>>>> Zhu Yanjun
>>>>>> On 01/06/2016 01:49 PM, Wengang Wang wrote:
>>>>>>> A problem is found that we are looking for route basing a 
>>>>>>> bonding device and
>>>>>>> deal with path MTU there: The path MTU is set to the active 
>>>>>>> slave device, not
>>>>>>> the bonding master.
>>>>>>>
>>>>>>> The patch tries to fix the issue by letting build_skb_flow_key() 
>>>>>>> take care
>>>>>>> of the transition of device index from bonding slave to the master.
>>>>>>>
>>>>>>> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
>>>>>>> ---
>>>>>>>   net/ipv4/route.c | 11 ++++++++++-
>>>>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>>>>>>> index 85f184e..3053f10 100644
>>>>>>> --- a/net/ipv4/route.c
>>>>>>> +++ b/net/ipv4/route.c
>>>>>>> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct 
>>>>>>> flowi4 *fl4, const struct sk_buff *skb,
>>>>>>>                      const struct sock *sk)
>>>>>>>   {
>>>>>>>       const struct iphdr *iph = ip_hdr(skb);
>>>>>>> -    int oif = skb->dev->ifindex;
>>>>>>> +    int oif;
>>>>>>> +    struct net_device *master = NULL;
>>>>>>> +
>>>>>>>       u8 tos = RT_TOS(iph->tos);
>>>>>>>       u8 prot = iph->protocol;
>>>>>>>       u32 mark = skb->mark;
>>>>>>>   +    if (skb->dev->flags & IFF_SLAVE)
>>>>>>> +        master = netdev_master_upper_dev_get(skb->dev);
>>>>>>> +    if (master)
>>>>>>> +        oif = master->ifindex;
>>>>>>> +    else
>>>>>>> +        oif = skb->dev->ifindex;
>>>>>>> +
>>>>>>>       __build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
>>>>>>>   }
>>>>>>
>>>>>
>>>>
>>>
>>
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wengang Wang Jan. 6, 2016, 9:04 a.m. UTC | #12
Please see if the V2 is clear.

thanks,
wengang

在 2016年01月06日 16:18, zhuyj 写道:
> On 01/06/2016 04:14 PM, Wengang Wang wrote:
>>
>>
>> 在 2016年01月06日 16:00, zhuyj 写道:
>>> On 01/06/2016 03:45 PM, Wengang Wang wrote:
>>>>
>>>>
>>>> 在 2016年01月06日 15:35, zhuyj 写道:
>>>>> IMHO, "The path MTU is set to the active slave device, not the 
>>>>> bonding master."
>>>>> Can we set PMTU to bonding master when path MTU is set to the 
>>>>> active slave device?
>>>>>
>>>> Actually the route is set on bonding master, not on any slave, the 
>>>> trying to set PMTU to the active slave failed(it didn't found a 
>>>> route on slave).
>>> In your commit log:
>>>
>>> "
>>> A problem is found that we are looking for route basing a bonding 
>>> device and
>>> deal with path MTU there: The path MTU is set to the active slave 
>>> device, not
>>> the bonding master.
>>> "
>>>
>>> and
>>>
>>> "the trying to set PMTU to the active slave failed"
>>>
>>> I am confused.
>>>
>>
>> Maybe changing "The path MTU is set to the active slave device, not 
>> the bonding master" to
>> "The path MTU is tried to set to the active slave device, not to the 
>> bonding master" is better.
> Maybe you should explain your problem clearly.
>
> Zhu Yanjun
>>
>> It tried to change the PMTU to the slave. Whether the setting can 
>> succeed depends:
>> if the route is there(on slave), it goes successfully; if not no 
>> route found, it goes unsuccessfully.
>> For the no-route case, your suggestion breaks.
>>
>> thanks,
>> wengang
>>
>>> Zhu Yanjun
>>>
>>>> So "set PMTU to bonding master when path MTU is set to the active 
>>>> slave device" is lacking a base.
>>>>
>>>> thanks,
>>>> wengang
>>>>
>>>>> If not appropriate, you can ignore it.
>>>>>
>>>>> Best Regards!
>>>>> Zhu Yanjun
>>>>>
>>>>> On 01/06/2016 03:06 PM, Wengang Wang wrote:
>>>>>> Hi Yanjun,
>>>>>>
>>>>>> Thanks for your review.
>>>>>> Master MTU is same as that for slaves.
>>>>>> Maybe fixing in bonding driver is a good idea, but I don't find a 
>>>>>> good place to do that.
>>>>>> Let's go through the simplified follow:
>>>>>>
>>>>>> ...
>>>>>> 1) Fragmentation.
>>>>>>    --This is is done is against the bonding master device(device 
>>>>>> MTU and path MTU)
>>>>>> 2) bond_start_xmit
>>>>>> 3) ipoib_start_xmit(slaves are IPoIB interfaces)
>>>>>>
>>>>>> For the first send
>>>>>> 1) fragment size is 7000(in my case)
>>>>>> 2) bond_start_xmit its self is fine
>>>>>> 3) ipoib_start_xmit sees the packet size 7000 is larger than the 
>>>>>> internal limit 2044, drops the packet and try to update PMTU.
>>>>>>     without the patch, it tried update PMTU on slave device(no 
>>>>>> changes to master).
>>>>>>
>>>>>> the seconds send comes, since no changes happen on bonding 
>>>>>> master(PMTU), the fragment size is still 7000 and the behavior is 
>>>>>> just the same as the first send.
>>>>>>
>>>>>> With the patch, the bonding master PMTU is changed to 2044 after 
>>>>>> the first send(hopefully), for the seconds send the fragment size 
>>>>>> is set to 2044.
>>>>>>
>>>>>> To fix in bonding code, I don't find where we can.
>>>>>>
>>>>>> thanks,
>>>>>> wengang
>>>>>>
>>>>>> 在 2016年01月06日 14:19, zhuyj 写道:
>>>>>>>
>>>>>>> IMHO, this should fix in bonding driver because the active slave 
>>>>>>> mtu should be the same with the master.
>>>>>>> bonding master's mtu is changed to path MTU, then slave dev's 
>>>>>>> MTU should be changed, too.
>>>>>>>
>>>>>>> Zhu Yanjun
>>>>>>> On 01/06/2016 01:49 PM, Wengang Wang wrote:
>>>>>>>> A problem is found that we are looking for route basing a 
>>>>>>>> bonding device and
>>>>>>>> deal with path MTU there: The path MTU is set to the active 
>>>>>>>> slave device, not
>>>>>>>> the bonding master.
>>>>>>>>
>>>>>>>> The patch tries to fix the issue by letting 
>>>>>>>> build_skb_flow_key() take care
>>>>>>>> of the transition of device index from bonding slave to the 
>>>>>>>> master.
>>>>>>>>
>>>>>>>> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
>>>>>>>> ---
>>>>>>>>   net/ipv4/route.c | 11 ++++++++++-
>>>>>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
>>>>>>>> index 85f184e..3053f10 100644
>>>>>>>> --- a/net/ipv4/route.c
>>>>>>>> +++ b/net/ipv4/route.c
>>>>>>>> @@ -523,11 +523,20 @@ static void build_skb_flow_key(struct 
>>>>>>>> flowi4 *fl4, const struct sk_buff *skb,
>>>>>>>>                      const struct sock *sk)
>>>>>>>>   {
>>>>>>>>       const struct iphdr *iph = ip_hdr(skb);
>>>>>>>> -    int oif = skb->dev->ifindex;
>>>>>>>> +    int oif;
>>>>>>>> +    struct net_device *master = NULL;
>>>>>>>> +
>>>>>>>>       u8 tos = RT_TOS(iph->tos);
>>>>>>>>       u8 prot = iph->protocol;
>>>>>>>>       u32 mark = skb->mark;
>>>>>>>>   +    if (skb->dev->flags & IFF_SLAVE)
>>>>>>>> +        master = netdev_master_upper_dev_get(skb->dev);
>>>>>>>> +    if (master)
>>>>>>>> +        oif = master->ifindex;
>>>>>>>> +    else
>>>>>>>> +        oif = skb->dev->ifindex;
>>>>>>>> +
>>>>>>>>       __build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
>>>>>>>>   }
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 85f184e..3053f10 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -523,11 +523,20 @@  static void build_skb_flow_key(struct flowi4 *fl4, const struct sk_buff *skb,
 			       const struct sock *sk)
 {
 	const struct iphdr *iph = ip_hdr(skb);
-	int oif = skb->dev->ifindex;
+	int oif;
+	struct net_device *master = NULL;
+
 	u8 tos = RT_TOS(iph->tos);
 	u8 prot = iph->protocol;
 	u32 mark = skb->mark;
 
+	if (skb->dev->flags & IFF_SLAVE)
+		master = netdev_master_upper_dev_get(skb->dev);
+	if (master)
+		oif = master->ifindex;
+	else
+		oif = skb->dev->ifindex;
+
 	__build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
 }