diff mbox series

flow_dissector: Fix vlan header offset in __skb_flow_dissect

Message ID 20190619160132.38416-1-yuehaibing@huawei.com
State Changes Requested
Delegated to: David Miller
Headers show
Series flow_dissector: Fix vlan header offset in __skb_flow_dissect | expand

Commit Message

Yue Haibing June 19, 2019, 4:01 p.m. UTC
We build vlan on top of bonding interface, which vlan offload
is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is
BOND_XMIT_POLICY_ENCAP34.

__skb_flow_dissect() fails to get information from protocol headers
encapsulated within vlan, because 'nhoff' is points to IP header,
so bond hashing is based on layer 2 info, which fails to distribute
packets across slaves.

Fixes: d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 net/core/flow_dissector.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Stanislav Fomichev June 19, 2019, 6:39 p.m. UTC | #1
On 06/20, YueHaibing wrote:
> We build vlan on top of bonding interface, which vlan offload
> is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is
> BOND_XMIT_POLICY_ENCAP34.
> 
> __skb_flow_dissect() fails to get information from protocol headers
> encapsulated within vlan, because 'nhoff' is points to IP header,
> so bond hashing is based on layer 2 info, which fails to distribute
> packets across slaves.
> 
> Fixes: d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci")
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> ---
>  net/core/flow_dissector.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index 415b95f..2a52abb 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -785,6 +785,9 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>  		    skb && skb_vlan_tag_present(skb)) {
>  			proto = skb->protocol;
>  		} else {
> +			if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX)
> +				nhoff -=  sizeof(*vlan);
> +
Should we instead fix the place where the skb is allocated to properly
pull vlan (skb_vlan_untag)? I'm not sure this particular place is
supposed to work with an skb. Having an skb with nhoff pointing to
IP header but missing skb_vlan_tag_present() when with
proto==ETH_P_8021xx seems weird.

>  			vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan),
>  						    data, hlen, &_vlan);
>  			if (!vlan) {
> -- 
> 2.7.0
> 
>
Jiri Pirko June 20, 2019, 7:20 a.m. UTC | #2
Wed, Jun 19, 2019 at 08:39:38PM CEST, sdf@fomichev.me wrote:
>On 06/20, YueHaibing wrote:
>> We build vlan on top of bonding interface, which vlan offload
>> is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is
>> BOND_XMIT_POLICY_ENCAP34.
>> 
>> __skb_flow_dissect() fails to get information from protocol headers
>> encapsulated within vlan, because 'nhoff' is points to IP header,
>> so bond hashing is based on layer 2 info, which fails to distribute
>> packets across slaves.
>> 
>> Fixes: d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci")
>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
>> ---
>>  net/core/flow_dissector.c | 3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>> index 415b95f..2a52abb 100644
>> --- a/net/core/flow_dissector.c
>> +++ b/net/core/flow_dissector.c
>> @@ -785,6 +785,9 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>>  		    skb && skb_vlan_tag_present(skb)) {
>>  			proto = skb->protocol;
>>  		} else {
>> +			if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX)
>> +				nhoff -=  sizeof(*vlan);
>> +
>Should we instead fix the place where the skb is allocated to properly
>pull vlan (skb_vlan_untag)? I'm not sure this particular place is

Yes.

>supposed to work with an skb. Having an skb with nhoff pointing to
>IP header but missing skb_vlan_tag_present() when with
>proto==ETH_P_8021xx seems weird.
>
>>  			vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan),
>>  						    data, hlen, &_vlan);
>>  			if (!vlan) {
>> -- 
>> 2.7.0
>> 
>>
Yue Haibing June 20, 2019, 10:02 a.m. UTC | #3
On 2019/6/20 2:39, Stanislav Fomichev wrote:
> On 06/20, YueHaibing wrote:
>> We build vlan on top of bonding interface, which vlan offload
>> is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is
>> BOND_XMIT_POLICY_ENCAP34.
>>
>> __skb_flow_dissect() fails to get information from protocol headers
>> encapsulated within vlan, because 'nhoff' is points to IP header,
>> so bond hashing is based on layer 2 info, which fails to distribute
>> packets across slaves.
>>
>> Fixes: d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci")
>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
>> ---
>>  net/core/flow_dissector.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>> index 415b95f..2a52abb 100644
>> --- a/net/core/flow_dissector.c
>> +++ b/net/core/flow_dissector.c
>> @@ -785,6 +785,9 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>>  		    skb && skb_vlan_tag_present(skb)) {
>>  			proto = skb->protocol;
>>  		} else {
>> +			if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX)
>> +				nhoff -=  sizeof(*vlan);
>> +
> Should we instead fix the place where the skb is allocated to properly
> pull vlan (skb_vlan_untag)? I'm not sure this particular place is
> supposed to work with an skb. Having an skb with nhoff pointing to
> IP header but missing skb_vlan_tag_present() when with
> proto==ETH_P_8021xx seems weird.

The skb is a forwarded vxlan packet, it send through vlan interface like this:

   vlan_dev_hard_start_xmit
    --> __vlan_hwaccel_put_tag //vlan_tci and VLAN_TAG_PRESENT is set
    --> dev_queue_xmit
        --> validate_xmit_skb
          --> validate_xmit_vlan // vlan_hw_offload_capable is false
             --> __vlan_hwaccel_push_inside //here skb_push vlan_hlen, then clear skb->tci

    --> bond_start_xmit
       --> bond_xmit_hash
         --> __skb_flow_dissect // nhoff point to IP header
            -->  case htons(ETH_P_8021Q)
            // skb_vlan_tag_present is false, so
              vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan), //vlan point to ip header wrongly

> 
>>  			vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan),
>>  						    data, hlen, &_vlan);
>>  			if (!vlan) {
>> -- 
>> 2.7.0
>>
>>
> 
> .
>
Stanislav Fomichev June 21, 2019, 12:33 a.m. UTC | #4
On 06/20, Yuehaibing wrote:
> On 2019/6/20 2:39, Stanislav Fomichev wrote:
> > On 06/20, YueHaibing wrote:
> >> We build vlan on top of bonding interface, which vlan offload
> >> is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is
> >> BOND_XMIT_POLICY_ENCAP34.
> >>
> >> __skb_flow_dissect() fails to get information from protocol headers
> >> encapsulated within vlan, because 'nhoff' is points to IP header,
> >> so bond hashing is based on layer 2 info, which fails to distribute
> >> packets across slaves.
> >>
> >> Fixes: d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci")
> >> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> >> ---
> >>  net/core/flow_dissector.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> >> index 415b95f..2a52abb 100644
> >> --- a/net/core/flow_dissector.c
> >> +++ b/net/core/flow_dissector.c
> >> @@ -785,6 +785,9 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
> >>  		    skb && skb_vlan_tag_present(skb)) {
> >>  			proto = skb->protocol;
> >>  		} else {
> >> +			if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX)
> >> +				nhoff -=  sizeof(*vlan);
> >> +
> > Should we instead fix the place where the skb is allocated to properly
> > pull vlan (skb_vlan_untag)? I'm not sure this particular place is
> > supposed to work with an skb. Having an skb with nhoff pointing to
> > IP header but missing skb_vlan_tag_present() when with
> > proto==ETH_P_8021xx seems weird.
> 
> The skb is a forwarded vxlan packet, it send through vlan interface like this:
> 
>    vlan_dev_hard_start_xmit
>     --> __vlan_hwaccel_put_tag //vlan_tci and VLAN_TAG_PRESENT is set
>     --> dev_queue_xmit
>         --> validate_xmit_skb
>           --> validate_xmit_vlan // vlan_hw_offload_capable is false
>              --> __vlan_hwaccel_push_inside //here skb_push vlan_hlen, then clear skb->tci
> 
>     --> bond_start_xmit
>        --> bond_xmit_hash
>          --> __skb_flow_dissect // nhoff point to IP header
>             -->  case htons(ETH_P_8021Q)
>             // skb_vlan_tag_present is false, so
>               vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan), //vlan point to ip header wrongly
I see, so bonding device propagates hw VLAN support from the slaves.
If one of the slaves doesn't have it, its disabled for the bond device.
Any idea why we do that? Why not pass skbs to the slave devices
instead and let them handle the hw/sw vlan implementation?
I see the propagation was added in 278339a42a1b 10 years ago and
I don't see any rationale in the commit description.
Somebody with more context should probably chime in :-)
David Miller June 22, 2019, 11:19 p.m. UTC | #5
From: YueHaibing <yuehaibing@huawei.com>
Date: Thu, 20 Jun 2019 00:01:32 +0800

> @@ -785,6 +785,9 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>  		    skb && skb_vlan_tag_present(skb)) {
>  			proto = skb->protocol;
>  		} else {
> +			if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX)
> +				nhoff -=  sizeof(*vlan);

Even if this would have turned out to be the desired fix, you would need
to get rid of the extra spaces in that last statement.
Jiri Pirko June 24, 2019, 1:50 p.m. UTC | #6
Fri, Jun 21, 2019 at 02:33:17AM CEST, sdf@fomichev.me wrote:
>On 06/20, Yuehaibing wrote:
>> On 2019/6/20 2:39, Stanislav Fomichev wrote:
>> > On 06/20, YueHaibing wrote:
>> >> We build vlan on top of bonding interface, which vlan offload
>> >> is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is
>> >> BOND_XMIT_POLICY_ENCAP34.
>> >>
>> >> __skb_flow_dissect() fails to get information from protocol headers
>> >> encapsulated within vlan, because 'nhoff' is points to IP header,
>> >> so bond hashing is based on layer 2 info, which fails to distribute
>> >> packets across slaves.
>> >>
>> >> Fixes: d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci")
>> >> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
>> >> ---
>> >>  net/core/flow_dissector.c | 3 +++
>> >>  1 file changed, 3 insertions(+)
>> >>
>> >> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>> >> index 415b95f..2a52abb 100644
>> >> --- a/net/core/flow_dissector.c
>> >> +++ b/net/core/flow_dissector.c
>> >> @@ -785,6 +785,9 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>> >>  		    skb && skb_vlan_tag_present(skb)) {
>> >>  			proto = skb->protocol;
>> >>  		} else {
>> >> +			if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX)
>> >> +				nhoff -=  sizeof(*vlan);
>> >> +
>> > Should we instead fix the place where the skb is allocated to properly
>> > pull vlan (skb_vlan_untag)? I'm not sure this particular place is
>> > supposed to work with an skb. Having an skb with nhoff pointing to
>> > IP header but missing skb_vlan_tag_present() when with
>> > proto==ETH_P_8021xx seems weird.
>> 
>> The skb is a forwarded vxlan packet, it send through vlan interface like this:
>> 
>>    vlan_dev_hard_start_xmit
>>     --> __vlan_hwaccel_put_tag //vlan_tci and VLAN_TAG_PRESENT is set
>>     --> dev_queue_xmit
>>         --> validate_xmit_skb
>>           --> validate_xmit_vlan // vlan_hw_offload_capable is false
>>              --> __vlan_hwaccel_push_inside //here skb_push vlan_hlen, then clear skb->tci
>> 
>>     --> bond_start_xmit
>>        --> bond_xmit_hash
>>          --> __skb_flow_dissect // nhoff point to IP header
>>             -->  case htons(ETH_P_8021Q)
>>             // skb_vlan_tag_present is false, so
>>               vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan), //vlan point to ip header wrongly
>I see, so bonding device propagates hw VLAN support from the slaves.
>If one of the slaves doesn't have it, its disabled for the bond device.
>Any idea why we do that? Why not pass skbs to the slave devices
>instead and let them handle the hw/sw vlan implementation?

Probably due to historical reasons. It is indeed not needed to push the
vlan header. We should rather have the vlan_tci filled all the way down
to the transmitting netdevice. So the bonding/team should have the
NETIF_F_HW_VLAN_CTAG_TX and NETIF_F_HW_VLAN_STAG_TX flags always on.
That seems to be the correct fix to me.


>I see the propagation was added in 278339a42a1b 10 years ago and
>I don't see any rationale in the commit description.
>Somebody with more context should probably chime in :-)
diff mbox series

Patch

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 415b95f..2a52abb 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -785,6 +785,9 @@  bool __skb_flow_dissect(const struct sk_buff *skb,
 		    skb && skb_vlan_tag_present(skb)) {
 			proto = skb->protocol;
 		} else {
+			if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX)
+				nhoff -=  sizeof(*vlan);
+
 			vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan),
 						    data, hlen, &_vlan);
 			if (!vlan) {