diff mbox

flow-dissector: Fix alignment issue in __skb_flow_get_ports

Message ID 20141010035840.21428.359.stgit@ahduyck-workstation.home
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Alexander H Duyck Oct. 10, 2014, 4:03 a.m. UTC
From: Alexander Duyck <alexander.h.duyck@redhat.com>

This patch addresses a kernel unaligned access bug seen on a sparc64 system
with an igb adapter.  Specifically the __skb_flow_get_ports was returning a
be32 pointer which was then having the value directly returned.

In order to keep the handling of the ports consistent with how we were
handling the IPv4 and IPv6 addresses I have instead replaced the assignment
with a memcpy to the flow key ports value.  This way it should stay a
memcpy on systems that cannot handle unaligned access, and will likely be
converted to a 32b assignment on the systems that can support it.

Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 drivers/net/bonding/bond_main.c |    2 +-
 include/net/flow_keys.h         |   10 ++++++----
 net/core/flow_dissector.c       |   21 +++++++++++++--------
 3 files changed, 20 insertions(+), 13 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller Oct. 10, 2014, 4:47 a.m. UTC | #1
From: alexander.duyck@gmail.com
Date: Thu, 09 Oct 2014 21:03:28 -0700

> From: Alexander Duyck <alexander.h.duyck@redhat.com>
> 
> This patch addresses a kernel unaligned access bug seen on a sparc64 system
> with an igb adapter.  Specifically the __skb_flow_get_ports was returning a
> be32 pointer which was then having the value directly returned.
> 
> In order to keep the handling of the ports consistent with how we were
> handling the IPv4 and IPv6 addresses I have instead replaced the assignment
> with a memcpy to the flow key ports value.  This way it should stay a
> memcpy on systems that cannot handle unaligned access, and will likely be
> converted to a 32b assignment on the systems that can support it.
> 
> Reported-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>

Guess what the compiler will output for the memcpy()....

	*(u32 *)dest = *(u32 *)src;

Using memcpy() is never a valid way to avoid misaligned loads and
stores.

If the types have a given alignment, the compiler can legitimately
expand the memcpy() inline with suitably sized loads and stores.

Please see my other reply in the original thread, we have to use
the hardware when we can to manage this situation, by configuring
it to output two garbage bytes before the packet contents when
DMA'ing into power-of-2 aligned blocks of memory.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Duyck Oct. 10, 2014, 2:42 p.m. UTC | #2
On 10/09/2014 09:47 PM, David Miller wrote:
> From: alexander.duyck@gmail.com
> Date: Thu, 09 Oct 2014 21:03:28 -0700
>
>> From: Alexander Duyck <alexander.h.duyck@redhat.com>
>>
>> This patch addresses a kernel unaligned access bug seen on a sparc64 system
>> with an igb adapter.  Specifically the __skb_flow_get_ports was returning a
>> be32 pointer which was then having the value directly returned.
>>
>> In order to keep the handling of the ports consistent with how we were
>> handling the IPv4 and IPv6 addresses I have instead replaced the assignment
>> with a memcpy to the flow key ports value.  This way it should stay a
>> memcpy on systems that cannot handle unaligned access, and will likely be
>> converted to a 32b assignment on the systems that can support it.
>>
>> Reported-by: David S. Miller <davem@davemloft.net>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
> Guess what the compiler will output for the memcpy()....
>
> 	*(u32 *)dest = *(u32 *)src;
>
> Using memcpy() is never a valid way to avoid misaligned loads and
> stores.

If needed we could convert ports to a __be16 I suppose.

> If the types have a given alignment, the compiler can legitimately
> expand the memcpy() inline with suitably sized loads and stores.

That is what I get for trying to come up with a fix at the end of the 
day.  Although it does leave me scratching my head why the IPv4 and IPv6 
addresses were not causing unaligned accesses or were they triggering 
them as well?

> Please see my other reply in the original thread, we have to use
> the hardware when we can to manage this situation, by configuring
> it to output two garbage bytes before the packet contents when
> DMA'ing into power-of-2 aligned blocks of memory.
>
> Thanks.

The problem is the igb / ixgbe / fm10k hardware doesn't have a means of 
inserting padding from its side.  The whole point of the xxx_get_headlen 
functions was to determine the size needed in order to generate the 
correct memcpy so that we could generate an aligned header.  It sounds 
like we can't do that with this function now because there are multiple 
instances where the data will be accessed unaligned if it isn't aligned 
in the first place.

The way I see it now there are two possible solutions.  The first one 
being to update the flow hash functions to work with unaligned accesses, 
and the second being to split the get_headlen code flow off and for now 
use my original code which already had all the alignment issues 
resolved.  I'm open to suggestions either way, but for now I will try to 
just address the items you have pointed out here with the current patch 
and submit a v2.

Thanks,

Alex


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight Oct. 10, 2014, 2:57 p.m. UTC | #3
From: Alexander Duyck
> On 10/09/2014 09:47 PM, David Miller wrote:
> > From: alexander.duyck@gmail.com
> > Date: Thu, 09 Oct 2014 21:03:28 -0700
> >
> >> From: Alexander Duyck <alexander.h.duyck@redhat.com>
> >>
> >> This patch addresses a kernel unaligned access bug seen on a sparc64 system
> >> with an igb adapter.  Specifically the __skb_flow_get_ports was returning a
> >> be32 pointer which was then having the value directly returned.
> >>
> >> In order to keep the handling of the ports consistent with how we were
> >> handling the IPv4 and IPv6 addresses I have instead replaced the assignment
> >> with a memcpy to the flow key ports value.  This way it should stay a
> >> memcpy on systems that cannot handle unaligned access, and will likely be
> >> converted to a 32b assignment on the systems that can support it.
> >>
> >> Reported-by: David S. Miller <davem@davemloft.net>
> >> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
> > Guess what the compiler will output for the memcpy()....
> >
> > 	*(u32 *)dest = *(u32 *)src;
> >
> > Using memcpy() is never a valid way to avoid misaligned loads and
> > stores.
> 
> If needed we could convert ports to a __be16 I suppose.

You would have to cast it somewhere where the compiler cannot
tell what the original type was.
This usually means you have to call a non-static function, which
might have to be in a different compilation unit.

...
> That is what I get for trying to come up with a fix at the end of the
> day.  Although it does leave me scratching my head why the IPv4 and IPv6
> addresses were not causing unaligned accesses or were they triggering
> them as well?

I think there is code to copy the IP and TCP headers to aligned memory
before they are parsed.

...
> 
> The problem is the igb / ixgbe / fm10k hardware doesn't have a means of
> inserting padding from its side...

Shoot the hardware engineers.

You aren't going to get the performance you expect from a 10Ge card
unless the rx buffers are 'correctly' aligned.

	David



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Duyck Oct. 10, 2014, 3:14 p.m. UTC | #4
On 10/10/2014 07:57 AM, David Laight wrote:
> From: Alexander Duyck
>> On 10/09/2014 09:47 PM, David Miller wrote:
>>> From: alexander.duyck@gmail.com
>>> Date: Thu, 09 Oct 2014 21:03:28 -0700
>>>
>>>> From: Alexander Duyck <alexander.h.duyck@redhat.com>
>>>>
>>>> This patch addresses a kernel unaligned access bug seen on a sparc64 system
>>>> with an igb adapter.  Specifically the __skb_flow_get_ports was returning a
>>>> be32 pointer which was then having the value directly returned.
>>>>
>>>> In order to keep the handling of the ports consistent with how we were
>>>> handling the IPv4 and IPv6 addresses I have instead replaced the assignment
>>>> with a memcpy to the flow key ports value.  This way it should stay a
>>>> memcpy on systems that cannot handle unaligned access, and will likely be
>>>> converted to a 32b assignment on the systems that can support it.
>>>>
>>>> Reported-by: David S. Miller <davem@davemloft.net>
>>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
>>> Guess what the compiler will output for the memcpy()....
>>>
>>> 	*(u32 *)dest = *(u32 *)src;
>>>
>>> Using memcpy() is never a valid way to avoid misaligned loads and
>>> stores.
>> If needed we could convert ports to a __be16 I suppose.
> You would have to cast it somewhere where the compiler cannot
> tell what the original type was.
> This usually means you have to call a non-static function, which
> might have to be in a different compilation unit.
>
> ...

The pointer itself gets assigned from skb->data + offset, since 
skb->data is an unsigned char I would think that it should be safe to 
cast it as a __be16 and have that stick.  It is only if we cast it as a 
__be32 that it should be an issue as we are casting a value that starts 
off only byte aligned.

>> That is what I get for trying to come up with a fix at the end of the
>> day.  Although it does leave me scratching my head why the IPv4 and IPv6
>> addresses were not causing unaligned accesses or were they triggering
>> them as well?
> I think there is code to copy the IP and TCP headers to aligned memory
> before they are parsed.
>
> ...

The code I am talking about is run before we actually get into the 
header parsing.  So for example if you take a look at 
iph_to_flow_copy_addrs that should be getting called before we even get 
to the code that is supposedly triggering this and it is doing 32b 
aligned accesses for the source and destination IP addresses.

>> The problem is the igb / ixgbe / fm10k hardware doesn't have a means of
>> inserting padding from its side...
> Shoot the hardware engineers.
>
> You aren't going to get the performance you expect from a 10Ge card
> unless the rx buffers are 'correctly' aligned.
>
> 	David
>

I think you might be coming to this a little late.  The igb and ixgbe 
drivers had been working this way for a long time.  We did a memcpy to 
move the headers from the page and into the skb->data at an aligned 
offset.  In order to determine the length to memcpy we had a function 
that could walk through the DMA aligned data to get the header length.  
The function for that was replaced with the __skb_flow_dissect as it was 
considered a duplication of code with the flow_dissection functions.  
However that is obviously not the case now that we are hitting these 
alignment issues.

The question I have in all this is do I push forward and make 
__skb_flow_dissect work with unaligned accesses, or do I back off and 
put something equivilent to igb/ixgbe_get_headlen functions in the 
kernel in order to deal with the unaligned accesses as they had no 
issues with them since they were only concerned with getting the header 
length and kept all accesses 16b aligned.

Thanks,

Alex


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 10, 2014, 3:29 p.m. UTC | #5
On Fri, 2014-10-10 at 08:14 -0700, Alexander Duyck wrote:

> I think you might be coming to this a little late.  The igb and ixgbe 
> drivers had been working this way for a long time.  We did a memcpy to 
> move the headers from the page and into the skb->data at an aligned 
> offset.  In order to determine the length to memcpy we had a function 
> that could walk through the DMA aligned data to get the header length.  
> The function for that was replaced with the __skb_flow_dissect as it was 
> considered a duplication of code with the flow_dissection functions.  
> However that is obviously not the case now that we are hitting these 
> alignment issues.
> 
> The question I have in all this is do I push forward and make 
> __skb_flow_dissect work with unaligned accesses, or do I back off and 
> put something equivilent to igb/ixgbe_get_headlen functions in the 
> kernel in order to deal with the unaligned accesses as they had no 
> issues with them since they were only concerned with getting the header 
> length and kept all accesses 16b aligned.
> 

I see nothing wrong dealing with unaligned accesses, as these helpers
are nop on x86 or CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y arches.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 10, 2014, 3:33 p.m. UTC | #6
On Fri, 2014-10-10 at 14:57 +0000, David Laight wrote:

> I think there is code to copy the IP and TCP headers to aligned memory
> before they are parsed.

There is no such thing. You are here on netdev list, please read the
code before doing such claims.

> ...
> > 
> > The problem is the igb / ixgbe / fm10k hardware doesn't have a means of
> > inserting padding from its side...
> 
> Shoot the hardware engineers.
> 
> You aren't going to get the performance you expect from a 10Ge card
> unless the rx buffers are 'correctly' aligned.

That is simply not true on current x86 cpus. They simply dont care at
all.

You cannot blame Intel for other arches.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight Oct. 10, 2014, 4:30 p.m. UTC | #7
From: Eric Dumazet [mailto:eric.dumazet@gmail.com]

> Sent: 10 October 2014 16:34

> To: David Laight

> Cc: 'alexander.h.duyck@redhat.com'; David Miller; alexander.duyck@gmail.com; netdev@vger.kernel.org

> Subject: Re: [PATCH] flow-dissector: Fix alignment issue in __skb_flow_get_ports

> 

> On Fri, 2014-10-10 at 14:57 +0000, David Laight wrote:

> 

> > I think there is code to copy the IP and TCP headers to aligned memory

> > before they are parsed.

> 

> There is no such thing. You are here on netdev list, please read the

> code before doing such claims.


I did say 'I think'...
I must be thinking of some similar code somewhere else.
Possibly just the code that ensures the header isn't fragmented.

> > > The problem is the igb / ixgbe / fm10k hardware doesn't have a means of

> > > inserting padding from its side...

> >

> > Shoot the hardware engineers.

> >

> > You aren't going to get the performance you expect from a 10Ge card

> > unless the rx buffers are 'correctly' aligned.

> 

> That is simply not true on current x86 cpus. They simply dont care at

> all.


I was referring to using them on sparc64, not x86.

I know that current intel x86 cpu have support for misaligned 'rep movsd',
but I thought there was still a small cost (maybe one clock) for
single word transfers.
So maybe they care 'just a little bit'.

> You cannot blame Intel for other arches.


True, but this does mean that you don't really want to use these adapters
on a system that can't to unaligned accesses.

	David
David Miller Oct. 10, 2014, 4:41 p.m. UTC | #8
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 10 Oct 2014 08:33:39 -0700

> On Fri, 2014-10-10 at 14:57 +0000, David Laight wrote:
> 
>> I think there is code to copy the IP and TCP headers to aligned memory
>> before they are parsed.
> 
> There is no such thing. You are here on netdev list, please read the
> code before doing such claims.

+1
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Duyck Oct. 10, 2014, 4:50 p.m. UTC | #9
On 10/10/2014 08:29 AM, Eric Dumazet wrote:
> On Fri, 2014-10-10 at 08:14 -0700, Alexander Duyck wrote:
>
>> I think you might be coming to this a little late.  The igb and ixgbe
>> drivers had been working this way for a long time.  We did a memcpy to
>> move the headers from the page and into the skb->data at an aligned
>> offset.  In order to determine the length to memcpy we had a function
>> that could walk through the DMA aligned data to get the header length.
>> The function for that was replaced with the __skb_flow_dissect as it was
>> considered a duplication of code with the flow_dissection functions.
>> However that is obviously not the case now that we are hitting these
>> alignment issues.
>>
>> The question I have in all this is do I push forward and make
>> __skb_flow_dissect work with unaligned accesses, or do I back off and
>> put something equivilent to igb/ixgbe_get_headlen functions in the
>> kernel in order to deal with the unaligned accesses as they had no
>> issues with them since they were only concerned with getting the header
>> length and kept all accesses 16b aligned.
>>
> I see nothing wrong dealing with unaligned accesses, as these helpers
> are nop on x86 or CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y arches.

Still it means possibly hurting performance on those platforms that 
don't have it defined.

If I just use get_unaligned that is pretty easy in terms of cleanup for 
the ports and IPv4 addresses, the IPv6 will still be a significant 
hurdle to overcome though.

Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 10, 2014, 5:58 p.m. UTC | #10
From: Alexander Duyck <alexander.h.duyck@redhat.com>
Date: Fri, 10 Oct 2014 09:50:17 -0700

> If I just use get_unaligned that is pretty easy in terms of cleanup
> for the ports and IPv4 addresses, the IPv6 will still be a significant
> hurdle to overcome though.

Actually, it's not that simple.

When the compiler sees things like "th->doff" it will load the 32-bit
word that 4-bit field contains and extract the value using shifts and
masking.

So we might need to sprinkle a "attribute((packed))" here and there
to make it work.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander H Duyck Oct. 10, 2014, 6:02 p.m. UTC | #11
On 10/10/2014 10:58 AM, David Miller wrote:
> From: Alexander Duyck <alexander.h.duyck@redhat.com>
> Date: Fri, 10 Oct 2014 09:50:17 -0700
>
>> If I just use get_unaligned that is pretty easy in terms of cleanup
>> for the ports and IPv4 addresses, the IPv6 will still be a significant
>> hurdle to overcome though.
> Actually, it's not that simple.
>
> When the compiler sees things like "th->doff" it will load the 32-bit
> word that 4-bit field contains and extract the value using shifts and
> masking.
>
> So we might need to sprinkle a "attribute((packed))" here and there
> to make it work.

I'll do some digging.  I know I had this working in igb/ixgbe before so
I probably just need to add a few small tweaks to resolve the remaining
issues for v4 of the patch.

Thanks,

Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 10, 2014, 6:14 p.m. UTC | #12
From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Fri, 10 Oct 2014 11:02:04 -0700

> On 10/10/2014 10:58 AM, David Miller wrote:
>> From: Alexander Duyck <alexander.h.duyck@redhat.com>
>> Date: Fri, 10 Oct 2014 09:50:17 -0700
>>
>>> If I just use get_unaligned that is pretty easy in terms of cleanup
>>> for the ports and IPv4 addresses, the IPv6 will still be a significant
>>> hurdle to overcome though.
>> Actually, it's not that simple.
>>
>> When the compiler sees things like "th->doff" it will load the 32-bit
>> word that 4-bit field contains and extract the value using shifts and
>> masking.
>>
>> So we might need to sprinkle a "attribute((packed))" here and there
>> to make it work.
> 
> I'll do some digging.  I know I had this working in igb/ixgbe before so
> I probably just need to add a few small tweaks to resolve the remaining
> issues for v4 of the patch.

Ok, I think v3 + resolving the th->doff thing will get rid of everything.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 10, 2014, 6:15 p.m. UTC | #13
From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Fri, 10 Oct 2014 11:02:04 -0700

> On 10/10/2014 10:58 AM, David Miller wrote:
>> From: Alexander Duyck <alexander.h.duyck@redhat.com>
>> Date: Fri, 10 Oct 2014 09:50:17 -0700
>>
>>> If I just use get_unaligned that is pretty easy in terms of cleanup
>>> for the ports and IPv4 addresses, the IPv6 will still be a significant
>>> hurdle to overcome though.
>> Actually, it's not that simple.
>>
>> When the compiler sees things like "th->doff" it will load the 32-bit
>> word that 4-bit field contains and extract the value using shifts and
>> masking.
>>
>> So we might need to sprinkle a "attribute((packed))" here and there
>> to make it work.
> 
> I'll do some digging.  I know I had this working in igb/ixgbe before so
> I probably just need to add a few small tweaks to resolve the remaining
> issues for v4 of the patch.

Your original code works because you do things like "byte[12] & 0xf0" to
extract these fields.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 10, 2014, 6:22 p.m. UTC | #14
From: David Miller <davem@davemloft.net>
Date: Fri, 10 Oct 2014 14:15:59 -0400 (EDT)

> Your original code works because you do things like "byte[12] & 0xf0" to
> extract these fields.

Changing that th->doff sequence to instead be:

		const u8 *bp;
		u8 buf[13];

		bp = __skb_header_pointer(skb, poff, sizeof(buf),
					  data, hlen, &buf);
		if (!bp)
			return poff;

		poff += max_t(u32, sizeof(struct tcphdr), (bp[12] & 0xf0) >> 2);
		break;

on top of your v3 patch works for me.

Please double-check my calculations.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Duyck Oct. 10, 2014, 6:53 p.m. UTC | #15
On 10/10/2014 11:22 AM, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Fri, 10 Oct 2014 14:15:59 -0400 (EDT)
>
>> Your original code works because you do things like "byte[12] & 0xf0" to
>> extract these fields.
> Changing that th->doff sequence to instead be:
>
> 		const u8 *bp;
> 		u8 buf[13];
>
> 		bp = __skb_header_pointer(skb, poff, sizeof(buf),
> 					  data, hlen, &buf);
> 		if (!bp)
> 			return poff;
>
> 		poff += max_t(u32, sizeof(struct tcphdr), (bp[12] & 0xf0) >> 2);
> 		break;
>
> on top of your v3 patch works for me.
>
> Please double-check my calculations.

Any reason why you are grabbing all 13 bytes instead of just the 1 we 
care about?  Seems like we could just use a u8 buf instead of the array 
since we are only grabbing doff.

Thanks,

Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 10, 2014, 7:32 p.m. UTC | #16
From: Alexander Duyck <alexander.h.duyck@redhat.com>
Date: Fri, 10 Oct 2014 11:53:35 -0700

> On 10/10/2014 11:22 AM, David Miller wrote:
>> From: David Miller <davem@davemloft.net>
>> Date: Fri, 10 Oct 2014 14:15:59 -0400 (EDT)
>>
>>> Your original code works because you do things like "byte[12] & 0xf0"
>>> to
>>> extract these fields.
>> Changing that th->doff sequence to instead be:
>>
>> 		const u8 *bp;
>> 		u8 buf[13];
>>
>> 		bp = __skb_header_pointer(skb, poff, sizeof(buf),
>> 					  data, hlen, &buf);
>> 		if (!bp)
>> 			return poff;
>>
>> 		poff += max_t(u32, sizeof(struct tcphdr), (bp[12] & 0xf0) >> 2);
>> 		break;
>>
>> on top of your v3 patch works for me.
>>
>> Please double-check my calculations.
> 
> Any reason why you are grabbing all 13 bytes instead of just the 1 we
> care about?  Seems like we could just use a u8 buf instead of the
> array since we are only grabbing doff.

No reason, just a thinko, my brain implemented this as if skb_header_pointer
worked like skb_pull :-/
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight Oct. 13, 2014, 8:32 a.m. UTC | #17
From: David Miller
> From: Alexander Duyck <alexander.h.duyck@redhat.com>
> Date: Fri, 10 Oct 2014 09:50:17 -0700
> 
> > If I just use get_unaligned that is pretty easy in terms of cleanup
> > for the ports and IPv4 addresses, the IPv6 will still be a significant
> > hurdle to overcome though.
> 
> Actually, it's not that simple.
> 
> When the compiler sees things like "th->doff" it will load the 32-bit
> word that 4-bit field contains and extract the value using shifts and
> masking.
> 
> So we might need to sprinkle a "attribute((packed))" here and there
> to make it work.

Marking a structure 'packed' forces the compiler to generate byte accesses.
It is enough to mark the 32bit members with __attribute__((aligned(2)))
(or use a typedef for u32 that includes that attribute).
Then the compiler will use 16bit accesses for that field.

	David



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index c9ac06c..326eb3d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2993,7 +2993,7 @@  static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb,
 		return false;
 	}
 	if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34 && proto >= 0)
-		fk->ports = skb_flow_get_ports(skb, noff, proto);
+		skb_flow_get_ports(fk, skb, noff, proto);
 
 	return true;
 }
diff --git a/include/net/flow_keys.h b/include/net/flow_keys.h
index 7ee2df0..66235f4 100644
--- a/include/net/flow_keys.h
+++ b/include/net/flow_keys.h
@@ -33,11 +33,13 @@  static inline bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys
 {
 	return __skb_flow_dissect(skb, flow, NULL, 0, 0, 0);
 }
-__be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto,
-			    void *data, int hlen_proto);
-static inline __be32 skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto)
+void __skb_flow_get_ports(struct flow_keys *flow, const struct sk_buff *skb,
+			  int thoff, u8 ip_proto, void *data, int hlen_proto);
+static inline void skb_flow_get_ports(struct flow_keys *flow,
+				      const struct sk_buff *skb, int thoff,
+				      u8 ip_proto)
 {
-	return __skb_flow_get_ports(skb, thoff, ip_proto, NULL, 0);
+	__skb_flow_get_ports(flow, skb, thoff, ip_proto, NULL, 0);
 }
 u32 flow_hash_from_keys(struct flow_keys *keys);
 unsigned int flow_get_hlen(const unsigned char *data, unsigned int max_len,
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 8560dea..baf8fe3 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -28,6 +28,7 @@  static void iph_to_flow_copy_addrs(struct flow_keys *flow, const struct iphdr *i
 
 /**
  * __skb_flow_get_ports - extract the upper layer ports and return them
+ * @flow: Flow key to place port informatin into
  * @skb: sk_buff to extract the ports from
  * @thoff: transport header offset
  * @ip_proto: protocol for which to get port offset
@@ -37,8 +38,8 @@  static void iph_to_flow_copy_addrs(struct flow_keys *flow, const struct iphdr *i
  * The function will try to retrieve the ports at offset thoff + poff where poff
  * is the protocol port offset returned from proto_ports_offset
  */
-__be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto,
-			    void *data, int hlen)
+void __skb_flow_get_ports(struct flow_keys *flow, const struct sk_buff *skb,
+			  int thoff, u8 ip_proto, void *data, int hlen)
 {
 	int poff = proto_ports_offset(ip_proto);
 
@@ -48,15 +49,19 @@  __be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto,
 	}
 
 	if (poff >= 0) {
-		__be32 *ports, _ports;
+		__be32 *ports;
 
 		ports = __skb_header_pointer(skb, thoff + poff,
-					     sizeof(_ports), data, hlen, &_ports);
-		if (ports)
-			return *ports;
+					     sizeof(flow->ports), data, hlen,
+					     &flow->ports);
+		if (ports) {
+			if (&flow->ports != ports)
+				memcpy(&flow->ports, ports, sizeof(flow->ports));
+			return;
+		}
 	}
 
-	return 0;
+	memset(&flow->ports, 0, sizeof(flow->ports));
 }
 EXPORT_SYMBOL(__skb_flow_get_ports);
 
@@ -231,7 +236,7 @@  ipv6:
 
 	flow->n_proto = proto;
 	flow->ip_proto = ip_proto;
-	flow->ports = __skb_flow_get_ports(skb, nhoff, ip_proto, data, hlen);
+	__skb_flow_get_ports(flow, skb, nhoff, ip_proto, data, hlen);
 	flow->thoff = (u16) nhoff;
 
 	return true;