diff mbox

openvswitch: Fix alignment of struct sw_flow_key.

Message ID 1378402887-17331-1-git-send-email-jesse@nicira.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Jesse Gross Sept. 5, 2013, 5:41 p.m. UTC
sw_flow_key alignment was declared as " __aligned(__alignof__(long))".
However, this breaks on the m68k architecture where long is 32 bit in
size but 16 bit aligned by default. This aligns to 8 bytes to ensure
that long is always covered without reducing native alignment. It also
adds an additional build check to catch any reduction in alignment.

CC: Andy Zhou <azhou@nicira.com>
Reported-by: Fengguang Wu <fengguan.wu@intel.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jesse Gross <jesse@nicira.com>
---
 net/openvswitch/flow.c | 1 +
 net/openvswitch/flow.h | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

Comments

David Miller Sept. 5, 2013, 6:17 p.m. UTC | #1
From: Jesse Gross <jesse@nicira.com>
Date: Thu,  5 Sep 2013 10:41:27 -0700

> -} __aligned(__alignof__(long));
> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */

This kind of stuff drives me crazy.

If the issue is the type, therefore at least use an expression that
mentions the type explicitly.  And mention the actual type that
matters.  "long" isn't it.

These half-hearted attempts to fix this problem are just papering
around the issue and do not address the real issue directly.

I don't want to apply changes like this.

You don't want "8 byte alignment", you want the alignment necessary
for the types contained in the structure to be accessed without
generating exceptions.  You might also want the "quickest" alignment,
and there are ways to express that as well.

This structure doesn't even contain a "long".  "long" isn't even
64-bit on some architectures, and you certainly have 64-bit types in
this thing (__be64).

Let's start from the beginning, what are you actually trying to
accomplish here that isn't handled automatically by the compiler
already?

Is it "proper" alignment?  That is handled automatically by the
compiler.  Is it "fast" alignment?  In what context and in what ways
is that important in this specific case?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Gross Sept. 5, 2013, 6:36 p.m. UTC | #2
On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem@davemloft.net> wrote:
> From: Jesse Gross <jesse@nicira.com>
> Date: Thu,  5 Sep 2013 10:41:27 -0700
>
>> -} __aligned(__alignof__(long));
>> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */
>
> This kind of stuff drives me crazy.
>
> If the issue is the type, therefore at least use an expression that
> mentions the type explicitly.  And mention the actual type that
> matters.  "long" isn't it.

'long' actually is the real type here.

When doing comparisons, this structure is being accessed as a byte
array in 'long' sized chunks, not by its members. Therefore, the
compiler's alignment does not necessarily correspond to anything for
this purpose. It could be a struct full of u16's and we would still
want to access it in chunks of 'long'.

To completely honest, I think the correct alignment should be
sizeof(long) because I know that 'long' is not always 8 bytes on all
architectures. However, you made the point before that this could
break the alignment of the 64-bit values on architectures where 'long'
is 32 bits wide, so 8 bytes is the generic solution.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 5, 2013, 6:40 p.m. UTC | #3
From: Jesse Gross <jesse@nicira.com>
Date: Thu, 5 Sep 2013 11:36:19 -0700

> On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem@davemloft.net> wrote:
>> From: Jesse Gross <jesse@nicira.com>
>> Date: Thu,  5 Sep 2013 10:41:27 -0700
>>
>>> -} __aligned(__alignof__(long));
>>> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */
>>
>> This kind of stuff drives me crazy.
>>
>> If the issue is the type, therefore at least use an expression that
>> mentions the type explicitly.  And mention the actual type that
>> matters.  "long" isn't it.
> 
> 'long' actually is the real type here.
> 
> When doing comparisons, this structure is being accessed as a byte
> array in 'long' sized chunks, not by its members. Therefore, the
> compiler's alignment does not necessarily correspond to anything for
> this purpose. It could be a struct full of u16's and we would still
> want to access it in chunks of 'long'.
> 
> To completely honest, I think the correct alignment should be
> sizeof(long) because I know that 'long' is not always 8 bytes on all
> architectures. However, you made the point before that this could
> break the alignment of the 64-bit values on architectures where 'long'
> is 32 bits wide, so 8 bytes is the generic solution.

Look at net/core/flow.c:flow_key_compare().

And then we annotate struct flowi with

} __attribute__((__aligned__(BITS_PER_LONG/8)));

Don't reinvent the wheel, either mimick how existing code does
this kind of thing or provide a justification for doing it differently
and update the existing cases to match and be consistent.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Gross Sept. 5, 2013, 7:14 p.m. UTC | #4
On Thu, Sep 5, 2013 at 11:40 AM, David Miller <davem@davemloft.net> wrote:
> From: Jesse Gross <jesse@nicira.com>
> Date: Thu, 5 Sep 2013 11:36:19 -0700
>
>> On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem@davemloft.net> wrote:
>>> From: Jesse Gross <jesse@nicira.com>
>>> Date: Thu,  5 Sep 2013 10:41:27 -0700
>>>
>>>> -} __aligned(__alignof__(long));
>>>> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */
>>>
>>> This kind of stuff drives me crazy.
>>>
>>> If the issue is the type, therefore at least use an expression that
>>> mentions the type explicitly.  And mention the actual type that
>>> matters.  "long" isn't it.
>>
>> 'long' actually is the real type here.
>>
>> When doing comparisons, this structure is being accessed as a byte
>> array in 'long' sized chunks, not by its members. Therefore, the
>> compiler's alignment does not necessarily correspond to anything for
>> this purpose. It could be a struct full of u16's and we would still
>> want to access it in chunks of 'long'.
>>
>> To completely honest, I think the correct alignment should be
>> sizeof(long) because I know that 'long' is not always 8 bytes on all
>> architectures. However, you made the point before that this could
>> break the alignment of the 64-bit values on architectures where 'long'
>> is 32 bits wide, so 8 bytes is the generic solution.
>
> Look at net/core/flow.c:flow_key_compare().
>
> And then we annotate struct flowi with
>
> } __attribute__((__aligned__(BITS_PER_LONG/8)));
>
> Don't reinvent the wheel, either mimick how existing code does
> this kind of thing or provide a justification for doing it differently
> and update the existing cases to match and be consistent.

Sure, I'll send a new patch to use BITS_PER_LONG.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Geert Uytterhoeven Sept. 5, 2013, 7:20 p.m. UTC | #5
On Thu, Sep 5, 2013 at 8:40 PM, David Miller <davem@davemloft.net> wrote:
> When doing comparisons, this structure is being accessed as a byte
> array in 'long' sized chunks, not by its members. Therefore, the

Like ... an optimized memcmp() does? Which handles all (un)alignment
well?

>> To completely honest, I think the correct alignment should be
>> sizeof(long) because I know that 'long' is not always 8 bytes on all
>> architectures. However, you made the point before that this could
>> break the alignment of the 64-bit values on architectures where 'long'
>> is 32 bits wide, so 8 bytes is the generic solution.
>
> Look at net/core/flow.c:flow_key_compare().
>
> And then we annotate struct flowi with
>
> } __attribute__((__aligned__(BITS_PER_LONG/8)));
>
> Don't reinvent the wheel, either mimick how existing code does
> this kind of thing or provide a justification for doing it differently
> and update the existing cases to match and be consistent.

This may still break the alignment of 64-bit values on 32-bit architectures
where __alignof__(u64) == 8.

Now let's look at the comparison function:

static bool __cmp_key(const struct sw_flow_key *key1,
                const struct sw_flow_key *key2,  int key_start, int key_end)
{
        const long *cp1 = (long *)((u8 *)key1 + key_start);
        const long *cp2 = (long *)((u8 *)key2 + key_start);
        long diffs = 0;
        int i;

        for (i = key_start; i < key_end;  i += sizeof(long))
                diffs |= *cp1++ ^ *cp2++;

        return diffs == 0;
}

Why don't you abort the loop if a difference is found?
Or is this a security-related struct where you want to protect against
timing attacks?

Furthermore, as you compare the raw bytes, I hope you always
initialize all gaps in the struct to zero.
E.g. there's a 2-byte gap immediately after "ip", as the next member
is 32-bit (except op m68k, where the 32-bit member will be 2-byte aligned).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Gross Sept. 5, 2013, 7:25 p.m. UTC | #6
On Thu, Sep 5, 2013 at 12:20 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> Why don't you abort the loop if a difference is found?
> Or is this a security-related struct where you want to protect against
> timing attacks?

It's more expensive to test for a difference on every iteration in the
common case where the comparison succeeds.

> Furthermore, as you compare the raw bytes, I hope you always
> initialize all gaps in the struct to zero.
> E.g. there's a 2-byte gap immediately after "ip", as the next member
> is 32-bit (except op m68k, where the 32-bit member will be 2-byte aligned).

It's initialized to zero.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joe Perches Sept. 5, 2013, 7:47 p.m. UTC | #7
On Thu, 2013-09-05 at 14:40 -0400, David Miller wrote:
> From: Jesse Gross <jesse@nicira.com>
> Date: Thu, 5 Sep 2013 11:36:19 -0700
> > On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem@davemloft.net> wrote:
> >> From: Jesse Gross <jesse@nicira.com>
> >> Date: Thu,  5 Sep 2013 10:41:27 -0700
> >>
> >>> -} __aligned(__alignof__(long));
> >>> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */
> >>
> >> This kind of stuff drives me crazy.
> >>
> >> If the issue is the type, therefore at least use an expression that
> >> mentions the type explicitly.  And mention the actual type that
> >> matters.  "long" isn't it.
> > 
> > 'long' actually is the real type here.
> > 
> > When doing comparisons, this structure is being accessed as a byte
> > array in 'long' sized chunks, not by its members. Therefore, the
> > compiler's alignment does not necessarily correspond to anything for
> > this purpose. It could be a struct full of u16's and we would still
> > want to access it in chunks of 'long'.
> > 
> > To completely honest, I think the correct alignment should be
> > sizeof(long) because I know that 'long' is not always 8 bytes on all
> > architectures. However, you made the point before that this could
> > break the alignment of the 64-bit values on architectures where 'long'
> > is 32 bits wide, so 8 bytes is the generic solution.
> 
> Look at net/core/flow.c:flow_key_compare().
> 
> And then we annotate struct flowi with
> 
> } __attribute__((__aligned__(BITS_PER_LONG/8)));
> 
> Don't reinvent the wheel, either mimick how existing code does
> this kind of thing or provide a justification for doing it differently
> and update the existing cases to match and be consistent.

I think using the BITS_PER_LONG #define is pretty odd
as it's set using a test for CONFIG_64BIT and that

	__aligned(__alignof__(long))
or
	__aligned(sizeof(long))

may be simpler to read. That first would allow
architectures that can read any long on an arbitrary
boundary to pack the structure as densely as possible.
That may not work if the entire structure is always
read with via long *.

If so, my choice would be to standardize using

} __aligned(sizeof(long));

Currently, what happens to a struct that isn't
actually a multiple of long bytes?  Are these
structs guaranteed to be zero padded right now?

Also, what happens for structs like:

struct foo {
	u8	bar[6];
	long	baz;
	u8	qux[2];
} __aligned(sizeof(long));

where the padding may or may not exist if
__packed is also specified?

btw:

all the __attribute__((__aligned__(foo)...)) and
__aligned(...) / __packed(...) uses could be rewritten
to a more standard style.

$ grep -rP --include=*.[ch] -oh "\b__aligned[^\(\n_]*\s*\([^;\n]+[;\n]" * | \
  sed -r -e 's/\s//g' -e 's/\)[^\)]*$/\)/' | sort | uniq -c
      3 __aligned(1)
      3 __aligned(16)
      1 __aligned(1<<WORK_STRUCT_FLAG_BITS)
     12 __aligned(2)
      7 __aligned(32)
     32 __aligned(4)
      8 __aligned(64)
     15 __aligned(8)
      1 __aligned(__alignof__(long))
      3 __aligned(__CACHE_WRITEBACK_GRANULE)
      1 __aligned((IOR_DMA_OFFSET_BITS/IOR_BPC))
      1 __aligned((IOR_PHYS_OFFSET_BITS/IOR_BPC))
      1 __aligned(LOG_ALIGN)
      2 __aligned(NETDEV_ALIGN)
     10 __aligned(PAGE_SIZE)
      2 __aligned(sizeof(s64))
      4 __aligned(sizeof(u16))
      2 __aligned(sizeof(u32))
      7 __aligned(sizeof(void*))

$ grep -rP --include=*.[ch] -oh "\b__attribute.*aligned[^\(\n]*\s*\([^;\n]+[;\n]" * | \
  sed -r -e 's/\s//g' -e 's/\)[^\)]*$/\)/' | sort | uniq -c
      2 __attribute__((aligned(0x100)))
      1 __attribute__((aligned(0x2000)))
      2 __attribute__((aligned(0x20000)))
      1 __attribute__((__aligned__(0x400)))
      3 __attribute__((aligned(0x80)))
      1 __attribute__((aligned(1024),packed))
      2 __attribute__((__aligned__(128)))
      9 __attribute__((__aligned__(16)))
     30 __attribute__((aligned(16)))
      2 __attribute__((aligned(1),packed))
      1 __attribute__((aligned(2)))
      1 __attribute__((aligned(2048)))
     14 __attribute__((aligned(256)))
      1 __attribute__((aligned(256),packed))
      5 __attribute__((aligned(2),packed))
      1 __attribute__((aligned(2*sizeof(long))))
      1 __attribute((aligned(2*sizeof(unsignedlong))))
      3 __attribute__((__aligned__(32)))
     53 __attribute__((aligned(32)))
      2 __attribute__((aligned(32),packed))
      2 __attribute__((__aligned__(4)))
     22 __attribute__((aligned(4)))
      2 __attribute__((aligned(4096)))
      1 __attribute__((aligned(4<<10)))
      1 __attribute__((aligned(4),packed))
     18 __attribute__((aligned(64)))
      1 __attribute__((aligned(65536)))
     15 __attribute__((__aligned__(8)))
     89 __attribute__((aligned(8)))
      2 __attribute__((aligned(a)))
      5 __attribute__((aligned(__alignof__(structebt_replace))))
      1 __attribute__((aligned(__alignof__(structhash_alg_common))))
      1 __attribute__((aligned(__alignof__(u32))))
      1 __attribute__((aligned(ATOMIC_HASH_L2_SIZE*4)))
      4 __attribute__((__aligned__(BITS_PER_LONG/8)))
      1 __attribute__((aligned(BUFFER_SIZE)))
      2 __attribute__((aligned(CHIP_L2_LINE_SIZE())))
      1 __attribute__((aligned(CPPI_DESCRIPTOR_ALIGN)))
      1 __attribute__((aligned(DM_IO_MAX_REGIONS)))
      1 __attribute__((aligned(HV_PAGE_TABLE_ALIGN)))
      1 __attribute__((aligned(_K_SS_ALIGNSIZE)))
      1 __attribute__((aligned(L1_CACHE_BYTES)))
      3 __attribute__((aligned(L2_CACHE_BYTES)))
      1 __attribute__((__aligned__(NETDEV_ALIGN)))
      3 __attribute__((__aligned__(PADLOCK_ALIGNMENT)))
      4 __attribute__((__aligned__(PAGE_SIZE)))
      7 __attribute__((aligned(PAGE_SIZE)))
      1 __attribute__((aligned(PS3_BMP_MINALIGN)))
      2 __attribute__((aligned(sizeof(Elf##size##_Word))))
      1 __attribute__((aligned(sizeof(int))))
      1 __attribute__((aligned(sizeof(__kernel_size_t))))
      1 __attribute__((aligned(sizeof(kernel_ulong_t))))
      7 __attribute__((aligned(sizeof(long))))
      1 __attribute__((aligned(sizeof(s64))))
      1 __attribute__((__aligned__(sizeof(structxencomm_mini))))
      1 __attribute__((aligned(sizeof(__u64))))
      8 __attribute__((aligned(sizeof(uint64_t))))
      6 __attribute__((aligned(sizeof(unsignedlong))))
      1 __attribute__((__aligned__(sizeof(void*))))
      1 __attribute__((aligned(sizeof(void*))))
      7 __attribute__((__aligned__(SMP_CACHE_BYTES)))
      6 __attribute__((aligned(SMP_CACHE_BYTES)))
      1 __attribute__((aligned(stackalign)))
      1 __attribute__((aligned(THREAD_SIZE)))
      1 __attribute__((aligned(TSB_ENTRY_ALIGNMENT)))
      1 __attribute__((aligned(XCHAL_CP0_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP1_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP2_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP3_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP4_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP5_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP6_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP7_SA_ALIGN)))
      2 __attribute__((aligned(XCHAL_NCP_SA_ALIGN)))
      1 __attribute__((packed,aligned(1024)))
      1 __attribute__((packed,__aligned__(16)))
      4 __attribute__((packed,aligned(16)))
      7 __attribute__((packed,aligned(2)))
      1 __attribute__((packed,aligned(2048)))
      4 __attribute__((packed,aligned(256)))
    177 __attribute__((packed,aligned(4)))
      2 __attribute__((packed,aligned(4096)))
      2 __attribute__((packed,aligned(64)))
     47 __attribute__((packed,aligned(8)))
      2 __attribute__((packed,aligned(PAGE_SIZE)))
      1 __attribute__((packed,aligned(PMCRAID_IOADL_ALIGNMENT)))
      2 __attribute__((packed,aligned(PMCRAID_IOARCB_ALIGNMENT)))
      1 __attribute__((__section__(".data..vm0.pgd"),aligned(PAGE_SIZE)))
      1 __attribute__((__section__(".data..vm0.pmd"),aligned(PAGE_SIZE)))
      1 __attribute__((__section__(".data..vm0.pte"),aligned(PAGE_SIZE)))


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 5, 2013, 7:49 p.m. UTC | #8
From: Geert Uytterhoeven <geert@linux-m68k.org>
Date: Thu, 5 Sep 2013 21:20:16 +0200

> Why don't you abort the loop if a difference is found?

We are optimizing for a match.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight Sept. 6, 2013, 10:18 a.m. UTC | #9
> -} __aligned(__alignof__(long));
> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */

Don't you really want __aligned(MAX(__alignof__(__be64), sizeof (long))) ?

	David



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index ad1aeeb..fb36f85 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -1981,6 +1981,7 @@  nla_put_failure:
  * Returns zero if successful or a negative error code. */
 int ovs_flow_init(void)
 {
+	BUILD_BUG_ON(__alignof__(struct sw_flow_key) % __alignof__(long));
 	BUILD_BUG_ON(sizeof(struct sw_flow_key) % sizeof(long));
 
 	flow_cache = kmem_cache_create("sw_flow", sizeof(struct sw_flow), 0,
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index b65f885..3dc3b52 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -125,7 +125,7 @@  struct sw_flow_key {
 			} nd;
 		} ipv6;
 	};
-} __aligned(__alignof__(long));
+} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */
 
 struct sw_flow {
 	struct rcu_head rcu;