Message ID | 1378402887-17331-1-git-send-email-jesse@nicira.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Jesse Gross <jesse@nicira.com> Date: Thu, 5 Sep 2013 10:41:27 -0700 > -} __aligned(__alignof__(long)); > +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */ This kind of stuff drives me crazy. If the issue is the type, therefore at least use an expression that mentions the type explicitly. And mention the actual type that matters. "long" isn't it. These half-hearted attempts to fix this problem are just papering around the issue and do not address the real issue directly. I don't want to apply changes like this. You don't want "8 byte alignment", you want the alignment necessary for the types contained in the structure to be accessed without generating exceptions. You might also want the "quickest" alignment, and there are ways to express that as well. This structure doesn't even contain a "long". "long" isn't even 64-bit on some architectures, and you certainly have 64-bit types in this thing (__be64). Let's start from the beginning, what are you actually trying to accomplish here that isn't handled automatically by the compiler already? Is it "proper" alignment? That is handled automatically by the compiler. Is it "fast" alignment? In what context and in what ways is that important in this specific case? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem@davemloft.net> wrote: > From: Jesse Gross <jesse@nicira.com> > Date: Thu, 5 Sep 2013 10:41:27 -0700 > >> -} __aligned(__alignof__(long)); >> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */ > > This kind of stuff drives me crazy. > > If the issue is the type, therefore at least use an expression that > mentions the type explicitly. And mention the actual type that > matters. "long" isn't it. 'long' actually is the real type here. When doing comparisons, this structure is being accessed as a byte array in 'long' sized chunks, not by its members. Therefore, the compiler's alignment does not necessarily correspond to anything for this purpose. It could be a struct full of u16's and we would still want to access it in chunks of 'long'. To completely honest, I think the correct alignment should be sizeof(long) because I know that 'long' is not always 8 bytes on all architectures. However, you made the point before that this could break the alignment of the 64-bit values on architectures where 'long' is 32 bits wide, so 8 bytes is the generic solution. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Jesse Gross <jesse@nicira.com> Date: Thu, 5 Sep 2013 11:36:19 -0700 > On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem@davemloft.net> wrote: >> From: Jesse Gross <jesse@nicira.com> >> Date: Thu, 5 Sep 2013 10:41:27 -0700 >> >>> -} __aligned(__alignof__(long)); >>> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */ >> >> This kind of stuff drives me crazy. >> >> If the issue is the type, therefore at least use an expression that >> mentions the type explicitly. And mention the actual type that >> matters. "long" isn't it. > > 'long' actually is the real type here. > > When doing comparisons, this structure is being accessed as a byte > array in 'long' sized chunks, not by its members. Therefore, the > compiler's alignment does not necessarily correspond to anything for > this purpose. It could be a struct full of u16's and we would still > want to access it in chunks of 'long'. > > To completely honest, I think the correct alignment should be > sizeof(long) because I know that 'long' is not always 8 bytes on all > architectures. However, you made the point before that this could > break the alignment of the 64-bit values on architectures where 'long' > is 32 bits wide, so 8 bytes is the generic solution. Look at net/core/flow.c:flow_key_compare(). And then we annotate struct flowi with } __attribute__((__aligned__(BITS_PER_LONG/8))); Don't reinvent the wheel, either mimick how existing code does this kind of thing or provide a justification for doing it differently and update the existing cases to match and be consistent. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 5, 2013 at 11:40 AM, David Miller <davem@davemloft.net> wrote: > From: Jesse Gross <jesse@nicira.com> > Date: Thu, 5 Sep 2013 11:36:19 -0700 > >> On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem@davemloft.net> wrote: >>> From: Jesse Gross <jesse@nicira.com> >>> Date: Thu, 5 Sep 2013 10:41:27 -0700 >>> >>>> -} __aligned(__alignof__(long)); >>>> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */ >>> >>> This kind of stuff drives me crazy. >>> >>> If the issue is the type, therefore at least use an expression that >>> mentions the type explicitly. And mention the actual type that >>> matters. "long" isn't it. >> >> 'long' actually is the real type here. >> >> When doing comparisons, this structure is being accessed as a byte >> array in 'long' sized chunks, not by its members. Therefore, the >> compiler's alignment does not necessarily correspond to anything for >> this purpose. It could be a struct full of u16's and we would still >> want to access it in chunks of 'long'. >> >> To completely honest, I think the correct alignment should be >> sizeof(long) because I know that 'long' is not always 8 bytes on all >> architectures. However, you made the point before that this could >> break the alignment of the 64-bit values on architectures where 'long' >> is 32 bits wide, so 8 bytes is the generic solution. > > Look at net/core/flow.c:flow_key_compare(). > > And then we annotate struct flowi with > > } __attribute__((__aligned__(BITS_PER_LONG/8))); > > Don't reinvent the wheel, either mimick how existing code does > this kind of thing or provide a justification for doing it differently > and update the existing cases to match and be consistent. Sure, I'll send a new patch to use BITS_PER_LONG. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 5, 2013 at 8:40 PM, David Miller <davem@davemloft.net> wrote: > When doing comparisons, this structure is being accessed as a byte > array in 'long' sized chunks, not by its members. Therefore, the Like ... an optimized memcmp() does? Which handles all (un)alignment well? >> To completely honest, I think the correct alignment should be >> sizeof(long) because I know that 'long' is not always 8 bytes on all >> architectures. However, you made the point before that this could >> break the alignment of the 64-bit values on architectures where 'long' >> is 32 bits wide, so 8 bytes is the generic solution. > > Look at net/core/flow.c:flow_key_compare(). > > And then we annotate struct flowi with > > } __attribute__((__aligned__(BITS_PER_LONG/8))); > > Don't reinvent the wheel, either mimick how existing code does > this kind of thing or provide a justification for doing it differently > and update the existing cases to match and be consistent. This may still break the alignment of 64-bit values on 32-bit architectures where __alignof__(u64) == 8. Now let's look at the comparison function: static bool __cmp_key(const struct sw_flow_key *key1, const struct sw_flow_key *key2, int key_start, int key_end) { const long *cp1 = (long *)((u8 *)key1 + key_start); const long *cp2 = (long *)((u8 *)key2 + key_start); long diffs = 0; int i; for (i = key_start; i < key_end; i += sizeof(long)) diffs |= *cp1++ ^ *cp2++; return diffs == 0; } Why don't you abort the loop if a difference is found? Or is this a security-related struct where you want to protect against timing attacks? Furthermore, as you compare the raw bytes, I hope you always initialize all gaps in the struct to zero. E.g. there's a 2-byte gap immediately after "ip", as the next member is 32-bit (except op m68k, where the 32-bit member will be 2-byte aligned). Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 5, 2013 at 12:20 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote: > Why don't you abort the loop if a difference is found? > Or is this a security-related struct where you want to protect against > timing attacks? It's more expensive to test for a difference on every iteration in the common case where the comparison succeeds. > Furthermore, as you compare the raw bytes, I hope you always > initialize all gaps in the struct to zero. > E.g. there's a 2-byte gap immediately after "ip", as the next member > is 32-bit (except op m68k, where the 32-bit member will be 2-byte aligned). It's initialized to zero. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2013-09-05 at 14:40 -0400, David Miller wrote: > From: Jesse Gross <jesse@nicira.com> > Date: Thu, 5 Sep 2013 11:36:19 -0700 > > On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem@davemloft.net> wrote: > >> From: Jesse Gross <jesse@nicira.com> > >> Date: Thu, 5 Sep 2013 10:41:27 -0700 > >> > >>> -} __aligned(__alignof__(long)); > >>> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */ > >> > >> This kind of stuff drives me crazy. > >> > >> If the issue is the type, therefore at least use an expression that > >> mentions the type explicitly. And mention the actual type that > >> matters. "long" isn't it. > > > > 'long' actually is the real type here. > > > > When doing comparisons, this structure is being accessed as a byte > > array in 'long' sized chunks, not by its members. Therefore, the > > compiler's alignment does not necessarily correspond to anything for > > this purpose. It could be a struct full of u16's and we would still > > want to access it in chunks of 'long'. > > > > To completely honest, I think the correct alignment should be > > sizeof(long) because I know that 'long' is not always 8 bytes on all > > architectures. However, you made the point before that this could > > break the alignment of the 64-bit values on architectures where 'long' > > is 32 bits wide, so 8 bytes is the generic solution. > > Look at net/core/flow.c:flow_key_compare(). > > And then we annotate struct flowi with > > } __attribute__((__aligned__(BITS_PER_LONG/8))); > > Don't reinvent the wheel, either mimick how existing code does > this kind of thing or provide a justification for doing it differently > and update the existing cases to match and be consistent. I think using the BITS_PER_LONG #define is pretty odd as it's set using a test for CONFIG_64BIT and that __aligned(__alignof__(long)) or __aligned(sizeof(long)) may be simpler to read. That first would allow architectures that can read any long on an arbitrary boundary to pack the structure as densely as possible. That may not work if the entire structure is always read with via long *. If so, my choice would be to standardize using } __aligned(sizeof(long)); Currently, what happens to a struct that isn't actually a multiple of long bytes? Are these structs guaranteed to be zero padded right now? Also, what happens for structs like: struct foo { u8 bar[6]; long baz; u8 qux[2]; } __aligned(sizeof(long)); where the padding may or may not exist if __packed is also specified? btw: all the __attribute__((__aligned__(foo)...)) and __aligned(...) / __packed(...) uses could be rewritten to a more standard style. $ grep -rP --include=*.[ch] -oh "\b__aligned[^\(\n_]*\s*\([^;\n]+[;\n]" * | \ sed -r -e 's/\s//g' -e 's/\)[^\)]*$/\)/' | sort | uniq -c 3 __aligned(1) 3 __aligned(16) 1 __aligned(1<<WORK_STRUCT_FLAG_BITS) 12 __aligned(2) 7 __aligned(32) 32 __aligned(4) 8 __aligned(64) 15 __aligned(8) 1 __aligned(__alignof__(long)) 3 __aligned(__CACHE_WRITEBACK_GRANULE) 1 __aligned((IOR_DMA_OFFSET_BITS/IOR_BPC)) 1 __aligned((IOR_PHYS_OFFSET_BITS/IOR_BPC)) 1 __aligned(LOG_ALIGN) 2 __aligned(NETDEV_ALIGN) 10 __aligned(PAGE_SIZE) 2 __aligned(sizeof(s64)) 4 __aligned(sizeof(u16)) 2 __aligned(sizeof(u32)) 7 __aligned(sizeof(void*)) $ grep -rP --include=*.[ch] -oh "\b__attribute.*aligned[^\(\n]*\s*\([^;\n]+[;\n]" * | \ sed -r -e 's/\s//g' -e 's/\)[^\)]*$/\)/' | sort | uniq -c 2 __attribute__((aligned(0x100))) 1 __attribute__((aligned(0x2000))) 2 __attribute__((aligned(0x20000))) 1 __attribute__((__aligned__(0x400))) 3 __attribute__((aligned(0x80))) 1 __attribute__((aligned(1024),packed)) 2 __attribute__((__aligned__(128))) 9 __attribute__((__aligned__(16))) 30 __attribute__((aligned(16))) 2 __attribute__((aligned(1),packed)) 1 __attribute__((aligned(2))) 1 __attribute__((aligned(2048))) 14 __attribute__((aligned(256))) 1 __attribute__((aligned(256),packed)) 5 __attribute__((aligned(2),packed)) 1 __attribute__((aligned(2*sizeof(long)))) 1 __attribute((aligned(2*sizeof(unsignedlong)))) 3 __attribute__((__aligned__(32))) 53 __attribute__((aligned(32))) 2 __attribute__((aligned(32),packed)) 2 __attribute__((__aligned__(4))) 22 __attribute__((aligned(4))) 2 __attribute__((aligned(4096))) 1 __attribute__((aligned(4<<10))) 1 __attribute__((aligned(4),packed)) 18 __attribute__((aligned(64))) 1 __attribute__((aligned(65536))) 15 __attribute__((__aligned__(8))) 89 __attribute__((aligned(8))) 2 __attribute__((aligned(a))) 5 __attribute__((aligned(__alignof__(structebt_replace)))) 1 __attribute__((aligned(__alignof__(structhash_alg_common)))) 1 __attribute__((aligned(__alignof__(u32)))) 1 __attribute__((aligned(ATOMIC_HASH_L2_SIZE*4))) 4 __attribute__((__aligned__(BITS_PER_LONG/8))) 1 __attribute__((aligned(BUFFER_SIZE))) 2 __attribute__((aligned(CHIP_L2_LINE_SIZE()))) 1 __attribute__((aligned(CPPI_DESCRIPTOR_ALIGN))) 1 __attribute__((aligned(DM_IO_MAX_REGIONS))) 1 __attribute__((aligned(HV_PAGE_TABLE_ALIGN))) 1 __attribute__((aligned(_K_SS_ALIGNSIZE))) 1 __attribute__((aligned(L1_CACHE_BYTES))) 3 __attribute__((aligned(L2_CACHE_BYTES))) 1 __attribute__((__aligned__(NETDEV_ALIGN))) 3 __attribute__((__aligned__(PADLOCK_ALIGNMENT))) 4 __attribute__((__aligned__(PAGE_SIZE))) 7 __attribute__((aligned(PAGE_SIZE))) 1 __attribute__((aligned(PS3_BMP_MINALIGN))) 2 __attribute__((aligned(sizeof(Elf##size##_Word)))) 1 __attribute__((aligned(sizeof(int)))) 1 __attribute__((aligned(sizeof(__kernel_size_t)))) 1 __attribute__((aligned(sizeof(kernel_ulong_t)))) 7 __attribute__((aligned(sizeof(long)))) 1 __attribute__((aligned(sizeof(s64)))) 1 __attribute__((__aligned__(sizeof(structxencomm_mini)))) 1 __attribute__((aligned(sizeof(__u64)))) 8 __attribute__((aligned(sizeof(uint64_t)))) 6 __attribute__((aligned(sizeof(unsignedlong)))) 1 __attribute__((__aligned__(sizeof(void*)))) 1 __attribute__((aligned(sizeof(void*)))) 7 __attribute__((__aligned__(SMP_CACHE_BYTES))) 6 __attribute__((aligned(SMP_CACHE_BYTES))) 1 __attribute__((aligned(stackalign))) 1 __attribute__((aligned(THREAD_SIZE))) 1 __attribute__((aligned(TSB_ENTRY_ALIGNMENT))) 1 __attribute__((aligned(XCHAL_CP0_SA_ALIGN))) 1 __attribute__((aligned(XCHAL_CP1_SA_ALIGN))) 1 __attribute__((aligned(XCHAL_CP2_SA_ALIGN))) 1 __attribute__((aligned(XCHAL_CP3_SA_ALIGN))) 1 __attribute__((aligned(XCHAL_CP4_SA_ALIGN))) 1 __attribute__((aligned(XCHAL_CP5_SA_ALIGN))) 1 __attribute__((aligned(XCHAL_CP6_SA_ALIGN))) 1 __attribute__((aligned(XCHAL_CP7_SA_ALIGN))) 2 __attribute__((aligned(XCHAL_NCP_SA_ALIGN))) 1 __attribute__((packed,aligned(1024))) 1 __attribute__((packed,__aligned__(16))) 4 __attribute__((packed,aligned(16))) 7 __attribute__((packed,aligned(2))) 1 __attribute__((packed,aligned(2048))) 4 __attribute__((packed,aligned(256))) 177 __attribute__((packed,aligned(4))) 2 __attribute__((packed,aligned(4096))) 2 __attribute__((packed,aligned(64))) 47 __attribute__((packed,aligned(8))) 2 __attribute__((packed,aligned(PAGE_SIZE))) 1 __attribute__((packed,aligned(PMCRAID_IOADL_ALIGNMENT))) 2 __attribute__((packed,aligned(PMCRAID_IOARCB_ALIGNMENT))) 1 __attribute__((__section__(".data..vm0.pgd"),aligned(PAGE_SIZE))) 1 __attribute__((__section__(".data..vm0.pmd"),aligned(PAGE_SIZE))) 1 __attribute__((__section__(".data..vm0.pte"),aligned(PAGE_SIZE))) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Geert Uytterhoeven <geert@linux-m68k.org> Date: Thu, 5 Sep 2013 21:20:16 +0200 > Why don't you abort the loop if a difference is found? We are optimizing for a match. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -} __aligned(__alignof__(long)); > +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */ Don't you really want __aligned(MAX(__alignof__(__be64), sizeof (long))) ? David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c index ad1aeeb..fb36f85 100644 --- a/net/openvswitch/flow.c +++ b/net/openvswitch/flow.c @@ -1981,6 +1981,7 @@ nla_put_failure: * Returns zero if successful or a negative error code. */ int ovs_flow_init(void) { + BUILD_BUG_ON(__alignof__(struct sw_flow_key) % __alignof__(long)); BUILD_BUG_ON(sizeof(struct sw_flow_key) % sizeof(long)); flow_cache = kmem_cache_create("sw_flow", sizeof(struct sw_flow), 0, diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h index b65f885..3dc3b52 100644 --- a/net/openvswitch/flow.h +++ b/net/openvswitch/flow.h @@ -125,7 +125,7 @@ struct sw_flow_key { } nd; } ipv6; }; -} __aligned(__alignof__(long)); +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */ struct sw_flow { struct rcu_head rcu;
sw_flow_key alignment was declared as " __aligned(__alignof__(long))". However, this breaks on the m68k architecture where long is 32 bit in size but 16 bit aligned by default. This aligns to 8 bytes to ensure that long is always covered without reducing native alignment. It also adds an additional build check to catch any reduction in alignment. CC: Andy Zhou <azhou@nicira.com> Reported-by: Fengguang Wu <fengguan.wu@intel.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jesse Gross <jesse@nicira.com> --- net/openvswitch/flow.c | 1 + net/openvswitch/flow.h | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)