diff mbox

net: Add back alignment for size for __alloc_skb

Message ID 20111102224317.GM31337@atomide.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Tony Lindgren Nov. 2, 2011, 10:43 p.m. UTC
Commit 87fb4b7b533073eeeaed0b6bf7c2328995f6c075 (net: more
accurate skb truesize) changed the alignment of size. This
can cause problems at least on some machines with NFS root:

Unhandled fault: alignment exception (0x801) at 0xc183a43a
Internal error: : 801 [#1] PREEMPT
Modules linked in:
CPU: 0    Not tainted  (3.1.0-08784-g5eeee4a #733)
pc : [<c02fbba0>]    lr : [<c02fbb9c>]    psr: 60000013
sp : c180fef8  ip : 00000000  fp : c181f580
r10: 00000000  r9 : c044b28c  r8 : 00000001
r7 : c183a3a0  r6 : c1835be0  r5 : c183a412  r4 : 000001f2
r3 : 00000000  r2 : 00000000  r1 : ffffffe6  r0 : c183a43a
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 0005317f  Table: 10004000  DAC: 00000017
Process swapper (pid: 1, stack limit = 0xc180e270)
Stack: (0xc180fef8 to 0xc1810000)
fee0:                                                       00000024 00000000
ff00: 00000000 c183b9c0 c183b8e0 c044b28c c0507ccc c019dfc4 c180ff2c c0503cf8
ff20: c180ff4c c180ff4c 00000000 c1835420 c182c740 c18349c0 c05233c0 00000000
ff40: 00000000 c00e6bb8 c180e000 00000000 c04dd82c c0507e7c c050cc18 c183b9c0
ff60: c05233c0 00000000 00000000 c01f34f4 c0430d70 c019d364 c04dd898 c04dd898
ff80: c04dd82c c0507e7c c180e000 00000000 c04c584c c01f4918 c04dd898 c04dd82c
ffa0: c04ddd28 c180e000 00000000 c0008758 c181fa60 3231d82c 00000037 00000000
ffc0: 00000000 c04dd898 c04dd82c c04ddd28 00000013 00000000 00000000 00000000
ffe0: 00000000 c04b2224 00000000 c04b21a0 c001056c c001056c 00000000 00000000
Function entered at [<c02fbba0>] from [<c019dfc4>]
Function entered at [<c019dfc4>] from [<c01f34f4>]
Function entered at [<c01f34f4>] from [<c01f4918>]
Function entered at [<c01f4918>] from [<c0008758>]
Function entered at [<c0008758>] from [<c04b2224>]
Function entered at [<c04b2224>] from [<c001056c>]
Code: e1a00005 e3a01028 ebfa7cb0 e35a0000 (e5858028)

Here PC is at __alloc_skb and &shinfo->dataref is unaligned because
skb->end can be unaligned without this patch.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet Nov. 2, 2011, 10:55 p.m. UTC | #1
On 02/11/2011 23:43, Tony Lindgren wrote:

> Commit 87fb4b7b533073eeeaed0b6bf7c2328995f6c075 (net: more
> accurate skb truesize) changed the alignment of size. This
> can cause problems at least on some machines with NFS root:
> 
> Unhandled fault: alignment exception (0x801) at 0xc183a43a
> Internal error: : 801 [#1] PREEMPT
> Modules linked in:
> CPU: 0    Not tainted  (3.1.0-08784-g5eeee4a #733)
> pc : [<c02fbba0>]    lr : [<c02fbb9c>]    psr: 60000013
> sp : c180fef8  ip : 00000000  fp : c181f580
> r10: 00000000  r9 : c044b28c  r8 : 00000001
> r7 : c183a3a0  r6 : c1835be0  r5 : c183a412  r4 : 000001f2
> r3 : 00000000  r2 : 00000000  r1 : ffffffe6  r0 : c183a43a
> Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
> Control: 0005317f  Table: 10004000  DAC: 00000017
> Process swapper (pid: 1, stack limit = 0xc180e270)
> Stack: (0xc180fef8 to 0xc1810000)
> fee0:                                                       00000024 00000000
> ff00: 00000000 c183b9c0 c183b8e0 c044b28c c0507ccc c019dfc4 c180ff2c c0503cf8
> ff20: c180ff4c c180ff4c 00000000 c1835420 c182c740 c18349c0 c05233c0 00000000
> ff40: 00000000 c00e6bb8 c180e000 00000000 c04dd82c c0507e7c c050cc18 c183b9c0
> ff60: c05233c0 00000000 00000000 c01f34f4 c0430d70 c019d364 c04dd898 c04dd898
> ff80: c04dd82c c0507e7c c180e000 00000000 c04c584c c01f4918 c04dd898 c04dd82c
> ffa0: c04ddd28 c180e000 00000000 c0008758 c181fa60 3231d82c 00000037 00000000
> ffc0: 00000000 c04dd898 c04dd82c c04ddd28 00000013 00000000 00000000 00000000
> ffe0: 00000000 c04b2224 00000000 c04b21a0 c001056c c001056c 00000000 00000000
> Function entered at [<c02fbba0>] from [<c019dfc4>]
> Function entered at [<c019dfc4>] from [<c01f34f4>]
> Function entered at [<c01f34f4>] from [<c01f4918>]
> Function entered at [<c01f4918>] from [<c0008758>]
> Function entered at [<c0008758>] from [<c04b2224>]
> Function entered at [<c04b2224>] from [<c001056c>]
> Code: e1a00005 e3a01028 ebfa7cb0 e35a0000 (e5858028)
> 
> Here PC is at __alloc_skb and &shinfo->dataref is unaligned because
> skb->end can be unaligned without this patch.
> 
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Tony Lindgren <tony@atomide.com>
> 
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -189,6 +189,7 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
>  	 * aligned memory blocks, unless SLUB/SLAB debug is enabled.
>  	 * Both skb->head and skb_shared_info are cache line aligned.
>  	 */
> +	size = SKB_DATA_ALIGN(size);
>  	size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>  	data = kmalloc_node_track_caller(size, gfp_mask, node);
>  	if (!data)



Hmm, I dont think this is the right way to fix the bug.

There is a problem with your kmalloc() or alignments on your architecture.

What is the SMP_CACHE_BYTES value ?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 2, 2011, 11:09 p.m. UTC | #2
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 02 Nov 2011 23:55:11 +0100

> There is a problem with your kmalloc() or alignments on your architecture.
> 
> What is the SMP_CACHE_BYTES value ?

kmalloc() behavior doesn't have anything to do with this bug.

The issue is calculation of skb->end, which is based upon calculated
'size' variable.

skb->end determines alignment of skb_shared_info, which is where the
alignment problem is occuring for Tony.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 2, 2011, 11:13 p.m. UTC | #3
From: David Miller <davem@davemloft.net>
Date: Wed, 02 Nov 2011 19:09:55 -0400 (EDT)

> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 02 Nov 2011 23:55:11 +0100
> 
>> There is a problem with your kmalloc() or alignments on your architecture.
>> 
>> What is the SMP_CACHE_BYTES value ?
> 
> kmalloc() behavior doesn't have anything to do with this bug.

Sorry, I take that back, I see now how ksize() and friends are being
used here.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Nov. 2, 2011, 11:14 p.m. UTC | #4
On 03/11/2011 00:09, David Miller wrote:

> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 02 Nov 2011 23:55:11 +0100
> 
>> There is a problem with your kmalloc() or alignments on your architecture.
>>
>> What is the SMP_CACHE_BYTES value ?
> 
> kmalloc() behavior doesn't have anything to do with this bug.
> 
> The issue is calculation of skb->end, which is based upon calculated
> 'size' variable.
> 
> skb->end determines alignment of skb_shared_info, which is where the
> alignment problem is occuring for Tony.
> 


I understood that David, but check the code, and you can see that exact
skb->end depends also on ksize()

So maybe the right fix is to make sure skb->end is properly aligned, say
with SLOB

So a more generic fix is welcomed.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 2, 2011, 11:17 p.m. UTC | #5
c> 
> The issue is calculation of skb->end, which is based upon calculated
> 'size' variable.
> 
> skb->end determines alignment of skb_shared_info, which is where the
> alignment problem is occuring for Tony.

Right, and SMP_CACHE_BYTES setting should save us in any case.

For ARM, SMP_CACHE_BYTES seems to be set to L1_CACHE_BYTES which in
turn is set via ARM_L1_CACHE_SHIFT which can be set seemingly to any
value but the defaults are 5 and 6 which should be OK.

So unless Tony is using a non-standard setting of ARM_L1_CACHE_SHIFT,
this report is a bit mysterious.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Nov. 2, 2011, 11:19 p.m. UTC | #6
* David Miller <davem@davemloft.net> [111102 15:42]:
> c> 
> > The issue is calculation of skb->end, which is based upon calculated
> > 'size' variable.
> > 
> > skb->end determines alignment of skb_shared_info, which is where the
> > alignment problem is occuring for Tony.
> 
> Right, and SMP_CACHE_BYTES setting should save us in any case.
> 
> For ARM, SMP_CACHE_BYTES seems to be set to L1_CACHE_BYTES which in
> turn is set via ARM_L1_CACHE_SHIFT which can be set seemingly to any
> value but the defaults are 5 and 6 which should be OK.
> 
> So unless Tony is using a non-standard setting of ARM_L1_CACHE_SHIFT,
> this report is a bit mysterious.

This is happening at least with omap1_defconfig. In that case we have
ARM_L1_CACHE_SHIFT=5.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Nov. 2, 2011, 11:20 p.m. UTC | #7
On 03/11/2011 00:19, Tony Lindgren wrote:

> * David Miller <davem@davemloft.net> [111102 15:42]:
>> c> 
>>> The issue is calculation of skb->end, which is based upon calculated
>>> 'size' variable.
>>>
>>> skb->end determines alignment of skb_shared_info, which is where the
>>> alignment problem is occuring for Tony.
>>
>> Right, and SMP_CACHE_BYTES setting should save us in any case.
>>
>> For ARM, SMP_CACHE_BYTES seems to be set to L1_CACHE_BYTES which in
>> turn is set via ARM_L1_CACHE_SHIFT which can be set seemingly to any
>> value but the defaults are 5 and 6 which should be OK.
>>
>> So unless Tony is using a non-standard setting of ARM_L1_CACHE_SHIFT,
>> this report is a bit mysterious.
> 
> This is happening at least with omap1_defconfig. In that case we have
> ARM_L1_CACHE_SHIFT=5.
> 
> Regards,
> 
> Tony


Do you use SLOB, SLUB or SLAB ?

Thanks

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 2, 2011, 11:22 p.m. UTC | #8
From: Tony Lindgren <tony@atomide.com>
Date: Wed, 2 Nov 2011 16:19:17 -0700

> * David Miller <davem@davemloft.net> [111102 15:42]:
>> c> 
>> > The issue is calculation of skb->end, which is based upon calculated
>> > 'size' variable.
>> > 
>> > skb->end determines alignment of skb_shared_info, which is where the
>> > alignment problem is occuring for Tony.
>> 
>> Right, and SMP_CACHE_BYTES setting should save us in any case.
>> 
>> For ARM, SMP_CACHE_BYTES seems to be set to L1_CACHE_BYTES which in
>> turn is set via ARM_L1_CACHE_SHIFT which can be set seemingly to any
>> value but the defaults are 5 and 6 which should be OK.
>> 
>> So unless Tony is using a non-standard setting of ARM_L1_CACHE_SHIFT,
>> this report is a bit mysterious.
> 
> This is happening at least with omap1_defconfig. In that case we have
> ARM_L1_CACHE_SHIFT=5.

Ok, so ksize(x) gives an odd value for whichever allocator you are
using.  Which is the point Eric was trying to make.  It becomes
clearernow.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Nov. 2, 2011, 11:24 p.m. UTC | #9
* Eric Dumazet <eric.dumazet@gmail.com> [111102 15:46]:
> On 03/11/2011 00:19, Tony Lindgren wrote:
> 
> > * David Miller <davem@davemloft.net> [111102 15:42]:
> >> c> 
> >>> The issue is calculation of skb->end, which is based upon calculated
> >>> 'size' variable.
> >>>
> >>> skb->end determines alignment of skb_shared_info, which is where the
> >>> alignment problem is occuring for Tony.
> >>
> >> Right, and SMP_CACHE_BYTES setting should save us in any case.
> >>
> >> For ARM, SMP_CACHE_BYTES seems to be set to L1_CACHE_BYTES which in
> >> turn is set via ARM_L1_CACHE_SHIFT which can be set seemingly to any
> >> value but the defaults are 5 and 6 which should be OK.
> >>
> >> So unless Tony is using a non-standard setting of ARM_L1_CACHE_SHIFT,
> >> this report is a bit mysterious.
> > 
> > This is happening at least with omap1_defconfig. In that case we have
> > ARM_L1_CACHE_SHIFT=5.
> > 
> > Regards,
> > 
> > Tony
> 
> 
> Do you use SLOB, SLUB or SLAB ?

Seems to be SLOB for omap1_defconfig.

Tony
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Nov. 2, 2011, 11:31 p.m. UTC | #10
On 03/11/2011 00:24, Tony Lindgren wrote:


> 
> Seems to be SLOB for omap1_defconfig.
> 
> Tony


OK this makes sense now

Your patch is absolutely needed, I completely forgot about SLOB :(

since, kmalloc(386) on SLOB gives exactly ksize=386 bytes, not nearest
power of two.

[   60.305763] malloc(size=385)->ffff880112c11e38 ksize=386 -> nsize=2
[   60.305921] malloc(size=385)->ffff88007c92ce28 ksize=386 -> nsize=2
[   60.306898] malloc(size=656)->ffff88007c44ad28 ksize=656 -> nsize=272
[   60.325385] malloc(size=656)->ffff88007c575868 ksize=656 -> nsize=272
[   60.325531] malloc(size=656)->ffff88011c777230 ksize=656 -> nsize=272
[   60.325701] malloc(size=656)->ffff880114011008 ksize=656 -> nsize=272
[   60.346716] malloc(size=385)->ffff880114142008 ksize=386 -> nsize=2
[   60.346900] malloc(size=385)->ffff88011c777690 ksize=386 -> nsize=2

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Thanks !

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -189,6 +189,7 @@  struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
 	 * aligned memory blocks, unless SLUB/SLAB debug is enabled.
 	 * Both skb->head and skb_shared_info are cache line aligned.
 	 */
+	size = SKB_DATA_ALIGN(size);
 	size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 	data = kmalloc_node_track_caller(size, gfp_mask, node);
 	if (!data)