diff mbox

[net-next] sctp: fix kfree static array pointer in sctp_sysctl_net_unregister

Message ID 536B349A.7020306@huawei.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

wangweidong May 8, 2014, 7:39 a.m. UTC
As commit efb842c45e("sctp: optimize the sctp_sysctl_net_register"),
we don't kmemdup a sysctl_table for init_net, so the 
init_net->sctp.sysctl_header->ctl_table_arg points to sctp_net_table
which is a static array pointer. So when doing sctp_sysctl_net_unregister,
it will free sctp_net_table, then we will get a NULL pointer dereference
like that:

[  262.948220] BUG: unable to handle kernel NULL pointer dereference at 000000000000006c
[  262.948232] IP: [<ffffffff81144b70>] kfree+0x80/0x420
[  262.948260] PGD db80a067 PUD dae12067 PMD 0
[  262.948268] Oops: 0000 [#1] SMP
[  262.948273] Modules linked in: sctp(-) crc32c_generic libcrc32c
...
[  262.948338] task: ffff8800db830190 ti: ffff8800dad00000 task.ti: ffff8800dad00000
[  262.948344] RIP: 0010:[<ffffffff81144b70>]  [<ffffffff81144b70>] kfree+0x80/0x420
[  262.948353] RSP: 0018:ffff8800dad01d88  EFLAGS: 00010046
[  262.948358] RAX: 0100000000000000 RBX: ffffffffa0227940 RCX: ffffea0000707888
[  262.948363] RDX: ffffea0000707888 RSI: 0000000000000001 RDI: ffffffffa0227940
[  262.948369] RBP: ffff8800dad01de8 R08: 0000000000000000 R09: ffff8800d9e983a9
[  262.948374] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0227940
[  262.948380] R13: ffffffff8187cfc0 R14: 0000000000000000 R15: ffffffff8187da10
[  262.948386] FS:  00007fa2a2658700(0000) GS:ffff880112800000(0000) knlGS:0000000000000000
[  262.948394] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  262.948400] CR2: 000000000000006c CR3: 00000000cddc0000 CR4: 00000000000006e0
[  262.948410] Stack:
[  262.948413]  ffff8800dad01da8 0000000000000286 0000000020227940 ffffffffa0227940
[  262.948422]  ffff8800dad01dd8 ffffffff811b7fa1 ffffffffa0227940 ffffffffa0227940
[  262.948431]  ffffffff8187d960 ffffffff8187cfc0 ffffffff8187d960 ffffffff8187da10
[  262.948440] Call Trace:
[  262.948457]  [<ffffffff811b7fa1>] ? unregister_sysctl_table+0x51/0xa0
[  262.948476]  [<ffffffffa020d1a1>] sctp_sysctl_net_unregister+0x21/0x30 [sctp]
[  262.948490]  [<ffffffffa020ef6d>] sctp_net_exit+0x12d/0x150 [sctp]
[  262.948512]  [<ffffffff81394f49>] ops_exit_list+0x39/0x60
[  262.948522]  [<ffffffff813951ed>] unregister_pernet_operations+0x3d/0x70
[  262.948530]  [<ffffffff81395292>] unregister_pernet_subsys+0x22/0x40
[  262.948544]  [<ffffffffa020efcc>] sctp_exit+0x3c/0x12d [sctp]
[  262.948562]  [<ffffffff810c5e04>] SyS_delete_module+0x194/0x210
[  262.948577]  [<ffffffff81240fde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  262.948587]  [<ffffffff815217a2>] system_call_fastpath+0x16/0x1b

So add a check net_namespace init_net before kfree the sysctl_table.

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
---
 net/sctp/sysctl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Neil Horman May 8, 2014, 11:10 a.m. UTC | #1
On Thu, May 08, 2014 at 03:39:06PM +0800, Wang Weidong wrote:
> As commit efb842c45e("sctp: optimize the sctp_sysctl_net_register"),
> we don't kmemdup a sysctl_table for init_net, so the 
> init_net->sctp.sysctl_header->ctl_table_arg points to sctp_net_table
> which is a static array pointer. So when doing sctp_sysctl_net_unregister,
> it will free sctp_net_table, then we will get a NULL pointer dereference
> like that:
> 
> [  262.948220] BUG: unable to handle kernel NULL pointer dereference at 000000000000006c
> [  262.948232] IP: [<ffffffff81144b70>] kfree+0x80/0x420
> [  262.948260] PGD db80a067 PUD dae12067 PMD 0
> [  262.948268] Oops: 0000 [#1] SMP
> [  262.948273] Modules linked in: sctp(-) crc32c_generic libcrc32c
> ...
> [  262.948338] task: ffff8800db830190 ti: ffff8800dad00000 task.ti: ffff8800dad00000
> [  262.948344] RIP: 0010:[<ffffffff81144b70>]  [<ffffffff81144b70>] kfree+0x80/0x420
> [  262.948353] RSP: 0018:ffff8800dad01d88  EFLAGS: 00010046
> [  262.948358] RAX: 0100000000000000 RBX: ffffffffa0227940 RCX: ffffea0000707888
> [  262.948363] RDX: ffffea0000707888 RSI: 0000000000000001 RDI: ffffffffa0227940
> [  262.948369] RBP: ffff8800dad01de8 R08: 0000000000000000 R09: ffff8800d9e983a9
> [  262.948374] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0227940
> [  262.948380] R13: ffffffff8187cfc0 R14: 0000000000000000 R15: ffffffff8187da10
> [  262.948386] FS:  00007fa2a2658700(0000) GS:ffff880112800000(0000) knlGS:0000000000000000
> [  262.948394] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  262.948400] CR2: 000000000000006c CR3: 00000000cddc0000 CR4: 00000000000006e0
> [  262.948410] Stack:
> [  262.948413]  ffff8800dad01da8 0000000000000286 0000000020227940 ffffffffa0227940
> [  262.948422]  ffff8800dad01dd8 ffffffff811b7fa1 ffffffffa0227940 ffffffffa0227940
> [  262.948431]  ffffffff8187d960 ffffffff8187cfc0 ffffffff8187d960 ffffffff8187da10
> [  262.948440] Call Trace:
> [  262.948457]  [<ffffffff811b7fa1>] ? unregister_sysctl_table+0x51/0xa0
> [  262.948476]  [<ffffffffa020d1a1>] sctp_sysctl_net_unregister+0x21/0x30 [sctp]
> [  262.948490]  [<ffffffffa020ef6d>] sctp_net_exit+0x12d/0x150 [sctp]
> [  262.948512]  [<ffffffff81394f49>] ops_exit_list+0x39/0x60
> [  262.948522]  [<ffffffff813951ed>] unregister_pernet_operations+0x3d/0x70
> [  262.948530]  [<ffffffff81395292>] unregister_pernet_subsys+0x22/0x40
> [  262.948544]  [<ffffffffa020efcc>] sctp_exit+0x3c/0x12d [sctp]
> [  262.948562]  [<ffffffff810c5e04>] SyS_delete_module+0x194/0x210
> [  262.948577]  [<ffffffff81240fde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [  262.948587]  [<ffffffff815217a2>] system_call_fastpath+0x16/0x1b
> 
> So add a check net_namespace init_net before kfree the sysctl_table.
> 
> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
> ---
>  net/sctp/sysctl.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
> index c82fdc1..844d2b0 100644
> --- a/net/sctp/sysctl.c
> +++ b/net/sctp/sysctl.c
> @@ -459,7 +459,8 @@ void sctp_sysctl_net_unregister(struct net *net)
>  
>  	table = net->sctp.sysctl_header->ctl_table_arg;
>  	unregister_net_sysctl_table(net->sctp.sysctl_header);
> -	kfree(table);
> +	if (!net_eq(net, &init_net))
> +		kfree(table);
>  }
>  
>  static struct ctl_table_header *sctp_sysctl_header;
> -- 
> 1.7.12
> 
> 
> 
The size of of the sysctl table is 1.6k at the moment. Is it really worth having
to special case this in two location just to save that space?  It almost seems
better just to revert the origonal commit.

Neil

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
wangweidong May 8, 2014, 11:26 a.m. UTC | #2
On 2014/5/8 19:10, Neil Horman wrote:
> On Thu, May 08, 2014 at 03:39:06PM +0800, Wang Weidong wrote:
>> As commit efb842c45e("sctp: optimize the sctp_sysctl_net_register"),
>> we don't kmemdup a sysctl_table for init_net, so the 
>> init_net->sctp.sysctl_header->ctl_table_arg points to sctp_net_table
>> which is a static array pointer. So when doing sctp_sysctl_net_unregister,
>> it will free sctp_net_table, then we will get a NULL pointer dereference
>> like that:
>>
>> [  262.948220] BUG: unable to handle kernel NULL pointer dereference at 000000000000006c
>> [  262.948232] IP: [<ffffffff81144b70>] kfree+0x80/0x420
>> [  262.948260] PGD db80a067 PUD dae12067 PMD 0
>> [  262.948268] Oops: 0000 [#1] SMP
>> [  262.948273] Modules linked in: sctp(-) crc32c_generic libcrc32c
>> ...
>> [  262.948338] task: ffff8800db830190 ti: ffff8800dad00000 task.ti: ffff8800dad00000
>> [  262.948344] RIP: 0010:[<ffffffff81144b70>]  [<ffffffff81144b70>] kfree+0x80/0x420
>> [  262.948353] RSP: 0018:ffff8800dad01d88  EFLAGS: 00010046
>> [  262.948358] RAX: 0100000000000000 RBX: ffffffffa0227940 RCX: ffffea0000707888
>> [  262.948363] RDX: ffffea0000707888 RSI: 0000000000000001 RDI: ffffffffa0227940
>> [  262.948369] RBP: ffff8800dad01de8 R08: 0000000000000000 R09: ffff8800d9e983a9
>> [  262.948374] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0227940
>> [  262.948380] R13: ffffffff8187cfc0 R14: 0000000000000000 R15: ffffffff8187da10
>> [  262.948386] FS:  00007fa2a2658700(0000) GS:ffff880112800000(0000) knlGS:0000000000000000
>> [  262.948394] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [  262.948400] CR2: 000000000000006c CR3: 00000000cddc0000 CR4: 00000000000006e0
>> [  262.948410] Stack:
>> [  262.948413]  ffff8800dad01da8 0000000000000286 0000000020227940 ffffffffa0227940
>> [  262.948422]  ffff8800dad01dd8 ffffffff811b7fa1 ffffffffa0227940 ffffffffa0227940
>> [  262.948431]  ffffffff8187d960 ffffffff8187cfc0 ffffffff8187d960 ffffffff8187da10
>> [  262.948440] Call Trace:
>> [  262.948457]  [<ffffffff811b7fa1>] ? unregister_sysctl_table+0x51/0xa0
>> [  262.948476]  [<ffffffffa020d1a1>] sctp_sysctl_net_unregister+0x21/0x30 [sctp]
>> [  262.948490]  [<ffffffffa020ef6d>] sctp_net_exit+0x12d/0x150 [sctp]
>> [  262.948512]  [<ffffffff81394f49>] ops_exit_list+0x39/0x60
>> [  262.948522]  [<ffffffff813951ed>] unregister_pernet_operations+0x3d/0x70
>> [  262.948530]  [<ffffffff81395292>] unregister_pernet_subsys+0x22/0x40
>> [  262.948544]  [<ffffffffa020efcc>] sctp_exit+0x3c/0x12d [sctp]
>> [  262.948562]  [<ffffffff810c5e04>] SyS_delete_module+0x194/0x210
>> [  262.948577]  [<ffffffff81240fde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>> [  262.948587]  [<ffffffff815217a2>] system_call_fastpath+0x16/0x1b
>>
>> So add a check net_namespace init_net before kfree the sysctl_table.
>>
>> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
>> ---
>>  net/sctp/sysctl.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
>> index c82fdc1..844d2b0 100644
>> --- a/net/sctp/sysctl.c
>> +++ b/net/sctp/sysctl.c
>> @@ -459,7 +459,8 @@ void sctp_sysctl_net_unregister(struct net *net)
>>  
>>  	table = net->sctp.sysctl_header->ctl_table_arg;
>>  	unregister_net_sysctl_table(net->sctp.sysctl_header);
>> -	kfree(table);
>> +	if (!net_eq(net, &init_net))
>> +		kfree(table);
>>  }
>>  
>>  static struct ctl_table_header *sctp_sysctl_header;
>> -- 
>> 1.7.12
>>
>>
>>
> The size of of the sysctl table is 1.6k at the moment. Is it really worth having
> to special case this in two location just to save that space?  It almost seems
> better just to revert the origonal commit.
> 
I am not sure for that. In kernel, some do with this while others don't.
Add the special case, avoid to do kmemdup and reinitialization as well.

Regards
Wang

> Neil
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman May 8, 2014, 11:47 a.m. UTC | #3
On Thu, May 08, 2014 at 07:26:31PM +0800, Wang Weidong wrote:
> On 2014/5/8 19:10, Neil Horman wrote:
> > On Thu, May 08, 2014 at 03:39:06PM +0800, Wang Weidong wrote:
> >> As commit efb842c45e("sctp: optimize the sctp_sysctl_net_register"),
> >> we don't kmemdup a sysctl_table for init_net, so the 
> >> init_net->sctp.sysctl_header->ctl_table_arg points to sctp_net_table
> >> which is a static array pointer. So when doing sctp_sysctl_net_unregister,
> >> it will free sctp_net_table, then we will get a NULL pointer dereference
> >> like that:
> >>
> >> [  262.948220] BUG: unable to handle kernel NULL pointer dereference at 000000000000006c
> >> [  262.948232] IP: [<ffffffff81144b70>] kfree+0x80/0x420
> >> [  262.948260] PGD db80a067 PUD dae12067 PMD 0
> >> [  262.948268] Oops: 0000 [#1] SMP
> >> [  262.948273] Modules linked in: sctp(-) crc32c_generic libcrc32c
> >> ...
> >> [  262.948338] task: ffff8800db830190 ti: ffff8800dad00000 task.ti: ffff8800dad00000
> >> [  262.948344] RIP: 0010:[<ffffffff81144b70>]  [<ffffffff81144b70>] kfree+0x80/0x420
> >> [  262.948353] RSP: 0018:ffff8800dad01d88  EFLAGS: 00010046
> >> [  262.948358] RAX: 0100000000000000 RBX: ffffffffa0227940 RCX: ffffea0000707888
> >> [  262.948363] RDX: ffffea0000707888 RSI: 0000000000000001 RDI: ffffffffa0227940
> >> [  262.948369] RBP: ffff8800dad01de8 R08: 0000000000000000 R09: ffff8800d9e983a9
> >> [  262.948374] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0227940
> >> [  262.948380] R13: ffffffff8187cfc0 R14: 0000000000000000 R15: ffffffff8187da10
> >> [  262.948386] FS:  00007fa2a2658700(0000) GS:ffff880112800000(0000) knlGS:0000000000000000
> >> [  262.948394] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> [  262.948400] CR2: 000000000000006c CR3: 00000000cddc0000 CR4: 00000000000006e0
> >> [  262.948410] Stack:
> >> [  262.948413]  ffff8800dad01da8 0000000000000286 0000000020227940 ffffffffa0227940
> >> [  262.948422]  ffff8800dad01dd8 ffffffff811b7fa1 ffffffffa0227940 ffffffffa0227940
> >> [  262.948431]  ffffffff8187d960 ffffffff8187cfc0 ffffffff8187d960 ffffffff8187da10
> >> [  262.948440] Call Trace:
> >> [  262.948457]  [<ffffffff811b7fa1>] ? unregister_sysctl_table+0x51/0xa0
> >> [  262.948476]  [<ffffffffa020d1a1>] sctp_sysctl_net_unregister+0x21/0x30 [sctp]
> >> [  262.948490]  [<ffffffffa020ef6d>] sctp_net_exit+0x12d/0x150 [sctp]
> >> [  262.948512]  [<ffffffff81394f49>] ops_exit_list+0x39/0x60
> >> [  262.948522]  [<ffffffff813951ed>] unregister_pernet_operations+0x3d/0x70
> >> [  262.948530]  [<ffffffff81395292>] unregister_pernet_subsys+0x22/0x40
> >> [  262.948544]  [<ffffffffa020efcc>] sctp_exit+0x3c/0x12d [sctp]
> >> [  262.948562]  [<ffffffff810c5e04>] SyS_delete_module+0x194/0x210
> >> [  262.948577]  [<ffffffff81240fde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> >> [  262.948587]  [<ffffffff815217a2>] system_call_fastpath+0x16/0x1b
> >>
> >> So add a check net_namespace init_net before kfree the sysctl_table.
> >>
> >> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
> >> ---
> >>  net/sctp/sysctl.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
> >> index c82fdc1..844d2b0 100644
> >> --- a/net/sctp/sysctl.c
> >> +++ b/net/sctp/sysctl.c
> >> @@ -459,7 +459,8 @@ void sctp_sysctl_net_unregister(struct net *net)
> >>  
> >>  	table = net->sctp.sysctl_header->ctl_table_arg;
> >>  	unregister_net_sysctl_table(net->sctp.sysctl_header);
> >> -	kfree(table);
> >> +	if (!net_eq(net, &init_net))
> >> +		kfree(table);
> >>  }
> >>  
> >>  static struct ctl_table_header *sctp_sysctl_header;
> >> -- 
> >> 1.7.12
> >>
> >>
> >>
> > The size of of the sysctl table is 1.6k at the moment. Is it really worth having
> > to special case this in two location just to save that space?  It almost seems
> > better just to revert the origonal commit.
> > 
> I am not sure for that. In kernel, some do with this while others don't.
> Add the special case, avoid to do kmemdup and reinitialization as well.
> 
What?  No, the size of a ctl_table entry is about 64 bytes, and there are 25
entries in the table, thats 1.6k per kmemdup.  Your origonal patch avoids
kmemdup for init_net, saving us 1.6k, but creating the need to special case it
in at least 3 places now.  I'm argunig that its maybe better to just accept the
additional 1.6k, and not have to worry about the special casing.
Neil

> Regards
> Wang
> 
> > Neil
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > .
> > 
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
wangweidong May 8, 2014, 11:55 a.m. UTC | #4
On 2014/5/8 19:47, Neil Horman wrote:
> On Thu, May 08, 2014 at 07:26:31PM +0800, Wang Weidong wrote:
>> On 2014/5/8 19:10, Neil Horman wrote:
>>> On Thu, May 08, 2014 at 03:39:06PM +0800, Wang Weidong wrote:
>>>> As commit efb842c45e("sctp: optimize the sctp_sysctl_net_register"),
>>>> we don't kmemdup a sysctl_table for init_net, so the 
>>>> init_net->sctp.sysctl_header->ctl_table_arg points to sctp_net_table
>>>> which is a static array pointer. So when doing sctp_sysctl_net_unregister,
>>>> it will free sctp_net_table, then we will get a NULL pointer dereference
>>>> like that:
>>>>
>>>> [  262.948220] BUG: unable to handle kernel NULL pointer dereference at 000000000000006c
>>>> [  262.948232] IP: [<ffffffff81144b70>] kfree+0x80/0x420
>>>> [  262.948260] PGD db80a067 PUD dae12067 PMD 0
>>>> [  262.948268] Oops: 0000 [#1] SMP
>>>> [  262.948273] Modules linked in: sctp(-) crc32c_generic libcrc32c
>>>> ...
>>>> [  262.948338] task: ffff8800db830190 ti: ffff8800dad00000 task.ti: ffff8800dad00000
>>>> [  262.948344] RIP: 0010:[<ffffffff81144b70>]  [<ffffffff81144b70>] kfree+0x80/0x420
>>>> [  262.948353] RSP: 0018:ffff8800dad01d88  EFLAGS: 00010046
>>>> [  262.948358] RAX: 0100000000000000 RBX: ffffffffa0227940 RCX: ffffea0000707888
>>>> [  262.948363] RDX: ffffea0000707888 RSI: 0000000000000001 RDI: ffffffffa0227940
>>>> [  262.948369] RBP: ffff8800dad01de8 R08: 0000000000000000 R09: ffff8800d9e983a9
>>>> [  262.948374] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0227940
>>>> [  262.948380] R13: ffffffff8187cfc0 R14: 0000000000000000 R15: ffffffff8187da10
>>>> [  262.948386] FS:  00007fa2a2658700(0000) GS:ffff880112800000(0000) knlGS:0000000000000000
>>>> [  262.948394] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>> [  262.948400] CR2: 000000000000006c CR3: 00000000cddc0000 CR4: 00000000000006e0
>>>> [  262.948410] Stack:
>>>> [  262.948413]  ffff8800dad01da8 0000000000000286 0000000020227940 ffffffffa0227940
>>>> [  262.948422]  ffff8800dad01dd8 ffffffff811b7fa1 ffffffffa0227940 ffffffffa0227940
>>>> [  262.948431]  ffffffff8187d960 ffffffff8187cfc0 ffffffff8187d960 ffffffff8187da10
>>>> [  262.948440] Call Trace:
>>>> [  262.948457]  [<ffffffff811b7fa1>] ? unregister_sysctl_table+0x51/0xa0
>>>> [  262.948476]  [<ffffffffa020d1a1>] sctp_sysctl_net_unregister+0x21/0x30 [sctp]
>>>> [  262.948490]  [<ffffffffa020ef6d>] sctp_net_exit+0x12d/0x150 [sctp]
>>>> [  262.948512]  [<ffffffff81394f49>] ops_exit_list+0x39/0x60
>>>> [  262.948522]  [<ffffffff813951ed>] unregister_pernet_operations+0x3d/0x70
>>>> [  262.948530]  [<ffffffff81395292>] unregister_pernet_subsys+0x22/0x40
>>>> [  262.948544]  [<ffffffffa020efcc>] sctp_exit+0x3c/0x12d [sctp]
>>>> [  262.948562]  [<ffffffff810c5e04>] SyS_delete_module+0x194/0x210
>>>> [  262.948577]  [<ffffffff81240fde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>>>> [  262.948587]  [<ffffffff815217a2>] system_call_fastpath+0x16/0x1b
>>>>
>>>> So add a check net_namespace init_net before kfree the sysctl_table.
>>>>
>>>> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
>>>> ---
>>>>  net/sctp/sysctl.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
>>>> index c82fdc1..844d2b0 100644
>>>> --- a/net/sctp/sysctl.c
>>>> +++ b/net/sctp/sysctl.c
>>>> @@ -459,7 +459,8 @@ void sctp_sysctl_net_unregister(struct net *net)
>>>>  
>>>>  	table = net->sctp.sysctl_header->ctl_table_arg;
>>>>  	unregister_net_sysctl_table(net->sctp.sysctl_header);
>>>> -	kfree(table);
>>>> +	if (!net_eq(net, &init_net))
>>>> +		kfree(table);
>>>>  }
>>>>  
>>>>  static struct ctl_table_header *sctp_sysctl_header;
>>>> -- 
>>>> 1.7.12
>>>>
>>>>
>>>>
>>> The size of of the sysctl table is 1.6k at the moment. Is it really worth having
>>> to special case this in two location just to save that space?  It almost seems
>>> better just to revert the origonal commit.
>>>
>> I am not sure for that. In kernel, some do with this while others don't.
>> Add the special case, avoid to do kmemdup and reinitialization as well.
>>
> What?  No, the size of a ctl_table entry is about 64 bytes, and there are 25
> entries in the table, thats 1.6k per kmemdup.  Your origonal patch avoids
> kmemdup for init_net, saving us 1.6k, but creating the need to special case it
> in at least 3 places now.  I'm argunig that its maybe better to just accept the
> additional 1.6k, and not have to worry about the special casing.
> Neil
> 

OK, Got it.
About "In kernel, some do with this while others don't.", It means that
in ipv4_sysctl_init_net/xfrm4_net_init and so on,  take the init_net case
into account.

I will send the revert patch.

Thanks
Wang

>> Regards
>> Wang
>>
>>> Neil
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> .
>>>
>>
>>
>>
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
index c82fdc1..844d2b0 100644
--- a/net/sctp/sysctl.c
+++ b/net/sctp/sysctl.c
@@ -459,7 +459,8 @@  void sctp_sysctl_net_unregister(struct net *net)
 
 	table = net->sctp.sysctl_header->ctl_table_arg;
 	unregister_net_sysctl_table(net->sctp.sysctl_header);
-	kfree(table);
+	if (!net_eq(net, &init_net))
+		kfree(table);
 }
 
 static struct ctl_table_header *sctp_sysctl_header;