diff mbox

[RFC,net-next] ipvs: add missing lock in ip_vs_ftp_init_conn()

Message ID 1340890587-8169-1-git-send-email-xtfeng@gmail.com
State Not Applicable
Headers show

Commit Message

Xiaotian Feng June 28, 2012, 1:36 p.m. UTC
We met a kernel panic in 2.6.32.43 kernel:

[2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
<snip>
[2680311.849009] general protection fault: 0000 [#1] SMP
[2680311.853001] RIP: 0010:[<ffffffff815f155c>]  [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
[2680311.853001] RSP: 0018:ffff880028303e70  EFLAGS: 00010202
[2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
[2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
<snip>
[2680311.853001] Call Trace:
[2680311.853001]  <IRQ>
[2680311.853001]  [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
[2680311.853001]  [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
[2680311.853001]  [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
[2680311.853001]  [<ffffffff81049a13>] __do_softirq+0xb3/0x150
[2680311.853001]  [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
[2680311.853001]  [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
[2680311.853001]  [<ffffffff81049957>] irq_exit+0x77/0x80
[2680311.853001]  [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
[2680311.853001]  [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
[2680311.853001]  <EOI>
[2680311.853001]  [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
[2680311.853001]  [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
[2680311.853001]  [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
[2680311.853001]  [<ffffffff816d504d>] ? start_secondary+0x19d/0x280

rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
connection and result the general protect fault.

The "request for already hashed" warning, told us someone might change the connection flags
incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
and list_del() it, then kernel panic happened.

After code review, the only chance that kernel change connection flag without protection is
in ip_vs_ftp_init_conn().

Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net> 
---
 net/netfilter/ipvs/ip_vs_ftp.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

Comments

Julian Anastasov June 29, 2012, 12:17 a.m. UTC | #1
Hello,

On Thu, 28 Jun 2012, Xiaotian Feng wrote:

> We met a kernel panic in 2.6.32.43 kernel:
> 
> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
> <snip>
> [2680311.849009] general protection fault: 0000 [#1] SMP
> [2680311.853001] RIP: 0010:[<ffffffff815f155c>]  [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
> [2680311.853001] RSP: 0018:ffff880028303e70  EFLAGS: 00010202
> [2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
> [2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
> <snip>
> [2680311.853001] Call Trace:
> [2680311.853001]  <IRQ>
> [2680311.853001]  [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
> [2680311.853001]  [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
> [2680311.853001]  [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
> [2680311.853001]  [<ffffffff81049a13>] __do_softirq+0xb3/0x150
> [2680311.853001]  [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
> [2680311.853001]  [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
> [2680311.853001]  [<ffffffff81049957>] irq_exit+0x77/0x80
> [2680311.853001]  [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
> [2680311.853001]  [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
> [2680311.853001]  <EOI>
> [2680311.853001]  [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
> [2680311.853001]  [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
> [2680311.853001]  [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
> [2680311.853001]  [<ffffffff816d504d>] ? start_secondary+0x19d/0x280
> 
> rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
> connection and result the general protect fault.
> 
> The "request for already hashed" warning, told us someone might change the connection flags
> incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
> put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
> Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
> and list_del() it, then kernel panic happened.
> 
> After code review, the only chance that kernel change connection flag without protection is
> in ip_vs_ftp_init_conn().

	Hm, ip_vs_ftp_init_conn is called before 1st hashing,
from ip_vs_bind_app() in ip_vs_conn_new() before
ip_vs_conn_hash(). It should be another problem with
the flags. How different is IPVS in 2.6.32.43 compared to
recent kernels? If commit aea9d711 is present, I'm not
aware of other similar problems.

	May be you can post some details for your setup on
lvs-devel@vger.kernel.org. I assume FTP is used,
what about master-backup sync? Can you confirm that
this fix solves the problem or it happens in rare cases?

> Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
> Cc: Wensong Zhang <wensong@linux-vs.org>
> Cc: Simon Horman <horms@verge.net.au>
> Cc: Julian Anastasov <ja@ssi.bg>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: "David S. Miller" <davem@davemloft.net> 
> ---
>  net/netfilter/ipvs/ip_vs_ftp.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
> index b20b29c..c2bc264 100644
> --- a/net/netfilter/ipvs/ip_vs_ftp.c
> +++ b/net/netfilter/ipvs/ip_vs_ftp.c
> @@ -65,8 +65,10 @@ static int ip_vs_ftp_pasv;
>  static int
>  ip_vs_ftp_init_conn(struct ip_vs_app *app, struct ip_vs_conn *cp)
>  {
> +	spin_lock(&cp->lock);
>  	/* We use connection tracking for the command connection */
>  	cp->flags |= IP_VS_CONN_F_NFCT;
> +	spin_unlock(&cp->lock);
>  	return 0;
>  }
>  
> -- 
> 1.7.1

Regards

--
Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Simon Horman June 29, 2012, 12:34 a.m. UTC | #2
On Thu, Jun 28, 2012 at 09:36:27PM +0800, Xiaotian Feng wrote:
> We met a kernel panic in 2.6.32.43 kernel:
> 
> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
> <snip>
> [2680311.849009] general protection fault: 0000 [#1] SMP
> [2680311.853001] RIP: 0010:[<ffffffff815f155c>]  [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
> [2680311.853001] RSP: 0018:ffff880028303e70  EFLAGS: 00010202
> [2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
> [2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
> <snip>
> [2680311.853001] Call Trace:
> [2680311.853001]  <IRQ>
> [2680311.853001]  [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
> [2680311.853001]  [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
> [2680311.853001]  [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
> [2680311.853001]  [<ffffffff81049a13>] __do_softirq+0xb3/0x150
> [2680311.853001]  [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
> [2680311.853001]  [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
> [2680311.853001]  [<ffffffff81049957>] irq_exit+0x77/0x80
> [2680311.853001]  [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
> [2680311.853001]  [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
> [2680311.853001]  <EOI>
> [2680311.853001]  [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
> [2680311.853001]  [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
> [2680311.853001]  [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
> [2680311.853001]  [<ffffffff816d504d>] ? start_secondary+0x19d/0x280
> 
> rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
> connection and result the general protect fault.
> 
> The "request for already hashed" warning, told us someone might change the connection flags
> incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
> put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
> Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
> and list_del() it, then kernel panic happened.
> 
> After code review, the only chance that kernel change connection flag without protection is
> in ip_vs_ftp_init_conn().

Thanks for being thorough.

> Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
> Cc: Wensong Zhang <wensong@linux-vs.org>
> Cc: Simon Horman <horms@verge.net.au>
> Cc: Julian Anastasov <ja@ssi.bg>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: "David S. Miller" <davem@davemloft.net> 

Acked-by: Simon Horman <horms@verge.net.au>

Pablo, can you handle this directly through your tree?
I think it also needs to go to -stable.

> ---
>  net/netfilter/ipvs/ip_vs_ftp.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
> index b20b29c..c2bc264 100644
> --- a/net/netfilter/ipvs/ip_vs_ftp.c
> +++ b/net/netfilter/ipvs/ip_vs_ftp.c
> @@ -65,8 +65,10 @@ static int ip_vs_ftp_pasv;
>  static int
>  ip_vs_ftp_init_conn(struct ip_vs_app *app, struct ip_vs_conn *cp)
>  {
> +	spin_lock(&cp->lock);
>  	/* We use connection tracking for the command connection */
>  	cp->flags |= IP_VS_CONN_F_NFCT;
> +	spin_unlock(&cp->lock);
>  	return 0;
>  }
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiaotian Feng June 29, 2012, 1:50 a.m. UTC | #3
On Fri, Jun 29, 2012 at 8:17 AM, Julian Anastasov <ja@ssi.bg> wrote:
>
>        Hello,
>
> On Thu, 28 Jun 2012, Xiaotian Feng wrote:
>
>> We met a kernel panic in 2.6.32.43 kernel:
>>
>> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
>> <snip>
>> [2680311.849009] general protection fault: 0000 [#1] SMP
>> [2680311.853001] RIP: 0010:[<ffffffff815f155c>]  [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
>> [2680311.853001] RSP: 0018:ffff880028303e70  EFLAGS: 00010202
>> [2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
>> [2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
>> <snip>
>> [2680311.853001] Call Trace:
>> [2680311.853001]  <IRQ>
>> [2680311.853001]  [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
>> [2680311.853001]  [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
>> [2680311.853001]  [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
>> [2680311.853001]  [<ffffffff81049a13>] __do_softirq+0xb3/0x150
>> [2680311.853001]  [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
>> [2680311.853001]  [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
>> [2680311.853001]  [<ffffffff81049957>] irq_exit+0x77/0x80
>> [2680311.853001]  [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
>> [2680311.853001]  [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
>> [2680311.853001]  <EOI>
>> [2680311.853001]  [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
>> [2680311.853001]  [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
>> [2680311.853001]  [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
>> [2680311.853001]  [<ffffffff816d504d>] ? start_secondary+0x19d/0x280
>>
>> rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
>> connection and result the general protect fault.
>>
>> The "request for already hashed" warning, told us someone might change the connection flags
>> incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
>> put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
>> Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
>> and list_del() it, then kernel panic happened.
>>
>> After code review, the only chance that kernel change connection flag without protection is
>> in ip_vs_ftp_init_conn().
>
>        Hm, ip_vs_ftp_init_conn is called before 1st hashing,
> from ip_vs_bind_app() in ip_vs_conn_new() before
> ip_vs_conn_hash(). It should be another problem with
> the flags. How different is IPVS in 2.6.32.43 compared to
> recent kernels? If commit aea9d711 is present, I'm not
> aware of other similar problems.

ip_vs_bind_app() is also called by ip_vs_try_bind_dest(), which can be
traced to ip_vs_proc_conn().
I've checked the changes in upstream, but nothing helps since aea9d711
has been taken into 2.6.32.28 kernel.

>
>        May be you can post some details for your setup on
> lvs-devel@vger.kernel.org. I assume FTP is used,
> what about master-backup sync? Can you confirm that
> this fix solves the problem or it happens in rare cases?

Yes, ftp and WRR scheduler is used. Unfortunately, the oops
is unreproducible :(

>
>> Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
>> Cc: Wensong Zhang <wensong@linux-vs.org>
>> Cc: Simon Horman <horms@verge.net.au>
>> Cc: Julian Anastasov <ja@ssi.bg>
>> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
>> Cc: Patrick McHardy <kaber@trash.net>
>> Cc: "David S. Miller" <davem@davemloft.net>
>> ---
>>  net/netfilter/ipvs/ip_vs_ftp.c |    2 ++
>>  1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
>> index b20b29c..c2bc264 100644
>> --- a/net/netfilter/ipvs/ip_vs_ftp.c
>> +++ b/net/netfilter/ipvs/ip_vs_ftp.c
>> @@ -65,8 +65,10 @@ static int ip_vs_ftp_pasv;
>>  static int
>>  ip_vs_ftp_init_conn(struct ip_vs_app *app, struct ip_vs_conn *cp)
>>  {
>> +     spin_lock(&cp->lock);
>>       /* We use connection tracking for the command connection */
>>       cp->flags |= IP_VS_CONN_F_NFCT;
>> +     spin_unlock(&cp->lock);
>>       return 0;
>>  }
>>
>> --
>> 1.7.1
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Julian Anastasov June 29, 2012, 9:04 a.m. UTC | #4
Hello,

On Fri, 29 Jun 2012, Xiaotian Feng wrote:

> > On Thu, 28 Jun 2012, Xiaotian Feng wrote:
> >
> >> We met a kernel panic in 2.6.32.43 kernel:
> >>
> >> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
> >> <snip>
> >> [2680311.849009] general protection fault: 0000 [#1] SMP

	What we see here is 120 seconds between 2680191 and
2680311. It can mean 2 things:

- some state timeout, it depends on your forwarding method.
What is it? NAT? DR?

- 60 seconds for ip_vs_conn_expire retries

> >> After code review, the only chance that kernel change connection flag without protection is
> >> in ip_vs_ftp_init_conn().
> >
> >        Hm, ip_vs_ftp_init_conn is called before 1st hashing,
> > from ip_vs_bind_app() in ip_vs_conn_new() before
> > ip_vs_conn_hash(). It should be another problem with
> > the flags. How different is IPVS in 2.6.32.43 compared to
> > recent kernels? If commit aea9d711 is present, I'm not
> > aware of other similar problems.
> 
> ip_vs_bind_app() is also called by ip_vs_try_bind_dest(), which can be
> traced to ip_vs_proc_conn().
> I've checked the changes in upstream, but nothing helps since aea9d711
> has been taken into 2.6.32.28 kernel.

	OK, this fix should make it safe for master-backup
sync and it should be applied but I suspect you are not
using sync, right? And then this fix will not solve the oops.

	There are no many places that rehash conn:

ip_vs_conn_fill_cport
	- used for FTP

ip_vs_check_template:
	- do you have persistence configured?

	After you provide details for the used forwarding
method, persistence and sync we should think how such races
with rehashing can lead to double hlist_del. May be
you can modify the debug message in ip_vs_conn_hash, so
that we can see cp->flags and ntohs of cp->cport, cp->dport
and cp->vport when oops happens again.

Regards

--
Julian Anastasov <ja@ssi.bg>
Xiaotian Feng July 2, 2012, 10:30 a.m. UTC | #5
On Fri, Jun 29, 2012 at 5:04 PM, Julian Anastasov <ja@ssi.bg> wrote:
>
>         Hello,
>
> On Fri, 29 Jun 2012, Xiaotian Feng wrote:
>
>> > On Thu, 28 Jun 2012, Xiaotian Feng wrote:
>> >
>> >> We met a kernel panic in 2.6.32.43 kernel:
>> >>
>> >> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
>> >> <snip>
>> >> [2680311.849009] general protection fault: 0000 [#1] SMP
>
>         What we see here is 120 seconds between 2680191 and
> 2680311. It can mean 2 things:
>
> - some state timeout, it depends on your forwarding method.
> What is it? NAT? DR?
>
> - 60 seconds for ip_vs_conn_expire retries
>
>> >> After code review, the only chance that kernel change connection flag without protection is
>> >> in ip_vs_ftp_init_conn().
>> >
>> >        Hm, ip_vs_ftp_init_conn is called before 1st hashing,
>> > from ip_vs_bind_app() in ip_vs_conn_new() before
>> > ip_vs_conn_hash(). It should be another problem with
>> > the flags. How different is IPVS in 2.6.32.43 compared to
>> > recent kernels? If commit aea9d711 is present, I'm not
>> > aware of other similar problems.
>>
>> ip_vs_bind_app() is also called by ip_vs_try_bind_dest(), which can be
>> traced to ip_vs_proc_conn().
>> I've checked the changes in upstream, but nothing helps since aea9d711
>> has been taken into 2.6.32.28 kernel.
>
>         OK, this fix should make it safe for master-backup
> sync and it should be applied but I suspect you are not
> using sync, right? And then this fix will not solve the oops.
>

We're using sync.

>         There are no many places that rehash conn:
>
> ip_vs_conn_fill_cport
>         - used for FTP
>
> ip_vs_check_template:
>         - do you have persistence configured?

No.

>
>         After you provide details for the used forwarding
> method, persistence and sync we should think how such races
> with rehashing can lead to double hlist_del. May be
> you can modify the debug message in ip_vs_conn_hash, so
> that we can see cp->flags and ntohs of cp->cport, cp->dport
> and cp->vport when oops happens again.

I just found 2.6.32.34 kernel differ from upstream kernel,  2.6.32
kernel doesn't have ip_vs_try_bind_dest(), but ip_vs_process_message()
kernel might change conn flags without lock protection. This is fixed in
commit f73181c, following changes:

@@ -834,6 +843,7 @@ static void ip_vs_proc_conn(struct net *net,
struct ip_vs_conn_param *param,
                kfree(param->pe_data);

                dest = cp->dest;
+               spin_lock(&cp->lock);
                if ((cp->flags ^ flags) & IP_VS_CONN_F_INACTIVE &&
                    !(flags & IP_VS_CONN_F_TEMPLATE) && dest) {
                        if (flags & IP_VS_CONN_F_INACTIVE) {
@@ -847,6 +857,7 @@ static void ip_vs_proc_conn(struct net *net,
struct ip_vs_conn_param *param,
                flags &= IP_VS_CONN_F_BACKUP_UPD_MASK;
                flags |= cp->flags & ~IP_VS_CONN_F_BACKUP_UPD_MASK;
                cp->flags = flags;
+               spin_unlock(&cp->lock);
                if (!dest) {
                        dest = ip_vs_try_bind_dest(cp);
                        if (dest)

So I took this part into 2.6.32 kernel. But I still think the patch I
posted is required for upstream kernel. Even though there are no many
places that rehash conn, this is potential race as cp->flags is not
protected.

Thanks.

>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Julian Anastasov July 3, 2012, 7:12 a.m. UTC | #6
Hello,

On Thu, 28 Jun 2012, Xiaotian Feng wrote:

> We met a kernel panic in 2.6.32.43 kernel:
> 
> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
> <snip>
> [2680311.849009] general protection fault: 0000 [#1] SMP
> [2680311.853001] RIP: 0010:[<ffffffff815f155c>]  [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
> [2680311.853001] RSP: 0018:ffff880028303e70  EFLAGS: 00010202
> [2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
> [2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
> <snip>
> [2680311.853001] Call Trace:
> [2680311.853001]  <IRQ>
> [2680311.853001]  [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
> [2680311.853001]  [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
> [2680311.853001]  [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
> [2680311.853001]  [<ffffffff81049a13>] __do_softirq+0xb3/0x150
> [2680311.853001]  [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
> [2680311.853001]  [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
> [2680311.853001]  [<ffffffff81049957>] irq_exit+0x77/0x80
> [2680311.853001]  [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
> [2680311.853001]  [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
> [2680311.853001]  <EOI>
> [2680311.853001]  [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
> [2680311.853001]  [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
> [2680311.853001]  [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
> [2680311.853001]  [<ffffffff816d504d>] ? start_secondary+0x19d/0x280
> 
> rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
> connection and result the general protect fault.
> 
> The "request for already hashed" warning, told us someone might change the connection flags
> incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
> put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
> Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
> and list_del() it, then kernel panic happened.
> 
> After code review, the only chance that kernel change connection flag without protection is
> in ip_vs_ftp_init_conn().
> 
> Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
> Cc: Wensong Zhang <wensong@linux-vs.org>
> Cc: Simon Horman <horms@verge.net.au>
> Cc: Julian Anastasov <ja@ssi.bg>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: "David S. Miller" <davem@davemloft.net> 

	For the fix below:

Acked-by: Julian Anastasov <ja@ssi.bg>

	Simon, the change looks ok. ip_vs_ftp_init_conn is called
from context where cp->lock is not locked (no double lock), so it
should be safe for the backup.

	Only that the comment is not specifying that we
fix a problem in the backup server.

> ---
>  net/netfilter/ipvs/ip_vs_ftp.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
> index b20b29c..c2bc264 100644
> --- a/net/netfilter/ipvs/ip_vs_ftp.c
> +++ b/net/netfilter/ipvs/ip_vs_ftp.c
> @@ -65,8 +65,10 @@ static int ip_vs_ftp_pasv;
>  static int
>  ip_vs_ftp_init_conn(struct ip_vs_app *app, struct ip_vs_conn *cp)
>  {
> +	spin_lock(&cp->lock);
>  	/* We use connection tracking for the command connection */
>  	cp->flags |= IP_VS_CONN_F_NFCT;
> +	spin_unlock(&cp->lock);
>  	return 0;
>  }
>  
> -- 
> 1.7.1

Regards

--
Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Simon Horman July 10, 2012, 9:05 a.m. UTC | #7
On Tue, Jul 03, 2012 at 10:12:41AM +0300, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Thu, 28 Jun 2012, Xiaotian Feng wrote:
> 
> > We met a kernel panic in 2.6.32.43 kernel:
> > 
> > [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
> > <snip>
> > [2680311.849009] general protection fault: 0000 [#1] SMP
> > [2680311.853001] RIP: 0010:[<ffffffff815f155c>]  [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
> > [2680311.853001] RSP: 0018:ffff880028303e70  EFLAGS: 00010202
> > [2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
> > [2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
> > <snip>
> > [2680311.853001] Call Trace:
> > [2680311.853001]  <IRQ>
> > [2680311.853001]  [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
> > [2680311.853001]  [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
> > [2680311.853001]  [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
> > [2680311.853001]  [<ffffffff81049a13>] __do_softirq+0xb3/0x150
> > [2680311.853001]  [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
> > [2680311.853001]  [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
> > [2680311.853001]  [<ffffffff81049957>] irq_exit+0x77/0x80
> > [2680311.853001]  [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
> > [2680311.853001]  [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
> > [2680311.853001]  <EOI>
> > [2680311.853001]  [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
> > [2680311.853001]  [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
> > [2680311.853001]  [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
> > [2680311.853001]  [<ffffffff816d504d>] ? start_secondary+0x19d/0x280
> > 
> > rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
> > connection and result the general protect fault.
> > 
> > The "request for already hashed" warning, told us someone might change the connection flags
> > incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
> > put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
> > Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
> > and list_del() it, then kernel panic happened.
> > 
> > After code review, the only chance that kernel change connection flag without protection is
> > in ip_vs_ftp_init_conn().
> > 
> > Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
> > Cc: Wensong Zhang <wensong@linux-vs.org>
> > Cc: Simon Horman <horms@verge.net.au>
> > Cc: Julian Anastasov <ja@ssi.bg>
> > Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> > Cc: Patrick McHardy <kaber@trash.net>
> > Cc: "David S. Miller" <davem@davemloft.net> 
> 
> 	For the fix below:
> 
> Acked-by: Julian Anastasov <ja@ssi.bg>
> 
> 	Simon, the change looks ok. ip_vs_ftp_init_conn is called
> from context where cp->lock is not locked (no double lock), so it
> should be safe for the backup.
> 
> 	Only that the comment is not specifying that we
> fix a problem in the backup server.

Thanks.

I have pushed this to my ipvs branch and will see about getting it included in 3.5.

It appears that this problem has been present since (at least) 2.6.37 and
my feeling is that it is -stable material.

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index b20b29c..c2bc264 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -65,8 +65,10 @@  static int ip_vs_ftp_pasv;
 static int
 ip_vs_ftp_init_conn(struct ip_vs_app *app, struct ip_vs_conn *cp)
 {
+	spin_lock(&cp->lock);
 	/* We use connection tracking for the command connection */
 	cp->flags |= IP_VS_CONN_F_NFCT;
+	spin_unlock(&cp->lock);
 	return 0;
 }