diff mbox series

[net] ipvs: fix stats update from local clients

Message ID 20180503190218.3023-1-ja@ssi.bg
State Accepted
Delegated to: Pablo Neira
Headers show
Series [net] ipvs: fix stats update from local clients | expand

Commit Message

Julian Anastasov May 3, 2018, 7:02 p.m. UTC
Local clients are not properly synchronized on 32-bit CPUs when
updating stats (3.10+). Now it is possible estimation_timer (timer),
a stats reader, to interrupt the local client in the middle of
write_seqcount_{begin,end} sequence leading to loop (DEADLOCK).
The same interrupt can happen from received packet (SoftIRQ)
which updates the same per-CPU stats.

Fix it by disabling BH while updating stats.

Found with debug:

WARNING: inconsistent lock state
4.17.0-rc2-00105-g35cb6d7-dirty #2 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage.
ftp/2545 [HC0[0]:SC0[0]:HE1:SE1] takes:
86845479 (&syncp->seq#6){+.+-}, at: ip_vs_schedule+0x1c5/0x59e [ip_vs]
{IN-SOFTIRQ-R} state was registered at:
 lock_acquire+0x44/0x5b
 estimation_timer+0x1b3/0x341 [ip_vs]
 call_timer_fn+0x54/0xcd
 run_timer_softirq+0x10c/0x12b
 __do_softirq+0xc1/0x1a9
 do_softirq_own_stack+0x1d/0x23
 irq_exit+0x4a/0x64
 smp_apic_timer_interrupt+0x63/0x71
 apic_timer_interrupt+0x3a/0x40
 default_idle+0xa/0xc
 arch_cpu_idle+0x9/0xb
 default_idle_call+0x21/0x23
 do_idle+0xa0/0x167
 cpu_startup_entry+0x19/0x1b
 start_secondary+0x133/0x182
 startup_32_smp+0x164/0x168
irq event stamp: 42213

other info that might help us debug this:
Possible unsafe locking scenario:

      CPU0
      ----
 lock(&syncp->seq#6);
 <Interrupt>
   lock(&syncp->seq#6);

*** DEADLOCK ***

Fixes: ac69269a45e8 ("ipvs: do not disable bh for long time")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
 net/netfilter/ipvs/ip_vs_core.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Simon Horman May 7, 2018, 11:18 a.m. UTC | #1
On Thu, May 03, 2018 at 10:02:18PM +0300, Julian Anastasov wrote:
> Local clients are not properly synchronized on 32-bit CPUs when
> updating stats (3.10+). Now it is possible estimation_timer (timer),
> a stats reader, to interrupt the local client in the middle of
> write_seqcount_{begin,end} sequence leading to loop (DEADLOCK).
> The same interrupt can happen from received packet (SoftIRQ)
> which updates the same per-CPU stats.
> 
> Fix it by disabling BH while updating stats.
> 
> Found with debug:
> 
> WARNING: inconsistent lock state
> 4.17.0-rc2-00105-g35cb6d7-dirty #2 Not tainted
> --------------------------------
> inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage.
> ftp/2545 [HC0[0]:SC0[0]:HE1:SE1] takes:
> 86845479 (&syncp->seq#6){+.+-}, at: ip_vs_schedule+0x1c5/0x59e [ip_vs]
> {IN-SOFTIRQ-R} state was registered at:
>  lock_acquire+0x44/0x5b
>  estimation_timer+0x1b3/0x341 [ip_vs]
>  call_timer_fn+0x54/0xcd
>  run_timer_softirq+0x10c/0x12b
>  __do_softirq+0xc1/0x1a9
>  do_softirq_own_stack+0x1d/0x23
>  irq_exit+0x4a/0x64
>  smp_apic_timer_interrupt+0x63/0x71
>  apic_timer_interrupt+0x3a/0x40
>  default_idle+0xa/0xc
>  arch_cpu_idle+0x9/0xb
>  default_idle_call+0x21/0x23
>  do_idle+0xa0/0x167
>  cpu_startup_entry+0x19/0x1b
>  start_secondary+0x133/0x182
>  startup_32_smp+0x164/0x168
> irq event stamp: 42213
> 
> other info that might help us debug this:
> Possible unsafe locking scenario:
> 
>       CPU0
>       ----
>  lock(&syncp->seq#6);
>  <Interrupt>
>    lock(&syncp->seq#6);
> 
> *** DEADLOCK ***
> 
> Fixes: ac69269a45e8 ("ipvs: do not disable bh for long time")
> Signed-off-by: Julian Anastasov <ja@ssi.bg>

Acked-by: Simon Horman <horms@verge.net.au>

Pablo, can you take this into nf?

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso May 8, 2018, 12:16 p.m. UTC | #2
On Mon, May 07, 2018 at 01:18:26PM +0200, Simon Horman wrote:
> On Thu, May 03, 2018 at 10:02:18PM +0300, Julian Anastasov wrote:
> > Local clients are not properly synchronized on 32-bit CPUs when
> > updating stats (3.10+). Now it is possible estimation_timer (timer),
> > a stats reader, to interrupt the local client in the middle of
> > write_seqcount_{begin,end} sequence leading to loop (DEADLOCK).
> > The same interrupt can happen from received packet (SoftIRQ)
> > which updates the same per-CPU stats.
> > 
> > Fix it by disabling BH while updating stats.
> > 
> > Found with debug:
> > 
> > WARNING: inconsistent lock state
> > 4.17.0-rc2-00105-g35cb6d7-dirty #2 Not tainted
> > --------------------------------
> > inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage.
> > ftp/2545 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > 86845479 (&syncp->seq#6){+.+-}, at: ip_vs_schedule+0x1c5/0x59e [ip_vs]
> > {IN-SOFTIRQ-R} state was registered at:
> >  lock_acquire+0x44/0x5b
> >  estimation_timer+0x1b3/0x341 [ip_vs]
> >  call_timer_fn+0x54/0xcd
> >  run_timer_softirq+0x10c/0x12b
> >  __do_softirq+0xc1/0x1a9
> >  do_softirq_own_stack+0x1d/0x23
> >  irq_exit+0x4a/0x64
> >  smp_apic_timer_interrupt+0x63/0x71
> >  apic_timer_interrupt+0x3a/0x40
> >  default_idle+0xa/0xc
> >  arch_cpu_idle+0x9/0xb
> >  default_idle_call+0x21/0x23
> >  do_idle+0xa0/0x167
> >  cpu_startup_entry+0x19/0x1b
> >  start_secondary+0x133/0x182
> >  startup_32_smp+0x164/0x168
> > irq event stamp: 42213
> > 
> > other info that might help us debug this:
> > Possible unsafe locking scenario:
> > 
> >       CPU0
> >       ----
> >  lock(&syncp->seq#6);
> >  <Interrupt>
> >    lock(&syncp->seq#6);
> > 
> > *** DEADLOCK ***
> > 
> > Fixes: ac69269a45e8 ("ipvs: do not disable bh for long time")
> > Signed-off-by: Julian Anastasov <ja@ssi.bg>
> 
> Acked-by: Simon Horman <horms@verge.net.au>
> 
> Pablo, can you take this into nf?

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox series

Patch

diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 5f6f73c..0679dd1 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -119,6 +119,8 @@  ip_vs_in_stats(struct ip_vs_conn *cp, struct sk_buff *skb)
 		struct ip_vs_cpu_stats *s;
 		struct ip_vs_service *svc;
 
+		local_bh_disable();
+
 		s = this_cpu_ptr(dest->stats.cpustats);
 		u64_stats_update_begin(&s->syncp);
 		s->cnt.inpkts++;
@@ -137,6 +139,8 @@  ip_vs_in_stats(struct ip_vs_conn *cp, struct sk_buff *skb)
 		s->cnt.inpkts++;
 		s->cnt.inbytes += skb->len;
 		u64_stats_update_end(&s->syncp);
+
+		local_bh_enable();
 	}
 }
 
@@ -151,6 +155,8 @@  ip_vs_out_stats(struct ip_vs_conn *cp, struct sk_buff *skb)
 		struct ip_vs_cpu_stats *s;
 		struct ip_vs_service *svc;
 
+		local_bh_disable();
+
 		s = this_cpu_ptr(dest->stats.cpustats);
 		u64_stats_update_begin(&s->syncp);
 		s->cnt.outpkts++;
@@ -169,6 +175,8 @@  ip_vs_out_stats(struct ip_vs_conn *cp, struct sk_buff *skb)
 		s->cnt.outpkts++;
 		s->cnt.outbytes += skb->len;
 		u64_stats_update_end(&s->syncp);
+
+		local_bh_enable();
 	}
 }
 
@@ -179,6 +187,8 @@  ip_vs_conn_stats(struct ip_vs_conn *cp, struct ip_vs_service *svc)
 	struct netns_ipvs *ipvs = svc->ipvs;
 	struct ip_vs_cpu_stats *s;
 
+	local_bh_disable();
+
 	s = this_cpu_ptr(cp->dest->stats.cpustats);
 	u64_stats_update_begin(&s->syncp);
 	s->cnt.conns++;
@@ -193,6 +203,8 @@  ip_vs_conn_stats(struct ip_vs_conn *cp, struct ip_vs_service *svc)
 	u64_stats_update_begin(&s->syncp);
 	s->cnt.conns++;
 	u64_stats_update_end(&s->syncp);
+
+	local_bh_enable();
 }