Message ID | 1382519724-3953-4-git-send-email-pablo@netfilter.org |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
> Subject: [PATCH 3/3] netfilter: x_tables: fix ordering of jumpstack allocation and table update ... > Meanwhile, CPU0 is handling the network receive path and ends up in > ipt_do_table, resulting in: > > private = table->private; > > [...] > > jumpstack = (struct ipt_entry **)private->jumpstack[cpu]; > > On weakly ordered memory architectures, the writes to table->private > and newinfo->jumpstack from CPU1 can be observed out of order by CPU0. > Furthermore, on architectures which don't respect ordering of address > dependencies (i.e. Alpha), the reads from CPU0 can also be re-ordered. Which reads might be out of order? AFAICT they are strongly sequenced because they second depends on the value read by the first. So I don't see why the read barrier is needed. I presume the above code is tied to a single cpu. ... > > - table->private = newinfo; > newinfo->initial_entries = private->initial_entries; > + /* > + * Ensure contents of newinfo are visible before assigning to > + * private. > + */ > + smp_wmb(); > + table->private = newinfo; Those writes were in the wrong order on all systems. Also gcc needs to be told not to reorder the writes even on non-smp systems (if the code might be pre-empted). So an asm volatile (:::"memory") is needed there even if no specific synchronisation instruction is needed. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-10-23 at 10:45 +0100, David Laight wrote: > Those writes were in the wrong order on all systems. > Also gcc needs to be told not to reorder the writes even on non-smp > systems (if the code might be pre-empted). > So an asm volatile (:::"memory") is needed there even if no specific > synchronisation instruction is needed. smp_wmb() contains this compiler barrier already. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi David, On Wed, Oct 23, 2013 at 10:45:04AM +0100, David Laight wrote: > > Subject: [PATCH 3/3] netfilter: x_tables: fix ordering of jumpstack allocation and table update > ... > > Meanwhile, CPU0 is handling the network receive path and ends up in > > ipt_do_table, resulting in: > > > > private = table->private; > > > > [...] > > > > jumpstack = (struct ipt_entry **)private->jumpstack[cpu]; > > > > On weakly ordered memory architectures, the writes to table->private > > and newinfo->jumpstack from CPU1 can be observed out of order by CPU0. > > Furthermore, on architectures which don't respect ordering of address > > dependencies (i.e. Alpha), the reads from CPU0 can also be re-ordered. > > Which reads might be out of order? > AFAICT they are strongly sequenced because they second depends on the > value read by the first. > So I don't see why the read barrier is needed. That is why this is a dependent read barrier. Some architectures (e.g. Alpha) *do* allow dependent reads to be observed out of order, so you can effectively load the data pointed to by a pointer before you load the pointer itself! Take a look at Paul's paper about memory ordering if you're curious: http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2009.04.05a.pdf > > - table->private = newinfo; > > newinfo->initial_entries = private->initial_entries; > > + /* > > + * Ensure contents of newinfo are visible before assigning to > > + * private. > > + */ > > + smp_wmb(); > > + table->private = newinfo; > > Those writes were in the wrong order on all systems. > Also gcc needs to be told not to reorder the writes even on non-smp > systems (if the code might be pre-empted). > So an asm volatile (:::"memory") is needed there even if no specific > synchronisation instruction is needed. The smp_* barriers expand to barrier() when !CONFIG_SMP, which gives you the memory clobber you want. What I'm *not* 100% sure about is the table freeing path. There is a mutex there for removing the table from a list, but I'm not sure how we ensure that there are no parallel readers at that point. Will -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-10-23 at 17:37 +0100, Will Deacon wrote: > What I'm *not* 100% sure about is the table freeing path. There is a mutex > there for removing the table from a list, but I'm not sure how we ensure > that there are no parallel readers at that point. Sequence is : xt_replace_table(); get_counters(); xt_free_table_info(); get_counters() is the way we ensure no cpu is using old copy of the table before freeing. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 85a4f21..59da7cd 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -271,6 +271,11 @@ unsigned int arpt_do_table(struct sk_buff *skb, local_bh_disable(); addend = xt_write_recseq_begin(); private = table->private; + /* + * Ensure we load private-> members after we've fetched the base + * pointer. + */ + smp_read_barrier_depends(); table_base = private->entries[smp_processor_id()]; e = get_entry(table_base, private->hook_entry[hook]); diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index d23118d..718dfbd 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -327,6 +327,11 @@ ipt_do_table(struct sk_buff *skb, addend = xt_write_recseq_begin(); private = table->private; cpu = smp_processor_id(); + /* + * Ensure we load private-> members after we've fetched the base + * pointer. + */ + smp_read_barrier_depends(); table_base = private->entries[cpu]; jumpstack = (struct ipt_entry **)private->jumpstack[cpu]; stackptr = per_cpu_ptr(private->stackptr, cpu); diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 44400c2..710238f 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -349,6 +349,11 @@ ip6t_do_table(struct sk_buff *skb, local_bh_disable(); addend = xt_write_recseq_begin(); private = table->private; + /* + * Ensure we load private-> members after we've fetched the base + * pointer. + */ + smp_read_barrier_depends(); cpu = smp_processor_id(); table_base = private->entries[cpu]; jumpstack = (struct ip6t_entry **)private->jumpstack[cpu]; diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 8b03028..227aa11 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -845,8 +845,13 @@ xt_replace_table(struct xt_table *table, return NULL; } - table->private = newinfo; newinfo->initial_entries = private->initial_entries; + /* + * Ensure contents of newinfo are visible before assigning to + * private. + */ + smp_wmb(); + table->private = newinfo; /* * Even though table entries have now been swapped, other CPU's