diff mbox

mpls: Add missing RCU-bh read side critical section locking in output path

Message ID 20160620180527.GU20238@wantstofly.org
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Lennert Buytenhek June 20, 2016, 6:05 p.m. UTC
From: David Barroso <dbarroso@fastly.com>

When locally originated IP traffic hits a route that says to push
MPLS labels, we'll get a call chain dst_output() -> lwtunnel_output()
-> mpls_output() -> neigh_xmit() -> ___neigh_lookup_noref() where the
last function in this chain accesses a RCU-bh protected struct
neigh_table pointer without us ever having declared an RCU-bh read
side critical section.

As in case of locally originated IP traffic we'll be running in process
context, with softirqs enabled, we can be preempted by a softirq at any
time, and RCU-bh considers the completion of a softirq as signaling
the end of any pending read-side critical sections, so if we do get a
softirq here, we can end up with an unexpected RCU grace period and
all the nastiness that that comes with.

This patch makes neigh_xmit() take rcu_read_{,un}lock_bh() around the
code that expects to be treated as an RCU-bh read side critical section.

Signed-off-by: David Barroso <dbarroso@fastly.com>
Signed-off-by: Lennert Buytenhek <lbuytenhek@fastly.com>

Comments

Robert Shearman June 21, 2016, 3:15 p.m. UTC | #1
On 20/06/16 19:05, Lennert Buytenhek wrote:
> From: David Barroso <dbarroso@fastly.com>
>
> When locally originated IP traffic hits a route that says to push
> MPLS labels, we'll get a call chain dst_output() -> lwtunnel_output()
> -> mpls_output() -> neigh_xmit() -> ___neigh_lookup_noref() where the
> last function in this chain accesses a RCU-bh protected struct
> neigh_table pointer without us ever having declared an RCU-bh read
> side critical section.
>
> As in case of locally originated IP traffic we'll be running in process
> context, with softirqs enabled, we can be preempted by a softirq at any
> time, and RCU-bh considers the completion of a softirq as signaling
> the end of any pending read-side critical sections, so if we do get a
> softirq here, we can end up with an unexpected RCU grace period and
> all the nastiness that that comes with.
>
> This patch makes neigh_xmit() take rcu_read_{,un}lock_bh() around the
> code that expects to be treated as an RCU-bh read side critical section.
>
> Signed-off-by: David Barroso <dbarroso@fastly.com>
> Signed-off-by: Lennert Buytenhek <lbuytenhek@fastly.com>

LGTM too.

Acked-by: Robert Shearman <rshearma@brocade.com>
David Miller June 23, 2016, 4 p.m. UTC | #2
From: Lennert Buytenhek <buytenh@wantstofly.org>
Date: Mon, 20 Jun 2016 21:05:27 +0300

> From: David Barroso <dbarroso@fastly.com>
> 
> When locally originated IP traffic hits a route that says to push
> MPLS labels, we'll get a call chain dst_output() -> lwtunnel_output()
> -> mpls_output() -> neigh_xmit() -> ___neigh_lookup_noref() where the
> last function in this chain accesses a RCU-bh protected struct
> neigh_table pointer without us ever having declared an RCU-bh read
> side critical section.
> 
> As in case of locally originated IP traffic we'll be running in process
> context, with softirqs enabled, we can be preempted by a softirq at any
> time, and RCU-bh considers the completion of a softirq as signaling
> the end of any pending read-side critical sections, so if we do get a
> softirq here, we can end up with an unexpected RCU grace period and
> all the nastiness that that comes with.
> 
> This patch makes neigh_xmit() take rcu_read_{,un}lock_bh() around the
> code that expects to be treated as an RCU-bh read side critical section.
> 
> Signed-off-by: David Barroso <dbarroso@fastly.com>
> Signed-off-by: Lennert Buytenhek <lbuytenhek@fastly.com>

Whilst the case that was used to discover this problem was MPLS, that
is not the subsystem where the bug exists and is being fixed.

Therefore please fix your Subject line.

Thanks.
Lennert Buytenhek June 28, 2016, 8:14 a.m. UTC | #3
On Thu, Jun 23, 2016 at 12:00:55PM -0400, David Miller wrote:

> > From: David Barroso <dbarroso@fastly.com>
> > 
> > When locally originated IP traffic hits a route that says to push
> > MPLS labels, we'll get a call chain dst_output() -> lwtunnel_output()
> > -> mpls_output() -> neigh_xmit() -> ___neigh_lookup_noref() where the
> > last function in this chain accesses a RCU-bh protected struct
> > neigh_table pointer without us ever having declared an RCU-bh read
> > side critical section.
> > 
> > As in case of locally originated IP traffic we'll be running in process
> > context, with softirqs enabled, we can be preempted by a softirq at any
> > time, and RCU-bh considers the completion of a softirq as signaling
> > the end of any pending read-side critical sections, so if we do get a
> > softirq here, we can end up with an unexpected RCU grace period and
> > all the nastiness that that comes with.
> > 
> > This patch makes neigh_xmit() take rcu_read_{,un}lock_bh() around the
> > code that expects to be treated as an RCU-bh read side critical section.
> > 
> > Signed-off-by: David Barroso <dbarroso@fastly.com>
> > Signed-off-by: Lennert Buytenhek <lbuytenhek@fastly.com>
> 
> Whilst the case that was used to discover this problem was MPLS, that
> is not the subsystem where the bug exists and is being fixed.
> 
> Therefore please fix your Subject line.
> 
> Thanks.

I'd say that the bug _is_ in the MPLS code, but that we're just fixing
it in a helper function that lives elsewhere (and which is only used by
MPLS), but yeah, the subject line and the patch body don't match up. :(
I've resubmitted the patch with the commit message below, I hope that
that'll do.

Thanks!


===

[PATCH] neigh: Explicitly declare RCU-bh read side critical section in neigh_xmit()

From: David Barroso <dbarroso@fastly.com>

neigh_xmit() expects to be called inside an RCU-bh read side critical
section, and while one of its two current callers gets this right, the
other one doesn't.

More specifically, neigh_xmit() has two callers, mpls_forward() and
mpls_output(), and while both callers call neigh_xmit() under
rcu_read_lock(), this provides sufficient protection for neigh_xmit()
only in the case of mpls_forward(), as that is always called from
softirq context and therefore doesn't need explicit BH protection,
while mpls_output() can be called from process context with softirqs
enabled.

When mpls_output() is called from process context, with softirqs
enabled, we can be preempted by a softirq at any time, and RCU-bh
considers the completion of a softirq as signaling the end of any
pending read-side critical sections, so if we do get a softirq
while we are in the part of neigh_xmit() that expects to be run inside
an RCU-bh read side critical section, we can end up with an unexpected
RCU grace period running right in the middle of that critical section,
making things go boom.

This patch fixes this impedance mismatch in the callee, by making
neigh_xmit() always take rcu_read_{,un}lock_bh() around the code that
expects to be treated as an RCU-bh read side critical section, as this
seems a safer option than fixing it in the callers.

Fixes: 4fd3d7d9e868f ("neigh: Add helper function neigh_xmit")
Signed-off-by: David Barroso <dbarroso@fastly.com>
Signed-off-by: Lennert Buytenhek <lbuytenhek@fastly.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Robert Shearman <rshearma@brocade.com>
diff mbox

Patch

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index f18ae91..769cece 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -2467,13 +2467,17 @@  int neigh_xmit(int index, struct net_device *dev,
 		tbl = neigh_tables[index];
 		if (!tbl)
 			goto out;
+		rcu_read_lock_bh();
 		neigh = __neigh_lookup_noref(tbl, addr, dev);
 		if (!neigh)
 			neigh = __neigh_create(tbl, addr, dev, false);
 		err = PTR_ERR(neigh);
-		if (IS_ERR(neigh))
+		if (IS_ERR(neigh)) {
+			rcu_read_unlock_bh();
 			goto out_kfree_skb;
+		}
 		err = neigh->output(neigh, skb);
+		rcu_read_unlock_bh();
 	}
 	else if (index == NEIGH_LINK_TABLE) {
 		err = dev_hard_header(skb, dev, ntohs(skb->protocol),