diff mbox

[v2,1/1] ppp: Fix one deadlock issue of PPP when send frame

Message ID 1471347218-24472-1-git-send-email-fgao@ikuai8.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

高峰 Aug. 16, 2016, 11:33 a.m. UTC
From: Gao Feng <fgao@ikuai8.com>

PPP channel holds one spinlock before send frame. But the skb may
select the same PPP channel with wrong route policy. As a result,
the skb reaches the same channel path. It tries to get the same
spinlock which is held before. Bang, the deadlock comes out.

Now add one lock owner to avoid it like xmit_lock_owner of
netdev_queue. Check the lock owner before try to get the spinlock.
If the current cpu is already the owner, needn't lock again. When
PPP channel holds the spinlock at the first time, it sets owner
with current CPU ID.

The following is the panic stack of 3.3.8. But the same issue
should be in the upstream too.

[<ffffffff81568131>] ? _raw_spin_lock_bh+0x11/0x40
[<ffffffffa006a2b7>] ppp_unregister_channel+0x1347/0x2170 [ppp_generic]
[<ffffffff810a2827>] ? kmem_cache_free+0xa7/0xc0
[<ffffffffa006ad27>] ppp_unregister_channel+0x1db7/0x2170 [ppp_generic]
[<ffffffffa006afd5>] ppp_unregister_channel+0x2065/0x2170 [ppp_generic]
[<ffffffff8148f1dd>] dev_hard_start_xmit+0x4cd/0x620
[<ffffffff814a6254>] sch_direct_xmit+0x74/0x1d0
[<ffffffff8148f88d>] dev_queue_xmit+0x1d/0x30
[<ffffffff81496a4c>] neigh_direct_output+0xc/0x10
[<ffffffff814d9dae>] ip_finish_output+0x25e/0x2b0
[<ffffffff814da688>] ip_output+0x88/0x90
[<ffffffff814d9e9f>] ? __ip_local_out+0x9f/0xb0
[<ffffffff814d9ed4>] ip_local_out+0x24/0x30
[<ffffffffa00b9745>] 0xffffffffa00b9744
[<ffffffffa006b068>] ppp_unregister_channel+0x20f8/0x2170 [ppp_generic]
[<ffffffffa006b202>] ppp_output_wakeup+0x122/0x11d0 [ppp_generic]
[<ffffffff810a7978>] vfs_write+0xb8/0x160
[<ffffffff810a7c55>] sys_write+0x45/0x90
[<ffffffff815689e2>] system_call_fastpath+0x16/0x1b

The call flow is like this.
ppp_write->ppp_channel_push->start_xmit->select inappropriate route
.... -> dev_hard_start_xmit->ppp_start_xmit->ppp_xmit_process->
ppp_push. Now ppp_push tries to get the same spinlock which is held
in ppp_channel_push.

Although the PPP deadlock is caused by inappropriate route policy
with L2TP, I think it is not accepted the PPP module would cause kernel
deadlock with wrong route policy.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
---
 v2: Add lock_cnt to avoid unlock multiple times when recurisve lock
 v1: Initial patch
 drivers/net/ppp/ppp_generic.c | 57 +++++++++++++++++++++++++++++++++----------
 1 file changed, 44 insertions(+), 13 deletions(-)

Comments

Guillaume Nault Aug. 17, 2016, 5:42 p.m. UTC | #1
On Tue, Aug 16, 2016 at 07:33:38PM +0800, fgao@ikuai8.com wrote:
> From: Gao Feng <fgao@ikuai8.com>
> 
> PPP channel holds one spinlock before send frame. But the skb may
> select the same PPP channel with wrong route policy. As a result,
> the skb reaches the same channel path. It tries to get the same
> spinlock which is held before. Bang, the deadlock comes out.
> 
Unless I misunderstood the problem you're trying to solve, this patch
doesn't really help: deadlock still occurs if the same IP is used for
L2TP and PPP's peer address.

> Now add one lock owner to avoid it like xmit_lock_owner of
> netdev_queue. Check the lock owner before try to get the spinlock.
> If the current cpu is already the owner, needn't lock again. When
> PPP channel holds the spinlock at the first time, it sets owner
> with current CPU ID.
>
I think you should forbid lock recursion entirely, and drop the packet
if the owner tries to re-acquire the channel lock. Otherwise you just
move the deadlock down the stack (l2tp_xmit_skb() can't be called
recursively).

> diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
> index 70cfa06..6909ab1 100644
> --- a/drivers/net/ppp/ppp_generic.c
> +++ b/drivers/net/ppp/ppp_generic.c
> @@ -162,6 +162,37 @@ struct ppp {
>  			 |SC_MULTILINK|SC_MP_SHORTSEQ|SC_MP_XSHORTSEQ \
>  			 |SC_COMP_TCP|SC_REJ_COMP_TCP|SC_MUST_COMP)
>  
> +struct chennel_lock {
            ^
I guess you meant 'channel_lock'.

> +	spinlock_t lock;
> +	u32 owner;
This field's default value is -1. So why not declaring it as 'int'?
Feng Gao Aug. 18, 2016, 12:58 a.m. UTC | #2
inline answer.

On Thu, Aug 18, 2016 at 1:42 AM, Guillaume Nault <g.nault@alphalink.fr> wrote:
> On Tue, Aug 16, 2016 at 07:33:38PM +0800, fgao@ikuai8.com wrote:
>> From: Gao Feng <fgao@ikuai8.com>
>>
>> PPP channel holds one spinlock before send frame. But the skb may
>> select the same PPP channel with wrong route policy. As a result,
>> the skb reaches the same channel path. It tries to get the same
>> spinlock which is held before. Bang, the deadlock comes out.
>>
> Unless I misunderstood the problem you're trying to solve, this patch
> doesn't really help: deadlock still occurs if the same IP is used for
> L2TP and PPP's peer address.
>

The deadlock happens because the same cpu try to hold the spinlock
which is already held before by itself.
Now the PPP_CHANNEL_LOCK_BH sets the lock owner after hold lock, then
when the same cpu
invokes PPP_CHANNEL_LOCK_BH again. The cl.owner equals current cpu id,
so it only increases
the lock_cnt without trying to hold the lock again.
So it avoids the deadlock.

>> Now add one lock owner to avoid it like xmit_lock_owner of
>> netdev_queue. Check the lock owner before try to get the spinlock.
>> If the current cpu is already the owner, needn't lock again. When
>> PPP channel holds the spinlock at the first time, it sets owner
>> with current CPU ID.
>>
> I think you should forbid lock recursion entirely, and drop the packet
> if the owner tries to re-acquire the channel lock. Otherwise you just
> move the deadlock down the stack (l2tp_xmit_skb() can't be called
> recursively).

The reason that fix it in ppp is that there are other layer on the ppp module.
We resolve it in ppp module, it could avoid all similar potential issues.

>
>> diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
>> index 70cfa06..6909ab1 100644
>> --- a/drivers/net/ppp/ppp_generic.c
>> +++ b/drivers/net/ppp/ppp_generic.c
>> @@ -162,6 +162,37 @@ struct ppp {
>>                        |SC_MULTILINK|SC_MP_SHORTSEQ|SC_MP_XSHORTSEQ \
>>                        |SC_COMP_TCP|SC_REJ_COMP_TCP|SC_MUST_COMP)
>>
>> +struct chennel_lock {
>             ^
> I guess you meant 'channel_lock'.

Yes. It is a typo. Thanks.

>
>> +     spinlock_t lock;
>> +     u32 owner;
> This field's default value is -1. So why not declaring it as 'int'?

OK. I will fix it.

Best Regards
Feng
Guillaume Nault Aug. 18, 2016, 4:13 p.m. UTC | #3
On Thu, Aug 18, 2016 at 08:58:31AM +0800, Feng Gao wrote:
> On Thu, Aug 18, 2016 at 1:42 AM, Guillaume Nault <g.nault@alphalink.fr> wrote:
> > On Tue, Aug 16, 2016 at 07:33:38PM +0800, fgao@ikuai8.com wrote:
> >> From: Gao Feng <fgao@ikuai8.com>
> >>
> >> PPP channel holds one spinlock before send frame. But the skb may
> >> select the same PPP channel with wrong route policy. As a result,
> >> the skb reaches the same channel path. It tries to get the same
> >> spinlock which is held before. Bang, the deadlock comes out.
> >>
> > Unless I misunderstood the problem you're trying to solve, this patch
> > doesn't really help: deadlock still occurs if the same IP is used for
> > L2TP and PPP's peer address.
> >
> 
> The deadlock happens because the same cpu try to hold the spinlock
> which is already held before by itself.
> Now the PPP_CHANNEL_LOCK_BH sets the lock owner after hold lock, then
> when the same cpu
> invokes PPP_CHANNEL_LOCK_BH again. The cl.owner equals current cpu id,
> so it only increases
> the lock_cnt without trying to hold the lock again.
> So it avoids the deadlock.
> 
I'm sorry but, again, it just _moves_ the deadlock down to L2TP. The
kernel still oops because, now, l2tp_xmit_skb() is called recursively
while holding its tunnel socket.

> >> Now add one lock owner to avoid it like xmit_lock_owner of
> >> netdev_queue. Check the lock owner before try to get the spinlock.
> >> If the current cpu is already the owner, needn't lock again. When
> >> PPP channel holds the spinlock at the first time, it sets owner
> >> with current CPU ID.
> >>
> > I think you should forbid lock recursion entirely, and drop the packet
> > if the owner tries to re-acquire the channel lock. Otherwise you just
> > move the deadlock down the stack (l2tp_xmit_skb() can't be called
> > recursively).
> 
> The reason that fix it in ppp is that there are other layer on the ppp module.
> We resolve it in ppp module, it could avoid all similar potential issues.
>
Not sure if I understand your point here.
The xmit path of PPP and its sub-layers hasn't been designed to be
reentrant. Allowing recursive sends thus require to review the full
path.

Beyond the L2TP issue discussed above, just consider the locking
dependencies used in PPP: ppp->wlock has to be held before
channel->downl. Sending a packet directly on a channel will lock
channel->downl. If this packet is routed back to the parent unit then
ppp_xmit_process() will lock ppp->wlock, effectively leading to lock
inversion and potential deadlock.

So we have two options: adapt the whole xmit path to handle recursive
sends or forbid recursion entirely. Unfortunately none of these options
looks easy to achieve:

  * Making PPP xmit path reentrant will be hard and error prone because
    of all the locking dependencies. Looks like simplifying PPP's
    locking scheme will be required first.

  * I can't see any way to reliably prevent settings where a packet sent
    on a given channel would be routed back to the parent unit.


OTOH, we can try to limit the impact of recursive sends for simple
cases:

  * Following your approach, either adapt the lower layers
    (like l2tp_xmit_skb() for L2TP), or drop the packet when
    cl.owner == smp_processor_id(). This is very limited in scope and
    doesn't address issues like locking inversions. But it may let the
    system survive long enough for the PPP to time out.

  * In the lower layer, check where the packet is going to be enqueued
    and drop it if it's the parent device. That should reliably handle
    simple and common cases. However this requires to update at least
    L2TP and PPTP and to get a way to access the parent device. Also,
    it doesn't prevent recursion with stacked interfaces.
Feng Gao Aug. 18, 2016, 10:52 p.m. UTC | #4
Hi Guillaume,

Thanks your detail analyses.
Now I think it is a good solution that just drop the packet and print
error log instead my original solution that supports reentrant. This
solution will not bring any side effects.

I will send one update according to this new solution.


Regards
Feng


On Fri, Aug 19, 2016 at 12:13 AM, Guillaume Nault <g.nault@alphalink.fr> wrote:
> On Thu, Aug 18, 2016 at 08:58:31AM +0800, Feng Gao wrote:
>> On Thu, Aug 18, 2016 at 1:42 AM, Guillaume Nault <g.nault@alphalink.fr> wrote:
>> > On Tue, Aug 16, 2016 at 07:33:38PM +0800, fgao@ikuai8.com wrote:
>> >> From: Gao Feng <fgao@ikuai8.com>
>> >>
>> >> PPP channel holds one spinlock before send frame. But the skb may
>> >> select the same PPP channel with wrong route policy. As a result,
>> >> the skb reaches the same channel path. It tries to get the same
>> >> spinlock which is held before. Bang, the deadlock comes out.
>> >>
>> > Unless I misunderstood the problem you're trying to solve, this patch
>> > doesn't really help: deadlock still occurs if the same IP is used for
>> > L2TP and PPP's peer address.
>> >
>>
>> The deadlock happens because the same cpu try to hold the spinlock
>> which is already held before by itself.
>> Now the PPP_CHANNEL_LOCK_BH sets the lock owner after hold lock, then
>> when the same cpu
>> invokes PPP_CHANNEL_LOCK_BH again. The cl.owner equals current cpu id,
>> so it only increases
>> the lock_cnt without trying to hold the lock again.
>> So it avoids the deadlock.
>>
> I'm sorry but, again, it just _moves_ the deadlock down to L2TP. The
> kernel still oops because, now, l2tp_xmit_skb() is called recursively
> while holding its tunnel socket.
>
>> >> Now add one lock owner to avoid it like xmit_lock_owner of
>> >> netdev_queue. Check the lock owner before try to get the spinlock.
>> >> If the current cpu is already the owner, needn't lock again. When
>> >> PPP channel holds the spinlock at the first time, it sets owner
>> >> with current CPU ID.
>> >>
>> > I think you should forbid lock recursion entirely, and drop the packet
>> > if the owner tries to re-acquire the channel lock. Otherwise you just
>> > move the deadlock down the stack (l2tp_xmit_skb() can't be called
>> > recursively).
>>
>> The reason that fix it in ppp is that there are other layer on the ppp module.
>> We resolve it in ppp module, it could avoid all similar potential issues.
>>
> Not sure if I understand your point here.
> The xmit path of PPP and its sub-layers hasn't been designed to be
> reentrant. Allowing recursive sends thus require to review the full
> path.
>
> Beyond the L2TP issue discussed above, just consider the locking
> dependencies used in PPP: ppp->wlock has to be held before
> channel->downl. Sending a packet directly on a channel will lock
> channel->downl. If this packet is routed back to the parent unit then
> ppp_xmit_process() will lock ppp->wlock, effectively leading to lock
> inversion and potential deadlock.
>
> So we have two options: adapt the whole xmit path to handle recursive
> sends or forbid recursion entirely. Unfortunately none of these options
> looks easy to achieve:
>
>   * Making PPP xmit path reentrant will be hard and error prone because
>     of all the locking dependencies. Looks like simplifying PPP's
>     locking scheme will be required first.
>
>   * I can't see any way to reliably prevent settings where a packet sent
>     on a given channel would be routed back to the parent unit.
>
>
> OTOH, we can try to limit the impact of recursive sends for simple
> cases:
>
>   * Following your approach, either adapt the lower layers
>     (like l2tp_xmit_skb() for L2TP), or drop the packet when
>     cl.owner == smp_processor_id(). This is very limited in scope and
>     doesn't address issues like locking inversions. But it may let the
>     system survive long enough for the PPP to time out.
>
>   * In the lower layer, check where the packet is going to be enqueued
>     and drop it if it's the parent device. That should reliably handle
>     simple and common cases. However this requires to update at least
>     L2TP and PPTP and to get a way to access the parent device. Also,
>     it doesn't prevent recursion with stacked interfaces.
Guillaume Nault Aug. 19, 2016, 9:57 p.m. UTC | #5
On Fri, Aug 19, 2016 at 06:52:58AM +0800, Feng Gao wrote:
> Hi Guillaume,
> 
> Thanks your detail analyses.
> Now I think it is a good solution that just drop the packet and print
> error log instead my original solution that supports reentrant. This
> solution will not bring any side effects.
> 
At least it should improve the situation.
I'm still uncomfortable with this approach because it only fixes part
of the problem. But I have nothing better to propose for now.
diff mbox

Patch

diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index 70cfa06..6909ab1 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -162,6 +162,37 @@  struct ppp {
 			 |SC_MULTILINK|SC_MP_SHORTSEQ|SC_MP_XSHORTSEQ \
 			 |SC_COMP_TCP|SC_REJ_COMP_TCP|SC_MUST_COMP)
 
+struct chennel_lock {
+	spinlock_t lock;
+	u32 owner;
+	u32 lock_cnt;
+};
+
+#define PPP_CHANNEL_LOCK_INIT(cl) \
+	cl.owner = -1; \
+	cl.lock_cnt = 0; \
+	spin_lock_init(&cl.lock)
+
+#define PPP_CHANNEL_LOCK_BH(cl) \
+	do { \
+		local_bh_disable(); \
+		if (cl.owner != smp_processor_id()) { \
+			spin_lock(&cl.lock); \
+			cl.owner = smp_processor_id(); \
+		} \
+		cl.lock_cnt++; \
+	} while (0)
+
+#define PPP_CHANNEL_UNLOCK_BH(cl) \
+	do { \
+		cl.lock_cnt--; \
+		if (cl.lock_cnt == 0) { \
+			cl.owner = -1; \
+			spin_unlock(&cl.lock); \
+		} \
+		local_bh_enable(); \
+	} while (0)
+
 /*
  * Private data structure for each channel.
  * This includes the data structure used for multilink.
@@ -171,7 +202,7 @@  struct channel {
 	struct list_head list;		/* link in all/new_channels list */
 	struct ppp_channel *chan;	/* public channel data structure */
 	struct rw_semaphore chan_sem;	/* protects `chan' during chan ioctl */
-	spinlock_t	downl;		/* protects `chan', file.xq dequeue */
+	struct chennel_lock downl;	/* protects `chan', file.xq dequeue */
 	struct ppp	*ppp;		/* ppp unit we're connected to */
 	struct net	*chan_net;	/* the net channel belongs to */
 	struct list_head clist;		/* link in list of channels per unit */
@@ -1597,7 +1628,7 @@  ppp_push(struct ppp *ppp)
 		list = list->next;
 		pch = list_entry(list, struct channel, clist);
 
-		spin_lock_bh(&pch->downl);
+		PPP_CHANNEL_LOCK_BH(pch->downl);
 		if (pch->chan) {
 			if (pch->chan->ops->start_xmit(pch->chan, skb))
 				ppp->xmit_pending = NULL;
@@ -1606,7 +1637,7 @@  ppp_push(struct ppp *ppp)
 			kfree_skb(skb);
 			ppp->xmit_pending = NULL;
 		}
-		spin_unlock_bh(&pch->downl);
+		PPP_CHANNEL_UNLOCK_BH(pch->downl);
 		return;
 	}
 
@@ -1736,7 +1767,7 @@  static int ppp_mp_explode(struct ppp *ppp, struct sk_buff *skb)
 		}
 
 		/* check the channel's mtu and whether it is still attached. */
-		spin_lock_bh(&pch->downl);
+		PPP_CHANNEL_LOCK_BH(pch->downl);
 		if (pch->chan == NULL) {
 			/* can't use this channel, it's being deregistered */
 			if (pch->speed == 0)
@@ -1744,7 +1775,7 @@  static int ppp_mp_explode(struct ppp *ppp, struct sk_buff *skb)
 			else
 				totspeed -= pch->speed;
 
-			spin_unlock_bh(&pch->downl);
+			PPP_CHANNEL_UNLOCK_BH(pch->downl);
 			pch->avail = 0;
 			totlen = len;
 			totfree--;
@@ -1795,7 +1826,7 @@  static int ppp_mp_explode(struct ppp *ppp, struct sk_buff *skb)
 		 */
 		if (flen <= 0) {
 			pch->avail = 2;
-			spin_unlock_bh(&pch->downl);
+			PPP_CHANNEL_UNLOCK_BH(pch->downl);
 			continue;
 		}
 
@@ -1840,14 +1871,14 @@  static int ppp_mp_explode(struct ppp *ppp, struct sk_buff *skb)
 		len -= flen;
 		++ppp->nxseq;
 		bits = 0;
-		spin_unlock_bh(&pch->downl);
+		PPP_CHANNEL_UNLOCK_BH(pch->downl);
 	}
 	ppp->nxchan = i;
 
 	return 1;
 
  noskb:
-	spin_unlock_bh(&pch->downl);
+	PPP_CHANNEL_UNLOCK_BH(pch->downl);
 	if (ppp->debug & 1)
 		netdev_err(ppp->dev, "PPP: no memory (fragment)\n");
 	++ppp->dev->stats.tx_errors;
@@ -1865,7 +1896,7 @@  ppp_channel_push(struct channel *pch)
 	struct sk_buff *skb;
 	struct ppp *ppp;
 
-	spin_lock_bh(&pch->downl);
+	PPP_CHANNEL_LOCK_BH(pch->downl);
 	if (pch->chan) {
 		while (!skb_queue_empty(&pch->file.xq)) {
 			skb = skb_dequeue(&pch->file.xq);
@@ -1879,7 +1910,7 @@  ppp_channel_push(struct channel *pch)
 		/* channel got deregistered */
 		skb_queue_purge(&pch->file.xq);
 	}
-	spin_unlock_bh(&pch->downl);
+	PPP_CHANNEL_UNLOCK_BH(pch->downl);
 	/* see if there is anything from the attached unit to be sent */
 	if (skb_queue_empty(&pch->file.xq)) {
 		read_lock_bh(&pch->upl);
@@ -2520,7 +2551,7 @@  int ppp_register_net_channel(struct net *net, struct ppp_channel *chan)
 	pch->lastseq = -1;
 #endif /* CONFIG_PPP_MULTILINK */
 	init_rwsem(&pch->chan_sem);
-	spin_lock_init(&pch->downl);
+	PPP_CHANNEL_LOCK_INIT(pch->downl);
 	rwlock_init(&pch->upl);
 
 	spin_lock_bh(&pn->all_channels_lock);
@@ -2599,9 +2630,9 @@  ppp_unregister_channel(struct ppp_channel *chan)
 	 * the channel's start_xmit or ioctl routine before we proceed.
 	 */
 	down_write(&pch->chan_sem);
-	spin_lock_bh(&pch->downl);
+	PPP_CHANNEL_LOCK_BH(pch->downl);
 	pch->chan = NULL;
-	spin_unlock_bh(&pch->downl);
+	PPP_CHANNEL_UNLOCK_BH(pch->downl);
 	up_write(&pch->chan_sem);
 	ppp_disconnect_channel(pch);