diff mbox

[1/2] xfrm: Force a dst refcount before entering the xfrm type handlers

Message ID 20110316070828.GQ31402@secunet.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Steffen Klassert March 16, 2011, 7:08 a.m. UTC
Crypto requests might return asynchronous. In this case we leave
the rcu protected region, so force a refcount on the skb's
destination entry before we enter the xfrm type input/output
handlers.

This fixes a crash when a route is deleted whilst sending IPsec
data that is transformed by an asynchronous algorithm.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/xfrm/xfrm_input.c  |    2 ++
 net/xfrm/xfrm_output.c |    2 ++
 2 files changed, 4 insertions(+), 0 deletions(-)

Comments

David Miller March 16, 2011, 7:17 a.m. UTC | #1
Steffen we really need to find another way to fix these problems.

We already make way too many expensive atomic operations in these code
paths modified by your 3 patches, we should strive to not add new
ones.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steffen Klassert March 16, 2011, 7:43 a.m. UTC | #2
On Wed, Mar 16, 2011 at 12:17:32AM -0700, David Miller wrote:
> 
> Steffen we really need to find another way to fix these problems.
> 
> We already make way too many expensive atomic operations in these code
> paths modified by your 3 patches, we should strive to not add new
> ones.
> 

I know that it is exspensive, but we have to take a refcount if
the crypto layer returns asyncronous. Unfortunately it is too
late to take the refcount when the crypto layer notifies us about
that as the skb might be already gone.

The second patch just moves the refcount from xfrm_output_one
to skb_dst_pop. As xfrm_output_one is the only user of
skb_dst_pop, we take the refcont just a bit realier.

The third one makes the socket policy case consistent to the
case where we get the destination entry from the flow cache
where we take a reference. We can either return the dst entry
with or without a refcont in both cases. But we can't return
sometimes with and somtimes without a refcount.

I'd be happy to see all these refcounts gone too of course,
but that's way beyond a simple bug fix.

Steffen
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 16, 2011, 5:41 p.m. UTC | #3
From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Wed, 16 Mar 2011 08:43:06 +0100

> On Wed, Mar 16, 2011 at 12:17:32AM -0700, David Miller wrote:
>> 
>> Steffen we really need to find another way to fix these problems.
>> 
>> We already make way too many expensive atomic operations in these code
>> paths modified by your 3 patches, we should strive to not add new
>> ones.
>> 
> 
> I know that it is exspensive, but we have to take a refcount if
> the crypto layer returns asyncronous. Unfortunately it is too
> late to take the refcount when the crypto layer notifies us about
> that as the skb might be already gone.

We can pass around an atomic_t for the crypto layer to bump when
it decides to take an async path, bump it on entry and upon event
wakeup decrement it.

Actually, a plain atomic_t is a bad idea because we might have to
release the object.  So callbacks with private void pointer arg is
better.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 872065c..341cd11 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -190,6 +190,8 @@  int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 		XFRM_SKB_CB(skb)->seq.input.low = seq;
 		XFRM_SKB_CB(skb)->seq.input.hi = seq_hi;
 
+		skb_dst_force(skb);
+
 		nexthdr = x->type->input(x, skb);
 
 		if (nexthdr == -EINPROGRESS)
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index 1aba03f..8f3f0ee 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -78,6 +78,8 @@  static int xfrm_output_one(struct sk_buff *skb, int err)
 
 		spin_unlock_bh(&x->lock);
 
+		skb_dst_force(skb);
+
 		err = x->type->output(x, skb);
 		if (err == -EINPROGRESS)
 			goto out_exit;