diff mbox

[RFC] xfrm: fix perpetual bundles

Message ID 20100225104029.GB3927@gauss.dd.secunet.de
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Steffen Klassert Feb. 25, 2010, 10:40 a.m. UTC
On Wed, Feb 24, 2010 at 08:19:24AM -0500, jamal wrote:
> 
> 1)In the connect() stage, in the slow path a route cache is
> created with the rth->fl.fl4_src of 0.0.0.0...
> ==> policy->bundles is empty, so we do a lookup, fail, create
> one.. (remember rth->fl.fl4_src of 0.0.0.0 at this stage and
> thats what we end storing in the bundle/xdst for later comparison
> instead of the skb's fl)
> 
> 2)ping sends a packet (does a sendmsg) 
> ==> xfrm_find_bundle() ends up comparing skb's->fl (non-zero
> fl->fl4_src) with what we stored from #1b. Fails.
> ==> we create a new bundle at attach the old one at the end of it.
> ...and now policy->bundles has two xdst entries
> 
> 3) Repeat #2, and now we have 3 xdsts in policy bundles
> 
> 4) Repeat #2, and now we have 4 xdsts in policy bundles..
> 
> 5) Send 7 more pings and look at slabinfo and youll see
> 10 object all of which are active..
> 
> Essentially this seems to go on and on and i can cache a huge
> number of xdsts..
> 

Do you have CONFIG_XFRM_SUB_POLICY enabled?

I observed the same behaviour recently when I had CONFIG_XFRM_SUB_POLICY
enabled. The problem in my case is, that we do a route lookup based on a flow
with a source address of 0.0.0.0 in ip_route_output_flow() if we send a ping
packet. Then we update the flow's source address based on the routing
informations we got from __ip_route_output_key(). Now the actual flow does
not match the the flow information in the routing table anymore. As a result,
we generate a new xfrm bundle entry with every ping packet, as you pointed out.

I solved this by rerunning __ip_route_output_key() if we change the source or
destination address of the flow (patch below). I have not send the patch so
far because I'm not that familiar with the routing code and I'm still not sure
whether this is the right way to fix it, so I wanted to do some further
analysis first.

Interestingly this does not happen if CONFIG_XFRM_SUB_POLICY is disabled.

When ping is started, it opens a udp socket. This triggers a xfrm_lookup()
and a xfrm bundle entry is generated. In the standard case, the flow of the
ping packets matching the flow informations from the bundle entry generated 
by the opening of the udp socket, because we don't care for the upper layer
flow information here. Unlike the standard case, if CONFIG_XFRM_SUB_POLICY is
enabled we do upper layer information matching with flow_cache_uli_match().
Now the flow of the ping packets does, of course, not match the flow
informations of the bundle entry and we generate a new bundle entry with
every packet...


---
 net/ipv4/route.c |   15 +++++++++++++--
 1 files changed, 13 insertions(+), 2 deletions(-)

Comments

jamal Feb. 25, 2010, 1:19 p.m. UTC | #1
Hi Steffen,

On Thu, 2010-02-25 at 11:40 +0100, Steffen Klassert wrote:

> 
> Do you have CONFIG_XFRM_SUB_POLICY enabled?

I tried with and without. If in xfrm_bundle_create()
after the call to xfrm_fill_dst() somewhere i "fixed" the
xdst->u.rt.fl.fl4_src, as in:
---
                err = xfrm_fill_dst(xdst, dev);
                if (err)
                        goto free_dst;
 

+               if (!xdst->u.rt.fl.fl4_src) {
+                       xdst->u.rt.fl.fl4_src = fl->fl4_src;
+		}
----

Then this worked as long as i turned off CONFIG_XFRM_SUB_POLICY.
If i use the simple patch i posted - it worked with or without
CONFIG_XFRM_SUB_POLICY turned on.

> I observed the same behaviour recently when I had CONFIG_XFRM_SUB_POLICY
> enabled. The problem in my case is, that we do a route lookup based on a flow
> with a source address of 0.0.0.0 in ip_route_output_flow() if we send a ping
> packet. Then we update the flow's source address based on the routing
> informations we got from __ip_route_output_key(). Now the actual flow does
> not match the the flow information in the routing table anymore. As a result,
> we generate a new xfrm bundle entry with every ping packet, as you pointed out.
> 

nod.

> I solved this by rerunning __ip_route_output_key() if we change the source or
> destination address of the flow (patch below). I have not send the patch so
> far because I'm not that familiar with the routing code and I'm still not sure
> whether this is the right way to fix it, so I wanted to do some further
> analysis first.
> 
> Interestingly this does not happen if CONFIG_XFRM_SUB_POLICY is disabled.
>
> When ping is started, it opens a udp socket. This triggers a xfrm_lookup()
> and a xfrm bundle entry is generated. In the standard case, the flow of the
> ping packets matching the flow informations from the bundle entry generated 
> by the opening of the udp socket, because we don't care for the upper layer
> flow information here. Unlike the standard case, if CONFIG_XFRM_SUB_POLICY is
> enabled we do upper layer information matching with flow_cache_uli_match().

Precisely - i actually was failing at exactly the same spot with
CONFIG_XFRM_SUB_POLICY with the "fix" i described above. 
"Fixing it" at that path level you have below may have bigger effect - i
cant think of one right now; but that path is hit by all outgoing
packets...

Lets hear what other people have to say - but iam beginning to feel
probably the change i posted is not so unreasonable since 0.0.0.0
is INADDR_ANY.

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jamal Feb. 28, 2010, 2:07 p.m. UTC | #2
Steffen,

Did you try the simple patch i posted? After contemplating
in the background i think it is the right thing to do.
Ive also fixed IPV6 side the same way.

cheers,
jamal
On Thu, 2010-02-25 at 08:19 -0500, jamal wrote:


> 
> Lets hear what other people have to say - but iam beginning to feel
> probably the change i posted is not so unreasonable since 0.0.0.0
> is INADDR_ANY.
> 
> cheers,
> jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steffen Klassert March 1, 2010, 3:33 p.m. UTC | #3
Hi Jamal.

On Sun, Feb 28, 2010 at 09:07:15AM -0500, jamal wrote:
> Steffen,
> 
> Did you try the simple patch i posted? After contemplating
> in the background i think it is the right thing to do.
> Ive also fixed IPV6 side the same way.
> 

Yes, I did. Your patch works fine. As your solution is less
invasive we probaply should take your patches if nobody else
has an opinion on this.

Steffen
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index d62b05d..3bf0b89 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2778,15 +2778,26 @@  int ip_route_output_flow(struct net *net, struct rtable **rp, struct flowi *flp,
 			 struct sock *sk, int flags)
 {
 	int err;
+	int update_route = 0;
 
 	if ((err = __ip_route_output_key(net, rp, flp)) != 0)
 		return err;
 
 	if (flp->proto) {
-		if (!flp->fl4_src)
+		if (!flp->fl4_src) {
 			flp->fl4_src = (*rp)->rt_src;
-		if (!flp->fl4_dst)
+			update_route = 1;
+		}
+		if (!flp->fl4_dst) {
 			flp->fl4_dst = (*rp)->rt_dst;
+			update_route = 1;
+		}
+		if (update_route) {
+			dst_release(&(*rp)->u.dst);
+			if ((err = __ip_route_output_key(net, rp, flp)) != 0)
+				return err;
+		}
+
 		err = __xfrm_lookup(net, (struct dst_entry **)rp, flp, sk,
 				    flags ? XFRM_LOOKUP_WAIT : 0);
 		if (err == -EREMOTE)