diff mbox

fix ip_gre lockless xmits

Message ID 1327610075-5833-1-git-send-email-willemb@google.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Willem de Bruijn Jan. 26, 2012, 8:34 p.m. UTC
Tunnel devices set NETIF_F_LLTX to bypass HARD_TX_LOCK.  Sit and
ipip set this unconditionally in ops->setup, but gre enables it
conditionally after parameter passing in ops->newlink. This is
not called during tunnel setup as below, however, so GRE tunnels are
still taking the lock.

modprobe ip_gre
ip tunnel add test0 mode gre remote 10.5.1.1 dev lo
ip link set test0 up
ip addr add 10.6.0.1 dev test0
 # cat /sys/class/net/test0/features
 # $DIR/test_tunnel_xmit 10 10.5.2.1
ip route add 10.5.2.0/24 dev test0
ip tunnel del test0

The newlink callback is only called in rtnl_netlink, and only if
the device is new, as it calls register_netdevice internally. Gre
tunnels are created at 'ip tunnel add' with ioctl SIOCADDTUNNEL,
which calls ipgre_tunnel_locate, which calls register_netdev.
rtnl_newlink is called at 'ip link set', but skips ops->newlink
and the device is up with locking still enabled. The equivalent
ipip tunnel works fine, btw (just substitute 'method gre' for
'method ipip').

On kernels before /sys/class/net/*/features was removed [1],
the first commented out line returns 0x6000 with method gre,
which indicates that NETIF_F_LLTX (0x1000) is not set. With ipip,
it reports 0x7000. This test cannot be used on recent kernels where
the sysfs file is removed (and ETHTOOL_GFEATURES does not currently
work for tunnel devices, because they lack dev->ethtool_ops).

The second commented out line calls a simple transmission test [2]
that sends on 24 cores at maximum rate. Results of a single run:

ipip:			19,372,306
gre before patch:	 4,839,753
gre after patch:	19,133,873

This patch replicates the condition check in ipgre_newlink to
ipgre_tunnel_locate. It works for me, both with oseq on and off.
This is the first time I looked at rtnetlink and iproute2 code,
though, so someone more knowledgeable should probably check the
patch. Thanks.

The tail of both functions is now identical, by the way. To avoid
code duplication, I'll be happy to rework this and merge the two.

[1] http://patchwork.ozlabs.org/patch/104610/
[2] http://kernel.googlecode.com/files/xmit_udp_parallel.c

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/ipv4/ip_gre.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

Comments

Eric Dumazet Jan. 26, 2012, 8:58 p.m. UTC | #1
Le jeudi 26 janvier 2012 à 15:34 -0500, Willem de Bruijn a écrit :
> Tunnel devices set NETIF_F_LLTX to bypass HARD_TX_LOCK.  Sit and
> ipip set this unconditionally in ops->setup, but gre enables it
> conditionally after parameter passing in ops->newlink. This is
> not called during tunnel setup as below, however, so GRE tunnels are
> still taking the lock.
> 
> modprobe ip_gre
> ip tunnel add test0 mode gre remote 10.5.1.1 dev lo
> ip link set test0 up
> ip addr add 10.6.0.1 dev test0
>  # cat /sys/class/net/test0/features
>  # $DIR/test_tunnel_xmit 10 10.5.2.1
> ip route add 10.5.2.0/24 dev test0
> ip tunnel del test0
> 

> Signed-off-by: Willem de Bruijn <willemb@google.com>

Sure !

When I did the original patch, I used following setup sequence.

ip link add gre34 type gre remote 1.2.3.4

I was not aware of the "ip tunnel add ..."

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Jan. 26, 2012, 9:36 p.m. UTC | #2
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 26 Jan 2012 21:58:11 +0100

> Le jeudi 26 janvier 2012 à 15:34 -0500, Willem de Bruijn a écrit :
>> Tunnel devices set NETIF_F_LLTX to bypass HARD_TX_LOCK.  Sit and
>> ipip set this unconditionally in ops->setup, but gre enables it
>> conditionally after parameter passing in ops->newlink. This is
>> not called during tunnel setup as below, however, so GRE tunnels are
>> still taking the lock.
>> 
>> modprobe ip_gre
>> ip tunnel add test0 mode gre remote 10.5.1.1 dev lo
>> ip link set test0 up
>> ip addr add 10.6.0.1 dev test0
>>  # cat /sys/class/net/test0/features
>>  # $DIR/test_tunnel_xmit 10 10.5.2.1
>> ip route add 10.5.2.0/24 dev test0
>> ip tunnel del test0
>> 
> 
>> Signed-off-by: Willem de Bruijn <willemb@google.com>
> 
> Sure !
> 
> When I did the original patch, I used following setup sequence.
> 
> ip link add gre34 type gre remote 1.2.3.4
> 
> I was not aware of the "ip tunnel add ..."
> 
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Willem de Bruijn Jan. 26, 2012, 9:50 p.m. UTC | #3
> ip link add gre34 type gre remote 1.2.3.4
>
> I was not aware of the "ip tunnel add ..."

And I was unaware of that method. Interesting. That explains the
alternative codepath in rtnetlink.c. `ip link help` does not mention
gre or gretap on its list of supported types, even though they work.
Thanks for having a look!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 2b53a1f..6b3ca5b 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -422,6 +422,10 @@  static struct ip_tunnel *ipgre_tunnel_locate(struct net *net,
 	if (register_netdevice(dev) < 0)
 		goto failed_free;
 
+	/* Can use a lockless transmit, unless we generate output sequences */
+	if (!(nt->parms.o_flags & GRE_SEQ))
+		dev->features |= NETIF_F_LLTX;
+
 	dev_hold(dev);
 	ipgre_tunnel_link(ign, nt);
 	return nt;