Patchwork [1/2] net: Allow to create links with given ifindex

login
register
mail settings
Submitter Eric Dumazet
Date Aug. 3, 2012, 5:45 a.m.
Message ID <1343972729.9299.596.camel@edumazet-glaptop>
Download mbox | patch
Permalink /patch/174889/
State RFC
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - Aug. 3, 2012, 5:45 a.m.
On Thu, 2012-08-02 at 16:26 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 02 Aug 2012 12:28:30 +0200
> 
> > Strange because I see no false sharing on this ifindex location for
> > loopback device.
> 
> Are you sure netdev->rx_dropped isn't being incremented?  That appears
> as if it would land on the same cache line as netdev->ifindex.
> 

Yes I am sure (by the way my output device was dummy0, not lo, but its
the same for the case here)

offsetof(struct net_device, features)=0x80
... (all features are mostly read only)
offsetof(struct net_device, ifindex)=0xa0

struct net_device_stats stats; is untouched on dummy device

offsetof(struct net_device, rx_dropped)=0x160

So thats only the dereference done million times per second that show in
the profiles, even if cache lines are clean and in cpu cache.

I see that even more clear in the IN_DEV_ROUTE_LOCALNET(in_dev) macro in
ip_route_input_slow(), doing so many derefs :

Its actually faster to avoid it, if we already have the net pointer in
hand : IN_DEV_NET_ROUTE_LOCALNET(in_dev, net) 





--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Aug. 3, 2012, 5:51 a.m.
On Fri, 2012-08-03 at 07:45 +0200, Eric Dumazet wrote:

> Yes I am sure (by the way my output device was dummy0, not lo, but its
> the same for the case here)

Of course, the ifindex was related to lo, sorry for the confusion,
-ENOCOFFEE_YET_THIS_MORNING


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Aug. 3, 2012, 11:56 p.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 03 Aug 2012 07:45:29 +0200

> @@ -1587,13 +1587,11 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
>  	if (ipv4_is_zeronet(daddr))
>  		goto martian_destination;
>  
> -	if (likely(!IN_DEV_ROUTE_LOCALNET(in_dev))) {
> -		if (ipv4_is_loopback(daddr))
> -			goto martian_destination;
> +	if (ipv4_is_loopback(daddr) && !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
> +		goto martian_destination;
>  
> -		if (ipv4_is_loopback(saddr))
> -			goto martian_source;
> -	}
> +	if (ipv4_is_loopback(saddr) && !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
> +		goto martian_source;

Maybe clearer as:

	if ((ipv4_is_loopback(daddr) || ipv4_is_loopback(saddr)) &&
	    !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
		goto martian_source;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Aug. 4, 2012, 7:10 a.m.
On Fri, 2012-08-03 at 16:56 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 03 Aug 2012 07:45:29 +0200
> 
> > @@ -1587,13 +1587,11 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
> >  	if (ipv4_is_zeronet(daddr))
> >  		goto martian_destination;
> >  
> > -	if (likely(!IN_DEV_ROUTE_LOCALNET(in_dev))) {
> > -		if (ipv4_is_loopback(daddr))
> > -			goto martian_destination;
> > +	if (ipv4_is_loopback(daddr) && !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
> > +		goto martian_destination;
> >  
> > -		if (ipv4_is_loopback(saddr))
> > -			goto martian_source;
> > -	}
> > +	if (ipv4_is_loopback(saddr) && !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
> > +		goto martian_source;
> 
> Maybe clearer as:
> 
> 	if ((ipv4_is_loopback(daddr) || ipv4_is_loopback(saddr)) &&
> 	    !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
> 		goto martian_source;

Clearer, but handling of a martian destination is different of the
martian source ;)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Aug. 4, 2012, 8:25 a.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 04 Aug 2012 09:10:40 +0200

> On Fri, 2012-08-03 at 16:56 -0700, David Miller wrote:
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Fri, 03 Aug 2012 07:45:29 +0200
>> 
>> > @@ -1587,13 +1587,11 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
>> >  	if (ipv4_is_zeronet(daddr))
>> >  		goto martian_destination;
>> >  
>> > -	if (likely(!IN_DEV_ROUTE_LOCALNET(in_dev))) {
>> > -		if (ipv4_is_loopback(daddr))
>> > -			goto martian_destination;
>> > +	if (ipv4_is_loopback(daddr) && !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
>> > +		goto martian_destination;
>> >  
>> > -		if (ipv4_is_loopback(saddr))
>> > -			goto martian_source;
>> > -	}
>> > +	if (ipv4_is_loopback(saddr) && !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
>> > +		goto martian_source;
>> 
>> Maybe clearer as:
>> 
>> 	if ((ipv4_is_loopback(daddr) || ipv4_is_loopback(saddr)) &&
>> 	    !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
>> 		goto martian_source;
> 
> Clearer, but handling of a martian destination is different of the
> martian source ;)

Duh, I missed that, too many martians :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
index 67f9dda..d032780 100644
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -104,9 +104,14 @@  static inline void ipv4_devconf_setall(struct in_device *in_dev)
 #define IN_DEV_ANDCONF(in_dev, attr) \
 	(IPV4_DEVCONF_ALL(dev_net(in_dev->dev), attr) && \
 	 IN_DEV_CONF_GET((in_dev), attr))
-#define IN_DEV_ORCONF(in_dev, attr) \
-	(IPV4_DEVCONF_ALL(dev_net(in_dev->dev), attr) || \
+
+#define IN_DEV_NET_ORCONF(in_dev, net, attr) \
+	(IPV4_DEVCONF_ALL(net, attr) || \
 	 IN_DEV_CONF_GET((in_dev), attr))
+
+#define IN_DEV_ORCONF(in_dev, attr) \
+	IN_DEV_NET_ORCONF(in_dev, dev_net(in_dev->dev), attr)
+
 #define IN_DEV_MAXCONF(in_dev, attr) \
 	(max(IPV4_DEVCONF_ALL(dev_net(in_dev->dev), attr), \
 	     IN_DEV_CONF_GET((in_dev), attr)))
@@ -133,6 +138,8 @@  static inline void ipv4_devconf_setall(struct in_device *in_dev)
 					IN_DEV_ORCONF((in_dev), \
 						      PROMOTE_SECONDARIES)
 #define IN_DEV_ROUTE_LOCALNET(in_dev)	IN_DEV_ORCONF(in_dev, ROUTE_LOCALNET)
+#define IN_DEV_NET_ROUTE_LOCALNET(in_dev, net)	\
+	IN_DEV_NET_ORCONF(in_dev, net, ROUTE_LOCALNET)
 
 #define IN_DEV_RX_REDIRECTS(in_dev) \
 	((IN_DEV_FORWARD(in_dev) && \
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index e4ba974..5e88e3b 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1587,13 +1587,11 @@  static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	if (ipv4_is_zeronet(daddr))
 		goto martian_destination;
 
-	if (likely(!IN_DEV_ROUTE_LOCALNET(in_dev))) {
-		if (ipv4_is_loopback(daddr))
-			goto martian_destination;
+	if (ipv4_is_loopback(daddr) && !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
+		goto martian_destination;
 
-		if (ipv4_is_loopback(saddr))
-			goto martian_source;
-	}
+	if (ipv4_is_loopback(saddr) && !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
+		goto martian_source;
 
 	/*
 	 *	Now we are ready to route packet.