diff mbox

xfrm: optimize ipv4 selector matching

Message ID 20111122144334.GA22824@p183.telecom.by
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Alexey Dobriyan Nov. 22, 2011, 2:43 p.m. UTC
Current addr_match() is errh, under-optimized.

Compiler doesn't know that memcmp() branch doesn't trigger for IPv4.
Pass addresses by value.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 include/net/xfrm.h     |   10 ++++++++++
 net/xfrm/xfrm_policy.c |    4 ++--
 2 files changed, 12 insertions(+), 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Laight Nov. 22, 2011, 2:50 p.m. UTC | #1
> +static inline int addr4_match(const __be32 a1, const __be32 
> a2, const u8 prefixlen)
> +{
> +	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> +	if (prefixlen == 0) {
> +		/* Matching constants result in smaller assembly. */
> +		return 0xFFFFFFFFu;
> +	}
> +	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> +}
> +

It would probably be clearer to 'return 1' when prefixlen is zero.

If this is a common path, might be worth caching
    htonl(0xFFFFFFFFu << (32 - prefixlen))
in the enclosing structure.

	David


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexey Dobriyan Nov. 22, 2011, 2:59 p.m. UTC | #2
On Tue, Nov 22, 2011 at 02:50:59PM -0000, David Laight wrote:
>  
> > +static inline int addr4_match(const __be32 a1, const __be32 
> > a2, const u8 prefixlen)
> > +{
> > +	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> > +	if (prefixlen == 0) {
> > +		/* Matching constants result in smaller assembly. */
> > +		return 0xFFFFFFFFu;
> > +	}
> > +	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> > +}
> > +
> 
> It would probably be clearer to 'return 1' when prefixlen is zero.

"return 1" results in bigger code.
This function used only in boolean context, so exact return value doesn't matter.

> If this is a common path, might be worth caching
>     htonl(0xFFFFFFFFu << (32 - prefixlen))
> in the enclosing structure.

This means one more branch, which wouldn't be a win compared
to current code.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Nov. 22, 2011, 3:10 p.m. UTC | #3
Le mardi 22 novembre 2011 à 17:59 +0300, Alexey Dobriyan a écrit :
> On Tue, Nov 22, 2011 at 02:50:59PM -0000, David Laight wrote:
> >  
> > > +static inline int addr4_match(const __be32 a1, const __be32 
> > > a2, const u8 prefixlen)
> > > +{
> > > +	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> > > +	if (prefixlen == 0) {
> > > +		/* Matching constants result in smaller assembly. */
> > > +		return 0xFFFFFFFFu;
> > > +	}
> > > +	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> > > +}
> > > +
> > 
> > It would probably be clearer to 'return 1' when prefixlen is zero.
> 
> "return 1" results in bigger code.
> This function used only in boolean context, so exact return value doesn't matter.


Thats too ugly, sorry.

Also, using "const" attributes on integral types (not pointers) make
code less clear. Caller doesnt care if the implementation wants to
change a1 or a2 or prefixlen.

Please use :

+static inline bool addr4_match(__be32 a1, __be32 a2, u8 prefixlen)
+{
+       /* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
+       if (prefixlen == 0)
+               return true;
+       return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
+}


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight Nov. 22, 2011, 3:15 p.m. UTC | #4
> Please use :
> 
> +static inline bool addr4_match(__be32 a1, __be32 a2, u8 prefixlen)
> +{
> +       /* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> +       if (prefixlen == 0)
> +               return true;
> +       return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> +}

I'm not sure I'd agree about using 'u8'.
It may well cause an unnecessary mask with 0xff.

	David


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexey Dobriyan Nov. 22, 2011, 3:41 p.m. UTC | #5
On Tue, Nov 22, 2011 at 03:15:25PM -0000, David Laight wrote:
>  
> > Please use :
> > 
> > +static inline bool addr4_match(__be32 a1, __be32 a2, u8 prefixlen)
> > +{
> > +       /* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> > +       if (prefixlen == 0)
> > +               return true;
> > +       return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> > +}
> 
> I'm not sure I'd agree about using 'u8'.
> It may well cause an unnecessary mask with 0xff.

It's u8 in all other places.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexey Dobriyan Nov. 22, 2011, 3:47 p.m. UTC | #6
On Tue, Nov 22, 2011 at 04:10:52PM +0100, Eric Dumazet wrote:
> Le mardi 22 novembre 2011 à 17:59 +0300, Alexey Dobriyan a écrit :
> > On Tue, Nov 22, 2011 at 02:50:59PM -0000, David Laight wrote:
> > >  
> > > > +static inline int addr4_match(const __be32 a1, const __be32 
> > > > a2, const u8 prefixlen)
> > > > +{
> > > > +	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> > > > +	if (prefixlen == 0) {
> > > > +		/* Matching constants result in smaller assembly. */
> > > > +		return 0xFFFFFFFFu;
> > > > +	}
> > > > +	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> > > > +}
> > > > +
> > > 
> > > It would probably be clearer to 'return 1' when prefixlen is zero.
> > 
> > "return 1" results in bigger code.
> > This function used only in boolean context, so exact return value doesn't matter.
> 
> 
> Thats too ugly, sorry.
> 
> Also, using "const" attributes on integral types (not pointers) make
> code less clear.

No. it doesn't.
It signals to programmer that functions doesn't change its arguments.

> Caller doesnt care if the implementation wants to
> change a1 or a2 or prefixlen.

const is never about caller.
Millions of const qualifiers were injected already, so who cares.

> Please use :
> 
> +static inline bool addr4_match(__be32 a1, __be32 a2, u8 prefixlen)
> +{
> +       /* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> +       if (prefixlen == 0)
> +               return true;
> +       return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> +}

hmm, bool results in smaller code for -Os, so I'll resend.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -827,6 +827,16 @@  static inline bool addr_match(const void *token1, const void *token2,
 	return true;
 }
 
+static inline int addr4_match(const __be32 a1, const __be32 a2, const u8 prefixlen)
+{
+	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
+	if (prefixlen == 0) {
+		/* Matching constants result in smaller assembly. */
+		return 0xFFFFFFFFu;
+	}
+	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
+}
+
 static __inline__
 __be16 xfrm_flowi_sport(const struct flowi *fl, const union flowi_uli *uli)
 {
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -61,8 +61,8 @@  __xfrm4_selector_match(const struct xfrm_selector *sel, const struct flowi *fl)
 {
 	const struct flowi4 *fl4 = &fl->u.ip4;
 
-	return  addr_match(&fl4->daddr, &sel->daddr, sel->prefixlen_d) &&
-		addr_match(&fl4->saddr, &sel->saddr, sel->prefixlen_s) &&
+	return  addr4_match(fl4->daddr, sel->daddr.a4, sel->prefixlen_d) &&
+		addr4_match(fl4->saddr, sel->saddr.a4, sel->prefixlen_s) &&
 		!((xfrm_flowi_dport(fl, &fl4->uli) ^ sel->dport) & sel->dport_mask) &&
 		!((xfrm_flowi_sport(fl, &fl4->uli) ^ sel->sport) & sel->sport_mask) &&
 		(fl4->flowi4_proto == sel->proto || !sel->proto) &&