Patchwork [net-next,3/4] net: avoid false perf interpretations in frag code

login
register
mail settings
Submitter Jesper Dangaard Brouer
Date April 24, 2013, 3:48 p.m.
Message ID <20130424154836.16883.79599.stgit@dragon>
Download mbox | patch
Permalink /patch/239242/
State Changes Requested
Delegated to: David Miller
Headers show

Comments

Jesper Dangaard Brouer - April 24, 2013, 3:48 p.m.
The compiler make us misinterpret performance issues in the frag code,
because its auto inlining functions.  Lets instead do explicit
inlining to make this situation obvious to the programmer.

The function inet_frag_find() get the perf blame, because auto
inlining of the functions inet_frag_create(), inet_frag_alloc() and
inet_frag_intern().

My solution is to explicit inline inet_frag_alloc() and
inet_frag_intern(), but explicitly "noinline" inet_frag_create(),
in-order to make it explicit to the performance engineer, that
creation phase is a bottleneck. Then, when reading the code the
programmer should notice the inline, and see the bottleneck is really
located in inet_frag_intern().

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---

 net/ipv4/inet_fragment.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - April 24, 2013, 11:48 p.m.
On Wed, 2013-04-24 at 17:48 +0200, Jesper Dangaard Brouer wrote:
> The compiler make us misinterpret performance issues in the frag code,
> because its auto inlining functions.  Lets instead do explicit
> inlining to make this situation obvious to the programmer.
> 
> The function inet_frag_find() get the perf blame, because auto
> inlining of the functions inet_frag_create(), inet_frag_alloc() and
> inet_frag_intern().
> 
> My solution is to explicit inline inet_frag_alloc() and
> inet_frag_intern(), but explicitly "noinline" inet_frag_create(),
> in-order to make it explicit to the performance engineer, that
> creation phase is a bottleneck. Then, when reading the code the
> programmer should notice the inline, and see the bottleneck is really
> located in inet_frag_intern().
> 
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---

There is no way we add inline/noinline attributes to help developers to
use performance tools.

noinline can make sense when we want to avoid consuming too much stack
space, and in this case we use the explicit noinline_for_stack.

Another case would be when we know a particular function is called on
very rare occasions, and we want to avoid compiler being smart and
inline the function in the caller.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - April 24, 2013, 11:54 p.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 24 Apr 2013 16:48:27 -0700

> noinline can make sense when we want to avoid consuming too much stack
> space, and in this case we use the explicit noinline_for_stack.
> 
> Another case would be when we know a particular function is called on
> very rare occasions, and we want to avoid compiler being smart and
> inline the function in the caller.

Yet another case is where we must force the function in a special
section, f.e. __kprobes.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesper Dangaard Brouer - April 25, 2013, 10:57 a.m.
On Wed, 2013-04-24 at 16:48 -0700, Eric Dumazet wrote:
> On Wed, 2013-04-24 at 17:48 +0200, Jesper Dangaard Brouer wrote:
> > The compiler make us misinterpret performance issues in the frag code,
> > because its auto inlining functions.  Lets instead do explicit
> > inlining to make this situation obvious to the programmer.
> > 
> > The function inet_frag_find() get the perf blame, because auto
> > inlining of the functions inet_frag_create(), inet_frag_alloc() and
> > inet_frag_intern().
> > 
> > My solution is to explicit inline inet_frag_alloc() and
> > inet_frag_intern(), but explicitly "noinline" inet_frag_create(),
> > in-order to make it explicit to the performance engineer, that
> > creation phase is a bottleneck. Then, when reading the code the
> > programmer should notice the inline, and see the bottleneck is really
> > located in inet_frag_intern().
> > 
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > ---
> 
> There is no way we add inline/noinline attributes to help developers to
> use performance tools.

I take that as a NACK

Would we add the "inlines" only to the code, to make it clear what is
happening in the code?


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - April 25, 2013, 7:13 p.m.
From: Jesper Dangaard Brouer <brouer@redhat.com>
Date: Thu, 25 Apr 2013 12:57:08 +0200

> Would we add the "inlines" only to the code, to make it clear what is
> happening in the code?

Never add inline in *.c code, in *.h headers, let the compiler sort it out.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index cabe3d7..db30a01 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -240,7 +240,7 @@  int inet_frag_evictor(struct netns_frags *nf, struct inet_frags *f, bool force)
 }
 EXPORT_SYMBOL(inet_frag_evictor);
 
-static struct inet_frag_queue *inet_frag_intern(struct netns_frags *nf,
+static inline struct inet_frag_queue *inet_frag_intern(struct netns_frags *nf,
 		struct inet_frag_queue *qp_in, struct inet_frags *f,
 		void *arg)
 {
@@ -288,7 +288,7 @@  static struct inet_frag_queue *inet_frag_intern(struct netns_frags *nf,
 	return qp;
 }
 
-static struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf,
+static inline struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf,
 		struct inet_frags *f, void *arg)
 {
 	struct inet_frag_queue *q;
@@ -308,7 +308,7 @@  static struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf,
 	return q;
 }
 
-static struct inet_frag_queue *inet_frag_create(struct netns_frags *nf,
+static noinline struct inet_frag_queue *inet_frag_create(struct netns_frags *nf,
 		struct inet_frags *f, void *arg)
 {
 	struct inet_frag_queue *q;