Patchwork vsprintf.c: Use noinline_for_stack

login
register
mail settings
Submitter Joe Perches
Date March 7, 2010, 2:33 a.m.
Message ID <1267929215.1937.18.camel@Joe-Laptop.home>
Download mbox | patch
Permalink /patch/47065/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Joe Perches - March 7, 2010, 2:33 a.m.
On Sat, 2010-03-06 at 18:03 -0800, Linus Torvalds wrote:
> A few noinlines might be appropriate.

Mark static functions with noinline_for_stack

Signed-off-by: Joe Perches <joe@perches.com>
---
 lib/vsprintf.c |   67 +++++++++++++++++++++++++++++++++++---------------------
 1 files changed, 42 insertions(+), 25 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joe Perches - March 8, 2010, 11:39 p.m.
On Sat, 2010-03-06 at 18:33 -0800, Joe Perches wrote:
> On Sat, 2010-03-06 at 18:03 -0800, Linus Torvalds wrote:
> > A few noinlines might be appropriate.
> Mark static functions with noinline_for_stack

It's fine by me that the vsnprintf recursion and
(pr|dev|netdev)_<level> text reduction patches didn't make .34.

I'd like to know what you think necessary to get them into .35.

Perhaps it'd be useful if they could go into -next as-is for a
bit of wider testing.

http://patchwork.kernel.org/patch/83940/
http://patchwork.kernel.org/patch/83724/
http://patchwork.kernel.org/patch/83726/
http://patchwork.kernel.org/patch/83725/


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Morton - March 13, 2010, 12:25 a.m.
On Sat, 06 Mar 2010 18:33:35 -0800
Joe Perches <joe@perches.com> wrote:

> On Sat, 2010-03-06 at 18:03 -0800, Linus Torvalds wrote:
> > A few noinlines might be appropriate.
> 
> Mark static functions with noinline_for_stack
> 

-ENOTESTINGRESULTS.



Before:

akpm:/usr/src/25> objdump -d lib/vsprintf.o | perl scripts/checkstack.pl
0x00000e82 pointer [vsprintf.o]:                        344
0x0000198c pointer [vsprintf.o]:                        344
0x000025d6 scnprintf [vsprintf.o]:                      216
0x00002648 scnprintf [vsprintf.o]:                      216
0x00002565 snprintf [vsprintf.o]:                       208
0x0000267c sprintf [vsprintf.o]:                        208
0x000030a3 bprintf [vsprintf.o]:                        208
0x00003b1e sscanf [vsprintf.o]:                         208
0x00000608 number [vsprintf.o]:                         136
0x00000937 number [vsprintf.o]:                         136

After:

akpm:/usr/src/25> objdump -d lib/vsprintf.o | perl scripts/checkstack.pl  
0x00000a7c symbol_string [vsprintf.o]:                  248
0x00000ae8 symbol_string [vsprintf.o]:                  248
0x00002310 scnprintf [vsprintf.o]:                      216
0x00002382 scnprintf [vsprintf.o]:                      216
0x0000229f snprintf [vsprintf.o]:                       208
0x000023b6 sprintf [vsprintf.o]:                        208
0x00002ddd bprintf [vsprintf.o]:                        208
0x00003858 sscanf [vsprintf.o]:                         208
0x00000625 number [vsprintf.o]:                         136
0x00000954 number [vsprintf.o]:                         136

nice.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - March 13, 2010, 3:35 p.m.
On Fri, 12 Mar 2010, Andrew Morton wrote:
> 
> -ENOTESTINGRESULTS.
> 
> Before:
> 
> akpm:/usr/src/25> objdump -d lib/vsprintf.o | perl scripts/checkstack.pl
> 0x00000e82 pointer [vsprintf.o]:                        344
> 0x0000198c pointer [vsprintf.o]:                        344
> 0x000025d6 scnprintf [vsprintf.o]:                      216
> 0x00002648 scnprintf [vsprintf.o]:                      216
> 0x00002565 snprintf [vsprintf.o]:                       208
> 0x0000267c sprintf [vsprintf.o]:                        208
> 0x000030a3 bprintf [vsprintf.o]:                        208
> 0x00003b1e sscanf [vsprintf.o]:                         208
> 0x00000608 number [vsprintf.o]:                         136
> 0x00000937 number [vsprintf.o]:                         136
> 
> After:
> 
> akpm:/usr/src/25> objdump -d lib/vsprintf.o | perl scripts/checkstack.pl  
> 0x00000a7c symbol_string [vsprintf.o]:                  248
> 0x00000ae8 symbol_string [vsprintf.o]:                  248
> 0x00002310 scnprintf [vsprintf.o]:                      216
> 0x00002382 scnprintf [vsprintf.o]:                      216
> 0x0000229f snprintf [vsprintf.o]:                       208
> 0x000023b6 sprintf [vsprintf.o]:                        208
> 0x00002ddd bprintf [vsprintf.o]:                        208
> 0x00003858 sscanf [vsprintf.o]:                         208
> 0x00000625 number [vsprintf.o]:                         136
> 0x00000954 number [vsprintf.o]:                         136
> 
> nice.

Note that the fact that the numbers are smaller is to some degree less 
important than _where_ the numbers are.

In the "before" side, it's the "pointer()" function that has a big stack 
depth. And the recursion that is going to happen is very much about 
vsnprintf -> pointer -> vsnprintf, so that is bad.

Now it's the new non-inlined leaf functions that still have a big stack 
footprint, and that's much better, because they wouldn't be part of any 
recursive behavior.

Not that I think it's wonderful even now. Especially that whole 
'symbol_string()' thing is not only a big stack user, it ends up calling 
down a fair number of other functions. Non-recursively, but still.

That, in turn, is due to this:

 - include/linux/kallsyms.h:
	#define KSYM_NAME_LEN 128
	#define KSYM_SYMBOL_LEN (sizeof("%s+%#lx/%#lx [%s]") + (KSYM_NAME_LEN - 1) + \

 - symbol_string():
	char sym[KSYM_SYMBOL_LEN];

ie we "need" about 150 bytes for just that silly symbol expansion (rounded 
up etc). Which is ridiculous, since we could/should limit it to something 
sane. But the kallsyms_lookup()/sprint_symbol() functions don't take a 
length parameter, so we have to do the worst-case thing (which itself has 
tons of unnecessary padding).

Gaah. We do _not_ want a kmalloc() or something like that in this path, 
since its' very much used for oopses (which in turn may be due to various 
slab bugs etc).

		Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joe Perches - March 13, 2010, 5:44 p.m.
On Sat, 2010-03-13 at 07:35 -0800, Linus Torvalds wrote:
> On Fri, 12 Mar 2010, Andrew Morton wrote:
> > nice.
> But the kallsyms_lookup()/sprint_symbol() functions don't take a 
> length parameter, so we have to do the worst-case thing (which itself has 
> tons of unnecessary padding).

I sent a patch once about that using a struct
because I didn't like the unbounded sprint
http://lkml.org/lkml/2009/4/15/16

> Gaah. We do _not_ want a kmalloc() or something like that in this path, 
> since its' very much used for oopses (which in turn may be due to various 
> slab bugs etc).

Perhaps a new snprint_symbol function with the
other kallsyms... functions changed as necessary.

thoughts?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 0d461c7..e9335a8 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -266,7 +266,8 @@  int strict_strtoll(const char *cp, unsigned int base, long long *res)
 }
 EXPORT_SYMBOL(strict_strtoll);
 
-static int skip_atoi(const char **s)
+static noinline_for_stack
+int skip_atoi(const char **s)
 {
 	int i = 0;
 
@@ -286,7 +287,8 @@  static int skip_atoi(const char **s)
 /* Formats correctly any integer in [0,99999].
  * Outputs from one to five digits depending on input.
  * On i386 gcc 4.1.2 -O2: ~250 bytes of code. */
-static char *put_dec_trunc(char *buf, unsigned q)
+static noinline_for_stack
+char *put_dec_trunc(char *buf, unsigned q)
 {
 	unsigned d3, d2, d1, d0;
 	d1 = (q>>4) & 0xf;
@@ -323,7 +325,8 @@  static char *put_dec_trunc(char *buf, unsigned q)
 	return buf;
 }
 /* Same with if's removed. Always emits five digits */
-static char *put_dec_full(char *buf, unsigned q)
+static noinline_for_stack
+char *put_dec_full(char *buf, unsigned q)
 {
 	/* BTW, if q is in [0,9999], 8-bit ints will be enough, */
 	/* but anyway, gcc produces better code with full-sized ints */
@@ -365,7 +368,8 @@  static char *put_dec_full(char *buf, unsigned q)
 	return buf;
 }
 /* No inlining helps gcc to use registers better */
-static noinline char *put_dec(char *buf, unsigned long long num)
+static noinline_for_stack
+char *put_dec(char *buf, unsigned long long num)
 {
 	while (1) {
 		unsigned rem;
@@ -416,8 +420,9 @@  struct printf_spec {
 	u8	qualifier;
 };
 
-static char *number(char *buf, char *end, unsigned long long num,
-			struct printf_spec spec)
+static noinline_for_stack
+char *number(char *buf, char *end, unsigned long long num,
+	     struct printf_spec spec)
 {
 	/* we are called with base 8, 10 or 16, only, thus don't need "G..."  */
 	static const char digits[16] = "0123456789ABCDEF"; /* "GHIJKLMNOPQRSTUVWXYZ"; */
@@ -536,7 +541,8 @@  static char *number(char *buf, char *end, unsigned long long num,
 	return buf;
 }
 
-static char *string(char *buf, char *end, const char *s, struct printf_spec spec)
+static noinline_for_stack
+char *string(char *buf, char *end, const char *s, struct printf_spec spec)
 {
 	int len, i;
 
@@ -566,8 +572,9 @@  static char *string(char *buf, char *end, const char *s, struct printf_spec spec
 	return buf;
 }
 
-static char *symbol_string(char *buf, char *end, void *ptr,
-				struct printf_spec spec, char ext)
+static noinline_for_stack
+char *symbol_string(char *buf, char *end, void *ptr,
+		    struct printf_spec spec, char ext)
 {
 	unsigned long value = (unsigned long) ptr;
 #ifdef CONFIG_KALLSYMS
@@ -587,8 +594,9 @@  static char *symbol_string(char *buf, char *end, void *ptr,
 #endif
 }
 
-static char *resource_string(char *buf, char *end, struct resource *res,
-				struct printf_spec spec, const char *fmt)
+static noinline_for_stack
+char *resource_string(char *buf, char *end, struct resource *res,
+		      struct printf_spec spec, const char *fmt)
 {
 #ifndef IO_RSRC_PRINTK_SIZE
 #define IO_RSRC_PRINTK_SIZE	6
@@ -678,8 +686,9 @@  static char *resource_string(char *buf, char *end, struct resource *res,
 	return string(buf, end, sym, spec);
 }
 
-static char *mac_address_string(char *buf, char *end, u8 *addr,
-				struct printf_spec spec, const char *fmt)
+static noinline_for_stack
+char *mac_address_string(char *buf, char *end, u8 *addr,
+			 struct printf_spec spec, const char *fmt)
 {
 	char mac_addr[sizeof("xx:xx:xx:xx:xx:xx")];
 	char *p = mac_addr;
@@ -702,7 +711,8 @@  static char *mac_address_string(char *buf, char *end, u8 *addr,
 	return string(buf, end, mac_addr, spec);
 }
 
-static char *ip4_string(char *p, const u8 *addr, const char *fmt)
+static noinline_for_stack
+char *ip4_string(char *p, const u8 *addr, const char *fmt)
 {
 	int i;
 	bool leading_zeros = (fmt[0] == 'i');
@@ -751,7 +761,8 @@  static char *ip4_string(char *p, const u8 *addr, const char *fmt)
 	return p;
 }
 
-static char *ip6_compressed_string(char *p, const char *addr)
+static noinline_for_stack
+char *ip6_compressed_string(char *p, const char *addr)
 {
 	int i, j, range;
 	unsigned char zerolength[8];
@@ -831,7 +842,8 @@  static char *ip6_compressed_string(char *p, const char *addr)
 	return p;
 }
 
-static char *ip6_string(char *p, const char *addr, const char *fmt)
+static noinline_for_stack
+char *ip6_string(char *p, const char *addr, const char *fmt)
 {
 	int i;
 
@@ -846,8 +858,9 @@  static char *ip6_string(char *p, const char *addr, const char *fmt)
 	return p;
 }
 
-static char *ip6_addr_string(char *buf, char *end, const u8 *addr,
-			     struct printf_spec spec, const char *fmt)
+static noinline_for_stack
+char *ip6_addr_string(char *buf, char *end, const u8 *addr,
+		      struct printf_spec spec, const char *fmt)
 {
 	char ip6_addr[sizeof("xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:255.255.255.255")];
 
@@ -859,8 +872,9 @@  static char *ip6_addr_string(char *buf, char *end, const u8 *addr,
 	return string(buf, end, ip6_addr, spec);
 }
 
-static char *ip4_addr_string(char *buf, char *end, const u8 *addr,
-			     struct printf_spec spec, const char *fmt)
+static noinline_for_stack
+char *ip4_addr_string(char *buf, char *end, const u8 *addr,
+		      struct printf_spec spec, const char *fmt)
 {
 	char ip4_addr[sizeof("255.255.255.255")];
 
@@ -869,8 +883,9 @@  static char *ip4_addr_string(char *buf, char *end, const u8 *addr,
 	return string(buf, end, ip4_addr, spec);
 }
 
-static char *uuid_string(char *buf, char *end, const u8 *addr,
-			 struct printf_spec spec, const char *fmt)
+static noinline_for_stack
+char *uuid_string(char *buf, char *end, const u8 *addr,
+		  struct printf_spec spec, const char *fmt)
 {
 	char uuid[sizeof("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")];
 	char *p = uuid;
@@ -958,8 +973,9 @@  static char *uuid_string(char *buf, char *end, const u8 *addr,
  * function pointers are really function descriptors, which contain a
  * pointer to the real address.
  */
-static char *pointer(const char *fmt, char *buf, char *end, void *ptr,
-			struct printf_spec spec)
+static noinline_for_stack
+char *pointer(const char *fmt, char *buf, char *end, void *ptr,
+	      struct printf_spec spec)
 {
 	if (!ptr)
 		return string(buf, end, "(null)", spec);
@@ -1028,7 +1044,8 @@  static char *pointer(const char *fmt, char *buf, char *end, void *ptr,
  * @precision: precision of a number
  * @qualifier: qualifier of a number (long, size_t, ...)
  */
-static int format_decode(const char *fmt, struct printf_spec *spec)
+static noinline_for_stack
+int format_decode(const char *fmt, struct printf_spec *spec)
 {
 	const char *start = fmt;