[U-Boot,v2] string: Provide a slimmed-down memset()
diff mbox

Message ID 4143670.kux4ZBcVD6@phil
State Superseded
Delegated to: Simon Glass
Headers show

Commit Message

Heiko Stuebner March 30, 2017, 11:14 a.m. UTC
Most of the time the optimised memset() is what we want. For extreme
situations such as TPL it may be too large. For example on the 'rock'
board, using a simple loop saves a useful 48 bytes. With gcc 4.9 and
the rodata bug, this patch is enough to reduce the TPL image below the
limit.

Signed-off-by: Simon Glass <sjg@chromium.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
---
Hi Simon,

a bit bikesheddy, but might it make more sense to structure the
options like below? That way it matches USE_ARCH_MEMSET and might
make the intent visible better, as you get
USE_ARCH_MEMSET=y = biggest but also fastest
(nothing) = default from libgeneric
USE_TINY_MEMSET=y = optimize for size over speed

Also might make reading defconfigs easier as you would have
    CONFIG_USE_TINY_MEMSET=y
instead of
    # CONFIG_FAST_MEMSET is not set
when needing that option.

Anyway, I've tested both variants on a live rk3188-rock now and
everything of course still works, even when build with gcc-4.9, so
both variants also
Tested-by: Heiko Stuebner <heiko@sntech.de>


Heiko


 lib/Kconfig  | 20 ++++++++++++++++++++
 lib/string.c |  5 ++++-
 2 files changed, 24 insertions(+), 1 deletion(-)

Patch
diff mbox

diff --git a/lib/Kconfig b/lib/Kconfig
index 65c01573e1..ab42413839 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -52,6 +52,26 @@  config LIB_RAND
 	help
 	  This library provides pseudo-random number generator functions.
 
+config USE_TINY_MEMSET
+	bool "Use a size-optimized memset()"
+	help
+	  This makes memset prefer code size over speed optimizations.
+	  The fastest memset() is the arch-specific one (if available) enabled
+	  by CONFIG_USE_ARCH_MEMSET. If that is not enabled, we can still get
+	  better performance by writing a word at a time at the cost of
+	  slightly bigger memset code, but in some special cases size might
+	  be more important than speed.
+
+config SPL_USE_TINY_MEMSET
+	bool "Use a size-optimized memset()"
+	help
+	  This makes memset prefer code size over speed optimizations.
+	  The fastest memset() is the arch-specific one (if available) enabled
+	  by CONFIG_USE_ARCH_MEMSET. If that is not enabled, we can still get
+	  better performance by writing a word at a time at the cost of
+	  slightly bigger memset code, but in some special cases size might
+	  be more important than speed.
+
 source lib/dhry/Kconfig
 
 source lib/rsa/Kconfig
diff --git a/lib/string.c b/lib/string.c
index 67d5f6a421..edae997fa6 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -437,8 +437,10 @@  char *strswab(const char *s)
 void * memset(void * s,int c,size_t count)
 {
 	unsigned long *sl = (unsigned long *) s;
-	unsigned long cl = 0;
 	char *s8;
+
+#if !CONFIG_IS_ENABLED(USE_TINY_MEMSET)
+	unsigned long cl = 0;
 	int i;
 
 	/* do it one word at a time (32 bits or 64 bits) while possible */
@@ -452,6 +454,7 @@  void * memset(void * s,int c,size_t count)
 			count -= sizeof(*sl);
 		}
 	}
+#endif
 	/* fill 8 bits at a time */
 	s8 = (char *)sl;
 	while (count--)