diff mbox

[U-Boot,v2,1/5] string: Provide a slimmed-down memset()

Message ID 20170402155032.27473-2-sjg@chromium.org
State Accepted
Delegated to: Simon Glass
Headers show

Commit Message

Simon Glass April 2, 2017, 3:50 p.m. UTC
Most of the time the optimised memset() is what we want. For extreme
situations such as TPL it may be too large. For example on the 'rock'
board, using a simple loop saves a useful 48 bytes. With gcc 4.9 and
the rodata bug, this patch is enough to reduce the TPL image below the
limit.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

Changes in v2:
- Adjust the option to be SPL-only
- Change the option to default to off (name it CONFIG_SPL_TINY_MEMSET)

 lib/Kconfig  | 8 ++++++++
 lib/string.c | 6 ++++--
 2 files changed, 12 insertions(+), 2 deletions(-)

Comments

Heiko Stuebner April 4, 2017, 9:38 a.m. UTC | #1
Am Sonntag, 2. April 2017, 09:50:28 CEST schrieb Simon Glass:
> Most of the time the optimised memset() is what we want. For extreme
> situations such as TPL it may be too large. For example on the 'rock'
> board, using a simple loop saves a useful 48 bytes. With gcc 4.9 and
> the rodata bug, this patch is enough to reduce the TPL image below the
> limit.
> 
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
> 
> Changes in v2:
> - Adjust the option to be SPL-only
> - Change the option to default to off (name it CONFIG_SPL_TINY_MEMSET)
> 
>  lib/Kconfig  | 8 ++++++++
>  lib/string.c | 6 ++++--
>  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/Kconfig b/lib/Kconfig
> index 65c01573e1..58b5717dcd 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -52,6 +52,14 @@ config LIB_RAND
>  	help
>  	  This library provides pseudo-random number generator functions.
> 
> +config SPL_TINY_MEMSET
> +	bool "Use a very small memset() in SPL"
> +	help
> +	  The faster memset() is the arch-specific one (if available) enabled
> +	  by CONFIG_USE_ARCH_MEMSET. If that is not enabled, we can still get
> +	  better performance by write a word at a time. Enable this option
> +	  to reduce code size slightly at the cost of some speed.

Wording sounds off, I guess we could do something like

[...better performance by] writing a word at a time. In very size-constrained
environments even this may be to big though. [Enable this option...]

Otherwise
Reviewed-by: Heiko Stuebner <heiko@sntech.de>

> +
>  source lib/dhry/Kconfig
> 
>  source lib/rsa/Kconfig
> diff --git a/lib/string.c b/lib/string.c
> index 67d5f6a421..c1a28c14ce 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -437,8 +437,10 @@ char *strswab(const char *s)
>  void * memset(void * s,int c,size_t count)
>  {
>  	unsigned long *sl = (unsigned long *) s;
> -	unsigned long cl = 0;
>  	char *s8;
> +
> +#if !CONFIG_IS_ENABLED(TINY_MEMSET)
> +	unsigned long cl = 0;
>  	int i;
> 
>  	/* do it one word at a time (32 bits or 64 bits) while possible */
> @@ -452,7 +454,7 @@ void * memset(void * s,int c,size_t count)
>  			count -= sizeof(*sl);
>  		}
>  	}
> -	/* fill 8 bits at a time */
> +#endif	/* fill 8 bits at a time */
>  	s8 = (char *)sl;
>  	while (count--)
>  		*s8++ = c;
Simon Glass April 5, 2017, 1:05 a.m. UTC | #2
On 4 April 2017 at 03:38, Heiko Stübner <heiko@sntech.de> wrote:
>
> Am Sonntag, 2. April 2017, 09:50:28 CEST schrieb Simon Glass:
> > Most of the time the optimised memset() is what we want. For extreme
> > situations such as TPL it may be too large. For example on the 'rock'
> > board, using a simple loop saves a useful 48 bytes. With gcc 4.9 and
> > the rodata bug, this patch is enough to reduce the TPL image below the
> > limit.
> >
> > Signed-off-by: Simon Glass <sjg@chromium.org>
> > ---
> >
> > Changes in v2:
> > - Adjust the option to be SPL-only
> > - Change the option to default to off (name it CONFIG_SPL_TINY_MEMSET)
> >
> >  lib/Kconfig  | 8 ++++++++
> >  lib/string.c | 6 ++++--
> >  2 files changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/Kconfig b/lib/Kconfig
> > index 65c01573e1..58b5717dcd 100644
> > --- a/lib/Kconfig
> > +++ b/lib/Kconfig
> > @@ -52,6 +52,14 @@ config LIB_RAND
> >       help
> >         This library provides pseudo-random number generator functions.
> >
> > +config SPL_TINY_MEMSET
> > +     bool "Use a very small memset() in SPL"
> > +     help
> > +       The faster memset() is the arch-specific one (if available) enabled
> > +       by CONFIG_USE_ARCH_MEMSET. If that is not enabled, we can still get
> > +       better performance by write a word at a time. Enable this option
> > +       to reduce code size slightly at the cost of some speed.
>
> Wording sounds off, I guess we could do something like
>
> [...better performance by] writing a word at a time. In very size-constrained
> environments even this may be to big though. [Enable this option...]
>
> Otherwise
> Reviewed-by: Heiko Stuebner <heiko@sntech.de>

I am going to apply this one now and leave the rest of the series
until it has had a bit more review. But this one is needed for me to
enable the rock board.

Fixed this and:

Applied to u-boot-rockchip
diff mbox

Patch

diff --git a/lib/Kconfig b/lib/Kconfig
index 65c01573e1..58b5717dcd 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -52,6 +52,14 @@  config LIB_RAND
 	help
 	  This library provides pseudo-random number generator functions.
 
+config SPL_TINY_MEMSET
+	bool "Use a very small memset() in SPL"
+	help
+	  The faster memset() is the arch-specific one (if available) enabled
+	  by CONFIG_USE_ARCH_MEMSET. If that is not enabled, we can still get
+	  better performance by write a word at a time. Enable this option
+	  to reduce code size slightly at the cost of some speed.
+
 source lib/dhry/Kconfig
 
 source lib/rsa/Kconfig
diff --git a/lib/string.c b/lib/string.c
index 67d5f6a421..c1a28c14ce 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -437,8 +437,10 @@  char *strswab(const char *s)
 void * memset(void * s,int c,size_t count)
 {
 	unsigned long *sl = (unsigned long *) s;
-	unsigned long cl = 0;
 	char *s8;
+
+#if !CONFIG_IS_ENABLED(TINY_MEMSET)
+	unsigned long cl = 0;
 	int i;
 
 	/* do it one word at a time (32 bits or 64 bits) while possible */
@@ -452,7 +454,7 @@  void * memset(void * s,int c,size_t count)
 			count -= sizeof(*sl);
 		}
 	}
-	/* fill 8 bits at a time */
+#endif	/* fill 8 bits at a time */
 	s8 = (char *)sl;
 	while (count--)
 		*s8++ = c;