diff mbox series

powerpc: reduce number of WATCHDOG_RESET calls from flush_cache

Message ID 20200604093039.26781-1-rasmus.villemoes@prevas.dk
State New
Delegated to: Priyanka Jain
Headers show
Series powerpc: reduce number of WATCHDOG_RESET calls from flush_cache | expand

Commit Message

Rasmus Villemoes June 4, 2020, 9:30 a.m. UTC
Calling WATCHDOG_RESET for each and every cache line is overkill.

In our case, the kernel image is a little over 7MB, and the almost
500000 calls of WATCHDOG_RESET() adds about one second to the
boottime.

I very highly doubt there's any real hardware where flushing 64K
from cache to memory takes more than a few milliseconds, so this
should be completely safe. Since it reduces the number of
WATCHDOG_RESET() calls by roughly a factor of 1000, the overhead from
those is practically eliminated. (Just in case the range flushed is so
small that it doesn't cross a 64K boundary, add a single
WATCHDOG_RESET() between the loops).

64K is chosen because that's also the default chunk size used by the
hashing algorithms, and when, say, a sha256 digest of a kernel image of
a few MB is being verified, that's almost guaranteed to be cache-cold,
so apart from the computations being done, the hashing is also bounded
by memory speed - so if 64K works for those cases, it should certainly
also work when memory access is the only thing being done.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 arch/powerpc/lib/cache.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Stefan Roese June 4, 2020, 10:31 a.m. UTC | #1
On 04.06.20 11:30, Rasmus Villemoes wrote:
> Calling WATCHDOG_RESET for each and every cache line is overkill.
> 
> In our case, the kernel image is a little over 7MB, and the almost
> 500000 calls of WATCHDOG_RESET() adds about one second to the
> boottime.
> 
> I very highly doubt there's any real hardware where flushing 64K
> from cache to memory takes more than a few milliseconds, so this
> should be completely safe. Since it reduces the number of
> WATCHDOG_RESET() calls by roughly a factor of 1000, the overhead from
> those is practically eliminated. (Just in case the range flushed is so
> small that it doesn't cross a 64K boundary, add a single
> WATCHDOG_RESET() between the loops).
> 
> 64K is chosen because that's also the default chunk size used by the
> hashing algorithms, and when, say, a sha256 digest of a kernel image of
> a few MB is being verified, that's almost guaranteed to be cache-cold,
> so apart from the computations being done, the hashing is also bounded
> by memory speed - so if 64K works for those cases, it should certainly
> also work when memory access is the only thing being done.
> 
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>

Reviewed-by: Stefan Roese <sr@denx.de>

Thanks,
Stefan

> ---
>   arch/powerpc/lib/cache.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/lib/cache.c b/arch/powerpc/lib/cache.c
> index 528361e972..df2310f4e2 100644
> --- a/arch/powerpc/lib/cache.c
> +++ b/arch/powerpc/lib/cache.c
> @@ -8,6 +8,7 @@
>   #include <cpu_func.h>
>   #include <asm/cache.h>
>   #include <watchdog.h>
> +#include <linux/sizes.h>
>   
>   void flush_cache(ulong start_addr, ulong size)
>   {
> @@ -21,15 +22,18 @@ void flush_cache(ulong start_addr, ulong size)
>   	for (addr = start; (addr <= end) && (addr >= start);
>   			addr += CONFIG_SYS_CACHELINE_SIZE) {
>   		asm volatile("dcbst 0,%0" : : "r" (addr) : "memory");
> -		WATCHDOG_RESET();
> +		if ((addr & (SZ_64K - 1)) == 0)
> +			WATCHDOG_RESET();
>   	}
>   	/* wait for all dcbst to complete on bus */
>   	asm volatile("sync" : : : "memory");
> +	WATCHDOG_RESET();
>   
>   	for (addr = start; (addr <= end) && (addr >= start);
>   			addr += CONFIG_SYS_CACHELINE_SIZE) {
>   		asm volatile("icbi 0,%0" : : "r" (addr) : "memory");
> -		WATCHDOG_RESET();
> +		if ((addr & (SZ_64K - 1)) == 0)
> +			WATCHDOG_RESET();
>   	}
>   	asm volatile("sync" : : : "memory");
>   	/* flush prefetch queue */
>
diff mbox series

Patch

diff --git a/arch/powerpc/lib/cache.c b/arch/powerpc/lib/cache.c
index 528361e972..df2310f4e2 100644
--- a/arch/powerpc/lib/cache.c
+++ b/arch/powerpc/lib/cache.c
@@ -8,6 +8,7 @@ 
 #include <cpu_func.h>
 #include <asm/cache.h>
 #include <watchdog.h>
+#include <linux/sizes.h>
 
 void flush_cache(ulong start_addr, ulong size)
 {
@@ -21,15 +22,18 @@  void flush_cache(ulong start_addr, ulong size)
 	for (addr = start; (addr <= end) && (addr >= start);
 			addr += CONFIG_SYS_CACHELINE_SIZE) {
 		asm volatile("dcbst 0,%0" : : "r" (addr) : "memory");
-		WATCHDOG_RESET();
+		if ((addr & (SZ_64K - 1)) == 0)
+			WATCHDOG_RESET();
 	}
 	/* wait for all dcbst to complete on bus */
 	asm volatile("sync" : : : "memory");
+	WATCHDOG_RESET();
 
 	for (addr = start; (addr <= end) && (addr >= start);
 			addr += CONFIG_SYS_CACHELINE_SIZE) {
 		asm volatile("icbi 0,%0" : : "r" (addr) : "memory");
-		WATCHDOG_RESET();
+		if ((addr & (SZ_64K - 1)) == 0)
+			WATCHDOG_RESET();
 	}
 	asm volatile("sync" : : : "memory");
 	/* flush prefetch queue */