diff mbox

[U-Boot,v2,4/4] cmd: mem: Use memcpy for 'cp' command

Message ID 1481824813-14432-4-git-send-email-fabio.estevam@nxp.com
State Accepted
Commit c2538421b28424b9705865e838c5fba19c9dc651
Delegated to: Tom Rini
Headers show

Commit Message

Fabio Estevam Dec. 15, 2016, 6 p.m. UTC
Simplify the 'cp' command implementation by using the memcpy() function,
which brings the additional benefit of performance gain for those who have
CONFIG_USE_ARCH_MEMCPY selected.

Tested on a mx6qsabreauto board where a 5x gain in performance is seen
when reading 10MB from the parallel NOR memory.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
---
Changes since v1:
- Always use memcpy()

 cmd/mem.c | 28 ++--------------------------
 1 file changed, 2 insertions(+), 26 deletions(-)

Comments

Tom Rini Dec. 17, 2016, 3:11 p.m. UTC | #1
On Thu, Dec 15, 2016 at 04:00:13PM -0200, Fabio Estevam wrote:

> Simplify the 'cp' command implementation by using the memcpy() function,
> which brings the additional benefit of performance gain for those who have
> CONFIG_USE_ARCH_MEMCPY selected.
> 
> Tested on a mx6qsabreauto board where a 5x gain in performance is seen
> when reading 10MB from the parallel NOR memory.
> 
> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>

Can you load up a few other platforms via QEMU and test it out there?
Thanks!
Tom Rini Dec. 19, 2016, 10:09 p.m. UTC | #2
On Thu, Dec 15, 2016 at 04:00:13PM -0200, Fabio Estevam wrote:

> Simplify the 'cp' command implementation by using the memcpy() function,
> which brings the additional benefit of performance gain for those who have
> CONFIG_USE_ARCH_MEMCPY selected.
> 
> Tested on a mx6qsabreauto board where a 5x gain in performance is seen
> when reading 10MB from the parallel NOR memory.
> 
> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
> ---
> Changes since v1:
> - Always use memcpy()

I feel like this is correct.  I also feel like it's too close to release
to do this, so I'll be applying this shortly after release, thanks!

I also found it odd, and might try and rectify, that we don't have a
'cp' test in test.py which would make being relatively sure it's OK on
x86/MIPS/PowerPC easy, slightly less easy (I'll be firing up qemu by
hand and maybe writing a test too).
Fabio Estevam Dec. 19, 2016, 10:11 p.m. UTC | #3
Hi Tom,

On Mon, Dec 19, 2016 at 8:09 PM, Tom Rini <trini@konsulko.com> wrote:

> I feel like this is correct.  I also feel like it's too close to release
> to do this, so I'll be applying this shortly after release, thanks!

Yes, agreed this is material for the next release.

> I also found it odd, and might try and rectify, that we don't have a
> 'cp' test in test.py which would make being relatively sure it's OK on
> x86/MIPS/PowerPC easy, slightly less easy (I'll be firing up qemu by
> hand and maybe writing a test too).

Thanks for doing this!
Fabio Estevam Jan. 11, 2017, 3:49 p.m. UTC | #4
Hi Tom,

On Mon, Dec 19, 2016 at 8:09 PM, Tom Rini <trini@konsulko.com> wrote:
> On Thu, Dec 15, 2016 at 04:00:13PM -0200, Fabio Estevam wrote:
>
>> Simplify the 'cp' command implementation by using the memcpy() function,
>> which brings the additional benefit of performance gain for those who have
>> CONFIG_USE_ARCH_MEMCPY selected.
>>
>> Tested on a mx6qsabreauto board where a 5x gain in performance is seen
>> when reading 10MB from the parallel NOR memory.
>>
>> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
>> ---
>> Changes since v1:
>> - Always use memcpy()
>
> I feel like this is correct.  I also feel like it's too close to release
> to do this, so I'll be applying this shortly after release, thanks!

Is it a good time to apply this one now?

Thanks
Tom Rini Jan. 11, 2017, 4:21 p.m. UTC | #5
On Wed, Jan 11, 2017 at 01:49:55PM -0200, Fabio Estevam wrote:
> Hi Tom,
> 
> On Mon, Dec 19, 2016 at 8:09 PM, Tom Rini <trini@konsulko.com> wrote:
> > On Thu, Dec 15, 2016 at 04:00:13PM -0200, Fabio Estevam wrote:
> >
> >> Simplify the 'cp' command implementation by using the memcpy() function,
> >> which brings the additional benefit of performance gain for those who have
> >> CONFIG_USE_ARCH_MEMCPY selected.
> >>
> >> Tested on a mx6qsabreauto board where a 5x gain in performance is seen
> >> when reading 10MB from the parallel NOR memory.
> >>
> >> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
> >> ---
> >> Changes since v1:
> >> - Always use memcpy()
> >
> > I feel like this is correct.  I also feel like it's too close to release
> > to do this, so I'll be applying this shortly after release, thanks!
> 
> Is it a good time to apply this one now?

Yes, yes it would, thanks for the reminder.  I'm kicking off a round of
testing in travis-ci now along with another pre-req of finally, really,
making arch/arm/lib/mem{set,cpy}.S be used by default in most cases.
Tom Rini Jan. 13, 2017, 12:51 a.m. UTC | #6
On Thu, Dec 15, 2016 at 04:00:13PM -0200, Fabio Estevam wrote:

> Simplify the 'cp' command implementation by using the memcpy() function,
> which brings the additional benefit of performance gain for those who have
> CONFIG_USE_ARCH_MEMCPY selected.
> 
> Tested on a mx6qsabreauto board where a 5x gain in performance is seen
> when reading 10MB from the parallel NOR memory.
> 
> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>

Applied to u-boot/master, thanks!
diff mbox

Patch

diff --git a/cmd/mem.c b/cmd/mem.c
index a690957..ff6a770 100644
--- a/cmd/mem.c
+++ b/cmd/mem.c
@@ -372,10 +372,8 @@  static int do_mem_cmp(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[])
 
 static int do_mem_cp(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[])
 {
-	ulong	addr, dest, count, bytes;
+	ulong	addr, dest, count;
 	int	size;
-	const void *src;
-	void *buf;
 
 	if (argc != 4)
 		return CMD_RET_USAGE;
@@ -465,29 +463,7 @@  static int do_mem_cp(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[])
 	}
 #endif
 
-	bytes = size * count;
-	buf = map_sysmem(dest, bytes);
-	src = map_sysmem(addr, bytes);
-	while (count-- > 0) {
-		if (size == 4)
-			*((u32 *)buf) = *((u32  *)src);
-#ifdef CONFIG_SYS_SUPPORT_64BIT_DATA
-		else if (size == 8)
-			*((u64 *)buf) = *((u64 *)src);
-#endif
-		else if (size == 2)
-			*((u16 *)buf) = *((u16 *)src);
-		else
-			*((u8 *)buf) = *((u8 *)src);
-		src += size;
-		buf += size;
-
-		/* reset watchdog from time to time */
-		if ((count % (64 << 10)) == 0)
-			WATCHDOG_RESET();
-	}
-	unmap_sysmem(buf);
-	unmap_sysmem(src);
+	memcpy((void *)dest, (void *)addr, count * size);
 
 	return 0;
 }