diff mbox

[U-Boot,3/4,V2] USB: Drop cache flush bloat in EHCI-HCD

Message ID 1333946851-27904-1-git-send-email-marex@denx.de
State Accepted
Commit b8adb12095814260d2b5edb23663ddf0ab97b877
Delegated to: Marek Vasut
Headers show

Commit Message

Marek Vasut April 9, 2012, 4:47 a.m. UTC
Discard the creepy cache flushing mechanisms in ehci-hcd.c and replace them with
more straightforward flushing. In the new approach, the flushing takes place
directly in ehci_submit_async() call instead of going through the QH list and
flushing all members and buffers. This discards a lot of weird bit operations
on the members of QH and qTD structures.

NOTE: Certainly, this flushes even qTDs which are possibly unused in some
transactions, though the overhead of the previous code was much higher than is
the overhead of flushing two more cache lines (which most probably aren't even
cached).

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Puneet Saxena <puneets@nvidia.com>
---
 drivers/usb/host/ehci-hcd.c |  127 +++++++++----------------------------------
 1 files changed, 27 insertions(+), 100 deletions(-)

V2: Also flush the data buffer.
    Fix typo in address alignment checking in ehci_td_buffer()

Comments

Anatolij Gustschin May 24, 2012, 6:57 a.m. UTC | #1
Hi Marek,

On Mon,  9 Apr 2012 06:47:31 +0200
Marek Vasut <marex@denx.de> wrote:

> Discard the creepy cache flushing mechanisms in ehci-hcd.c and replace them with
> more straightforward flushing. In the new approach, the flushing takes place
> directly in ehci_submit_async() call instead of going through the QH list and
> flushing all members and buffers. This discards a lot of weird bit operations
> on the members of QH and qTD structures.
> 
> NOTE: Certainly, this flushes even qTDs which are possibly unused in some
> transactions, though the overhead of the previous code was much higher than is
> the overhead of flushing two more cache lines (which most probably aren't even
> cached).
> 
> Signed-off-by: Marek Vasut <marex@denx.de>
> Cc: Puneet Saxena <puneets@nvidia.com>
> ---
>  drivers/usb/host/ehci-hcd.c |  127 +++++++++----------------------------------
>  1 files changed, 27 insertions(+), 100 deletions(-)

Unfortunately this patch breaks compiling for many powerpc boards, mpc512x,
mpc83xx, mpc85xx and QorIQ Px based with USB support enabled.

Thanks,
Anatolij
Marek Vasut May 24, 2012, 1:21 p.m. UTC | #2
Dear Anatolij Gustschin,

> Hi Marek,
> 
> On Mon,  9 Apr 2012 06:47:31 +0200
> 
> Marek Vasut <marex@denx.de> wrote:
> > Discard the creepy cache flushing mechanisms in ehci-hcd.c and replace
> > them with more straightforward flushing. In the new approach, the
> > flushing takes place directly in ehci_submit_async() call instead of
> > going through the QH list and flushing all members and buffers. This
> > discards a lot of weird bit operations on the members of QH and qTD
> > structures.
> > 
> > NOTE: Certainly, this flushes even qTDs which are possibly unused in some
> > transactions, though the overhead of the previous code was much higher
> > than is the overhead of flushing two more cache lines (which most
> > probably aren't even cached).
> > 
> > Signed-off-by: Marek Vasut <marex@denx.de>
> > Cc: Puneet Saxena <puneets@nvidia.com>
> > ---
> > 
> >  drivers/usb/host/ehci-hcd.c |  127
> >  +++++++++---------------------------------- 1 files changed, 27
> >  insertions(+), 100 deletions(-)
> 
> Unfortunately this patch breaks compiling for many powerpc boards, mpc512x,
> mpc83xx, mpc85xx and QorIQ Px based with USB support enabled.

Because they have broken cache implementation, right? I mean, they have their 
own snooping methods, so they don't need the cache flushing at all, but then, if 
they don't, these methods (dcache_flush() etc) should be optimized to empty 
functions. Maybe we should implement them for these CPUs then? btw. I thought 
these compiled before, hm...

> Thanks,
> Anatolij

Best regards,
Marek Vasut
Liu Gang May 29, 2012, 8:24 a.m. UTC | #3
Hi Marek,

On Thu, 2012-05-24 at 15:21 +0200, Marek Vasut wrote:
 
> > >  drivers/usb/host/ehci-hcd.c |  127
> > >  +++++++++---------------------------------- 1 files changed, 27
> > >  insertions(+), 100 deletions(-)
> > 
> > Unfortunately this patch breaks compiling for many powerpc boards, mpc512x,
> > mpc83xx, mpc85xx and QorIQ Px based with USB support enabled.
> 
> Because they have broken cache implementation, right? I mean, they have their 
> own snooping methods, so they don't need the cache flushing at all, but then, if 
> they don't, these methods (dcache_flush() etc) should be optimized to empty 
> functions. Maybe we should implement them for these CPUs then? btw. I thought 
> these compiled before, hm...

Now this patch has been applied at "master" branch, but the building for
powerpc boards as Anatolij mentioned will be failed.
So Which platforms this patch applies to? If they all can not flush
cache like the powerpc snooping methods?

Thanks!

Best Regards!

Liu Gang
Marek Vasut May 29, 2012, 8:36 a.m. UTC | #4
Dear Liu Gang,

> Hi Marek,
> 
> On Thu, 2012-05-24 at 15:21 +0200, Marek Vasut wrote:
> > > >  drivers/usb/host/ehci-hcd.c |  127
> > > >  +++++++++---------------------------------- 1 files changed, 27
> > > >  insertions(+), 100 deletions(-)
> > > 
> > > Unfortunately this patch breaks compiling for many powerpc boards,
> > > mpc512x, mpc83xx, mpc85xx and QorIQ Px based with USB support enabled.
> > 
> > Because they have broken cache implementation, right? I mean, they have
> > their own snooping methods, so they don't need the cache flushing at
> > all, but then, if they don't, these methods (dcache_flush() etc) should
> > be optimized to empty functions. Maybe we should implement them for
> > these CPUs then? btw. I thought these compiled before, hm...
> 
> Now this patch has been applied at "master" branch, but the building for
> powerpc boards as Anatolij mentioned will be failed.
> So Which platforms this patch applies to? If they all can not flush
> cache like the powerpc snooping methods?

I already submitted a patch for this issue

http://www.mail-archive.com/u-boot@lists.denx.de/msg84629.html

> Thanks!
> 
> Best Regards!
> 
> Liu Gang

Best regards,
Marek Vasut
diff mbox

Patch

diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 8ac0b44..9f0991c 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -108,99 +108,6 @@  static struct descriptor {
 #define ehci_is_TDI()	(0)
 #endif
 
-#if defined(CONFIG_EHCI_DCACHE)
-/*
- * Routines to handle (flush/invalidate) the dcache for the QH and qTD
- * structures and data buffers. This is needed on platforms using this
- * EHCI support with dcache enabled.
- */
-static void flush_invalidate(u32 addr, int size, int flush)
-{
-	if (flush)
-		flush_dcache_range(addr, addr + size);
-	else
-		invalidate_dcache_range(addr, addr + size);
-}
-
-static void cache_qtd(struct qTD *qtd, int flush)
-{
-	u32 *ptr = (u32 *)qtd->qt_buffer[0];
-	int len = (qtd->qt_token & 0x7fff0000) >> 16;
-
-	flush_invalidate((u32)qtd, sizeof(struct qTD), flush);
-	if (ptr && len)
-		flush_invalidate((u32)ptr, len, flush);
-}
-
-
-static inline struct QH *qh_addr(struct QH *qh)
-{
-	return (struct QH *)((u32)qh & 0xffffffe0);
-}
-
-static void cache_qh(struct QH *qh, int flush)
-{
-	struct qTD *qtd;
-	struct qTD *next;
-	static struct qTD *first_qtd;
-
-	/*
-	 * Walk the QH list and flush/invalidate all entries
-	 */
-	while (1) {
-		flush_invalidate((u32)qh_addr(qh), sizeof(struct QH), flush);
-		if ((u32)qh & QH_LINK_TYPE_QH)
-			break;
-		qh = qh_addr(qh);
-		qh = (struct QH *)qh->qh_link;
-	}
-	qh = qh_addr(qh);
-
-	/*
-	 * Save first qTD pointer, needed for invalidating pass on this QH
-	 */
-	if (flush)
-		first_qtd = qtd = (struct qTD *)(*(u32 *)&qh->qh_overlay &
-						 0xffffffe0);
-	else
-		qtd = first_qtd;
-
-	/*
-	 * Walk the qTD list and flush/invalidate all entries
-	 */
-	while (1) {
-		if (qtd == NULL)
-			break;
-		cache_qtd(qtd, flush);
-		next = (struct qTD *)((u32)qtd->qt_next & 0xffffffe0);
-		if (next == qtd)
-			break;
-		qtd = next;
-	}
-}
-
-static inline void ehci_flush_dcache(struct QH *qh)
-{
-	cache_qh(qh, 1);
-}
-
-static inline void ehci_invalidate_dcache(struct QH *qh)
-{
-	cache_qh(qh, 0);
-}
-#else /* CONFIG_EHCI_DCACHE */
-/*
- *
- */
-static inline void ehci_flush_dcache(struct QH *qh)
-{
-}
-
-static inline void ehci_invalidate_dcache(struct QH *qh)
-{
-}
-#endif /* CONFIG_EHCI_DCACHE */
-
 void __ehci_powerup_fixup(uint32_t *status_reg, uint32_t *reg)
 {
 	mdelay(50);
@@ -263,12 +170,20 @@  out:
 
 static int ehci_td_buffer(struct qTD *td, void *buf, size_t sz)
 {
-	uint32_t addr, delta, next;
+	uint32_t delta, next;
+	uint32_t addr = (uint32_t)buf;
+	size_t rsz = roundup(sz, 32);
 	int idx;
 
-	addr = (uint32_t) buf;
+	if (sz != rsz)
+		debug("EHCI-HCD: Misaligned buffer size (%08x)\n", sz);
+
+	if (addr & 31)
+		debug("EHCI-HCD: Misaligned buffer address (%p)\n", buf);
+
 	idx = 0;
 	while (idx < 5) {
+		flush_dcache_range(addr, addr + rsz);
 		td->qt_buffer[idx] = cpu_to_hc32(addr);
 		td->qt_buffer_hi[idx] = 0;
 		next = (addr + 4096) & ~4095;
@@ -317,6 +232,8 @@  ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 	memset(&qh, 0, sizeof(struct QH));
 	memset(qtd, 0, sizeof(qtd));
 
+	toggle = usb_gettoggle(dev, usb_pipeendpoint(pipe), usb_pipeout(pipe));
+
 	qh.qh_link = cpu_to_hc32((uint32_t)&qh_list | QH_LINK_TYPE_QH);
 	c = (usb_pipespeed(pipe) != USB_SPEED_HIGH &&
 	     usb_pipeendpoint(pipe) == 0) ? 1 : 0;
@@ -337,9 +254,6 @@  ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 
 	tdp = &qh.qh_overlay.qt_next;
 
-	toggle =
-	    usb_gettoggle(dev, usb_pipeendpoint(pipe), usb_pipeout(pipe));
-
 	if (req != NULL) {
 		qtd[qtd_counter].qt_next = cpu_to_hc32(QT_NEXT_TERMINATE);
 		qtd[qtd_counter].qt_altnext = cpu_to_hc32(QT_NEXT_TERMINATE);
@@ -391,7 +305,10 @@  ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 	qh_list.qh_link = cpu_to_hc32((uint32_t)&qh | QH_LINK_TYPE_QH);
 
 	/* Flush dcache */
-	ehci_flush_dcache(&qh_list);
+	flush_dcache_range((uint32_t)&qh_list,
+		(uint32_t)&qh_list + sizeof(struct QH));
+	flush_dcache_range((uint32_t)&qh, (uint32_t)&qh + sizeof(struct QH));
+	flush_dcache_range((uint32_t)qtd, (uint32_t)qtd + sizeof(qtd));
 
 	usbsts = ehci_readl(&hcor->or_usbsts);
 	ehci_writel(&hcor->or_usbsts, (usbsts & 0x3f));
@@ -414,13 +331,23 @@  ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 	timeout = USB_TIMEOUT_MS(pipe);
 	do {
 		/* Invalidate dcache */
-		ehci_invalidate_dcache(&qh_list);
+		invalidate_dcache_range((uint32_t)&qh_list,
+			(uint32_t)&qh_list + sizeof(struct QH));
+		invalidate_dcache_range((uint32_t)&qh,
+			(uint32_t)&qh + sizeof(struct QH));
+		invalidate_dcache_range((uint32_t)qtd,
+			(uint32_t)qtd + sizeof(qtd));
+
 		token = hc32_to_cpu(vtd->qt_token);
 		if (!(token & 0x80))
 			break;
 		WATCHDOG_RESET();
 	} while (get_timer(ts) < timeout);
 
+	/* Invalidate the memory area occupied by buffer */
+	invalidate_dcache_range(((uint32_t)buffer & ~31),
+		((uint32_t)buffer & ~31) + roundup(length, 32));
+
 	/* Check that the TD processing happened */
 	if (token & 0x80) {
 		printf("EHCI timed out on TD - token=%#x\n", token);