[U-Boot,v3,4/4] usb: dwc2: invalidate dcache before starting DMA

Message ID 1467797663-16276-5-git-send-email-xzy.xu@rock-chips.com
State Accepted
Commit 9424f1418377bfb0c95ed36a8854e7fe53436229
Delegated to: Simon Glass
Headers show

Commit Message

Xu Ziyuan July 6, 2016, 9:34 a.m.
From: Xu Ziyuan <xzy.xu@rock-chips.com>

Invalidate dcache before starting the DMA to ensure coherency. In case
there are any dirty lines from the DMA buffer in the cache, subsequent
cache-line replacements may corrupt the buffer in memory while the DMA
is still going on. Cache-line replacement can happen if the CPU tries to
bring some other memory locations into the cache while the DMA is going
on.

Signed-off-by: Ziyuan Xu <xzy.xu@rock-chips.com>
---

Changes in v3:
- New commit since v3 to fix the coherence issue between memory and
cache

Changes in v2: None

 drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Simon Glass July 12, 2016, 12:59 p.m. | #1
Hi Ziyuan,

On 6 July 2016 at 03:34, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
> From: Xu Ziyuan <xzy.xu@rock-chips.com>
>
> Invalidate dcache before starting the DMA to ensure coherency. In case
> there are any dirty lines from the DMA buffer in the cache, subsequent
> cache-line replacements may corrupt the buffer in memory while the DMA
> is still going on. Cache-line replacement can happen if the CPU tries to
> bring some other memory locations into the cache while the DMA is going
> on.
>
> Signed-off-by: Ziyuan Xu <xzy.xu@rock-chips.com>
> ---
>
> Changes in v3:
> - New commit since v3 to fix the coherence issue between memory and
> cache
>
> Changes in v2: None
>
>  drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
> index 12f5c85..0d6d2fb 100644
> --- a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
> +++ b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
> @@ -110,6 +110,9 @@ static int setdma_rx(struct dwc2_ep *ep, struct dwc2_request *req)
>
>         ctrl =  readl(&reg->out_endp[ep_num].doepctl);
>
> +       invalidate_dcache_range((unsigned long) ep->dma_buf,
> +                               (unsigned long) ep->dma_buf + ep->len);

There is an invalidate in complete_rx() which is one of the callers
for this function. It seems to me that we should not have this in two
places. Why do we have this problem? Is it because the other calls to
setdma_rx() don't invalidate?

I think the invalidate should happen just before reading the data. Can
you please check if the other invalidate is needed? Also see how it
cache-aligns the end address.

> +
>         writel((unsigned int) ep->dma_buf, &reg->out_endp[ep_num].doepdma);
>         writel(DOEPT_SIZ_PKT_CNT(pktcnt) | DOEPT_SIZ_XFER_SIZE(length),
>                &reg->out_endp[ep_num].doeptsiz);
> --
> 1.9.1
>
>

Regards,
Simon
Xu Ziyuan July 13, 2016, 1:06 a.m. | #2
On 2016年07月12日 20:59, Simon Glass wrote:
> Hi Ziyuan,
>
> On 6 July 2016 at 03:34, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
>> From: Xu Ziyuan <xzy.xu@rock-chips.com>
>>
>> Invalidate dcache before starting the DMA to ensure coherency. In case
>> there are any dirty lines from the DMA buffer in the cache, subsequent
>> cache-line replacements may corrupt the buffer in memory while the DMA
>> is still going on. Cache-line replacement can happen if the CPU tries to
>> bring some other memory locations into the cache while the DMA is going
>> on.
>>
>> Signed-off-by: Ziyuan Xu <xzy.xu@rock-chips.com>
>> ---
>>
>> Changes in v3:
>> - New commit since v3 to fix the coherence issue between memory and
>> cache
>>
>> Changes in v2: None
>>
>>   drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>> index 12f5c85..0d6d2fb 100644
>> --- a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>> +++ b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>> @@ -110,6 +110,9 @@ static int setdma_rx(struct dwc2_ep *ep, struct dwc2_request *req)
>>
>>          ctrl =  readl(&reg->out_endp[ep_num].doepctl);
>>
>> +       invalidate_dcache_range((unsigned long) ep->dma_buf,
>> +                               (unsigned long) ep->dma_buf + ep->len);
> There is an invalidate in complete_rx() which is one of the callers
> for this function. It seems to me that we should not have this in two
> places. Why do we have this problem? Is it because the other calls to
> setdma_rx() don't invalidate?

Yup, invoke invalidate method twice during one complete transmission:
1- invalidate in setdma_rx() in case of  the write back cache, present 
from cache-line replacement after DMA transmission complete.
i.e.
1) dma_buf = "123456789abcd";
2) didn't invalidate in setdma_rx()
3) DMA complete interrupt coming
4) invalidate in complete_rx()
5) read dma_buf, it's "123456789abcd"

If invalidate in step 2, we will achieve correct data.
I think it's necessary to invalidate before starting DMA, and 
doc/README.arm-caches describe  details.
2- invalidate in complete_rx(), cpu read the dma_buf from memory directly.

> I think the invalidate should happen just before reading the data. Can
> you please check if the other invalidate is needed? Also see how it
> cache-aligns the end address.
memalign(CONFIG_SYS_CACHELINE_SIZE, EP_BUFFER_SIZE);
cache-aligns: 64 bytes.
dma_buffer size: 4096

I had check cache-aligins and end address, rightful.
Furthermore, everything works fine with noncached_alloc().
>
>> +
>>          writel((unsigned int) ep->dma_buf, &reg->out_endp[ep_num].doepdma);
>>          writel(DOEPT_SIZ_PKT_CNT(pktcnt) | DOEPT_SIZ_XFER_SIZE(length),
>>                 &reg->out_endp[ep_num].doeptsiz);
>> --
>> 1.9.1
>>
>>
> Regards,
> Simon
>
>
>
Simon Glass July 14, 2016, 3 p.m. | #3
On 12 July 2016 at 19:06, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
>
>
> On 2016年07月12日 20:59, Simon Glass wrote:
>>
>> Hi Ziyuan,
>>
>> On 6 July 2016 at 03:34, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
>>>
>>> From: Xu Ziyuan <xzy.xu@rock-chips.com>
>>>
>>> Invalidate dcache before starting the DMA to ensure coherency. In case
>>> there are any dirty lines from the DMA buffer in the cache, subsequent
>>> cache-line replacements may corrupt the buffer in memory while the DMA
>>> is still going on. Cache-line replacement can happen if the CPU tries to
>>> bring some other memory locations into the cache while the DMA is going
>>> on.
>>>
>>> Signed-off-by: Ziyuan Xu <xzy.xu@rock-chips.com>
>>> ---
>>>
>>> Changes in v3:
>>> - New commit since v3 to fix the coherence issue between memory and
>>> cache
>>>
>>> Changes in v2: None
>>>
>>>   drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>> b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>> index 12f5c85..0d6d2fb 100644
>>> --- a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>> +++ b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>> @@ -110,6 +110,9 @@ static int setdma_rx(struct dwc2_ep *ep, struct
>>> dwc2_request *req)
>>>
>>>          ctrl =  readl(&reg->out_endp[ep_num].doepctl);
>>>
>>> +       invalidate_dcache_range((unsigned long) ep->dma_buf,
>>> +                               (unsigned long) ep->dma_buf + ep->len);
>>
>> There is an invalidate in complete_rx() which is one of the callers
>> for this function. It seems to me that we should not have this in two
>> places. Why do we have this problem? Is it because the other calls to
>> setdma_rx() don't invalidate?
>
>
> Yup, invoke invalidate method twice during one complete transmission:
> 1- invalidate in setdma_rx() in case of  the write back cache, present from
> cache-line replacement after DMA transmission complete.
> i.e.
> 1) dma_buf = "123456789abcd";
> 2) didn't invalidate in setdma_rx()
> 3) DMA complete interrupt coming
> 4) invalidate in complete_rx()
> 5) read dma_buf, it's "123456789abcd"
>
> If invalidate in step 2, we will achieve correct data.
> I think it's necessary to invalidate before starting DMA, and
> doc/README.arm-caches describe  details.
> 2- invalidate in complete_rx(), cpu read the dma_buf from memory directly.
>
>> I think the invalidate should happen just before reading the data. Can
>> you please check if the other invalidate is needed? Also see how it
>> cache-aligns the end address.
>
> memalign(CONFIG_SYS_CACHELINE_SIZE, EP_BUFFER_SIZE);
> cache-aligns: 64 bytes.
> dma_buffer size: 4096
>
> I had check cache-aligins and end address, rightful.
> Furthermore, everything works fine with noncached_alloc().
>

OK, thanks for the details. Can the invalidate in (4) be dropped? We
should only need one invalidate.

Reviewed-by: Simon Glass <sjg@chromium.org>

>>
>>> +
>>>          writel((unsigned int) ep->dma_buf,
>>> &reg->out_endp[ep_num].doepdma);
>>>          writel(DOEPT_SIZ_PKT_CNT(pktcnt) | DOEPT_SIZ_XFER_SIZE(length),
>>>                 &reg->out_endp[ep_num].doeptsiz);
>>> --
>>> 1.9.1
>>>
>>>
>> Regards,
>> Simon
>>
>>
>>
>
>
Xu Ziyuan July 14, 2016, 3:43 p.m. | #4
Hi Simon,

On 2016年07月14日 23:00, Simon Glass wrote:
> On 12 July 2016 at 19:06, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
>>
>> On 2016年07月12日 20:59, Simon Glass wrote:
>>> Hi Ziyuan,
>>>
>>> On 6 July 2016 at 03:34, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
>>>> From: Xu Ziyuan <xzy.xu@rock-chips.com>
>>>>
>>>> Invalidate dcache before starting the DMA to ensure coherency. In case
>>>> there are any dirty lines from the DMA buffer in the cache, subsequent
>>>> cache-line replacements may corrupt the buffer in memory while the DMA
>>>> is still going on. Cache-line replacement can happen if the CPU tries to
>>>> bring some other memory locations into the cache while the DMA is going
>>>> on.
>>>>
>>>> Signed-off-by: Ziyuan Xu <xzy.xu@rock-chips.com>
>>>> ---
>>>>
>>>> Changes in v3:
>>>> - New commit since v3 to fix the coherence issue between memory and
>>>> cache
>>>>
>>>> Changes in v2: None
>>>>
>>>>    drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c | 3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>> b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>> index 12f5c85..0d6d2fb 100644
>>>> --- a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>> +++ b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>> @@ -110,6 +110,9 @@ static int setdma_rx(struct dwc2_ep *ep, struct
>>>> dwc2_request *req)
>>>>
>>>>           ctrl =  readl(&reg->out_endp[ep_num].doepctl);
>>>>
>>>> +       invalidate_dcache_range((unsigned long) ep->dma_buf,
>>>> +                               (unsigned long) ep->dma_buf + ep->len);
>>> There is an invalidate in complete_rx() which is one of the callers
>>> for this function. It seems to me that we should not have this in two
>>> places. Why do we have this problem? Is it because the other calls to
>>> setdma_rx() don't invalidate?
>>
>> Yup, invoke invalidate method twice during one complete transmission:
>> 1- invalidate in setdma_rx() in case of  the write back cache, present from
>> cache-line replacement after DMA transmission complete.
>> i.e.
>> 1) dma_buf = "123456789abcd";
>> 2) didn't invalidate in setdma_rx()
>> 3) DMA complete interrupt coming
>> 4) invalidate in complete_rx()
>> 5) read dma_buf, it's "123456789abcd"
>>
>> If invalidate in step 2, we will achieve correct data.
>> I think it's necessary to invalidate before starting DMA, and
>> doc/README.arm-caches describe  details.
>> 2- invalidate in complete_rx(), cpu read the dma_buf from memory directly.
>>
>>> I think the invalidate should happen just before reading the data. Can
>>> you please check if the other invalidate is needed? Also see how it
>>> cache-aligns the end address.
>> memalign(CONFIG_SYS_CACHELINE_SIZE, EP_BUFFER_SIZE);
>> cache-aligns: 64 bytes.
>> dma_buffer size: 4096
>>
>> I had check cache-aligins and end address, rightful.
>> Furthermore, everything works fine with noncached_alloc().
>>
> OK, thanks for the details. Can the invalidate in (4) be dropped? We
> should only need one invalidate.
We also need invalidate in after  DMA transfer completed, because in 
some processors memory contents can spontaneouslycome to the cache due 
to speculative memory access by the CPU. If this happens with the DMA 
buffer while DMA is going on we have a coherency problem.
Thanks for your review!

>
> Reviewed-by: Simon Glass <sjg@chromium.org>
>
>>>> +
>>>>           writel((unsigned int) ep->dma_buf,
>>>> &reg->out_endp[ep_num].doepdma);
>>>>           writel(DOEPT_SIZ_PKT_CNT(pktcnt) | DOEPT_SIZ_XFER_SIZE(length),
>>>>                  &reg->out_endp[ep_num].doeptsiz);
>>>> --
>>>> 1.9.1
>>>>
>>>>
>>> Regards,
>>> Simon
>>>
>>>
>>>
>>
>
>
Simon Glass July 15, 2016, 3:20 a.m. | #5
Hi Ziyuan,

On 14 July 2016 at 09:43, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
> Hi Simon,
>
>
> On 2016年07月14日 23:00, Simon Glass wrote:
>>
>> On 12 July 2016 at 19:06, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
>>>
>>>
>>> On 2016年07月12日 20:59, Simon Glass wrote:
>>>>
>>>> Hi Ziyuan,
>>>>
>>>> On 6 July 2016 at 03:34, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
>>>>>
>>>>> From: Xu Ziyuan <xzy.xu@rock-chips.com>
>>>>>
>>>>> Invalidate dcache before starting the DMA to ensure coherency. In case
>>>>> there are any dirty lines from the DMA buffer in the cache, subsequent
>>>>> cache-line replacements may corrupt the buffer in memory while the DMA
>>>>> is still going on. Cache-line replacement can happen if the CPU tries
>>>>> to
>>>>> bring some other memory locations into the cache while the DMA is going
>>>>> on.
>>>>>
>>>>> Signed-off-by: Ziyuan Xu <xzy.xu@rock-chips.com>
>>>>> ---
>>>>>
>>>>> Changes in v3:
>>>>> - New commit since v3 to fix the coherence issue between memory and
>>>>> cache
>>>>>
>>>>> Changes in v2: None
>>>>>
>>>>>    drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c | 3 +++
>>>>>    1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>>> b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>>> index 12f5c85..0d6d2fb 100644
>>>>> --- a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>>> +++ b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>>> @@ -110,6 +110,9 @@ static int setdma_rx(struct dwc2_ep *ep, struct
>>>>> dwc2_request *req)
>>>>>
>>>>>           ctrl =  readl(&reg->out_endp[ep_num].doepctl);
>>>>>
>>>>> +       invalidate_dcache_range((unsigned long) ep->dma_buf,
>>>>> +                               (unsigned long) ep->dma_buf + ep->len);
>>>>
>>>> There is an invalidate in complete_rx() which is one of the callers
>>>> for this function. It seems to me that we should not have this in two
>>>> places. Why do we have this problem? Is it because the other calls to
>>>> setdma_rx() don't invalidate?
>>>
>>>
>>> Yup, invoke invalidate method twice during one complete transmission:
>>> 1- invalidate in setdma_rx() in case of  the write back cache, present
>>> from
>>> cache-line replacement after DMA transmission complete.
>>> i.e.
>>> 1) dma_buf = "123456789abcd";
>>> 2) didn't invalidate in setdma_rx()
>>> 3) DMA complete interrupt coming
>>> 4) invalidate in complete_rx()
>>> 5) read dma_buf, it's "123456789abcd"
>>>
>>> If invalidate in step 2, we will achieve correct data.
>>> I think it's necessary to invalidate before starting DMA, and
>>> doc/README.arm-caches describe  details.
>>> 2- invalidate in complete_rx(), cpu read the dma_buf from memory
>>> directly.
>>>
>>>> I think the invalidate should happen just before reading the data. Can
>>>> you please check if the other invalidate is needed? Also see how it
>>>> cache-aligns the end address.
>>>
>>> memalign(CONFIG_SYS_CACHELINE_SIZE, EP_BUFFER_SIZE);
>>> cache-aligns: 64 bytes.
>>> dma_buffer size: 4096
>>>
>>> I had check cache-aligins and end address, rightful.
>>> Furthermore, everything works fine with noncached_alloc().
>>>
>> OK, thanks for the details. Can the invalidate in (4) be dropped? We
>> should only need one invalidate.
>
> We also need invalidate in after  DMA transfer completed, because in some
> processors memory contents can spontaneouslycome to the cache due to
> speculative memory access by the CPU. If this happens with the DMA buffer
> while DMA is going on we have a coherency problem.

Gosh that is horrible.

> Thanks for your review!
>
>
>>
>> Reviewed-by: Simon Glass <sjg@chromium.org>
>>
>>>>> +
>>>>>           writel((unsigned int) ep->dma_buf,
>>>>> &reg->out_endp[ep_num].doepdma);
>>>>>           writel(DOEPT_SIZ_PKT_CNT(pktcnt) |
>>>>> DOEPT_SIZ_XFER_SIZE(length),
>>>>>                  &reg->out_endp[ep_num].doeptsiz);
>>>>> --
>>>>> 1.9.1
>>>>>
>>>>>
>>>> Regards,
>>>> Simon
>>>>
>>>>
>>>>
>>>
>>
>>
>
>

Regards,
Simon
Simon Glass July 15, 2016, 3:56 a.m. | #6
On 14 July 2016 at 21:20, Simon Glass <sjg@chromium.org> wrote:
> Hi Ziyuan,
>
> On 14 July 2016 at 09:43, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
>> Hi Simon,
>>
>>
>> On 2016年07月14日 23:00, Simon Glass wrote:
>>>
>>> On 12 July 2016 at 19:06, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
>>>>
>>>>
>>>> On 2016年07月12日 20:59, Simon Glass wrote:
>>>>>
>>>>> Hi Ziyuan,
>>>>>
>>>>> On 6 July 2016 at 03:34, Ziyuan Xu <xzy.xu@rock-chips.com> wrote:
>>>>>>
>>>>>> From: Xu Ziyuan <xzy.xu@rock-chips.com>
>>>>>>
>>>>>> Invalidate dcache before starting the DMA to ensure coherency. In case
>>>>>> there are any dirty lines from the DMA buffer in the cache, subsequent
>>>>>> cache-line replacements may corrupt the buffer in memory while the DMA
>>>>>> is still going on. Cache-line replacement can happen if the CPU tries
>>>>>> to
>>>>>> bring some other memory locations into the cache while the DMA is going
>>>>>> on.
>>>>>>
>>>>>> Signed-off-by: Ziyuan Xu <xzy.xu@rock-chips.com>
>>>>>> ---
>>>>>>
>>>>>> Changes in v3:
>>>>>> - New commit since v3 to fix the coherence issue between memory and
>>>>>> cache
>>>>>>
>>>>>> Changes in v2: None
>>>>>>
>>>>>>    drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c | 3 +++
>>>>>>    1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>>>> b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>>>> index 12f5c85..0d6d2fb 100644
>>>>>> --- a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>>>> +++ b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
>>>>>> @@ -110,6 +110,9 @@ static int setdma_rx(struct dwc2_ep *ep, struct
>>>>>> dwc2_request *req)
>>>>>>
>>>>>>           ctrl =  readl(&reg->out_endp[ep_num].doepctl);
>>>>>>
>>>>>> +       invalidate_dcache_range((unsigned long) ep->dma_buf,
>>>>>> +                               (unsigned long) ep->dma_buf + ep->len);
>>>>>
>>>>> There is an invalidate in complete_rx() which is one of the callers
>>>>> for this function. It seems to me that we should not have this in two
>>>>> places. Why do we have this problem? Is it because the other calls to
>>>>> setdma_rx() don't invalidate?
>>>>
>>>>
>>>> Yup, invoke invalidate method twice during one complete transmission:
>>>> 1- invalidate in setdma_rx() in case of  the write back cache, present
>>>> from
>>>> cache-line replacement after DMA transmission complete.
>>>> i.e.
>>>> 1) dma_buf = "123456789abcd";
>>>> 2) didn't invalidate in setdma_rx()
>>>> 3) DMA complete interrupt coming
>>>> 4) invalidate in complete_rx()
>>>> 5) read dma_buf, it's "123456789abcd"
>>>>
>>>> If invalidate in step 2, we will achieve correct data.
>>>> I think it's necessary to invalidate before starting DMA, and
>>>> doc/README.arm-caches describe  details.
>>>> 2- invalidate in complete_rx(), cpu read the dma_buf from memory
>>>> directly.
>>>>
>>>>> I think the invalidate should happen just before reading the data. Can
>>>>> you please check if the other invalidate is needed? Also see how it
>>>>> cache-aligns the end address.
>>>>
>>>> memalign(CONFIG_SYS_CACHELINE_SIZE, EP_BUFFER_SIZE);
>>>> cache-aligns: 64 bytes.
>>>> dma_buffer size: 4096
>>>>
>>>> I had check cache-aligins and end address, rightful.
>>>> Furthermore, everything works fine with noncached_alloc().
>>>>
>>> OK, thanks for the details. Can the invalidate in (4) be dropped? We
>>> should only need one invalidate.
>>
>> We also need invalidate in after  DMA transfer completed, because in some
>> processors memory contents can spontaneouslycome to the cache due to
>> speculative memory access by the CPU. If this happens with the DMA buffer
>> while DMA is going on we have a coherency problem.
>
> Gosh that is horrible.
>
>> Thanks for your review!
>>

Applied to u-boot-rockchip, thanks!

Patch

diff --git a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
index 12f5c85..0d6d2fb 100644
--- a/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
+++ b/drivers/usb/gadget/dwc2_udc_otg_xfer_dma.c
@@ -110,6 +110,9 @@  static int setdma_rx(struct dwc2_ep *ep, struct dwc2_request *req)
 
 	ctrl =  readl(&reg->out_endp[ep_num].doepctl);
 
+	invalidate_dcache_range((unsigned long) ep->dma_buf,
+				(unsigned long) ep->dma_buf + ep->len);
+
 	writel((unsigned int) ep->dma_buf, &reg->out_endp[ep_num].doepdma);
 	writel(DOEPT_SIZ_PKT_CNT(pktcnt) | DOEPT_SIZ_XFER_SIZE(length),
 	       &reg->out_endp[ep_num].doeptsiz);