diff mbox series

sparc32: Page align size in arch_dma_alloc

Message ID 20210908074822.16793-1-andreas@gaisler.com
State New
Headers show
Series sparc32: Page align size in arch_dma_alloc | expand

Commit Message

Andreas Larsson Sept. 8, 2021, 7:48 a.m. UTC
Commit 53b7670e5735 ("sparc: factor the dma coherent mapping into
helper") lost the page align for the calls to dma_make_coherent and
srmmu_unmapiorange. The latter cannot handle a non page aligned len
argument.

Signed-off-by: Andreas Larsson <andreas@gaisler.com>
---
 arch/sparc/kernel/ioport.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig Sept. 9, 2021, 6:07 a.m. UTC | #1
On Wed, Sep 08, 2021 at 09:48:22AM +0200, Andreas Larsson wrote:
> Commit 53b7670e5735 ("sparc: factor the dma coherent mapping into
> helper") lost the page align for the calls to dma_make_coherent and
> srmmu_unmapiorange. The latter cannot handle a non page aligned len
> argument.
> 
> Signed-off-by: Andreas Larsson <andreas@gaisler.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

Andreas - while I've got your attention:  I've been looking into fully
converting sparc32 to the generic DMA code.  Do you have any
documentation for the Leon cache handling in dma_make_coherent,
and more importantly how that applies to the dma coherent handling?
I could see how a flush might be required for the streaming DMA mappings,
that is mapping normal cached memory for I/O.  But for the coherent
allocations which can be accessed from the device and the cpu without
another DMA mapping call this seems really strange.
Andreas Larsson Sept. 13, 2021, 1:18 p.m. UTC | #2
On 2021-09-09 08:07, Christoph Hellwig wrote:
> Andreas - while I've got your attention:  I've been looking into fully
> converting sparc32 to the generic DMA code.  Do you have any
> documentation for the Leon cache handling in dma_make_coherent,
> and more importantly how that applies to the dma coherent handling?
> I could see how a flush might be required for the streaming DMA mappings,
> that is mapping normal cached memory for I/O.  But for the coherent
> allocations which can be accessed from the device and the cpu without
> another DMA mapping call this seems really strange.

As long as the area passed to arch_dma_free is mapped by
arch_dma_allocate, I don't see why the call to dma_make_coherent in
arch_dma_free should be needed. I am not sure if there are any current
(or historical paths) where we nevertheless have a cacheable mapping
when we reach arch_dma_free (or the historical pci32_free_coherent).

The usual case for LEON systems is that cache snooping on the CPU side
invalidates cache lines matching DMA that the CPU sees on the bus. Under
the assumption that DMA accesses are seen on the processor bus, this is
the reason for only flushing if snooping is not enabled in
dma_make_coherent.
Christoph Hellwig Sept. 14, 2021, 6:12 a.m. UTC | #3
On Wed, Sep 08, 2021 at 06:04:54PM +0200, Sam Ravnborg wrote:
> Hi Andreas,
> 
> On Wed, Sep 08, 2021 at 09:48:22AM +0200, Andreas Larsson wrote:
> > Commit 53b7670e5735 ("sparc: factor the dma coherent mapping into
> > helper") lost the page align for the calls to dma_make_coherent and
> > srmmu_unmapiorange. The latter cannot handle a non page aligned len
> > argument.
> 
> I wonder how you managed to track this down - well done.
> > 
> > Signed-off-by: Andreas Larsson <andreas@gaisler.com>
> Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
> 
> I assume David or Christoph picks it up.

I'll happily pick it up if that is ok.  Dave?
Christoph Hellwig Sept. 14, 2021, 6:17 a.m. UTC | #4
On Mon, Sep 13, 2021 at 03:18:38PM +0200, Andreas Larsson wrote:
>> Andreas - while I've got your attention:  I've been looking into fully
>> converting sparc32 to the generic DMA code.  Do you have any
>> documentation for the Leon cache handling in dma_make_coherent,
>> and more importantly how that applies to the dma coherent handling?
>> I could see how a flush might be required for the streaming DMA mappings,
>> that is mapping normal cached memory for I/O.  But for the coherent
>> allocations which can be accessed from the device and the cpu without
>> another DMA mapping call this seems really strange.
>
> As long as the area passed to arch_dma_free is mapped by
> arch_dma_allocate, I don't see why the call to dma_make_coherent in
> arch_dma_free should be needed. I am not sure if there are any current
> (or historical paths) where we nevertheless have a cacheable mapping
> when we reach arch_dma_free (or the historical pci32_free_coherent).

Note that the cacheable mapping in the kernel map still exists, but is
is not used for any access.

> The usual case for LEON systems is that cache snooping on the CPU side
> invalidates cache lines matching DMA that the CPU sees on the bus. Under
> the assumption that DMA accesses are seen on the processor bus, this is
> the reason for only flushing if snooping is not enabled in
> dma_make_coherent.

Thanks.  Can you take a look and test the two patches below on top of
your fix?  A git tree is also available here:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/sparc32-generic-dma
Andreas Larsson Sept. 14, 2021, 8:51 a.m. UTC | #5
On 2021-09-14 08:17, Christoph Hellwig wrote:
> Thanks.  Can you take a look and test the two patches below on top of
> your fix?  A git tree is also available here:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/sparc32-generic-dma
> 

In a quick test, this seems to work on LEON for code paths previously 
going to arch_dma_alloc and arch_dma_free. However, this makes setting 
up these DMA mappings to not go through sparc_dma_alloc_resource, and it 
seems important that they do that on Sun systems. Hopefully, someone 
with more knowledge about that could chime in here.

The added pgprot_dmacoherent is problematic as it sets SRMMU_PRIV, which 
sets up kernel access only. This was fine for arch_dma_alloc that sets 
up kernel accesses only, but for user space DMA mmap this would make 
them kernel accessable only. Having no sparc-specific 
pgprot_dmacoherent, keeping it to default to pgprot_noncached, is 
probably better.
Christoph Hellwig Sept. 14, 2021, 10:42 a.m. UTC | #6
On Tue, Sep 14, 2021 at 10:51:51AM +0200, Andreas Larsson wrote:
> On 2021-09-14 08:17, Christoph Hellwig wrote:
>> Thanks.  Can you take a look and test the two patches below on top of
>> your fix?  A git tree is also available here:
>>
>> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/sparc32-generic-dma
>>
>
> In a quick test, this seems to work on LEON for code paths previously going 
> to arch_dma_alloc and arch_dma_free. However, this makes setting up these 
> DMA mappings to not go through sparc_dma_alloc_resource, and it seems 
> important that they do that on Sun systems. Hopefully, someone with more 
> knowledge about that could chime in here.

Does the hardware actually care about it?  The only thing it does is
to force allocating from a specific virtual address range, but how
would that have a special meaning?

> The added pgprot_dmacoherent is problematic as it sets SRMMU_PRIV, which 
> sets up kernel access only. This was fine for arch_dma_alloc that sets up 
> kernel accesses only, but for user space DMA mmap this would make them 
> kernel accessable only. Having no sparc-specific pgprot_dmacoherent, 
> keeping it to default to pgprot_noncached, is probably better.

I've just tried to keep the existing attributes.  If SRMMU_PRIV does
indeed mean that the page can't also be mapped into userspace page tables
it would be good to remove it in an incremental patch.  If OTOH it only
means that this PTE is a kernel mapping it should not affect a userspace
mapping as that will always use separate PTEs.

>
> -- 
> Andreas Larsson
---end quoted text---
Andreas Larsson Sept. 14, 2021, 11:16 a.m. UTC | #7
Please consider the environment before printing this email
On 2021-09-14 12:42, Christoph Hellwig wrote:

>> The added pgprot_dmacoherent is problematic as it sets SRMMU_PRIV, which
>> sets up kernel access only. This was fine for arch_dma_alloc that sets up
>> kernel accesses only, but for user space DMA mmap this would make them
>> kernel accessable only. Having no sparc-specific pgprot_dmacoherent,
>> keeping it to default to pgprot_noncached, is probably better.
> 
> I've just tried to keep the existing attributes.  If SRMMU_PRIV does
> indeed mean that the page can't also be mapped into userspace page tables
> it would be good to remove it in an incremental patch.  If OTOH it only
> means that this PTE is a kernel mapping it should not affect a userspace
> mapping as that will always use separate PTEs.

Before the patch, arch_dma_alloc did via srmmu_mapiorange set up pages 
with SRMMU_PRIV, which is all fine as it sets up kernel buffers. With 
your patch we get PAGE_KERNEL as an argument to dma_pgprot in the 
corresponding call path that earlier lead to arch_dma_alloc. PAGE_KERNEL 
already includes SRMMU_PRIV so adding it again should not be necessary.

The problem I am pointing to is that adding a pgprot_dmacoherent that 
adds SRMMU_PRIV, changes the behaviour of other call paths that calls 
dma_pgprot but are not mapping in kernel pages.

Now this is not confirmed in execution from my side, but it seems that 
from following the code that e.g. this call path that is about mapping 
DMA pages accessible from user space:

dma_mmap_attrs ->  dma_direct_mmap -> dma_pgprot -> pgprot_dmacoherent

goes from making it merely uncacheable with the default

#ifndef pgprot_dmacoherent
#define pgprot_dmacoherent(prot)	pgprot_noncached(prot)
#endif

to also being non-user-accessible if we change to this  pgprot_dmacoherent

#define pgprot_dmacoherent pgprot_dmacoherent
static inline pgprot_t pgprot_dmacoherent(pgprot_t prot)
{
	pgprot_val(prot) &= ~pgprot_val(__pgprot(SRMMU_CACHE));
	pgprot_val(prot) |= pgprot_val(__pgprot(SRMMU_PRIV));
	return prot;
}
Christoph Hellwig Sept. 14, 2021, 11:26 a.m. UTC | #8
On Tue, Sep 14, 2021 at 01:16:16PM +0200, Andreas Larsson wrote:
> Before the patch, arch_dma_alloc did via srmmu_mapiorange set up pages with 
> SRMMU_PRIV, which is all fine as it sets up kernel buffers. With your patch 
> we get PAGE_KERNEL as an argument to dma_pgprot in the corresponding call 
> path that earlier lead to arch_dma_alloc. PAGE_KERNEL already includes 
> SRMMU_PRIV so adding it again should not be necessary.

You're right, I missed that PAGE_KERNEL already includes SRMMU_PRIV.
David Miller Sept. 14, 2021, 11:39 a.m. UTC | #9
From: Christoph Hellwig <hch@lst.de>
Date: Tue, 14 Sep 2021 08:12:22 +0200

> On Wed, Sep 08, 2021 at 06:04:54PM +0200, Sam Ravnborg wrote:
>> Hi Andreas,
>> 
>> On Wed, Sep 08, 2021 at 09:48:22AM +0200, Andreas Larsson wrote:
>> > Commit 53b7670e5735 ("sparc: factor the dma coherent mapping into
>> > helper") lost the page align for the calls to dma_make_coherent and
>> > srmmu_unmapiorange. The latter cannot handle a non page aligned len
>> > argument.
>> 
>> I wonder how you managed to track this down - well done.
>> > 
>> > Signed-off-by: Andreas Larsson <andreas@gaisler.com>
>> Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
>> 
>> I assume David or Christoph picks it up.
> 
> I'll happily pick it up if that is ok.  Dave?

Yep, it's ok.
diff mbox series

Patch

diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index 8e1d72a16759..7ceae24b0ca9 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -356,7 +356,9 @@  void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		dma_addr_t dma_addr, unsigned long attrs)
 {
-	if (!sparc_dma_free_resource(cpu_addr, PAGE_ALIGN(size)))
+	size = PAGE_ALIGN(size);
+
+	if (!sparc_dma_free_resource(cpu_addr, size))
 		return;
 
 	dma_make_coherent(dma_addr, size);