Message ID | 20180730163824.10064-3-hch@lst.de (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [01/20] kernel/dma/direct: take DMA offset into account in dma_direct_supported | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | next/apply_patch Successfully applied |
snowpatch_ozlabs/checkpatch | warning | Test checkpatch on branch next |
On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote: > We need to take the DMA offset and encryption bit into account when selecting > a zone. Add a helper that takes those into account and use it. That whole "encryption" stuff seems to be completely specific to the way x86 does memory encryption, or am I mistaken ? It's not clear to me what that does in practice and how it relates to DMA mappings. I'm also not sure about that whole business with ZONE_DMA and ARCH_ZONE_DMA_BITS... On ppc64, unless you enable swiotlb (which we only do currently on some embedded platforms), you have all of memory in ZONE_DMA. [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x0000000000000000-0x0000001fffffffff] [ 0.000000] DMA32 empty [ 0.000000] Normal empty [ 0.000000] Device empty I'm not sure how this will work with that dma direct code. I also see a number of tests against a 64-bit mask rather than the top of memory... Ben. > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > kernel/dma/direct.c | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c > index d32d4f0d2c0c..c2c1df8827f2 100644 > --- a/kernel/dma/direct.c > +++ b/kernel/dma/direct.c > @@ -58,6 +58,14 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size) > return addr + size - 1 <= dev->coherent_dma_mask; > } > > +static bool dma_coherent_below(struct device *dev, u64 mask) > +{ > + dma_addr_t addr = force_dma_unencrypted() ? > + __phys_to_dma(dev, mask) : phys_to_dma(dev, mask); > + > + return dev->coherent_dma_mask <= addr; > +} > + > void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, > gfp_t gfp, unsigned long attrs) > { > @@ -70,9 +78,9 @@ void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, > gfp &= ~__GFP_ZERO; > > /* GFP_DMA32 and GFP_DMA are no ops without the corresponding zones: */ > - if (dev->coherent_dma_mask <= DMA_BIT_MASK(ARCH_ZONE_DMA_BITS)) > + if (dma_coherent_below(dev, DMA_BIT_MASK(ARCH_ZONE_DMA_BITS))) > gfp |= GFP_DMA; > - if (dev->coherent_dma_mask <= DMA_BIT_MASK(32) && !(gfp & GFP_DMA)) > + if (dma_coherent_below(dev, DMA_BIT_MASK(32) && !(gfp & GFP_DMA))) > gfp |= GFP_DMA32; > > again: > @@ -92,14 +100,14 @@ void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, > page = NULL; > > if (IS_ENABLED(CONFIG_ZONE_DMA32) && > - dev->coherent_dma_mask < DMA_BIT_MASK(64) && > + dma_coherent_below(dev, DMA_BIT_MASK(64)) && > !(gfp & (GFP_DMA32 | GFP_DMA))) { > gfp |= GFP_DMA32; > goto again; > } > > if (IS_ENABLED(CONFIG_ZONE_DMA) && > - dev->coherent_dma_mask < DMA_BIT_MASK(32) && > + dma_coherent_below(dev, DMA_BIT_MASK(32)) && > !(gfp & GFP_DMA)) { > gfp = (gfp & ~GFP_DMA32) | GFP_DMA; > goto again;
On Thu, Aug 09, 2018 at 09:54:33AM +1000, Benjamin Herrenschmidt wrote: > On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote: > > We need to take the DMA offset and encryption bit into account when selecting > > a zone. Add a helper that takes those into account and use it. > > That whole "encryption" stuff seems to be completely specific to the > way x86 does memory encryption, or am I mistaken ? It's not clear to me > what that does in practice and how it relates to DMA mappings. Not even all of x86, but AMD in particular, Intel does it yet another way. But it still is easier to take this into the core with a few overrides than duplicating all the code. > I'm also not sure about that whole business with ZONE_DMA and > ARCH_ZONE_DMA_BITS... ZONE_DMA usually (but not always) maps to 24-bits of address space, if it doesn't (I mostly through about s390 with it's odd 31-bits) the architecture can override it if it cares). > On ppc64, unless you enable swiotlb (which we only do currently on > some embedded platforms), you have all of memory in ZONE_DMA. > > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000000000000-0x0000001fffffffff] > [ 0.000000] DMA32 empty > [ 0.000000] Normal empty > [ 0.000000] Device empty This is really weird. Why would you wire up ZONE_DMA like this? The general scheme that architectures should implement is: ZONE_DMA: Any memory below a magic threshold that is lower than 32-bit. Only enabled if actually required (usually either 24-bit for ISA, or some other weird architecture specific value like 32-bit for S/390) ZONE_DMA32: Memory <= 32-bit if the architecture supports more than 32-bits worth of physical address space. Should generally be enabled on all 64-bit architectures unless you have a very good reason not to. ZONE_NORMAL: Everything above 32-bit not falling into HIGHMEM or MOVEABLE.
On Wed, 2018-08-22 at 08:58 +0200, Christoph Hellwig wrote: > On Thu, Aug 09, 2018 at 09:54:33AM +1000, Benjamin Herrenschmidt wrote: > > On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote: > > > We need to take the DMA offset and encryption bit into account when selecting > > > a zone. Add a helper that takes those into account and use it. > > > > That whole "encryption" stuff seems to be completely specific to the > > way x86 does memory encryption, or am I mistaken ? It's not clear to me > > what that does in practice and how it relates to DMA mappings. > > Not even all of x86, but AMD in particular, Intel does it yet another > way. But it still is easier to take this into the core with a few > overrides than duplicating all the code. > > > I'm also not sure about that whole business with ZONE_DMA and > > ARCH_ZONE_DMA_BITS... > > ZONE_DMA usually (but not always) maps to 24-bits of address space, > if it doesn't (I mostly through about s390 with it's odd 31-bits) > the architecture can override it if it cares). > > > On ppc64, unless you enable swiotlb (which we only do currently on > > some embedded platforms), you have all of memory in ZONE_DMA. > > > > [ 0.000000] Zone ranges: > > [ 0.000000] DMA [mem 0x0000000000000000-0x0000001fffffffff] > > [ 0.000000] DMA32 empty > > [ 0.000000] Normal empty > > [ 0.000000] Device empty > > This is really weird. Why would you wire up ZONE_DMA like this? We always did :-) It predates my involvement and I think it predates even Pauls. It's quite silly actually since the first powerpc machines actually had ISA devices in them, but that's how it's been for ever. I suppose we could change it but that would mean digging out some old stuff to test. > The general scheme that architectures should implement is: > > ZONE_DMA: Any memory below a magic threshold that is lower than > 32-bit. Only enabled if actually required (usually > either 24-bit for ISA, or some other weird architecture > specific value like 32-bit for S/390) It should have been ZONE_ISA_DMA :-) > ZONE_DMA32: Memory <= 32-bit if the architecture supports more than > 32-bits worth of physical address space. Should generally > be enabled on all 64-bit architectures unless you have > a very good reason not to. Yeah so we sort-of enable the config option but only populate the zone on platforms using swiotlb (freescale stuff). It's a bit messy at the moment I must admit. > ZONE_NORMAL: Everything above 32-bit not falling into HIGHMEM or > MOVEABLE. Cheers, Ben.
On Thu, Aug 23, 2018 at 10:01:45AM +1000, Benjamin Herrenschmidt wrote: > > The general scheme that architectures should implement is: > > > > ZONE_DMA: Any memory below a magic threshold that is lower than > > 32-bit. Only enabled if actually required (usually > > either 24-bit for ISA, or some other weird architecture > > specific value like 32-bit for S/390) > > It should have been ZONE_ISA_DMA :-) For most of these use cases it should have been indeed, and that would avoid a lot of confusion where people use GFP_DMA just because they do DMA. Anyway, switching powerpc to this scheme would be great, but I don't think it is required - GFP_KERNEL will silently fall back to ZONE_DMA, so except for an additional GFP_DMA fallback allocation when the GFP_KERNEL one fails the code should just work.
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index d32d4f0d2c0c..c2c1df8827f2 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -58,6 +58,14 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size) return addr + size - 1 <= dev->coherent_dma_mask; } +static bool dma_coherent_below(struct device *dev, u64 mask) +{ + dma_addr_t addr = force_dma_unencrypted() ? + __phys_to_dma(dev, mask) : phys_to_dma(dev, mask); + + return dev->coherent_dma_mask <= addr; +} + void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs) { @@ -70,9 +78,9 @@ void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp &= ~__GFP_ZERO; /* GFP_DMA32 and GFP_DMA are no ops without the corresponding zones: */ - if (dev->coherent_dma_mask <= DMA_BIT_MASK(ARCH_ZONE_DMA_BITS)) + if (dma_coherent_below(dev, DMA_BIT_MASK(ARCH_ZONE_DMA_BITS))) gfp |= GFP_DMA; - if (dev->coherent_dma_mask <= DMA_BIT_MASK(32) && !(gfp & GFP_DMA)) + if (dma_coherent_below(dev, DMA_BIT_MASK(32) && !(gfp & GFP_DMA))) gfp |= GFP_DMA32; again: @@ -92,14 +100,14 @@ void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, page = NULL; if (IS_ENABLED(CONFIG_ZONE_DMA32) && - dev->coherent_dma_mask < DMA_BIT_MASK(64) && + dma_coherent_below(dev, DMA_BIT_MASK(64)) && !(gfp & (GFP_DMA32 | GFP_DMA))) { gfp |= GFP_DMA32; goto again; } if (IS_ENABLED(CONFIG_ZONE_DMA) && - dev->coherent_dma_mask < DMA_BIT_MASK(32) && + dma_coherent_below(dev, DMA_BIT_MASK(32)) && !(gfp & GFP_DMA)) { gfp = (gfp & ~GFP_DMA32) | GFP_DMA; goto again;
We need to take the DMA offset and encryption bit into account when selecting a zone. Add a helper that takes those into account and use it. Signed-off-by: Christoph Hellwig <hch@lst.de> --- kernel/dma/direct.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)