Patchwork [RFC,01/13] Generic DMA memory access interface

login
register
mail settings
Submitter Eduard - Gabriel Munteanu
Date June 1, 2011, 1:38 a.m.
Message ID <1306892315-7306-2-git-send-email-eduard.munteanu@linux360.ro>
Download mbox | patch
Permalink /patch/98115/
State New
Headers show

Comments

Eduard - Gabriel Munteanu - June 1, 2011, 1:38 a.m.
This introduces replacements for memory access functions like
cpu_physical_memory_read(). The new interface can handle address
translation and access checking through an IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
 Makefile.target |    2 +-
 hw/dma_rw.c     |  155 +++++++++++++++++++++++++++++++++++++++
 hw/dma_rw.h     |  217 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 373 insertions(+), 1 deletions(-)
 create mode 100644 hw/dma_rw.c
 create mode 100644 hw/dma_rw.h
Richard Henderson - June 1, 2011, 2:01 p.m.
On 05/31/2011 06:38 PM, Eduard - Gabriel Munteanu wrote:
> +static inline void dma_memory_rw(DMADevice *dev,
> +                                 dma_addr_t addr,
> +                                 void *buf,
> +                                 dma_addr_t len,
> +                                 int is_write)

I don't think this needs to be inline...

> +{
> +    /*
> +     * Fast-path non-iommu.
> +     * More importantly, makes it obvious what this function does.
> +     */
> +    if (!dev || !dev->mmu) {
> +        cpu_physical_memory_rw(addr, buf, len, is_write);
> +        return;
> +    }

... because you'll never be able to eliminate the if or the calls.
You might as well make the overall code smaller by taking the
entire function out of line.

> +#define DEFINE_DMA_LD(prefix, suffix, devtype, dmafield, size)            \
> +static inline uint##size##_t                                              \
> +dma_ld##suffix(DMADevice *dev, dma_addr_t addr)                           \
> +{                                                                         \
> +    int err;                                                              \
> +    dma_addr_t paddr, plen;                                               \
> +                                                                          \
> +    if (!dev || !dev->mmu) {                                              \
> +        return ld##suffix##_phys(addr);                                   \
> +    }                                                                     \

Similarly for all the ld/st functions.

> +#define DEFINE_DMA_MEMORY_RW(prefix, devtype, dmafield)
> +#define DEFINE_DMA_MEMORY_READ(prefix, devtype, dmafield)
> +#define DEFINE_DMA_MEMORY_WRITE(prefix, devtype, dmafield)
> +
> +#define DEFINE_DMA_OPS(prefix, devtype, dmafield)          \

I think this is a bit over the top, really.

> +        err = dev->mmu->translate(dev, addr, &paddr, &plen, is_write);

I see you didn't take my suggestion for using an opaque callback pointer.
Really and truly, I won't be able to use this as-is for Alpha.


r~
Avi Kivity - June 1, 2011, 2:29 p.m.
On 06/01/2011 05:01 PM, Richard Henderson wrote:
> >  +        err = dev->mmu->translate(dev, addr,&paddr,&plen, is_write);
>
> I see you didn't take my suggestion for using an opaque callback pointer.
> Really and truly, I won't be able to use this as-is for Alpha.
>

Rather than opaques, please pass the DMA engine itself and use 
container_of().

We should be removing opaques, not adding them.
Eduard - Gabriel Munteanu - June 1, 2011, 2:52 p.m.
On Wed, Jun 01, 2011 at 07:01:42AM -0700, Richard Henderson wrote:
> On 05/31/2011 06:38 PM, Eduard - Gabriel Munteanu wrote:
> > +static inline void dma_memory_rw(DMADevice *dev,
> > +                                 dma_addr_t addr,
> > +                                 void *buf,
> > +                                 dma_addr_t len,
> > +                                 int is_write)
> 
> I don't think this needs to be inline...
> 
> > +{
> > +    /*
> > +     * Fast-path non-iommu.
> > +     * More importantly, makes it obvious what this function does.
> > +     */
> > +    if (!dev || !dev->mmu) {
> > +        cpu_physical_memory_rw(addr, buf, len, is_write);
> > +        return;
> > +    }
> 
> ... because you'll never be able to eliminate the if or the calls.
> You might as well make the overall code smaller by taking the
> entire function out of line.
> 
> > +#define DEFINE_DMA_LD(prefix, suffix, devtype, dmafield, size)            \
> > +static inline uint##size##_t                                              \
> > +dma_ld##suffix(DMADevice *dev, dma_addr_t addr)                           \
> > +{                                                                         \
> > +    int err;                                                              \
> > +    dma_addr_t paddr, plen;                                               \
> > +                                                                          \
> > +    if (!dev || !dev->mmu) {                                              \
> > +        return ld##suffix##_phys(addr);                                   \
> > +    }                                                                     \
> 
> Similarly for all the ld/st functions.
> 

The idea was to get to the fastpath as soon as possible. I'm not really
concerned about the case where there's an IOMMU present, since
translation/checking does a lot more work. But other people might be
worried about that additional function call when there's no IOMMU.

And these functions are quite small anyway.

Thoughts, anybody else?

> > +#define DEFINE_DMA_MEMORY_RW(prefix, devtype, dmafield)
> > +#define DEFINE_DMA_MEMORY_READ(prefix, devtype, dmafield)
> > +#define DEFINE_DMA_MEMORY_WRITE(prefix, devtype, dmafield)
> > +
> > +#define DEFINE_DMA_OPS(prefix, devtype, dmafield)          \
> 
> I think this is a bit over the top, really.
> 

Yeah, it's a bit unconventional, but why do you think that?

The main selling point is there are more chances to screw up if every
bus layer implements these manually. And it's really convenient,
especially if we get to add another ld/st.

I do have one concern about it, though: it might increase compile time
due to additional preprocessing work. I haven't done any benchmarks on
that. But apart from this, are there any other objections?

> > +        err = dev->mmu->translate(dev, addr, &paddr, &plen, is_write);
> 
> I see you didn't take my suggestion for using an opaque callback pointer.
> Really and truly, I won't be able to use this as-is for Alpha.
> 

If I understand correctly you need some sort of shared state between
IOMMUs or units residing on different buses. Then you should be able to
get to it even with this API, just like I do with my AMD IOMMU state by
upcasting. It doesn't seem to matter whether you've got an opaque, that
opaque could very well be reachable by upcasting.

Did I get this wrong?


	Eduard

> 
> r~
Richard Henderson - June 1, 2011, 3:09 p.m.
On 06/01/2011 07:52 AM, Eduard - Gabriel Munteanu wrote:
> The main selling point is there are more chances to screw up if every
> bus layer implements these manually. And it's really convenient,
> especially if we get to add another ld/st.

If we drop the ld/st, we're talking about 5 lines for every bus layer.

If I recall, there was just the one driver that actually uses the ld/st
interface; most used the read/write interface.

> If I understand correctly you need some sort of shared state between
> IOMMUs or units residing on different buses. Then you should be able to
> get to it even with this API, just like I do with my AMD IOMMU state by
> upcasting. It doesn't seem to matter whether you've got an opaque, that
> opaque could very well be reachable by upcasting.
> 
> Did I get this wrong?

Can you honestly tell me that 

> +static int amd_iommu_translate(DMADevice *dev,
> +                               dma_addr_t addr,
> +                               dma_addr_t *paddr,
> +                               dma_addr_t *len,
> +                               int is_write)
> +{
> +    PCIDevice *pci_dev = container_of(dev, PCIDevice, dma);
> +    PCIDevice *iommu_dev = DO_UPCAST(PCIDevice, qdev, dev->mmu->iommu);
> +    AMDIOMMUState *s = DO_UPCAST(AMDIOMMUState, dev, iommu_dev);

THREE (3) upcasts is a sane to write maintainable software?
The margin for error here is absolutely enormous.

If you had just passed in that AMDIOMMUState* as the opaque
value, it would be trivial to look at the initialization
statement and the callback function to verify that the right
value is being passed.


r~
Richard Henderson - June 1, 2011, 3:16 p.m.
On 06/01/2011 07:29 AM, Avi Kivity wrote:
> On 06/01/2011 05:01 PM, Richard Henderson wrote:
>> >  +        err = dev->mmu->translate(dev, addr,&paddr,&plen, is_write);
>>
>> I see you didn't take my suggestion for using an opaque callback pointer.
>> Really and truly, I won't be able to use this as-is for Alpha.
>>
> 
> Rather than opaques, please pass the DMA engine itself and use container_of().

The dma engine object is currently sitting in the PCIBus structure.
Which is private, and can't be extended by a host bridge implementation.

The entire code could be re-arranged, true, but please suggest something
reasonable.

> We should be removing opaques, not adding them.

See my followup elsewhere.  Opaques *can* be cleaner than upcasting,
particularly if there are too many hoops through which to jump.


r~
Eduard - Gabriel Munteanu - June 1, 2011, 3:35 p.m.
On Wed, Jun 01, 2011 at 08:09:29AM -0700, Richard Henderson wrote:
> On 06/01/2011 07:52 AM, Eduard - Gabriel Munteanu wrote:
> > The main selling point is there are more chances to screw up if every
> > bus layer implements these manually. And it's really convenient,
> > especially if we get to add another ld/st.
> 
> If we drop the ld/st, we're talking about 5 lines for every bus layer.
> 
> If I recall, there was just the one driver that actually uses the ld/st
> interface; most used the read/write interface.

Hm, indeed there seem to be far fewer uses of those now, actually my
patches don't seem to be using those. 

What do you guys think? Will these go away completely?

> > If I understand correctly you need some sort of shared state between
> > IOMMUs or units residing on different buses. Then you should be able to
> > get to it even with this API, just like I do with my AMD IOMMU state by
> > upcasting. It doesn't seem to matter whether you've got an opaque, that
> > opaque could very well be reachable by upcasting.
> > 
> > Did I get this wrong?
> 
> Can you honestly tell me that 
> 
> > +static int amd_iommu_translate(DMADevice *dev,
> > +                               dma_addr_t addr,
> > +                               dma_addr_t *paddr,
> > +                               dma_addr_t *len,
> > +                               int is_write)
> > +{
> > +    PCIDevice *pci_dev = container_of(dev, PCIDevice, dma);
> > +    PCIDevice *iommu_dev = DO_UPCAST(PCIDevice, qdev, dev->mmu->iommu);
> > +    AMDIOMMUState *s = DO_UPCAST(AMDIOMMUState, dev, iommu_dev);
> 
> THREE (3) upcasts is a sane to write maintainable software?
> The margin for error here is absolutely enormous.
> 
> If you had just passed in that AMDIOMMUState* as the opaque
> value, it would be trivial to look at the initialization
> statement and the callback function to verify that the right
> value is being passed.

Maybe it's not nice, but you're missing the fact upcasting gives you
some type safety. With opaques you have none. Plus you also get the PCI
device that made the call while you're at it.


	Eduard

> r~
Richard Henderson - June 1, 2011, 3:45 p.m.
On 06/01/2011 08:35 AM, Eduard - Gabriel Munteanu wrote:
> Maybe it's not nice, but you're missing the fact upcasting gives you
> some type safety. With opaques you have none.

Lol.  Do you understand what container_of does?
This is not dynamic_cast<> with RTTI.

You can put any type name in there that you like,
so long as it has a field name to match.  The type
of the field you give doesn't even have to match
the type of the pointer that you pass in.

Type safety this is not.


r~
David Gibson - June 2, 2011, 9:38 a.m.
On Wed, Jun 01, 2011 at 08:45:56AM -0700, Richard Henderson wrote:
> On 06/01/2011 08:35 AM, Eduard - Gabriel Munteanu wrote:
> > Maybe it's not nice, but you're missing the fact upcasting gives you
> > some type safety. With opaques you have none.
> 
> Lol.  Do you understand what container_of does?
> This is not dynamic_cast<> with RTTI.
> 
> You can put any type name in there that you like,
> so long as it has a field name to match.  The type
> of the field you give doesn't even have to match
> the type of the pointer that you pass in.

Uh, if that's true, that's a bug in the container_of implementation.
The ccan container_of implementation, for example, certainly does
check that the given field has type matching the pointer.
David Gibson - June 2, 2011, 10:22 a.m.
On Wed, Jun 01, 2011 at 08:16:44AM -0700, Richard Henderson wrote:
> On 06/01/2011 07:29 AM, Avi Kivity wrote:
> > On 06/01/2011 05:01 PM, Richard Henderson wrote:
> >> >  +        err = dev->mmu->translate(dev, addr,&paddr,&plen, is_write);
> >>
> >> I see you didn't take my suggestion for using an opaque callback pointer.
> >> Really and truly, I won't be able to use this as-is for Alpha.
> >>
> > 
> > Rather than opaques, please pass the DMA engine itself and use container_of().
> 
> The dma engine object is currently sitting in the PCIBus structure.
> Which is private, and can't be extended by a host bridge implementation.
> 
> The entire code could be re-arranged, true, but please suggest something
> reasonable.
> 
> > We should be removing opaques, not adding them.
> 
> See my followup elsewhere.  Opaques *can* be cleaner than upcasting,
> particularly if there are too many hoops through which to jump.

So, in the meantime, I've also done a version of Eduard's earlier
patches, with added support for the PAPR hypervisor managed IOMMU.

I have also significantly reworked how the structure lookup works,
partly because in my case I'l looking at IOMMU translation for non-PCI
devices, but I think it may also address your concerns.  I'm still
using upcasts, but there are less steps from the device to the IOMMU
state.

I've been sick and haven't had a chance to merge my stuff with
Eduard's changes.  I'll post them anyway, as another discussion
point.

Patch

diff --git a/Makefile.target b/Makefile.target
index 21f864a..ee0c80d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -224,7 +224,7 @@  obj-i386-y += cirrus_vga.o apic.o ioapic.o piix_pci.o
 obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
-obj-i386-y += pc_piix.o kvmclock.o
+obj-i386-y += pc_piix.o kvmclock.o dma_rw.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/dma_rw.c b/hw/dma_rw.c
new file mode 100644
index 0000000..824db83
--- /dev/null
+++ b/hw/dma_rw.c
@@ -0,0 +1,155 @@ 
+/*
+ * Generic DMA memory access interface.
+ *
+ * Copyright (c) 2011 Eduard - Gabriel Munteanu
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "dma_rw.h"
+#include "range.h"
+
+static void dma_register_memory_map(DMADevice *dev,
+                                    void *buffer,
+                                    dma_addr_t addr,
+                                    dma_addr_t len,
+                                    DMAInvalidateMapFunc *invalidate,
+                                    void *invalidate_opaque)
+{
+    DMAMemoryMap *map;
+
+    map = qemu_malloc(sizeof(DMAMemoryMap));
+    map->buffer             = buffer;
+    map->addr               = addr;
+    map->len                = len;
+    map->invalidate         = invalidate;
+    map->invalidate_opaque  = invalidate_opaque;
+
+    QLIST_INSERT_HEAD(&dev->mmu->memory_maps, map, list);
+}
+
+static void dma_unregister_memory_map(DMADevice *dev,
+                                      void *buffer,
+                                      dma_addr_t len)
+{
+    DMAMemoryMap *map;
+
+    QLIST_FOREACH(map, &dev->mmu->memory_maps, list) {
+        if (map->buffer == buffer && map->len == len) {
+            QLIST_REMOVE(map, list);
+            free(map);
+        }
+    }
+}
+
+void dma_invalidate_memory_range(DMADevice *dev,
+                                 dma_addr_t addr,
+                                 dma_addr_t len)
+{
+    DMAMemoryMap *map;
+
+    QLIST_FOREACH(map, &dev->mmu->memory_maps, list) {
+        if (ranges_overlap(addr, len, map->addr, map->len)) {
+            map->invalidate(map->invalidate_opaque);
+            QLIST_REMOVE(map, list);
+            free(map);
+        }
+    }
+}
+
+void *dma_memory_map(DMADevice *dev,
+                     DMAInvalidateMapFunc *cb,
+                     void *opaque,
+                     dma_addr_t addr,
+                     dma_addr_t *len,
+                     int is_write)
+{
+    int err;
+    target_phys_addr_t paddr, plen;
+    void *buf;
+
+    if (!dev || !dev->mmu) {
+        return cpu_physical_memory_map(addr, len, is_write);
+    }
+
+    plen = *len;
+    err = dev->mmu->translate(dev, addr, &paddr, &plen, is_write);
+    if (err) {
+        return NULL;
+    }
+
+    /*
+     * If this is true, the virtual region is contiguous,
+     * but the translated physical region isn't. We just
+     * clamp *len, much like cpu_physical_memory_map() does.
+     */
+    if (plen < *len) {
+        *len = plen;
+    }
+
+    buf = cpu_physical_memory_map(paddr, len, is_write);
+
+    /* We treat maps as remote TLBs to cope with stuff like AIO. */
+    if (cb) {
+        dma_register_memory_map(dev, buf, addr, *len, cb, opaque);
+    }
+
+    return buf;
+}
+
+void dma_memory_unmap(DMADevice *dev,
+                      void *buffer,
+                      dma_addr_t len,
+                      int is_write,
+                      dma_addr_t access_len)
+{
+    cpu_physical_memory_unmap(buffer, len, is_write, access_len);
+    if (dev && dev->mmu) {
+        dma_unregister_memory_map(dev, buffer, len);
+    }
+}
+
+void dma_memory_rw_iommu(DMADevice *dev,
+                         dma_addr_t addr,
+                         void *buf,
+                         dma_addr_t len,
+                         int is_write)
+{
+    dma_addr_t paddr, plen;
+    int err;
+
+    while (len) {
+        err = dev->mmu->translate(dev, addr, &paddr, &plen, is_write);
+        if (err) {
+            return;
+        }
+
+        /* The translation might be valid for larger regions. */
+        if (plen > len) {
+            plen = len;
+        }
+
+        cpu_physical_memory_rw(paddr, buf, plen, is_write);
+
+        len -= plen;
+        addr += plen;
+        buf += plen;
+    }
+}
+
diff --git a/hw/dma_rw.h b/hw/dma_rw.h
new file mode 100644
index 0000000..39482cb
--- /dev/null
+++ b/hw/dma_rw.h
@@ -0,0 +1,217 @@ 
+#ifndef DMA_RW_H
+#define DMA_RW_H
+
+#include "qemu-common.h"
+
+typedef uint64_t dma_addr_t;
+
+typedef struct DMAMmu DMAMmu;
+typedef struct DMADevice DMADevice;
+typedef struct DMAMemoryMap DMAMemoryMap;
+
+typedef int DMATranslateFunc(DMADevice *dev,
+                             dma_addr_t addr,
+                             dma_addr_t *paddr,
+                             dma_addr_t *len,
+                             int is_write);
+
+typedef void DMAInvalidateMapFunc(void *);
+
+struct DMAMmu {
+    DeviceState *iommu;
+    DMATranslateFunc *translate;
+    QLIST_HEAD(memory_maps, DMAMemoryMap) memory_maps;
+};
+
+struct DMADevice {
+    DMAMmu *mmu;
+};
+
+struct DMAMemoryMap {
+    void                    *buffer;
+    dma_addr_t              addr;
+    dma_addr_t              len;
+    DMAInvalidateMapFunc    *invalidate;
+    void                    *invalidate_opaque;
+
+    QLIST_ENTRY(DMAMemoryMap) list;
+};
+
+void dma_memory_rw_iommu(DMADevice *dev,
+                         dma_addr_t addr,
+                         void *buf,
+                         dma_addr_t len,
+                         int is_write);
+
+static inline void dma_memory_rw(DMADevice *dev,
+                                 dma_addr_t addr,
+                                 void *buf,
+                                 dma_addr_t len,
+                                 int is_write)
+{
+    /*
+     * Fast-path non-iommu.
+     * More importantly, makes it obvious what this function does.
+     */
+    if (!dev || !dev->mmu) {
+        cpu_physical_memory_rw(addr, buf, len, is_write);
+        return;
+    }
+
+    dma_memory_rw_iommu(dev, addr, buf, len, is_write);
+}
+
+static inline void dma_memory_read(DMADevice *dev,
+                                   dma_addr_t addr,
+                                   void *buf,
+                                   dma_addr_t len)
+{
+    dma_memory_rw(dev, addr, buf, len, 0);
+}
+
+static inline void dma_memory_write(DMADevice *dev,
+                                    dma_addr_t addr,
+                                    const void *buf,
+                                    dma_addr_t len)
+{
+    dma_memory_rw(dev, addr, (void *) buf, len, 1);
+}
+
+void *dma_memory_map(DMADevice *dev,
+                     DMAInvalidateMapFunc *cb,
+                     void *opaque,
+                     dma_addr_t addr,
+                     dma_addr_t *len,
+                     int is_write);
+void dma_memory_unmap(DMADevice *dev,
+                      void *buffer,
+                      dma_addr_t len,
+                      int is_write,
+                      dma_addr_t access_len);
+
+void dma_invalidate_memory_range(DMADevice *dev,
+                                 dma_addr_t addr,
+                                 dma_addr_t len);
+
+
+/*
+ * All the following macro magic tries is to
+ * achieve some type safety and avoid duplication.
+ */
+
+#define DEFINE_DMA_LD(prefix, suffix, devtype, dmafield, size)            \
+static inline uint##size##_t                                              \
+dma_ld##suffix(DMADevice *dev, dma_addr_t addr)                           \
+{                                                                         \
+    int err;                                                              \
+    dma_addr_t paddr, plen;                                               \
+                                                                          \
+    if (!dev || !dev->mmu) {                                              \
+        return ld##suffix##_phys(addr);                                   \
+    }                                                                     \
+                                                                          \
+    err = dev->mmu->translate(dev, addr, &paddr, &plen, 0);               \
+    if (err || (plen < size / 8)) {                                       \
+        return 0;                                                         \
+    }                                                                     \
+                                                                          \
+    return ld##suffix##_phys(paddr);                                      \
+}
+
+#define DEFINE_DMA_ST(prefix, suffix, devtype, dmafield, size)            \
+static inline void                                                        \
+dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val)       \
+{                                                                         \
+    int err;                                                              \
+    target_phys_addr_t paddr, plen;                                       \
+                                                                          \
+    if (!dev || !dev->mmu) {                                              \
+        st##suffix##_phys(addr, val);                                     \
+        return;                                                           \
+    }                                                                     \
+    err = dev->mmu->translate(dev, addr, &paddr, &plen, 1);               \
+    if (err || (plen < size / 8)) {                                       \
+        return;                                                           \
+    }                                                                     \
+                                                                          \
+    st##suffix##_phys(paddr, val);                                        \
+}
+
+#define DEFINE_DMA_MEMORY_RW(prefix, devtype, dmafield)
+#define DEFINE_DMA_MEMORY_READ(prefix, devtype, dmafield)
+#define DEFINE_DMA_MEMORY_WRITE(prefix, devtype, dmafield)
+
+#define DEFINE_DMA_OPS(prefix, devtype, dmafield)          \
+    /*                                                     \
+     * FIXME: find a way to handle these:                  \
+     * DEFINE_DMA_LD(prefix, ub, devtype, dmafield, 8)     \
+     * DEFINE_DMA_LD(prefix, uw, devtype, dmafield, 16)    \
+     */                                                    \
+    DEFINE_DMA_LD(prefix, l, devtype, dmafield, 32)        \
+    DEFINE_DMA_LD(prefix, q, devtype, dmafield, 64)        \
+                                                           \
+    DEFINE_DMA_ST(prefix, b, devtype, dmafield, 8)         \
+    DEFINE_DMA_ST(prefix, w, devtype, dmafield, 16)        \
+    DEFINE_DMA_ST(prefix, l, devtype, dmafield, 32)        \
+    DEFINE_DMA_ST(prefix, q, devtype, dmafield, 64)        \
+                                                           \
+    DEFINE_DMA_MEMORY_RW(prefix, devtype, dmafield)        \
+    DEFINE_DMA_MEMORY_READ(prefix, devtype, dmafield)      \
+    DEFINE_DMA_MEMORY_WRITE(prefix, devtype, dmafield)
+
+DEFINE_DMA_OPS(UNUSED, UNUSED, UNUSED)
+
+/*
+ * From here on, various bus interfaces can use DEFINE_DMA_OPS
+ * to summon their own personalized clone of the DMA interface.
+ */
+
+#undef DEFINE_DMA_LD
+#undef DEFINE_DMA_ST
+#undef DEFINE_DMA_MEMORY_RW
+#undef DEFINE_DMA_MEMORY_READ
+#undef DEFINE_DMA_MEMORY_WRITE
+
+#define DEFINE_DMA_LD(prefix, suffix, devtype, dma_field, size)         \
+static inline uint##size##_t                                            \
+prefix##_ld##suffix(devtype *dev, dma_addr_t addr)                      \
+{                                                                       \
+    return dma_ld##suffix(&dev->dma_field, addr);                       \
+}
+
+#define DEFINE_DMA_ST(prefix, suffix, devtype, dma_field, size)         \
+static inline void                                                      \
+prefix##_st##suffix(devtype *dev, dma_addr_t addr, uint##size##_t val)  \
+{                                                                       \
+    dma_st##suffix(&dev->dma_field, addr, val);                         \
+}
+
+#define DEFINE_DMA_MEMORY_RW(prefix, devtype, dmafield)                 \
+static inline void prefix##_memory_rw(devtype *dev,                     \
+                                      dma_addr_t addr,                  \
+                                      void *buf,                        \
+                                      dma_addr_t len,                   \
+                                      int is_write)                     \
+{                                                                       \
+    dma_memory_rw(&dev->dmafield, addr, buf, len, is_write);            \
+}
+
+#define DEFINE_DMA_MEMORY_READ(prefix, devtype, dmafield)               \
+static inline void prefix##_memory_read(devtype *dev,                   \
+                                        dma_addr_t addr,                \
+                                        void *buf,                      \
+                                        dma_addr_t len)                 \
+{                                                                       \
+    dma_memory_read(&dev->dmafield, addr, buf, len);                    \
+}
+
+#define DEFINE_DMA_MEMORY_WRITE(prefix, devtype, dmafield)              \
+static inline void prefix##_memory_write(devtype *dev,                  \
+                                         dma_addr_t addr,               \
+                                         const void *buf,               \
+                                         dma_addr_t len)                \
+{                                                                       \
+    dma_memory_write(&dev->dmafield, addr, buf, len);                   \
+}
+
+#endif