Patchwork [4/8] Add universal DMA helper functions

login
register
mail settings
Submitter David Gibson
Date Feb. 24, 2012, 3:27 a.m.
Message ID <1330054063-16860-5-git-send-email-david@gibson.dropbear.id.au>
Download mbox | patch
Permalink /patch/142737/
State New
Headers show

Comments

David Gibson - Feb. 24, 2012, 3:27 a.m.
Not that long ago, every device implementation using DMA directly
accessed guest memory using cpu_physical_memory_*().  This meant that
adding support for a guest visible IOMMU would require changing every
one of these devices to go through IOMMU translation.

Shortly before qemu 1.0, I made a start on fixing this by providing
helper functions for PCI DMA.  These are currently just stubs which
call the direct access functions, but mean that an IOMMU can be
implemented in one place, rather than for every PCI device.

Clearly, this doesn't help for non PCI devices, which could also be
IOMMU translated on some platforms.  It is also problematic for the
devices which have both PCI and non-PCI version (e.g. OHCI, AHCI) - we
cannot use the the pci_dma_*() functions, because they assume the
presence of a PCIDevice, but we don't want to have to check between
pci_dma_*() and cpu_physical_memory_*() every time we do a DMA in the
device code.

This patch makes the first step on addressing both these problems, by
introducing new (stub) dma helper functions which can be used for any
DMA capable device.

These dma functions take a DMAContext *, a new (currently empty)
variable describing the DMA address space in which the operation is to
take place.  NULL indicates untranslated DMA directly into guest
physical address space.  The intention is that in future non-NULL
values will given information about any necessary IOMMU translation.

DMA using devices must obtain a DMAContext (or, potentially, contexts)
from their bus or platform.  For now this patch just converts the PCI
wrappers to be implemented in terms of the universal wrappers,
converting other drivers can take place over time.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Joerg Rodel <Joerg.Rodel@amd.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Richard Henderson <rth@twiddle.net>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 dma.h    |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/pci.h |   21 ++++++++------
 2 files changed, 99 insertions(+), 9 deletions(-)
Michael S. Tsirkin - Feb. 26, 2012, 10:04 a.m.
On Fri, Feb 24, 2012 at 02:27:39PM +1100, David Gibson wrote:
> Not that long ago, every device implementation using DMA directly
> accessed guest memory using cpu_physical_memory_*().  This meant that
> adding support for a guest visible IOMMU would require changing every
> one of these devices to go through IOMMU translation.
> 
> Shortly before qemu 1.0, I made a start on fixing this by providing
> helper functions for PCI DMA.  These are currently just stubs which
> call the direct access functions, but mean that an IOMMU can be
> implemented in one place, rather than for every PCI device.
> 
> Clearly, this doesn't help for non PCI devices, which could also be
> IOMMU translated on some platforms.  It is also problematic for the
> devices which have both PCI and non-PCI version (e.g. OHCI, AHCI) - we
> cannot use the the pci_dma_*() functions, because they assume the
> presence of a PCIDevice, but we don't want to have to check between
> pci_dma_*() and cpu_physical_memory_*() every time we do a DMA in the
> device code.
> 
> This patch makes the first step on addressing both these problems, by
> introducing new (stub) dma helper functions which can be used for any
> DMA capable device.
> 
> These dma functions take a DMAContext *, a new (currently empty)
> variable describing the DMA address space in which the operation is to
> take place.  NULL indicates untranslated DMA directly into guest
> physical address space.  The intention is that in future non-NULL
> values will given information about any necessary IOMMU translation.
> 
> DMA using devices must obtain a DMAContext (or, potentially, contexts)
> from their bus or platform.  For now this patch just converts the PCI
> wrappers to be implemented in terms of the universal wrappers,
> converting other drivers can take place over time.
> 
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Joerg Rodel <Joerg.Rodel@amd.com>
> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> Cc: Richard Henderson <rth@twiddle.net>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

I'm a bit confused with all the stubbing going on.
Is this the final form of the pci_* functions or just
a stub? If the final form, we probably should just
open-code them - they don't buy us much.
If not, let's add a comment?

> ---
>  dma.h    |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/pci.h |   21 ++++++++------
>  2 files changed, 99 insertions(+), 9 deletions(-)
> 
> diff --git a/dma.h b/dma.h
> index 79be131..d7428df 100644
> --- a/dma.h
> +++ b/dma.h
> @@ -28,6 +28,93 @@ typedef enum {
>      DMA_DIRECTION_FROM_DEVICE = 1,
>  } DMADirection;
>  
> +typedef struct DMAContext {

why do we need the empty struct? Someone
will allocate an instance of it?
If not,
typedef struct DMAContext DMAContext;
in qemu-common.h would be enough.

> +
> +typedef void DMAInvalidateMapFunc(void *);
> +
> +static inline int dma_memory_rw(DMAContext *dma, dma_addr_t addr,
> +                                void *buf, dma_addr_t len, DMADirection dir)
> +{
> +    cpu_physical_memory_rw(addr, buf, (target_phys_addr_t)len,
> +                           dir == DMA_DIRECTION_FROM_DEVICE);
> +    return 0;
> +}
> +
> +static inline int dma_memory_read(DMAContext *dma, dma_addr_t addr,
> +                                  void *buf, dma_addr_t len)
> +{
> +    return dma_memory_rw(dma, addr, buf, len, DMA_DIRECTION_TO_DEVICE);
> +}
> +
> +static inline int dma_memory_write(DMAContext *dma, dma_addr_t addr,
> +                                   const void *buf, dma_addr_t len)
> +{
> +    return dma_memory_rw(dma, addr, (void *)buf, len,
> +                         DMA_DIRECTION_FROM_DEVICE);
> +}
> +
> +static inline void *dma_memory_map(DMAContext *dma,
> +                                   DMAInvalidateMapFunc *cb, void *opaque,
> +                                   dma_addr_t addr, dma_addr_t *len,
> +                                   DMADirection dir)
> +{
> +    target_phys_addr_t xlen = *len;
> +    void *p;
> +
> +    p = cpu_physical_memory_map(addr, &xlen,
> +                                dir == DMA_DIRECTION_FROM_DEVICE);
> +    *len = xlen;
> +    return p;
> +}
> +
> +static inline void dma_memory_unmap(DMAContext *dma,
> +                                    void *buffer, dma_addr_t len,
> +                                    DMADirection dir, dma_addr_t access_len)
> +{
> +    return cpu_physical_memory_unmap(buffer, (target_phys_addr_t)len,
> +                                     dir == DMA_DIRECTION_FROM_DEVICE,
> +                                     access_len);
> +}
> +
> +#define DEFINE_LDST_DMA(_lname, _sname, _bits, _end) \
> +    static inline uint##_bits##_t ld##_lname##_##_end##_dma(DMAContext *dma, \
> +                                                            dma_addr_t addr) \
> +    {                                                                   \
> +        uint##_bits##_t val;                                            \
> +        dma_memory_read(dma, addr, &val, (_bits) / 8);                  \
> +        return _end##_bits##_to_cpu(val);                               \
> +    }                                                                   \
> +    static inline void st##_sname##_##_end##_dma(DMAContext *dma,       \
> +                                                 dma_addr_t addr,       \
> +                                                 uint##_bits##_t val)   \
> +    {                                                                   \
> +        val = cpu_to_##_end##_bits(val);                                \
> +        dma_memory_write(dma, addr, &val, (_bits) / 8);                 \
> +    }
> +
> +static inline uint8_t ldub_dma(DMAContext *dma, dma_addr_t addr)
> +{
> +    uint8_t val;
> +
> +    dma_memory_read(dma, addr, &val, 1);
> +    return val;
> +}
> +
> +static inline void stb_dma(DMAContext *dma, dma_addr_t addr, uint8_t val)
> +{
> +    dma_memory_write(dma, addr, &val, 1);
> +}
> +
> +DEFINE_LDST_DMA(uw, w, 16, le);
> +DEFINE_LDST_DMA(l, l, 32, le);
> +DEFINE_LDST_DMA(q, q, 64, le);
> +DEFINE_LDST_DMA(uw, w, 16, be);
> +DEFINE_LDST_DMA(l, l, 32, be);
> +DEFINE_LDST_DMA(q, q, 64, be);
> +
> +#undef DEFINE_LDST_DMA
> +
>  struct ScatterGatherEntry {
>      dma_addr_t base;
>      dma_addr_t len;
> diff --git a/hw/pci.h b/hw/pci.h
> index ee53f26..64734a1 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -544,10 +544,15 @@ static inline uint32_t pci_config_size(const PCIDevice *d)
>  }
>  
>  /* DMA access functions */
> +static inline DMAContext *pci_dma_context(PCIDevice *dev)
> +{
> +    return NULL;

This is a stub, right? Pls add a comment.

> +}
> +
>  static inline int pci_dma_rw(PCIDevice *dev, dma_addr_t addr,
>                               void *buf, dma_addr_t len, DMADirection dir)
>  {
> -    cpu_physical_memory_rw(addr, buf, len, dir == DMA_DIRECTION_FROM_DEVICE);
> +    dma_memory_rw(pci_dma_context(dev), addr, buf, len, dir);
>      return 0;
>  }
>  
> @@ -567,12 +572,12 @@ static inline int pci_dma_write(PCIDevice *dev, dma_addr_t addr,
>      static inline uint##_bits##_t ld##_l##_pci_dma(PCIDevice *dev,      \
>                                                     dma_addr_t addr)     \
>      {                                                                   \
> -        return ld##_l##_phys(addr);                                     \
> +        return ld##_l##_dma(pci_dma_context(dev), addr);                \
>      }                                                                   \
>      static inline void st##_s##_pci_dma(PCIDevice *dev,                 \
> -                          dma_addr_t addr, uint##_bits##_t val)         \
> +                                        dma_addr_t addr, uint##_bits##_t val) \
>      {                                                                   \
> -        st##_s##_phys(addr, val);                                       \
> +        st##_s##_dma(pci_dma_context(dev), addr, val);                  \
>      }
>  
>  PCI_DMA_DEFINE_LDST(ub, b, 8);
> @@ -586,21 +591,19 @@ PCI_DMA_DEFINE_LDST(q_be, q_be, 64);
>  #undef PCI_DMA_DEFINE_LDST
>  
>  static inline void *pci_dma_map(PCIDevice *dev, dma_addr_t addr,
> +                                DMAInvalidateMapFunc *cb, void *opaque,
>                                  dma_addr_t *plen, DMADirection dir)
>  {
> -    target_phys_addr_t len = *plen;
>      void *buf;
>  
> -    buf = cpu_physical_memory_map(addr, &len, dir == DMA_DIRECTION_FROM_DEVICE);
> -    *plen = len;
> +    buf = dma_memory_map(pci_dma_context(dev), cb, opaque, addr, plen, dir);
>      return buf;
>  }
>  
>  static inline void pci_dma_unmap(PCIDevice *dev, void *buffer, dma_addr_t len,
>                                   DMADirection dir, dma_addr_t access_len)
>  {
> -    cpu_physical_memory_unmap(buffer, len, dir == DMA_DIRECTION_FROM_DEVICE,
> -                              access_len);
> +    dma_memory_unmap(pci_dma_context(dev), buffer, len, dir, access_len);
>  }
>  
>  static inline void pci_dma_sglist_init(QEMUSGList *qsg, PCIDevice *dev,
> -- 
> 1.7.9
Eduard - Gabriel Munteanu - Feb. 26, 2012, 9:09 p.m.
On Sun, Feb 26, 2012 at 12:04:49PM +0200, Michael S. Tsirkin wrote:
> On Fri, Feb 24, 2012 at 02:27:39PM +1100, David Gibson wrote:
> > Not that long ago, every device implementation using DMA directly
> > accessed guest memory using cpu_physical_memory_*().  This meant that
> > adding support for a guest visible IOMMU would require changing every
> > one of these devices to go through IOMMU translation.
> > 
> > Shortly before qemu 1.0, I made a start on fixing this by providing
> > helper functions for PCI DMA.  These are currently just stubs which
> > call the direct access functions, but mean that an IOMMU can be
> > implemented in one place, rather than for every PCI device.
> > 
> > Clearly, this doesn't help for non PCI devices, which could also be
> > IOMMU translated on some platforms.  It is also problematic for the
> > devices which have both PCI and non-PCI version (e.g. OHCI, AHCI) - we
> > cannot use the the pci_dma_*() functions, because they assume the
> > presence of a PCIDevice, but we don't want to have to check between
> > pci_dma_*() and cpu_physical_memory_*() every time we do a DMA in the
> > device code.
> > 
> > This patch makes the first step on addressing both these problems, by
> > introducing new (stub) dma helper functions which can be used for any
> > DMA capable device.
> > 
> > These dma functions take a DMAContext *, a new (currently empty)
> > variable describing the DMA address space in which the operation is to
> > take place.  NULL indicates untranslated DMA directly into guest
> > physical address space.  The intention is that in future non-NULL
> > values will given information about any necessary IOMMU translation.
> > 
> > DMA using devices must obtain a DMAContext (or, potentially, contexts)
> > from their bus or platform.  For now this patch just converts the PCI
> > wrappers to be implemented in terms of the universal wrappers,
> > converting other drivers can take place over time.
> > 
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > Cc: Joerg Rodel <Joerg.Rodel@amd.com>
> > Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > Cc: Richard Henderson <rth@twiddle.net>
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> I'm a bit confused with all the stubbing going on.
> Is this the final form of the pci_* functions or just
> a stub? If the final form, we probably should just
> open-code them - they don't buy us much.
> If not, let's add a comment?
> 

It is a stub. The final form needs to deal with address translation,
permissions checking and invalidating memory maps.

[snip]


	Eduard
Eduard - Gabriel Munteanu - Feb. 26, 2012, 9:24 p.m.
On Fri, Feb 24, 2012 at 02:27:39PM +1100, David Gibson wrote:
> Not that long ago, every device implementation using DMA directly
> accessed guest memory using cpu_physical_memory_*().  This meant that
> adding support for a guest visible IOMMU would require changing every
> one of these devices to go through IOMMU translation.
> 
> Shortly before qemu 1.0, I made a start on fixing this by providing
> helper functions for PCI DMA.  These are currently just stubs which
> call the direct access functions, but mean that an IOMMU can be
> implemented in one place, rather than for every PCI device.
> 
> Clearly, this doesn't help for non PCI devices, which could also be
> IOMMU translated on some platforms.  It is also problematic for the
> devices which have both PCI and non-PCI version (e.g. OHCI, AHCI) - we
> cannot use the the pci_dma_*() functions, because they assume the
> presence of a PCIDevice, but we don't want to have to check between
> pci_dma_*() and cpu_physical_memory_*() every time we do a DMA in the
> device code.
> 
> This patch makes the first step on addressing both these problems, by
> introducing new (stub) dma helper functions which can be used for any
> DMA capable device.
> 
> These dma functions take a DMAContext *, a new (currently empty)
> variable describing the DMA address space in which the operation is to
> take place.  NULL indicates untranslated DMA directly into guest
> physical address space.  The intention is that in future non-NULL
> values will given information about any necessary IOMMU translation.
> 
> DMA using devices must obtain a DMAContext (or, potentially, contexts)
> from their bus or platform.  For now this patch just converts the PCI
> wrappers to be implemented in terms of the universal wrappers,
> converting other drivers can take place over time.
> 
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Joerg Rodel <Joerg.Rodel@amd.com>
> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> Cc: Richard Henderson <rth@twiddle.net>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Hi,

Thanks for pushing this forward. I got caught up in other stuff, perhaps
I'll be able to beat you to submitting the actual implementation soon. :-)

[snip]


	Eduard
David Gibson - Feb. 27, 2012, 12:22 a.m.
On Sun, Feb 26, 2012 at 12:04:49PM +0200, Michael S. Tsirkin wrote:
> On Fri, Feb 24, 2012 at 02:27:39PM +1100, David Gibson wrote:
> > Not that long ago, every device implementation using DMA directly
> > accessed guest memory using cpu_physical_memory_*().  This meant that
> > adding support for a guest visible IOMMU would require changing every
> > one of these devices to go through IOMMU translation.
> > 
> > Shortly before qemu 1.0, I made a start on fixing this by providing
> > helper functions for PCI DMA.  These are currently just stubs which
> > call the direct access functions, but mean that an IOMMU can be
> > implemented in one place, rather than for every PCI device.
> > 
> > Clearly, this doesn't help for non PCI devices, which could also be
> > IOMMU translated on some platforms.  It is also problematic for the
> > devices which have both PCI and non-PCI version (e.g. OHCI, AHCI) - we
> > cannot use the the pci_dma_*() functions, because they assume the
> > presence of a PCIDevice, but we don't want to have to check between
> > pci_dma_*() and cpu_physical_memory_*() every time we do a DMA in the
> > device code.
> > 
> > This patch makes the first step on addressing both these problems, by
> > introducing new (stub) dma helper functions which can be used for any
> > DMA capable device.
> > 
> > These dma functions take a DMAContext *, a new (currently empty)
> > variable describing the DMA address space in which the operation is to
> > take place.  NULL indicates untranslated DMA directly into guest
> > physical address space.  The intention is that in future non-NULL
> > values will given information about any necessary IOMMU translation.
> > 
> > DMA using devices must obtain a DMAContext (or, potentially, contexts)
> > from their bus or platform.  For now this patch just converts the PCI
> > wrappers to be implemented in terms of the universal wrappers,
> > converting other drivers can take place over time.
> > 
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > Cc: Joerg Rodel <Joerg.Rodel@amd.com>
> > Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > Cc: Richard Henderson <rth@twiddle.net>
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> I'm a bit confused with all the stubbing going on.
> Is this the final form of the pci_* functions or just
> a stub? If the final form, we probably should just
> open-code them - they don't buy us much.
> If not, let's add a comment?

Well.. it's the intended final form of pci_dma_*() - which do become
trivial wrappers, yes.  It's _not_ the intended final form of dma_*(),
which need to grow code to do actual IOMMU translation.  I'll add a
comment about this in the next round.
Michael S. Tsirkin - Feb. 27, 2012, 10:33 a.m.
On Mon, Feb 27, 2012 at 11:22:43AM +1100, David Gibson wrote:
> On Sun, Feb 26, 2012 at 12:04:49PM +0200, Michael S. Tsirkin wrote:
> > On Fri, Feb 24, 2012 at 02:27:39PM +1100, David Gibson wrote:
> > > Not that long ago, every device implementation using DMA directly
> > > accessed guest memory using cpu_physical_memory_*().  This meant that
> > > adding support for a guest visible IOMMU would require changing every
> > > one of these devices to go through IOMMU translation.
> > > 
> > > Shortly before qemu 1.0, I made a start on fixing this by providing
> > > helper functions for PCI DMA.  These are currently just stubs which
> > > call the direct access functions, but mean that an IOMMU can be
> > > implemented in one place, rather than for every PCI device.
> > > 
> > > Clearly, this doesn't help for non PCI devices, which could also be
> > > IOMMU translated on some platforms.  It is also problematic for the
> > > devices which have both PCI and non-PCI version (e.g. OHCI, AHCI) - we
> > > cannot use the the pci_dma_*() functions, because they assume the
> > > presence of a PCIDevice, but we don't want to have to check between
> > > pci_dma_*() and cpu_physical_memory_*() every time we do a DMA in the
> > > device code.
> > > 
> > > This patch makes the first step on addressing both these problems, by
> > > introducing new (stub) dma helper functions which can be used for any
> > > DMA capable device.
> > > 
> > > These dma functions take a DMAContext *, a new (currently empty)
> > > variable describing the DMA address space in which the operation is to
> > > take place.  NULL indicates untranslated DMA directly into guest
> > > physical address space.  The intention is that in future non-NULL
> > > values will given information about any necessary IOMMU translation.
> > > 
> > > DMA using devices must obtain a DMAContext (or, potentially, contexts)
> > > from their bus or platform.  For now this patch just converts the PCI
> > > wrappers to be implemented in terms of the universal wrappers,
> > > converting other drivers can take place over time.
> > > 
> > > Cc: Michael S. Tsirkin <mst@redhat.com>
> > > Cc: Joerg Rodel <Joerg.Rodel@amd.com>
> > > Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > > Cc: Richard Henderson <rth@twiddle.net>
> > > 
> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > 
> > I'm a bit confused with all the stubbing going on.
> > Is this the final form of the pci_* functions or just
> > a stub? If the final form, we probably should just
> > open-code them - they don't buy us much.
> > If not, let's add a comment?
> 
> Well.. it's the intended final form of pci_dma_*() - which do become
> trivial wrappers, yes.

I'd say let's drop them then (in a follow-up patch).  The topic is
confusing enough without having to wade through layers of wrappers :)

> It's _not_ the intended final form of dma_*(),
> which need to grow code to do actual IOMMU translation.  I'll add a
> comment about this in the next round.
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
Eduard - Gabriel Munteanu - Feb. 27, 2012, 11:55 a.m.
On Mon, Feb 27, 2012 at 12:33:49PM +0200, Michael S. Tsirkin wrote:
> On Mon, Feb 27, 2012 at 11:22:43AM +1100, David Gibson wrote:
> > On Sun, Feb 26, 2012 at 12:04:49PM +0200, Michael S. Tsirkin wrote:
> > > On Fri, Feb 24, 2012 at 02:27:39PM +1100, David Gibson wrote:

[snip]

> > > 
> > > I'm a bit confused with all the stubbing going on.
> > > Is this the final form of the pci_* functions or just
> > > a stub? If the final form, we probably should just
> > > open-code them - they don't buy us much.
> > > If not, let's add a comment?
> > 
> > Well.. it's the intended final form of pci_dma_*() - which do become
> > trivial wrappers, yes.
> 
> I'd say let's drop them then (in a follow-up patch).  The topic is
> confusing enough without having to wade through layers of wrappers :)
>

Drop them how? Using dma_* stuff directly? That might work, but I
remember others suggesting we should use a specialized PCI wrapper.
Perhaps it makes sense if some other bus, or PCI itself at some point,
needs to do something special.

> > It's _not_ the intended final form of dma_*(),
> > which need to grow code to do actual IOMMU translation.  I'll add a
> > comment about this in the next round.
> > 
> > -- 
> > David Gibson			| I'll have my music baroque, and my code
> > david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> > 				| _way_ _around_!
> > http://www.ozlabs.org/~dgibson

Patch

diff --git a/dma.h b/dma.h
index 79be131..d7428df 100644
--- a/dma.h
+++ b/dma.h
@@ -28,6 +28,93 @@  typedef enum {
     DMA_DIRECTION_FROM_DEVICE = 1,
 } DMADirection;
 
+typedef struct DMAContext {
+} DMAContext;
+
+typedef void DMAInvalidateMapFunc(void *);
+
+static inline int dma_memory_rw(DMAContext *dma, dma_addr_t addr,
+                                void *buf, dma_addr_t len, DMADirection dir)
+{
+    cpu_physical_memory_rw(addr, buf, (target_phys_addr_t)len,
+                           dir == DMA_DIRECTION_FROM_DEVICE);
+    return 0;
+}
+
+static inline int dma_memory_read(DMAContext *dma, dma_addr_t addr,
+                                  void *buf, dma_addr_t len)
+{
+    return dma_memory_rw(dma, addr, buf, len, DMA_DIRECTION_TO_DEVICE);
+}
+
+static inline int dma_memory_write(DMAContext *dma, dma_addr_t addr,
+                                   const void *buf, dma_addr_t len)
+{
+    return dma_memory_rw(dma, addr, (void *)buf, len,
+                         DMA_DIRECTION_FROM_DEVICE);
+}
+
+static inline void *dma_memory_map(DMAContext *dma,
+                                   DMAInvalidateMapFunc *cb, void *opaque,
+                                   dma_addr_t addr, dma_addr_t *len,
+                                   DMADirection dir)
+{
+    target_phys_addr_t xlen = *len;
+    void *p;
+
+    p = cpu_physical_memory_map(addr, &xlen,
+                                dir == DMA_DIRECTION_FROM_DEVICE);
+    *len = xlen;
+    return p;
+}
+
+static inline void dma_memory_unmap(DMAContext *dma,
+                                    void *buffer, dma_addr_t len,
+                                    DMADirection dir, dma_addr_t access_len)
+{
+    return cpu_physical_memory_unmap(buffer, (target_phys_addr_t)len,
+                                     dir == DMA_DIRECTION_FROM_DEVICE,
+                                     access_len);
+}
+
+#define DEFINE_LDST_DMA(_lname, _sname, _bits, _end) \
+    static inline uint##_bits##_t ld##_lname##_##_end##_dma(DMAContext *dma, \
+                                                            dma_addr_t addr) \
+    {                                                                   \
+        uint##_bits##_t val;                                            \
+        dma_memory_read(dma, addr, &val, (_bits) / 8);                  \
+        return _end##_bits##_to_cpu(val);                               \
+    }                                                                   \
+    static inline void st##_sname##_##_end##_dma(DMAContext *dma,       \
+                                                 dma_addr_t addr,       \
+                                                 uint##_bits##_t val)   \
+    {                                                                   \
+        val = cpu_to_##_end##_bits(val);                                \
+        dma_memory_write(dma, addr, &val, (_bits) / 8);                 \
+    }
+
+static inline uint8_t ldub_dma(DMAContext *dma, dma_addr_t addr)
+{
+    uint8_t val;
+
+    dma_memory_read(dma, addr, &val, 1);
+    return val;
+}
+
+static inline void stb_dma(DMAContext *dma, dma_addr_t addr, uint8_t val)
+{
+    dma_memory_write(dma, addr, &val, 1);
+}
+
+DEFINE_LDST_DMA(uw, w, 16, le);
+DEFINE_LDST_DMA(l, l, 32, le);
+DEFINE_LDST_DMA(q, q, 64, le);
+DEFINE_LDST_DMA(uw, w, 16, be);
+DEFINE_LDST_DMA(l, l, 32, be);
+DEFINE_LDST_DMA(q, q, 64, be);
+
+#undef DEFINE_LDST_DMA
+
 struct ScatterGatherEntry {
     dma_addr_t base;
     dma_addr_t len;
diff --git a/hw/pci.h b/hw/pci.h
index ee53f26..64734a1 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -544,10 +544,15 @@  static inline uint32_t pci_config_size(const PCIDevice *d)
 }
 
 /* DMA access functions */
+static inline DMAContext *pci_dma_context(PCIDevice *dev)
+{
+    return NULL;
+}
+
 static inline int pci_dma_rw(PCIDevice *dev, dma_addr_t addr,
                              void *buf, dma_addr_t len, DMADirection dir)
 {
-    cpu_physical_memory_rw(addr, buf, len, dir == DMA_DIRECTION_FROM_DEVICE);
+    dma_memory_rw(pci_dma_context(dev), addr, buf, len, dir);
     return 0;
 }
 
@@ -567,12 +572,12 @@  static inline int pci_dma_write(PCIDevice *dev, dma_addr_t addr,
     static inline uint##_bits##_t ld##_l##_pci_dma(PCIDevice *dev,      \
                                                    dma_addr_t addr)     \
     {                                                                   \
-        return ld##_l##_phys(addr);                                     \
+        return ld##_l##_dma(pci_dma_context(dev), addr);                \
     }                                                                   \
     static inline void st##_s##_pci_dma(PCIDevice *dev,                 \
-                          dma_addr_t addr, uint##_bits##_t val)         \
+                                        dma_addr_t addr, uint##_bits##_t val) \
     {                                                                   \
-        st##_s##_phys(addr, val);                                       \
+        st##_s##_dma(pci_dma_context(dev), addr, val);                  \
     }
 
 PCI_DMA_DEFINE_LDST(ub, b, 8);
@@ -586,21 +591,19 @@  PCI_DMA_DEFINE_LDST(q_be, q_be, 64);
 #undef PCI_DMA_DEFINE_LDST
 
 static inline void *pci_dma_map(PCIDevice *dev, dma_addr_t addr,
+                                DMAInvalidateMapFunc *cb, void *opaque,
                                 dma_addr_t *plen, DMADirection dir)
 {
-    target_phys_addr_t len = *plen;
     void *buf;
 
-    buf = cpu_physical_memory_map(addr, &len, dir == DMA_DIRECTION_FROM_DEVICE);
-    *plen = len;
+    buf = dma_memory_map(pci_dma_context(dev), cb, opaque, addr, plen, dir);
     return buf;
 }
 
 static inline void pci_dma_unmap(PCIDevice *dev, void *buffer, dma_addr_t len,
                                  DMADirection dir, dma_addr_t access_len)
 {
-    cpu_physical_memory_unmap(buffer, len, dir == DMA_DIRECTION_FROM_DEVICE,
-                              access_len);
+    dma_memory_unmap(pci_dma_context(dev), buffer, len, dir, access_len);
 }
 
 static inline void pci_dma_sglist_init(QEMUSGList *qsg, PCIDevice *dev,