mbox series

[v7,0/7] Add virtio-iommu driver

Message ID 20190115121959.23763-1-jean-philippe.brucker@arm.com
Headers show
Series Add virtio-iommu driver | expand

Message

Jean-Philippe Brucker Jan. 15, 2019, 12:19 p.m. UTC
Implement the virtio-iommu driver, following specification v0.9 [1].

This is a simple rebase onto Linux v5.0-rc2. We now use the
dev_iommu_fwspec_get() helper introduced in v5.0 instead of accessing
dev->iommu_fwspec, but there aren't any functional change from v6 [2].

Our current goal for virtio-iommu is to get a paravirtual IOMMU working
on Arm, and enable device assignment to guest userspace. In this
use-case the mappings are static, and don't require optimal performance,
so this series tries to keep things simple. However there is plenty more
to do for features and optimizations, and having this base in v5.1 would
be good. Given that most of the changes are to drivers/iommu, I believe
the driver and future changes should go via the IOMMU tree.

You can find Linux driver and kvmtool device on v0.9.2 branches [3],
module and x86 support on virtio-iommu/devel. Also tested with Eric's
QEMU device [4]. Please note that the series depends on Robin's
probe-deferral fix [5], which will hopefully land in v5.0.

[1] Virtio-iommu specification v0.9, sources and pdf
    git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
    http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf

[2] [PATCH v6 0/7] Add virtio-iommu driver
    https://lists.linuxfoundation.org/pipermail/iommu/2018-December/032127.html

[3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.2
    git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9.2

[4] [RFC v9 00/17] VIRTIO-IOMMU device
    https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html

[5] [PATCH] iommu/of: Fix probe-deferral
    https://www.spinics.net/lists/arm-kernel/msg698371.html

Jean-Philippe Brucker (7):
  dt-bindings: virtio-mmio: Add IOMMU description
  dt-bindings: virtio: Add virtio-pci-iommu node
  of: Allow the iommu-map property to omit untranslated devices
  PCI: OF: Initialize dev->fwnode appropriately
  iommu: Add virtio-iommu driver
  iommu/virtio: Add probe request
  iommu/virtio: Add event queue

 .../devicetree/bindings/virtio/iommu.txt      |   66 +
 .../devicetree/bindings/virtio/mmio.txt       |   30 +
 MAINTAINERS                                   |    7 +
 drivers/iommu/Kconfig                         |   11 +
 drivers/iommu/Makefile                        |    1 +
 drivers/iommu/virtio-iommu.c                  | 1158 +++++++++++++++++
 drivers/of/base.c                             |   10 +-
 drivers/pci/of.c                              |    7 +
 include/uapi/linux/virtio_ids.h               |    1 +
 include/uapi/linux/virtio_iommu.h             |  161 +++
 10 files changed, 1449 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
 create mode 100644 drivers/iommu/virtio-iommu.c
 create mode 100644 include/uapi/linux/virtio_iommu.h

Comments

Michael S. Tsirkin Jan. 18, 2019, 3:51 p.m. UTC | #1
On Tue, Jan 15, 2019 at 12:19:52PM +0000, Jean-Philippe Brucker wrote:
> Implement the virtio-iommu driver, following specification v0.9 [1].
> 
> This is a simple rebase onto Linux v5.0-rc2. We now use the
> dev_iommu_fwspec_get() helper introduced in v5.0 instead of accessing
> dev->iommu_fwspec, but there aren't any functional change from v6 [2].
> 
> Our current goal for virtio-iommu is to get a paravirtual IOMMU working
> on Arm, and enable device assignment to guest userspace. In this
> use-case the mappings are static, and don't require optimal performance,
> so this series tries to keep things simple. However there is plenty more
> to do for features and optimizations, and having this base in v5.1 would
> be good. Given that most of the changes are to drivers/iommu, I believe
> the driver and future changes should go via the IOMMU tree.
> 
> You can find Linux driver and kvmtool device on v0.9.2 branches [3],
> module and x86 support on virtio-iommu/devel. Also tested with Eric's
> QEMU device [4]. Please note that the series depends on Robin's
> probe-deferral fix [5], which will hopefully land in v5.0.
> 
> [1] Virtio-iommu specification v0.9, sources and pdf
>     git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
>     http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
> 
> [2] [PATCH v6 0/7] Add virtio-iommu driver
>     https://lists.linuxfoundation.org/pipermail/iommu/2018-December/032127.html
> 
> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.2
>     git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9.2
> 
> [4] [RFC v9 00/17] VIRTIO-IOMMU device
>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html
> 
> [5] [PATCH] iommu/of: Fix probe-deferral
>     https://www.spinics.net/lists/arm-kernel/msg698371.html

Thanks for the work!
So really my only issue with this is that there's no
way for the IOMMU to describe the devices that it
covers.

As a result that is then done in a platform-specific way.

And this means that for example it does not solve the problem that e.g.
some power people have in that their platform simply does not have a way
to specify which devices are covered by the IOMMU.

Solving that problem would make me much more excited about
this device.

On the other hand I can see that while there have been some
developments most of the code has been stable for quite a while now.

So what I am trying to do right about now, is making a small module that
loads early and pokes at the IOMMU sufficiently to get the data about
which devices use the IOMMU out of it using standard virtio config
space.  IIUC it's claimed to be impossible without messy changes to the
boot sequence.

If I succeed at least on some platforms I'll ask that this design is
worked into this device, minimizing info that goes through DT/ACPI.  If
I see I can't make it in time to meet the next merge window, I plan
merging the existing patches using DT (barring surprises).

As I only have a very small amount of time to spend on this attempt, If
someone else wants to try doing that in parallel, that would be great!


> Jean-Philippe Brucker (7):
>   dt-bindings: virtio-mmio: Add IOMMU description
>   dt-bindings: virtio: Add virtio-pci-iommu node
>   of: Allow the iommu-map property to omit untranslated devices
>   PCI: OF: Initialize dev->fwnode appropriately
>   iommu: Add virtio-iommu driver
>   iommu/virtio: Add probe request
>   iommu/virtio: Add event queue
> 
>  .../devicetree/bindings/virtio/iommu.txt      |   66 +
>  .../devicetree/bindings/virtio/mmio.txt       |   30 +
>  MAINTAINERS                                   |    7 +
>  drivers/iommu/Kconfig                         |   11 +
>  drivers/iommu/Makefile                        |    1 +
>  drivers/iommu/virtio-iommu.c                  | 1158 +++++++++++++++++
>  drivers/of/base.c                             |   10 +-
>  drivers/pci/of.c                              |    7 +
>  include/uapi/linux/virtio_ids.h               |    1 +
>  include/uapi/linux/virtio_iommu.h             |  161 +++
>  10 files changed, 1449 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h
> 
> -- 
> 2.19.1
Jean-Philippe Brucker Jan. 21, 2019, 11:29 a.m. UTC | #2
Hi,

On 18/01/2019 15:51, Michael S. Tsirkin wrote:
> 
> On Tue, Jan 15, 2019 at 12:19:52PM +0000, Jean-Philippe Brucker wrote:
>> Implement the virtio-iommu driver, following specification v0.9 [1].
>>
>> This is a simple rebase onto Linux v5.0-rc2. We now use the
>> dev_iommu_fwspec_get() helper introduced in v5.0 instead of accessing
>> dev->iommu_fwspec, but there aren't any functional change from v6 [2].
>>
>> Our current goal for virtio-iommu is to get a paravirtual IOMMU working
>> on Arm, and enable device assignment to guest userspace. In this
>> use-case the mappings are static, and don't require optimal performance,
>> so this series tries to keep things simple. However there is plenty more
>> to do for features and optimizations, and having this base in v5.1 would
>> be good. Given that most of the changes are to drivers/iommu, I believe
>> the driver and future changes should go via the IOMMU tree.
>>
>> You can find Linux driver and kvmtool device on v0.9.2 branches [3],
>> module and x86 support on virtio-iommu/devel. Also tested with Eric's
>> QEMU device [4]. Please note that the series depends on Robin's
>> probe-deferral fix [5], which will hopefully land in v5.0.
>>
>> [1] Virtio-iommu specification v0.9, sources and pdf
>>     git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
>>     http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
>>
>> [2] [PATCH v6 0/7] Add virtio-iommu driver
>>     https://lists.linuxfoundation.org/pipermail/iommu/2018-December/032127.html
>>
>> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.2
>>     git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9.2
>>
>> [4] [RFC v9 00/17] VIRTIO-IOMMU device
>>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html
>>
>> [5] [PATCH] iommu/of: Fix probe-deferral
>>     https://www.spinics.net/lists/arm-kernel/msg698371.html
> 
> Thanks for the work!
> So really my only issue with this is that there's no
> way for the IOMMU to describe the devices that it
> covers.
> 
> As a result that is then done in a platform-specific way.
> 
> And this means that for example it does not solve the problem that e.g.
> some power people have in that their platform simply does not have a way
> to specify which devices are covered by the IOMMU.

Isn't power using device tree? I haven't looked much at power because I
was told a while ago that they already paravirtualize their IOMMU and
don't need virtio-iommu, except perhaps for some legacy platforms. Or
something along those lines. But I would certainly be interested in
enabling the IOMMU for more architectures.

As for the enumeration problem, I still don't think we can get much
better than DT and ACPI as solutions (and IMO they are necessary to make
this device portable). But I believe that getting DT and ACPI support is
just a one-off inconvenience. That is, once the required bindings are
accepted, any future extension can then be done at the virtio level with
feature bits and probe requests, without having to update ACPI or DT.

Thanks,
Jean

> Solving that problem would make me much more excited about
> this device.
> 
> On the other hand I can see that while there have been some
> developments most of the code has been stable for quite a while now.
> 
> So what I am trying to do right about now, is making a small module that
> loads early and pokes at the IOMMU sufficiently to get the data about
> which devices use the IOMMU out of it using standard virtio config
> space.  IIUC it's claimed to be impossible without messy changes to the
> boot sequence.
> 
> If I succeed at least on some platforms I'll ask that this design is
> worked into this device, minimizing info that goes through DT/ACPI.  If
> I see I can't make it in time to meet the next merge window, I plan
> merging the existing patches using DT (barring surprises).
> 
> As I only have a very small amount of time to spend on this attempt, If
> someone else wants to try doing that in parallel, that would be great!
> 
> 
>> Jean-Philippe Brucker (7):
>>   dt-bindings: virtio-mmio: Add IOMMU description
>>   dt-bindings: virtio: Add virtio-pci-iommu node
>>   of: Allow the iommu-map property to omit untranslated devices
>>   PCI: OF: Initialize dev->fwnode appropriately
>>   iommu: Add virtio-iommu driver
>>   iommu/virtio: Add probe request
>>   iommu/virtio: Add event queue
>>
>>  .../devicetree/bindings/virtio/iommu.txt      |   66 +
>>  .../devicetree/bindings/virtio/mmio.txt       |   30 +
>>  MAINTAINERS                                   |    7 +
>>  drivers/iommu/Kconfig                         |   11 +
>>  drivers/iommu/Makefile                        |    1 +
>>  drivers/iommu/virtio-iommu.c                  | 1158 +++++++++++++++++
>>  drivers/of/base.c                             |   10 +-
>>  drivers/pci/of.c                              |    7 +
>>  include/uapi/linux/virtio_ids.h               |    1 +
>>  include/uapi/linux/virtio_iommu.h             |  161 +++
>>  10 files changed, 1449 insertions(+), 3 deletions(-)
>>  create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
>>  create mode 100644 drivers/iommu/virtio-iommu.c
>>  create mode 100644 include/uapi/linux/virtio_iommu.h
>>
>> -- 
>> 2.19.1
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
Joerg Roedel Jan. 23, 2019, 8:34 a.m. UTC | #3
Hi Jean-Philippe,

thanks for all your hard work on this!

On Tue, Jan 15, 2019 at 12:19:52PM +0000, Jean-Philippe Brucker wrote:
> Implement the virtio-iommu driver, following specification v0.9 [1].

To make progress on this I think the spec needs to be close to something
that can be included into the official virtio-specification. Have you
proposed the specification for inclusion there?

This is because I can't merge a driver that might be incompatible to
future implementations because the specification needs to be changed on
its way to an official standard.

I had a short discussion with Michael S. Tsirkin about that and from
what I understood the spec needs to be proposed for inclusion on the
virtio-comment[1] mailing list and later the TC needs to vote on it.
Please work with Michael on this to get the specification official (or
at least pretty close to something that will be part of the official
virtio standard).

Regards,

	Joerg

[1] https://www.oasis-open.org/committees/comments/index.php?wg_abbrev=virtio
Jean-Philippe Brucker Jan. 24, 2019, 4:03 p.m. UTC | #4
Hi Joerg,

On 23/01/2019 08:34, Joerg Roedel wrote:
> Hi Jean-Philippe,
> 
> thanks for all your hard work on this!
> 
> On Tue, Jan 15, 2019 at 12:19:52PM +0000, Jean-Philippe Brucker wrote:
>> Implement the virtio-iommu driver, following specification v0.9 [1].
> 
> To make progress on this I think the spec needs to be close to something
> that can be included into the official virtio-specification. Have you
> proposed the specification for inclusion there?

I haven't yet. I did send a few drafts of the spec to the mailing list,
using arbitrary version numbers (0.1 - 0.9), and received excellent
feedback from Eric, Kevin, Ashok and others [2], but I hadn't formally
asked for inclusion yet. Since I haven't made any major change to the
interface in a while, I'll get on that.

> This is because I can't merge a driver that might be incompatible to
> future implementations because the specification needs to be changed on
> its way to an official standard.

Makes sense, though I think other virtio devices have been developed a
little more organically: device and driver code got upstreamed first,
and then the specification describing their interface got merged into
the standard. For example I believe that code for crypto, input and GPU
devices were upstreamed long before the specification was merged. Once
an implementation is upstream, the interface is expected to be
backward-compatible (all subsequent changes are introduced using feature
bits).

So I've been working with this process in mind, also described by Jens
at KVM forum 2017 [3]:
(1) Reserve a device ID, and get that merged into virtio (ID 23 for
virtio-iommu was reserved last year)
(2) Open-source an implementation (this driver and Eric's device)
(3) Formalize and upstream the device specification

But I get that some overlap between (2) and (3) would have been better.
So far the spec document has been reviewed mainly from the IOMMU point
of view, and might require more changes to be in line with the other
virtio devices -- hopefully just wording changes. I'll kick off step
(3), but I think the virtio folks are a bit busy with finalizing the 1.1
spec so I expect it to take a while.

Thanks,
Jean

[2] RFC https://markmail.org/thread/l6b2rpc46nua4egs
    0.4 https://markmail.org/thread/f5k37mab7tnrslin
    0.5 https://markmail.org/thread/tz65oolu5do7hi6n
    0.6 https://markmail.org/thread/dppbg6owzrx2km2n
    0.7 https://markmail.org/thread/dgdy4hicswpakmsq

[3] The future of virtio: riddles, myths and surprises
    https://www.linux-kvm.org/images/0/03/Virtio_fall_2017.pdf
    https://www.youtube.com/watch?v=z9cWwgYH97A


> I had a short discussion with Michael S. Tsirkin about that and from
> what I understood the spec needs to be proposed for inclusion on the
> virtio-comment[1] mailing list and later the TC needs to vote on it.
> Please work with Michael on this to get the specification official (or
> at least pretty close to something that will be part of the official
> virtio standard).
> 
> Regards,
> 
> 	Joerg
> 
> [1] https://www.oasis-open.org/committees/comments/index.php?wg_abbrev=virtio
Michael S. Tsirkin Jan. 29, 2019, 6:54 p.m. UTC | #5
On Mon, Jan 21, 2019 at 11:29:05AM +0000, Jean-Philippe Brucker wrote:
> Hi,
> 
> On 18/01/2019 15:51, Michael S. Tsirkin wrote:
> > 
> > On Tue, Jan 15, 2019 at 12:19:52PM +0000, Jean-Philippe Brucker wrote:
> >> Implement the virtio-iommu driver, following specification v0.9 [1].
> >>
> >> This is a simple rebase onto Linux v5.0-rc2. We now use the
> >> dev_iommu_fwspec_get() helper introduced in v5.0 instead of accessing
> >> dev->iommu_fwspec, but there aren't any functional change from v6 [2].
> >>
> >> Our current goal for virtio-iommu is to get a paravirtual IOMMU working
> >> on Arm, and enable device assignment to guest userspace. In this
> >> use-case the mappings are static, and don't require optimal performance,
> >> so this series tries to keep things simple. However there is plenty more
> >> to do for features and optimizations, and having this base in v5.1 would
> >> be good. Given that most of the changes are to drivers/iommu, I believe
> >> the driver and future changes should go via the IOMMU tree.
> >>
> >> You can find Linux driver and kvmtool device on v0.9.2 branches [3],
> >> module and x86 support on virtio-iommu/devel. Also tested with Eric's
> >> QEMU device [4]. Please note that the series depends on Robin's
> >> probe-deferral fix [5], which will hopefully land in v5.0.
> >>
> >> [1] Virtio-iommu specification v0.9, sources and pdf
> >>     git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
> >>     http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
> >>
> >> [2] [PATCH v6 0/7] Add virtio-iommu driver
> >>     https://lists.linuxfoundation.org/pipermail/iommu/2018-December/032127.html
> >>
> >> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.2
> >>     git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9.2
> >>
> >> [4] [RFC v9 00/17] VIRTIO-IOMMU device
> >>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html
> >>
> >> [5] [PATCH] iommu/of: Fix probe-deferral
> >>     https://www.spinics.net/lists/arm-kernel/msg698371.html
> > 
> > Thanks for the work!
> > So really my only issue with this is that there's no
> > way for the IOMMU to describe the devices that it
> > covers.
> > 
> > As a result that is then done in a platform-specific way.
> > 
> > And this means that for example it does not solve the problem that e.g.
> > some power people have in that their platform simply does not have a way
> > to specify which devices are covered by the IOMMU.
> 
> Isn't power using device tree? I haven't looked much at power because I
> was told a while ago that they already paravirtualize their IOMMU and
> don't need virtio-iommu, except perhaps for some legacy platforms. Or
> something along those lines. But I would certainly be interested in
> enabling the IOMMU for more architectures.

I have CC'd the relevant ppc developers, let's see what do they think.


> As for the enumeration problem, I still don't think we can get much
> better than DT and ACPI as solutions (and IMO they are necessary to make
> this device portable). But I believe that getting DT and ACPI support is
> just a one-off inconvenience. That is, once the required bindings are
> accepted, any future extension can then be done at the virtio level with
> feature bits and probe requests, without having to update ACPI or DT.
> 
> Thanks,
> Jean
> 
> > Solving that problem would make me much more excited about
> > this device.
> > 
> > On the other hand I can see that while there have been some
> > developments most of the code has been stable for quite a while now.
> > 
> > So what I am trying to do right about now, is making a small module that
> > loads early and pokes at the IOMMU sufficiently to get the data about
> > which devices use the IOMMU out of it using standard virtio config
> > space.  IIUC it's claimed to be impossible without messy changes to the
> > boot sequence.
> > 
> > If I succeed at least on some platforms I'll ask that this design is
> > worked into this device, minimizing info that goes through DT/ACPI.  If
> > I see I can't make it in time to meet the next merge window, I plan
> > merging the existing patches using DT (barring surprises).
> > 
> > As I only have a very small amount of time to spend on this attempt, If
> > someone else wants to try doing that in parallel, that would be great!
> > 
> > 
> >> Jean-Philippe Brucker (7):
> >>   dt-bindings: virtio-mmio: Add IOMMU description
> >>   dt-bindings: virtio: Add virtio-pci-iommu node
> >>   of: Allow the iommu-map property to omit untranslated devices
> >>   PCI: OF: Initialize dev->fwnode appropriately
> >>   iommu: Add virtio-iommu driver
> >>   iommu/virtio: Add probe request
> >>   iommu/virtio: Add event queue
> >>
> >>  .../devicetree/bindings/virtio/iommu.txt      |   66 +
> >>  .../devicetree/bindings/virtio/mmio.txt       |   30 +
> >>  MAINTAINERS                                   |    7 +
> >>  drivers/iommu/Kconfig                         |   11 +
> >>  drivers/iommu/Makefile                        |    1 +
> >>  drivers/iommu/virtio-iommu.c                  | 1158 +++++++++++++++++
> >>  drivers/of/base.c                             |   10 +-
> >>  drivers/pci/of.c                              |    7 +
> >>  include/uapi/linux/virtio_ids.h               |    1 +
> >>  include/uapi/linux/virtio_iommu.h             |  161 +++
> >>  10 files changed, 1449 insertions(+), 3 deletions(-)
> >>  create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
> >>  create mode 100644 drivers/iommu/virtio-iommu.c
> >>  create mode 100644 include/uapi/linux/virtio_iommu.h
> >>
> >> -- 
> >> 2.19.1
> > _______________________________________________
> > iommu mailing list
> > iommu@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/iommu
> >
Thiago Jung Bauermann Feb. 21, 2019, 9:57 p.m. UTC | #6
Michael S. Tsirkin <mst@redhat.com> writes:

> On Mon, Jan 21, 2019 at 11:29:05AM +0000, Jean-Philippe Brucker wrote:
>> Hi,
>>
>> On 18/01/2019 15:51, Michael S. Tsirkin wrote:
>> >
>> > On Tue, Jan 15, 2019 at 12:19:52PM +0000, Jean-Philippe Brucker wrote:
>> >> Implement the virtio-iommu driver, following specification v0.9 [1].
>> >>
>> >> This is a simple rebase onto Linux v5.0-rc2. We now use the
>> >> dev_iommu_fwspec_get() helper introduced in v5.0 instead of accessing
>> >> dev->iommu_fwspec, but there aren't any functional change from v6 [2].
>> >>
>> >> Our current goal for virtio-iommu is to get a paravirtual IOMMU working
>> >> on Arm, and enable device assignment to guest userspace. In this
>> >> use-case the mappings are static, and don't require optimal performance,
>> >> so this series tries to keep things simple. However there is plenty more
>> >> to do for features and optimizations, and having this base in v5.1 would
>> >> be good. Given that most of the changes are to drivers/iommu, I believe
>> >> the driver and future changes should go via the IOMMU tree.
>> >>
>> >> You can find Linux driver and kvmtool device on v0.9.2 branches [3],
>> >> module and x86 support on virtio-iommu/devel. Also tested with Eric's
>> >> QEMU device [4]. Please note that the series depends on Robin's
>> >> probe-deferral fix [5], which will hopefully land in v5.0.
>> >>
>> >> [1] Virtio-iommu specification v0.9, sources and pdf
>> >>     git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
>> >>     http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
>> >>
>> >> [2] [PATCH v6 0/7] Add virtio-iommu driver
>> >>     https://lists.linuxfoundation.org/pipermail/iommu/2018-December/032127.html
>> >>
>> >> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.2
>> >>     git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9.2
>> >>
>> >> [4] [RFC v9 00/17] VIRTIO-IOMMU device
>> >>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html
>> >>
>> >> [5] [PATCH] iommu/of: Fix probe-deferral
>> >>     https://www.spinics.net/lists/arm-kernel/msg698371.html
>> >
>> > Thanks for the work!
>> > So really my only issue with this is that there's no
>> > way for the IOMMU to describe the devices that it
>> > covers.
>> >
>> > As a result that is then done in a platform-specific way.
>> >
>> > And this means that for example it does not solve the problem that e.g.
>> > some power people have in that their platform simply does not have a way
>> > to specify which devices are covered by the IOMMU.
>>
>> Isn't power using device tree? I haven't looked much at power because I
>> was told a while ago that they already paravirtualize their IOMMU and
>> don't need virtio-iommu, except perhaps for some legacy platforms. Or
>> something along those lines. But I would certainly be interested in
>> enabling the IOMMU for more architectures.
>
> I have CC'd the relevant ppc developers, let's see what do they think.

I'm far from being an expert, but what could be very useful for us is to
have a way for the guest to request IOMMU bypass for a device.

From what I understand, the pSeries platform used by POWER guests always
puts devices behind an IOMMU, so at least for current systems a
description of which devices are covered by the IOMMU would always say
"all of them".

>> As for the enumeration problem, I still don't think we can get much
>> better than DT and ACPI as solutions (and IMO they are necessary to make
>> this device portable). But I believe that getting DT and ACPI support is
>> just a one-off inconvenience. That is, once the required bindings are
>> accepted, any future extension can then be done at the virtio level with
>> feature bits and probe requests, without having to update ACPI or DT.

There is a device tree binding that can specify devices connected to a
given IOMMU in Documentation/devicetree/bindings/iommu/iommu.txt.
I don't believe POWER machines use it though.

>> Thanks,
>> Jean
>>
>> > Solving that problem would make me much more excited about
>> > this device.
>> >
>> > On the other hand I can see that while there have been some
>> > developments most of the code has been stable for quite a while now.
>> >
>> > So what I am trying to do right about now, is making a small module that
>> > loads early and pokes at the IOMMU sufficiently to get the data about
>> > which devices use the IOMMU out of it using standard virtio config
>> > space.  IIUC it's claimed to be impossible without messy changes to the
>> > boot sequence.
>> >
>> > If I succeed at least on some platforms I'll ask that this design is
>> > worked into this device, minimizing info that goes through DT/ACPI.  If
>> > I see I can't make it in time to meet the next merge window, I plan
>> > merging the existing patches using DT (barring surprises).
>> >
>> > As I only have a very small amount of time to spend on this attempt, If
>> > someone else wants to try doing that in parallel, that would be great!

--
Thiago Jung Bauermann
IBM Linux Technology Center
Thiago Jung Bauermann Feb. 21, 2019, 10:18 p.m. UTC | #7
Hello Jean-Philippe,

Jean-Philippe Brucker <jean-philippe.brucker@arm.com> writes:
> Makes sense, though I think other virtio devices have been developed a
> little more organically: device and driver code got upstreamed first,
> and then the specification describing their interface got merged into
> the standard. For example I believe that code for crypto, input and GPU
> devices were upstreamed long before the specification was merged. Once
> an implementation is upstream, the interface is expected to be
> backward-compatible (all subsequent changes are introduced using feature
> bits).
>
> So I've been working with this process in mind, also described by Jens
> at KVM forum 2017 [3]:
> (1) Reserve a device ID, and get that merged into virtio (ID 23 for
> virtio-iommu was reserved last year)
> (2) Open-source an implementation (this driver and Eric's device)
> (3) Formalize and upstream the device specification
>
> But I get that some overlap between (2) and (3) would have been better.
> So far the spec document has been reviewed mainly from the IOMMU point
> of view, and might require more changes to be in line with the other
> virtio devices -- hopefully just wording changes. I'll kick off step
> (3), but I think the virtio folks are a bit busy with finalizing the 1.1
> spec so I expect it to take a while.

I read v0.9 of the spec and have some minor comments, hope this is a
good place to send them:

1. In section 2.6.2, one reads

    If the VIRTIO_IOMMU_F_INPUT_RANGE feature is offered and the range
    described by fields virt_start and virt_end doesn’t fit in the range
    described by input_range, the device MAY set status to VIRTIO_-
    IOMMU_S_RANGE and ignore the request.

Shouldn't int say "If the VIRTIO_IOMMU_F_INPUT_RANGE feature is
negotiated" instead?

2. There's a typo at the end of section 2.6.5:

    The VIRTIO_IOMMU_MAP_F_MMIO flag is a memory type rather than a
    protection lag.

s/lag/flag/

3. In section 3.1.2.1.1, the viommu compatible field says "virtio,mmio".
Shouldn't it say "virtio,mmio-iommu" instead, to be consistent with
"virtio,pci-iommu"?

4. There's a typo in section 3.3:

    A host bridge may limit the input address space – transaction
    accessing some addresses won’t reach the physical IOMMU.

s/transaction/transactions/

I also have one last comment which you may freely ignore, considering
it's clearly just personal opinion and also considering that the
specification is mature at this point: it specifies memory ranges by
specifying start and end addresses. My experience has been that this is
error prone, leading to confusion and bugs regarding whether the end
address is inclusive or exclusive. I tend to prefer expressing memory
ranges by specifying a start address and a length, which eliminates
ambiguity.

--
Thiago Jung Bauermann
IBM Linux Technology Center
Jean-Philippe Brucker Feb. 22, 2019, 12:18 p.m. UTC | #8
Hi Thiago,

On 21/02/2019 22:18, Thiago Jung Bauermann wrote:
> 
> Hello Jean-Philippe,
> 
> Jean-Philippe Brucker <jean-philippe.brucker@arm.com> writes:
>> Makes sense, though I think other virtio devices have been developed a
>> little more organically: device and driver code got upstreamed first,
>> and then the specification describing their interface got merged into
>> the standard. For example I believe that code for crypto, input and GPU
>> devices were upstreamed long before the specification was merged. Once
>> an implementation is upstream, the interface is expected to be
>> backward-compatible (all subsequent changes are introduced using feature
>> bits).
>>
>> So I've been working with this process in mind, also described by Jens
>> at KVM forum 2017 [3]:
>> (1) Reserve a device ID, and get that merged into virtio (ID 23 for
>> virtio-iommu was reserved last year)
>> (2) Open-source an implementation (this driver and Eric's device)
>> (3) Formalize and upstream the device specification
>>
>> But I get that some overlap between (2) and (3) would have been better.
>> So far the spec document has been reviewed mainly from the IOMMU point
>> of view, and might require more changes to be in line with the other
>> virtio devices -- hopefully just wording changes. I'll kick off step
>> (3), but I think the virtio folks are a bit busy with finalizing the 1.1
>> spec so I expect it to take a while.
> 
> I read v0.9 of the spec and have some minor comments, hope this is a
> good place to send them:

Thanks a lot, I'll fix them in my next posting. Note that I recently
sent v0.10 to virtio-comment, to request inclusion into the standard [1]
but your comments still apply to v0.10.

[1]
https://lists.oasis-open.org/archives/virtio-comment/201901/msg00016.html

> 1. In section 2.6.2, one reads
> 
>     If the VIRTIO_IOMMU_F_INPUT_RANGE feature is offered and the range
>     described by fields virt_start and virt_end doesn’t fit in the range
>     described by input_range, the device MAY set status to VIRTIO_-
>     IOMMU_S_RANGE and ignore the request.
> 
> Shouldn't int say "If the VIRTIO_IOMMU_F_INPUT_RANGE feature is
> negotiated" instead?

Yes, that seems clearer and more consistent with other devices, I'll
change it. In this case "offered" is equivalent to "negotiated", because
the driver SHOULD accept the feature or else the device may refuse to
set FEATURES_OK. A valid input_range field generally indicates that the
device is incapable of creating mappings outside this range, and it's
important that the driver acknowledges it.

> 2. There's a typo at the end of section 2.6.5:
> 
>     The VIRTIO_IOMMU_MAP_F_MMIO flag is a memory type rather than a
>     protection lag.
> 
> s/lag/flag/

Fixed in v0.10

> 3. In section 3.1.2.1.1, the viommu compatible field says "virtio,mmio".
> Shouldn't it say "virtio,mmio-iommu" instead, to be consistent with
> "virtio,pci-iommu"?

"virtio,mmio" already exists, and allows the virtio-mmio driver to pick
up any virtio device. The device type is then discovered while probing,
and doesn't need to be in the compatible string.

"virtio,pci-iommu" is something I introduced specifically for the
virtio-iommu, since it's the only virtio-pci device that requires a
device tree node - to describe the IOMMU topology earlier than the PCI
probe. If we want symmetry I'd rather replace "virtio,pci-iommu" with
"virtio,pci", but it wouldn't be used by other virtio device types. And
I have to admit I'm reluctant to change this binding now, given that it
has been reviewed (patch 2/7) and is ready to go.

> 4. There's a typo in section 3.3:
> 
>     A host bridge may limit the input address space – transaction
>     accessing some addresses won’t reach the physical IOMMU.
> 
> s/transaction/transactions/

I'll fix it, thanks

> I also have one last comment which you may freely ignore, considering
> it's clearly just personal opinion and also considering that the
> specification is mature at this point: it specifies memory ranges by
> specifying start and end addresses. My experience has been that this is
> error prone, leading to confusion and bugs regarding whether the end
> address is inclusive or exclusive. I tend to prefer expressing memory
> ranges by specifying a start address and a length, which eliminates
> ambiguity.

While the initial versions had start and length, I changed it because it
cannot express the whole 64-bit range. If the guest wants to do
unmap-all (and the input range is 64-bit), then it can send a single
UNMAP request with start=0, end=~0ULL, which wouldn't be possible with
start and length. Arguably a very rare use-case, but one I've tried to
implement at least twice with VFIO :) I'll see if I can make it more
obvious that end is inclusive, since the word doesn't appear at all in
the current draft.

Thanks,
Jean
Tomasz Nowicki Feb. 25, 2019, 1:20 p.m. UTC | #9
Hi Jean,

On 15.01.2019 13:19, Jean-Philippe Brucker wrote:
> Implement the virtio-iommu driver, following specification v0.9 [1].
> 
> This is a simple rebase onto Linux v5.0-rc2. We now use the
> dev_iommu_fwspec_get() helper introduced in v5.0 instead of accessing
> dev->iommu_fwspec, but there aren't any functional change from v6 [2].
> 
> Our current goal for virtio-iommu is to get a paravirtual IOMMU working
> on Arm, and enable device assignment to guest userspace. In this
> use-case the mappings are static, and don't require optimal performance,
> so this series tries to keep things simple. However there is plenty more
> to do for features and optimizations, and having this base in v5.1 would
> be good. Given that most of the changes are to drivers/iommu, I believe
> the driver and future changes should go via the IOMMU tree.
> 
> You can find Linux driver and kvmtool device on v0.9.2 branches [3],
> module and x86 support on virtio-iommu/devel. Also tested with Eric's
> QEMU device [4]. Please note that the series depends on Robin's
> probe-deferral fix [5], which will hopefully land in v5.0.
> 
> [1] Virtio-iommu specification v0.9, sources and pdf
>      git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
>      http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
> 
> [2] [PATCH v6 0/7] Add virtio-iommu driver
>      https://lists.linuxfoundation.org/pipermail/iommu/2018-December/032127.html
> 
> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.2
>      git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9.2
> 
> [4] [RFC v9 00/17] VIRTIO-IOMMU device
>      https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html
> 
> [5] [PATCH] iommu/of: Fix probe-deferral
>      https://www.spinics.net/lists/arm-kernel/msg698371.html
> 
> Jean-Philippe Brucker (7):
>    dt-bindings: virtio-mmio: Add IOMMU description
>    dt-bindings: virtio: Add virtio-pci-iommu node
>    of: Allow the iommu-map property to omit untranslated devices
>    PCI: OF: Initialize dev->fwnode appropriately
>    iommu: Add virtio-iommu driver
>    iommu/virtio: Add probe request
>    iommu/virtio: Add event queue
> 
>   .../devicetree/bindings/virtio/iommu.txt      |   66 +
>   .../devicetree/bindings/virtio/mmio.txt       |   30 +
>   MAINTAINERS                                   |    7 +
>   drivers/iommu/Kconfig                         |   11 +
>   drivers/iommu/Makefile                        |    1 +
>   drivers/iommu/virtio-iommu.c                  | 1158 +++++++++++++++++
>   drivers/of/base.c                             |   10 +-
>   drivers/pci/of.c                              |    7 +
>   include/uapi/linux/virtio_ids.h               |    1 +
>   include/uapi/linux/virtio_iommu.h             |  161 +++
>   10 files changed, 1449 insertions(+), 3 deletions(-)
>   create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
>   create mode 100644 drivers/iommu/virtio-iommu.c
>   create mode 100644 include/uapi/linux/virtio_iommu.h
> 

I have tested the whole series and Eric's QEMU patchset [4] for 
virtio-net-pci and virtio-blk-pci devices and faced no issues on 
ThunderX2. The multiqueue mode for both devices is working fine too so:

Tested-by: Tomasz Nowicki <tnowicki@marvell.com>

Thanks,
Tomasz
Michael S. Tsirkin May 12, 2019, 4:31 p.m. UTC | #10
On Tue, Jan 15, 2019 at 12:19:52PM +0000, Jean-Philippe Brucker wrote:
> Implement the virtio-iommu driver, following specification v0.9 [1].
> 
> This is a simple rebase onto Linux v5.0-rc2. We now use the
> dev_iommu_fwspec_get() helper introduced in v5.0 instead of accessing
> dev->iommu_fwspec, but there aren't any functional change from v6 [2].
> 
> Our current goal for virtio-iommu is to get a paravirtual IOMMU working
> on Arm, and enable device assignment to guest userspace. In this
> use-case the mappings are static, and don't require optimal performance,
> so this series tries to keep things simple. However there is plenty more
> to do for features and optimizations, and having this base in v5.1 would
> be good. Given that most of the changes are to drivers/iommu, I believe
> the driver and future changes should go via the IOMMU tree.
> 
> You can find Linux driver and kvmtool device on v0.9.2 branches [3],
> module and x86 support on virtio-iommu/devel. Also tested with Eric's
> QEMU device [4]. Please note that the series depends on Robin's
> probe-deferral fix [5], which will hopefully land in v5.0.
> 
> [1] Virtio-iommu specification v0.9, sources and pdf
>     git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
>     http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
> 
> [2] [PATCH v6 0/7] Add virtio-iommu driver
>     https://lists.linuxfoundation.org/pipermail/iommu/2018-December/032127.html
> 
> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.2
>     git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9.2
> 
> [4] [RFC v9 00/17] VIRTIO-IOMMU device
>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html
> 
> [5] [PATCH] iommu/of: Fix probe-deferral
>     https://www.spinics.net/lists/arm-kernel/msg698371.html


OK this has been in next for a while.

Last time IOMMU maintainers objected. Are objections
still in force?

If not could we get acks please?


> Jean-Philippe Brucker (7):
>   dt-bindings: virtio-mmio: Add IOMMU description
>   dt-bindings: virtio: Add virtio-pci-iommu node
>   of: Allow the iommu-map property to omit untranslated devices
>   PCI: OF: Initialize dev->fwnode appropriately
>   iommu: Add virtio-iommu driver
>   iommu/virtio: Add probe request
>   iommu/virtio: Add event queue
> 
>  .../devicetree/bindings/virtio/iommu.txt      |   66 +
>  .../devicetree/bindings/virtio/mmio.txt       |   30 +
>  MAINTAINERS                                   |    7 +
>  drivers/iommu/Kconfig                         |   11 +
>  drivers/iommu/Makefile                        |    1 +
>  drivers/iommu/virtio-iommu.c                  | 1158 +++++++++++++++++
>  drivers/of/base.c                             |   10 +-
>  drivers/pci/of.c                              |    7 +
>  include/uapi/linux/virtio_ids.h               |    1 +
>  include/uapi/linux/virtio_iommu.h             |  161 +++
>  10 files changed, 1449 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h
> 
> -- 
> 2.19.1
Michael S. Tsirkin May 12, 2019, 5:15 p.m. UTC | #11
On Tue, Jan 15, 2019 at 12:19:52PM +0000, Jean-Philippe Brucker wrote:
> Implement the virtio-iommu driver, following specification v0.9 [1].
> 
> This is a simple rebase onto Linux v5.0-rc2. We now use the
> dev_iommu_fwspec_get() helper introduced in v5.0 instead of accessing
> dev->iommu_fwspec, but there aren't any functional change from v6 [2].
> 
> Our current goal for virtio-iommu is to get a paravirtual IOMMU working
> on Arm, and enable device assignment to guest userspace. In this
> use-case the mappings are static, and don't require optimal performance,
> so this series tries to keep things simple. However there is plenty more
> to do for features and optimizations, and having this base in v5.1 would
> be good. Given that most of the changes are to drivers/iommu, I believe
> the driver and future changes should go via the IOMMU tree.
> 
> You can find Linux driver and kvmtool device on v0.9.2 branches [3],
> module and x86 support on virtio-iommu/devel. Also tested with Eric's
> QEMU device [4]. Please note that the series depends on Robin's
> probe-deferral fix [5], which will hopefully land in v5.0.
> 
> [1] Virtio-iommu specification v0.9, sources and pdf
>     git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
>     http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
> 
> [2] [PATCH v6 0/7] Add virtio-iommu driver
>     https://lists.linuxfoundation.org/pipermail/iommu/2018-December/032127.html
> 
> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.2
>     git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9.2
> 
> [4] [RFC v9 00/17] VIRTIO-IOMMU device
>     https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html
> 
> [5] [PATCH] iommu/of: Fix probe-deferral
>     https://www.spinics.net/lists/arm-kernel/msg698371.html

For virtio things:

Acked-by: Michael S. Tsirkin <mst@redhat.com>



> Jean-Philippe Brucker (7):
>   dt-bindings: virtio-mmio: Add IOMMU description
>   dt-bindings: virtio: Add virtio-pci-iommu node
>   of: Allow the iommu-map property to omit untranslated devices
>   PCI: OF: Initialize dev->fwnode appropriately
>   iommu: Add virtio-iommu driver
>   iommu/virtio: Add probe request
>   iommu/virtio: Add event queue
> 
>  .../devicetree/bindings/virtio/iommu.txt      |   66 +
>  .../devicetree/bindings/virtio/mmio.txt       |   30 +
>  MAINTAINERS                                   |    7 +
>  drivers/iommu/Kconfig                         |   11 +
>  drivers/iommu/Makefile                        |    1 +
>  drivers/iommu/virtio-iommu.c                  | 1158 +++++++++++++++++
>  drivers/of/base.c                             |   10 +-
>  drivers/pci/of.c                              |    7 +
>  include/uapi/linux/virtio_ids.h               |    1 +
>  include/uapi/linux/virtio_iommu.h             |  161 +++
>  10 files changed, 1449 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h
> 
> -- 
> 2.19.1
Joerg Roedel May 27, 2019, 9:26 a.m. UTC | #12
On Sun, May 12, 2019 at 12:31:59PM -0400, Michael S. Tsirkin wrote:
> OK this has been in next for a while.
> 
> Last time IOMMU maintainers objected. Are objections
> still in force?
> 
> If not could we get acks please?

No objections against the code, I only hesitated because the Spec was
not yet official.

So for the code:

	Acked-by: Joerg Roedel <jroedel@suse.de>
Michael S. Tsirkin May 27, 2019, 3:15 p.m. UTC | #13
On Mon, May 27, 2019 at 11:26:04AM +0200, Joerg Roedel wrote:
> On Sun, May 12, 2019 at 12:31:59PM -0400, Michael S. Tsirkin wrote:
> > OK this has been in next for a while.
> > 
> > Last time IOMMU maintainers objected. Are objections
> > still in force?
> > 
> > If not could we get acks please?
> 
> No objections against the code, I only hesitated because the Spec was
> not yet official.
> 
> So for the code:
> 
> 	Acked-by: Joerg Roedel <jroedel@suse.de>

Last spec patch had a bunch of comments not yet addressed.
But I do not remember whether comments are just about wording
or about the host/guest interface as well.
Jean-Philippe could you remind me please?
Jean-Philippe Brucker May 28, 2019, 9:18 a.m. UTC | #14
On 27/05/2019 16:15, Michael S. Tsirkin wrote:
> On Mon, May 27, 2019 at 11:26:04AM +0200, Joerg Roedel wrote:
>> On Sun, May 12, 2019 at 12:31:59PM -0400, Michael S. Tsirkin wrote:
>>> OK this has been in next for a while.
>>>
>>> Last time IOMMU maintainers objected. Are objections
>>> still in force?
>>>
>>> If not could we get acks please?
>>
>> No objections against the code, I only hesitated because the Spec was
>> not yet official.
>>
>> So for the code:
>>
>> 	Acked-by: Joerg Roedel <jroedel@suse.de>
> 
> Last spec patch had a bunch of comments not yet addressed.
> But I do not remember whether comments are just about wording
> or about the host/guest interface as well.
> Jean-Philippe could you remind me please?

It's mostly wording, but there is a small change in the config space
layout and two new feature bits. I'll send a new version of the driver
when possible.

Thanks,
Jean