diff mbox

[v2] intel_iommu: check misordered init when realize

Message ID 1487910561-17825-1-git-send-email-peterx@redhat.com
State New
Headers show

Commit Message

Peter Xu Feb. 24, 2017, 4:29 a.m. UTC
Intel vIOMMU devices are created with "-device" parameter, while here
actually we need to make sure the dmar device be created before other
PCI devices (like vfio-pci) so that we know iommu_fn will be setup
correctly before realizations of those PCI devices (it is sensible that
PCI device fetch these info during its realization). Now this ordering
yet cannot be achieved elsewhere, and devices will be created in the
order that user specified. That might be dangerous.

Here we add one more function to detect this kind of misordering issue,
then report to guest. Currently, the only known device that is affected
by this VT-d defect is the vfio-pci typed devices. So for now we just
check against it to make sure we are safe.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Comments

Pankaj Gupta Feb. 24, 2017, 5:07 a.m. UTC | #1
Hello Peter,

This solution looks to check dependency of 'vfio-pci' over 'intel-iommu'
before 'intel-iommu' is not initialized.

Overall it looks good to me, just a small nit below.
 
> 
> Intel vIOMMU devices are created with "-device" parameter, while here
> actually we need to make sure the dmar device be created before other
> PCI devices (like vfio-pci) so that we know iommu_fn will be setup
> correctly before realizations of those PCI devices (it is sensible that
> PCI device fetch these info during its realization). Now this ordering
> yet cannot be achieved elsewhere, and devices will be created in the
> order that user specified. That might be dangerous.
> 
> Here we add one more function to detect this kind of misordering issue,
> then report to guest. Currently, the only known device that is affected
> by this VT-d defect is the vfio-pci typed devices. So for now we just
> check against it to make sure we are safe.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 22d8226..b723ece 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s,
> Error **errp)
>      return true;
>  }
>  
> +/*
> + * TODO: we should have a better way to achieve the ordering rather
> + * than this misorder check explicitly against vfio-pci. After all, no
> + * one should be blamed for this, and vfio-pci did nothing wrong.
> + */
> +static bool vtd_detected_misorder_init(Error **errp)
> +{
> +    Object *dev = object_resolve_path_type("", "vfio-pci", NULL);
> +
> +    if (dev) {
> +        error_setg(errp, "Please specify \"intel-iommu\" before all the rest

            "before all the rest" does not give much clue to user. Do you think better
            error message would help? just a thought.
> "
> +                   "of the devices.");
> +        return true;
> +    }
> +
> +    return false;
> +}
> +
>  static void vtd_realize(DeviceState *dev, Error **errp)
>  {
>      PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
> @@ -2567,6 +2585,10 @@ static void vtd_realize(DeviceState *dev, Error
> **errp)
>      IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
>      X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
>  
> +    if (vtd_detected_misorder_init(errp)) {
> +        return;
> +    }
> +
>      VTD_DPRINTF(GENERAL, "");
>      x86_iommu->type = TYPE_INTEL;
>  
> --
> 2.7.4
> 
> 
>
Peter Xu Feb. 24, 2017, 5:50 a.m. UTC | #2
On Fri, Feb 24, 2017 at 12:07:33AM -0500, Pankaj Gupta wrote:
> Hello Peter,

Hi, Pankaj!

> 
> This solution looks to check dependency of 'vfio-pci' over 'intel-iommu'
> before 'intel-iommu' is not initialized.
> 
> Overall it looks good to me, just a small nit below.
>  
> > 
> > Intel vIOMMU devices are created with "-device" parameter, while here
> > actually we need to make sure the dmar device be created before other
> > PCI devices (like vfio-pci) so that we know iommu_fn will be setup
> > correctly before realizations of those PCI devices (it is sensible that
> > PCI device fetch these info during its realization). Now this ordering
> > yet cannot be achieved elsewhere, and devices will be created in the
> > order that user specified. That might be dangerous.
> > 
> > Here we add one more function to detect this kind of misordering issue,
> > then report to guest. Currently, the only known device that is affected
> > by this VT-d defect is the vfio-pci typed devices. So for now we just
> > check against it to make sure we are safe.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> > 
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 22d8226..b723ece 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s,
> > Error **errp)
> >      return true;
> >  }
> >  
> > +/*
> > + * TODO: we should have a better way to achieve the ordering rather
> > + * than this misorder check explicitly against vfio-pci. After all, no
> > + * one should be blamed for this, and vfio-pci did nothing wrong.
> > + */
> > +static bool vtd_detected_misorder_init(Error **errp)
> > +{
> > +    Object *dev = object_resolve_path_type("", "vfio-pci", NULL);
> > +
> > +    if (dev) {
> > +        error_setg(errp, "Please specify \"intel-iommu\" before all the rest
> 
>             "before all the rest" does not give much clue to user. Do you think better
>             error message would help? just a thought.

Yes this is my intention to emphasize that we should possibly put
intel-iommu even before all the PCI devices. As mentioned above (and
also in the commit message), although vfio-pci is the only one that is
affected by this, we should probably put intel-iommu even before all
the rest of PCI devices. E.g., in the future we can have new devices
that need this dependency as well (that vt-d being inited before that
device), which is still legal imho.

Meanwhile, I intended to avoid naming "vfio-pci" here since it'll let
user think of "something bad with vfio-pci" but actually vfio-pci did
nothing wrong.

Thanks,

-- peterx
Pankaj Gupta Feb. 24, 2017, 6:35 a.m. UTC | #3
> 
> On Fri, Feb 24, 2017 at 12:07:33AM -0500, Pankaj Gupta wrote:
> > Hello Peter,
> 
> Hi, Pankaj!
> 
> > 
> > This solution looks to check dependency of 'vfio-pci' over 'intel-iommu'
> > before 'intel-iommu' is not initialized.
> > 
> > Overall it looks good to me, just a small nit below.
> >  
> > > 
> > > Intel vIOMMU devices are created with "-device" parameter, while here
> > > actually we need to make sure the dmar device be created before other
> > > PCI devices (like vfio-pci) so that we know iommu_fn will be setup
> > > correctly before realizations of those PCI devices (it is sensible that
> > > PCI device fetch these info during its realization). Now this ordering
> > > yet cannot be achieved elsewhere, and devices will be created in the
> > > order that user specified. That might be dangerous.
> > > 
> > > Here we add one more function to detect this kind of misordering issue,
> > > then report to guest. Currently, the only known device that is affected
> > > by this VT-d defect is the vfio-pci typed devices. So for now we just
> > > check against it to make sure we are safe.
> > > 
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > >  hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++
> > >  1 file changed, 22 insertions(+)
> > > 
> > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > index 22d8226..b723ece 100644
> > > --- a/hw/i386/intel_iommu.c
> > > +++ b/hw/i386/intel_iommu.c
> > > @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s,
> > > Error **errp)
> > >      return true;
> > >  }
> > >  
> > > +/*
> > > + * TODO: we should have a better way to achieve the ordering rather
> > > + * than this misorder check explicitly against vfio-pci. After all, no
> > > + * one should be blamed for this, and vfio-pci did nothing wrong.
> > > + */
> > > +static bool vtd_detected_misorder_init(Error **errp)
> > > +{
> > > +    Object *dev = object_resolve_path_type("", "vfio-pci", NULL);
> > > +
> > > +    if (dev) {
> > > +        error_setg(errp, "Please specify \"intel-iommu\" before all the
> > > rest
> > 
> >             "before all the rest" does not give much clue to user. Do you
> >             think better
> >             error message would help? just a thought.
> 
> Yes this is my intention to emphasize that we should possibly put
> intel-iommu even before all the PCI devices. As mentioned above (and
> also in the commit message), although vfio-pci is the only one that is
> affected by this, we should probably put intel-iommu even before all
> the rest of PCI devices. E.g., in the future we can have new devices
> that need this dependency as well (that vt-d being inited before that
> device), which is still legal imho.

yes, something like this can help "intel-iommu should be specified as first device"?
Or we can just check for "intel-iommu" as first device at the start in place of
checking for "vfio-pci". This will also help us to check for all other devices. 

Thanks. 

> 
> Meanwhile, I intended to avoid naming "vfio-pci" here since it'll let
> user think of "something bad with vfio-pci" but actually vfio-pci did
> nothing wrong.

o.k 
> 
> Thanks,
> 
> -- peterx
>
Peter Xu Feb. 24, 2017, 7:10 a.m. UTC | #4
On Fri, Feb 24, 2017 at 01:35:10AM -0500, Pankaj Gupta wrote:
> 
> > 
> > On Fri, Feb 24, 2017 at 12:07:33AM -0500, Pankaj Gupta wrote:
> > > Hello Peter,
> > 
> > Hi, Pankaj!
> > 
> > > 
> > > This solution looks to check dependency of 'vfio-pci' over 'intel-iommu'
> > > before 'intel-iommu' is not initialized.
> > > 
> > > Overall it looks good to me, just a small nit below.
> > >  
> > > > 
> > > > Intel vIOMMU devices are created with "-device" parameter, while here
> > > > actually we need to make sure the dmar device be created before other
> > > > PCI devices (like vfio-pci) so that we know iommu_fn will be setup
> > > > correctly before realizations of those PCI devices (it is sensible that
> > > > PCI device fetch these info during its realization). Now this ordering
> > > > yet cannot be achieved elsewhere, and devices will be created in the
> > > > order that user specified. That might be dangerous.
> > > > 
> > > > Here we add one more function to detect this kind of misordering issue,
> > > > then report to guest. Currently, the only known device that is affected
> > > > by this VT-d defect is the vfio-pci typed devices. So for now we just
> > > > check against it to make sure we are safe.
> > > > 
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > ---
> > > >  hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++
> > > >  1 file changed, 22 insertions(+)
> > > > 
> > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > > index 22d8226..b723ece 100644
> > > > --- a/hw/i386/intel_iommu.c
> > > > +++ b/hw/i386/intel_iommu.c
> > > > @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s,
> > > > Error **errp)
> > > >      return true;
> > > >  }
> > > >  
> > > > +/*
> > > > + * TODO: we should have a better way to achieve the ordering rather
> > > > + * than this misorder check explicitly against vfio-pci. After all, no
> > > > + * one should be blamed for this, and vfio-pci did nothing wrong.
> > > > + */
> > > > +static bool vtd_detected_misorder_init(Error **errp)
> > > > +{
> > > > +    Object *dev = object_resolve_path_type("", "vfio-pci", NULL);
> > > > +
> > > > +    if (dev) {
> > > > +        error_setg(errp, "Please specify \"intel-iommu\" before all the
> > > > rest
> > > 
> > >             "before all the rest" does not give much clue to user. Do you
> > >             think better
> > >             error message would help? just a thought.
> > 
> > Yes this is my intention to emphasize that we should possibly put
> > intel-iommu even before all the PCI devices. As mentioned above (and
> > also in the commit message), although vfio-pci is the only one that is
> > affected by this, we should probably put intel-iommu even before all
> > the rest of PCI devices. E.g., in the future we can have new devices
> > that need this dependency as well (that vt-d being inited before that
> > device), which is still legal imho.
> 
> yes, something like this can help "intel-iommu should be specified as first device"?

I can switch this description if you prefer this one. :) Not sure
whether this can be touched up during merge if this patch is good for
2.9, or I'll just repost with yours if there is further comments.

> Or we can just check for "intel-iommu" as first device at the start in place of
> checking for "vfio-pci". This will also help us to check for all other devices. 

Actually your suggestion is just what I did in version 1. The problem
is that, it's not easy to let vt-d really be the first device to be
inited... Please check pc_q35_init(), within that we have:

    pc_vga_init(isa_bus, host_bus);
    pc_nic_init(isa_bus, host_bus);

These are integrated devices along with ICH9. Such devices are
possibly pci devices as well, but they are created much earlier than
vt-d device, since that's still during machine init.

Thanks,

-- peterx
Pankaj Gupta Feb. 24, 2017, 8:42 a.m. UTC | #5
> On Fri, Feb 24, 2017 at 01:35:10AM -0500, Pankaj Gupta wrote:
> > 
> > > 
> > > On Fri, Feb 24, 2017 at 12:07:33AM -0500, Pankaj Gupta wrote:
> > > > Hello Peter,
> > > 
> > > Hi, Pankaj!
> > > 
> > > > 
> > > > This solution looks to check dependency of 'vfio-pci' over
> > > > 'intel-iommu'
> > > > before 'intel-iommu' is not initialized.
> > > > 
> > > > Overall it looks good to me, just a small nit below.
> > > >  
> > > > > 
> > > > > Intel vIOMMU devices are created with "-device" parameter, while here
> > > > > actually we need to make sure the dmar device be created before other
> > > > > PCI devices (like vfio-pci) so that we know iommu_fn will be setup
> > > > > correctly before realizations of those PCI devices (it is sensible
> > > > > that
> > > > > PCI device fetch these info during its realization). Now this
> > > > > ordering
> > > > > yet cannot be achieved elsewhere, and devices will be created in the
> > > > > order that user specified. That might be dangerous.
> > > > > 
> > > > > Here we add one more function to detect this kind of misordering
> > > > > issue,
> > > > > then report to guest. Currently, the only known device that is
> > > > > affected
> > > > > by this VT-d defect is the vfio-pci typed devices. So for now we just
> > > > > check against it to make sure we are safe.
> > > > > 
> > > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > > ---
> > > > >  hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++
> > > > >  1 file changed, 22 insertions(+)
> > > > > 
> > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > > > index 22d8226..b723ece 100644
> > > > > --- a/hw/i386/intel_iommu.c
> > > > > +++ b/hw/i386/intel_iommu.c
> > > > > @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState
> > > > > *s,
> > > > > Error **errp)
> > > > >      return true;
> > > > >  }
> > > > >  
> > > > > +/*
> > > > > + * TODO: we should have a better way to achieve the ordering rather
> > > > > + * than this misorder check explicitly against vfio-pci. After all,
> > > > > no
> > > > > + * one should be blamed for this, and vfio-pci did nothing wrong.
> > > > > + */
> > > > > +static bool vtd_detected_misorder_init(Error **errp)
> > > > > +{
> > > > > +    Object *dev = object_resolve_path_type("", "vfio-pci", NULL);
> > > > > +
> > > > > +    if (dev) {
> > > > > +        error_setg(errp, "Please specify \"intel-iommu\" before all
> > > > > the
> > > > > rest
> > > > 
> > > >             "before all the rest" does not give much clue to user. Do
> > > >             you
> > > >             think better
> > > >             error message would help? just a thought.
> > > 
> > > Yes this is my intention to emphasize that we should possibly put
> > > intel-iommu even before all the PCI devices. As mentioned above (and
> > > also in the commit message), although vfio-pci is the only one that is
> > > affected by this, we should probably put intel-iommu even before all
> > > the rest of PCI devices. E.g., in the future we can have new devices
> > > that need this dependency as well (that vt-d being inited before that
> > > device), which is still legal imho.
> > 
> > yes, something like this can help "intel-iommu should be specified as first
> > device"?
> 
> I can switch this description if you prefer this one. :) Not sure
> whether this can be touched up during merge if this patch is good for
> 2.9, or I'll just repost with yours if there is further comments.

Sure.
> 
> > Or we can just check for "intel-iommu" as first device at the start in
> > place of
> > checking for "vfio-pci". This will also help us to check for all other
> > devices.
> 
> Actually your suggestion is just what I did in version 1. The problem
> is that, it's not easy to let vt-d really be the first device to be
> inited... Please check pc_q35_init(), within that we have:
> 
>     pc_vga_init(isa_bus, host_bus);
>     pc_nic_init(isa_bus, host_bus);
> 
> These are integrated devices along with ICH9. Such devices are
> possibly pci devices as well, but they are created much earlier than
> vt-d device, since that's still during machine init.

o.k. 

Thanks,
Pankaj
> 
> Thanks,
> 
> -- peterx
>
Marcel Apfelbaum Feb. 28, 2017, 2:42 p.m. UTC | #6
On 02/24/2017 06:29 AM, Peter Xu wrote:
> Intel vIOMMU devices are created with "-device" parameter, while here
> actually we need to make sure the dmar device be created before other
> PCI devices (like vfio-pci) so that we know iommu_fn will be setup
> correctly before realizations of those PCI devices (it is sensible that
> PCI device fetch these info during its realization). Now this ordering
> yet cannot be achieved elsewhere, and devices will be created in the
> order that user specified. That might be dangerous.
>
> Here we add one more function to detect this kind of misordering issue,
> then report to guest. Currently, the only known device that is affected
> by this VT-d defect is the vfio-pci typed devices. So for now we just
> check against it to make sure we are safe.
>

Hi,
I can't say that I like it but if we want it for 2.9 maybe we don't have a choice.

I mentioned in another thread other idea:
     Maybe we should follow the same "template" as disk/drive, nic/netdev ?
     I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 .
You are able to change the order, I didn't look how it is done.

A nice side effect is that you can:
  1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables)
  2. In the future we can support multiple iommu devices.

Thanks,
Marcel

> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 22d8226..b723ece 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>      return true;
>  }
>
> +/*
> + * TODO: we should have a better way to achieve the ordering rather
> + * than this misorder check explicitly against vfio-pci. After all, no
> + * one should be blamed for this, and vfio-pci did nothing wrong.
> + */
> +static bool vtd_detected_misorder_init(Error **errp)
> +{
> +    Object *dev = object_resolve_path_type("", "vfio-pci", NULL);
> +
> +    if (dev) {
> +        error_setg(errp, "Please specify \"intel-iommu\" before all the rest "
> +                   "of the devices.");
> +        return true;
> +    }
> +
> +    return false;
> +}
> +
>  static void vtd_realize(DeviceState *dev, Error **errp)
>  {
>      PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
> @@ -2567,6 +2585,10 @@ static void vtd_realize(DeviceState *dev, Error **errp)
>      IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
>      X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
>
> +    if (vtd_detected_misorder_init(errp)) {
> +        return;
> +    }
> +
>      VTD_DPRINTF(GENERAL, "");
>      x86_iommu->type = TYPE_INTEL;
>
>
Peter Xu March 1, 2017, 2:36 a.m. UTC | #7
On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote:
> On 02/24/2017 06:29 AM, Peter Xu wrote:
> >Intel vIOMMU devices are created with "-device" parameter, while here
> >actually we need to make sure the dmar device be created before other
> >PCI devices (like vfio-pci) so that we know iommu_fn will be setup
> >correctly before realizations of those PCI devices (it is sensible that
> >PCI device fetch these info during its realization). Now this ordering
> >yet cannot be achieved elsewhere, and devices will be created in the
> >order that user specified. That might be dangerous.
> >
> >Here we add one more function to detect this kind of misordering issue,
> >then report to guest. Currently, the only known device that is affected
> >by this VT-d defect is the vfio-pci typed devices. So for now we just
> >check against it to make sure we are safe.
> >
> 
> Hi,
> I can't say that I like it but if we want it for 2.9 maybe we don't have a choice.
> 
> I mentioned in another thread other idea:
>     Maybe we should follow the same "template" as disk/drive, nic/netdev ?
>     I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 .
> You are able to change the order, I didn't look how it is done.

I suppose that's done by using different keywords. For example:

  -netdev user,id=net0 -device e1000,netdev=net0

Here we are using "netdev" for the backend and "device" for the
frontend.

Since we have this difference, we can just make sure the ordering by
init netdev first (in net_init_clients()), then the devices (in
device_init_func()).

> 
> A nice side effect is that you can:
>  1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables)
>  2. In the future we can support multiple iommu devices.

Yes. Thanks,

-- peterx
Michael S. Tsirkin March 1, 2017, 3:23 a.m. UTC | #8
On Wed, Mar 01, 2017 at 10:36:35AM +0800, Peter Xu wrote:
> On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote:
> > On 02/24/2017 06:29 AM, Peter Xu wrote:
> > >Intel vIOMMU devices are created with "-device" parameter, while here
> > >actually we need to make sure the dmar device be created before other
> > >PCI devices (like vfio-pci) so that we know iommu_fn will be setup
> > >correctly before realizations of those PCI devices (it is sensible that
> > >PCI device fetch these info during its realization). Now this ordering
> > >yet cannot be achieved elsewhere, and devices will be created in the
> > >order that user specified. That might be dangerous.
> > >
> > >Here we add one more function to detect this kind of misordering issue,
> > >then report to guest. Currently, the only known device that is affected
> > >by this VT-d defect is the vfio-pci typed devices. So for now we just
> > >check against it to make sure we are safe.
> > >
> > 
> > Hi,
> > I can't say that I like it but if we want it for 2.9 maybe we don't have a choice.
> > 
> > I mentioned in another thread other idea:
> >     Maybe we should follow the same "template" as disk/drive, nic/netdev ?
> >     I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 .
> > You are able to change the order, I didn't look how it is done.
> 
> I suppose that's done by using different keywords. For example:
> 
>   -netdev user,id=net0 -device e1000,netdev=net0
> 
> Here we are using "netdev" for the backend and "device" for the
> frontend.
> 
> Since we have this difference, we can just make sure the ordering by
> init netdev first (in net_init_clients()), then the devices (in
> device_init_func()).
> 
> > 
> > A nice side effect is that you can:
> >  1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables)
> >  2. In the future we can support multiple iommu devices.

Unfortunately AFAIK this needs a bunch of work in practice:
host and guest side.
So this opens a can of worms: you can all too easily create
configurations that we don't support.


> 
> Yes. Thanks,
> 
> -- peterx

While I agree this fixes the specific problem, we have the ordering
issue in many other places.  For example, this is one of the reasons we
don't create built-in PC devices using QOM composition.

So I think that support for dependencies does make a lot of sense
generally.
Jason Wang March 1, 2017, 4:14 a.m. UTC | #9
On 2017年03月01日 11:23, Michael S. Tsirkin wrote:
> On Wed, Mar 01, 2017 at 10:36:35AM +0800, Peter Xu wrote:
>> On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote:
>>> On 02/24/2017 06:29 AM, Peter Xu wrote:
>>>> Intel vIOMMU devices are created with "-device" parameter, while here
>>>> actually we need to make sure the dmar device be created before other
>>>> PCI devices (like vfio-pci) so that we know iommu_fn will be setup
>>>> correctly before realizations of those PCI devices (it is sensible that
>>>> PCI device fetch these info during its realization). Now this ordering
>>>> yet cannot be achieved elsewhere, and devices will be created in the
>>>> order that user specified. That might be dangerous.
>>>>
>>>> Here we add one more function to detect this kind of misordering issue,
>>>> then report to guest. Currently, the only known device that is affected
>>>> by this VT-d defect is the vfio-pci typed devices. So for now we just
>>>> check against it to make sure we are safe.
>>>>
>>> Hi,
>>> I can't say that I like it but if we want it for 2.9 maybe we don't have a choice.
>>>
>>> I mentioned in another thread other idea:
>>>      Maybe we should follow the same "template" as disk/drive, nic/netdev ?
>>>      I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 .
>>> You are able to change the order, I didn't look how it is done.
>> I suppose that's done by using different keywords. For example:
>>
>>    -netdev user,id=net0 -device e1000,netdev=net0
>>
>> Here we are using "netdev" for the backend and "device" for the
>> frontend.
>>
>> Since we have this difference, we can just make sure the ordering by
>> init netdev first (in net_init_clients()), then the devices (in
>> device_init_func()).
>>
>>> A nice side effect is that you can:
>>>   1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables)
>>>   2. In the future we can support multiple iommu devices.
> Unfortunately AFAIK this needs a bunch of work in practice:
> host and guest side.
> So this opens a can of worms: you can all too easily create
> configurations that we don't support.
>
>
>> Yes. Thanks,
>>
>> -- peterx
> While I agree this fixes the specific problem, we have the ordering
> issue in many other places.

Yes, just post a fix for virtio-pci with IOMMU. With this fix, we meet 
the exact issue as this patch, IOMMU needs to be created before any 
virtio-pci devices.

Thanks

>   For example, this is one of the reasons we
> don't create built-in PC devices using QOM composition.
>
> So I think that support for dependencies does make a lot of sense
> generally.
>
>
Marcel Apfelbaum March 1, 2017, 7:03 a.m. UTC | #10
On 03/01/2017 06:14 AM, Jason Wang wrote:
>
>
> On 2017年03月01日 11:23, Michael S. Tsirkin wrote:
>> On Wed, Mar 01, 2017 at 10:36:35AM +0800, Peter Xu wrote:
>>> On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote:
>>>> On 02/24/2017 06:29 AM, Peter Xu wrote:
>>>>> Intel vIOMMU devices are created with "-device" parameter, while here
>>>>> actually we need to make sure the dmar device be created before other
>>>>> PCI devices (like vfio-pci) so that we know iommu_fn will be setup
>>>>> correctly before realizations of those PCI devices (it is sensible that
>>>>> PCI device fetch these info during its realization). Now this ordering
>>>>> yet cannot be achieved elsewhere, and devices will be created in the
>>>>> order that user specified. That might be dangerous.
>>>>>
>>>>> Here we add one more function to detect this kind of misordering issue,
>>>>> then report to guest. Currently, the only known device that is affected
>>>>> by this VT-d defect is the vfio-pci typed devices. So for now we just
>>>>> check against it to make sure we are safe.
>>>>>
>>>> Hi,
>>>> I can't say that I like it but if we want it for 2.9 maybe we don't have a choice.
>>>>
>>>> I mentioned in another thread other idea:
>>>>      Maybe we should follow the same "template" as disk/drive, nic/netdev ?
>>>>      I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 .
>>>> You are able to change the order, I didn't look how it is done.
>>> I suppose that's done by using different keywords. For example:
>>>
>>>    -netdev user,id=net0 -device e1000,netdev=net0
>>>
>>> Here we are using "netdev" for the backend and "device" for the
>>> frontend.
>>>
>>> Since we have this difference, we can just make sure the ordering by
>>> init netdev first (in net_init_clients()), then the devices (in
>>> device_init_func()).
>>>
>>>> A nice side effect is that you can:
>>>>   1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables)
>>>>   2. In the future we can support multiple iommu devices.
>> Unfortunately AFAIK this needs a bunch of work in practice:
>> host and guest side.
>> So this opens a can of worms: you can all too easily create
>> configurations that we don't support.
>>
>>
>>> Yes. Thanks,
>>>
>>> -- peterx
>> While I agree this fixes the specific problem, we have the ordering
>> issue in many other places.
>
> Yes, just post a fix for virtio-pci with IOMMU. With this fix, we meet the exact issue as this patch, IOMMU needs to be created before any virtio-pci devices.
>

Hi Jason,

I am not saying we don't need to create the IOMMU before some other device,
I just don't think that the command line order should matter to user.

BTW, are you working on a way to limit the IOMMU scope to a specific set of devices?
If yes, this approach will also help you work.

Thanks,
Marcel

> Thanks
>
>>   For example, this is one of the reasons we
>> don't create built-in PC devices using QOM composition.
>>
>> So I think that support for dependencies does make a lot of sense
>> generally.
>>
>>
>
Jason Wang March 1, 2017, 8:43 a.m. UTC | #11
On 2017年03月01日 15:03, Marcel Apfelbaum wrote:
> On 03/01/2017 06:14 AM, Jason Wang wrote:
>>
>>
>> On 2017年03月01日 11:23, Michael S. Tsirkin wrote:
>>> On Wed, Mar 01, 2017 at 10:36:35AM +0800, Peter Xu wrote:
>>>> On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote:
>>>>> On 02/24/2017 06:29 AM, Peter Xu wrote:
>>>>>> Intel vIOMMU devices are created with "-device" parameter, while 
>>>>>> here
>>>>>> actually we need to make sure the dmar device be created before 
>>>>>> other
>>>>>> PCI devices (like vfio-pci) so that we know iommu_fn will be setup
>>>>>> correctly before realizations of those PCI devices (it is 
>>>>>> sensible that
>>>>>> PCI device fetch these info during its realization). Now this 
>>>>>> ordering
>>>>>> yet cannot be achieved elsewhere, and devices will be created in the
>>>>>> order that user specified. That might be dangerous.
>>>>>>
>>>>>> Here we add one more function to detect this kind of misordering 
>>>>>> issue,
>>>>>> then report to guest. Currently, the only known device that is 
>>>>>> affected
>>>>>> by this VT-d defect is the vfio-pci typed devices. So for now we 
>>>>>> just
>>>>>> check against it to make sure we are safe.
>>>>>>
>>>>> Hi,
>>>>> I can't say that I like it but if we want it for 2.9 maybe we 
>>>>> don't have a choice.
>>>>>
>>>>> I mentioned in another thread other idea:
>>>>>      Maybe we should follow the same "template" as disk/drive, 
>>>>> nic/netdev ?
>>>>>      I mean something like -device iommu,id=i1, -device 
>>>>> vfio-pci,iommu=e1 .
>>>>> You are able to change the order, I didn't look how it is done.
>>>> I suppose that's done by using different keywords. For example:
>>>>
>>>>    -netdev user,id=net0 -device e1000,netdev=net0
>>>>
>>>> Here we are using "netdev" for the backend and "device" for the
>>>> frontend.
>>>>
>>>> Since we have this difference, we can just make sure the ordering by
>>>> init netdev first (in net_init_clients()), then the devices (in
>>>> device_init_func()).
>>>>
>>>>> A nice side effect is that you can:
>>>>>   1. Limit the iommu scope only to the devices you want to protect 
>>>>> (tweaking APCI tables)
>>>>>   2. In the future we can support multiple iommu devices.
>>> Unfortunately AFAIK this needs a bunch of work in practice:
>>> host and guest side.
>>> So this opens a can of worms: you can all too easily create
>>> configurations that we don't support.
>>>
>>>
>>>> Yes. Thanks,
>>>>
>>>> -- peterx
>>> While I agree this fixes the specific problem, we have the ordering
>>> issue in many other places.
>>
>> Yes, just post a fix for virtio-pci with IOMMU. With this fix, we 
>> meet the exact issue as this patch, IOMMU needs to be created before 
>> any virtio-pci devices.
>>
>
> Hi Jason,
>
> I am not saying we don't need to create the IOMMU before some other 
> device,
> I just don't think that the command line order should matter to user.
>
> BTW, are you working on a way to limit the IOMMU scope to a specific 
> set of devices?
> If yes, this approach will also help you work.

Not yet, this may work (after 2.9) but I believe we still need a fix for 
2.9.

Thanks

>
> Thanks,
> Marcel
Marcel Apfelbaum March 1, 2017, 9:05 a.m. UTC | #12
On 03/01/2017 10:43 AM, Jason Wang wrote:
>
>
> On 2017年03月01日 15:03, Marcel Apfelbaum wrote:
>> On 03/01/2017 06:14 AM, Jason Wang wrote:
>>>
>>>
>>> On 2017年03月01日 11:23, Michael S. Tsirkin wrote:
>>>> On Wed, Mar 01, 2017 at 10:36:35AM +0800, Peter Xu wrote:
>>>>> On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote:
>>>>>> On 02/24/2017 06:29 AM, Peter Xu wrote:
>>>>>>> Intel vIOMMU devices are created with "-device" parameter, while here
>>>>>>> actually we need to make sure the dmar device be created before other
>>>>>>> PCI devices (like vfio-pci) so that we know iommu_fn will be setup
>>>>>>> correctly before realizations of those PCI devices (it is sensible that
>>>>>>> PCI device fetch these info during its realization). Now this ordering
>>>>>>> yet cannot be achieved elsewhere, and devices will be created in the
>>>>>>> order that user specified. That might be dangerous.
>>>>>>>
>>>>>>> Here we add one more function to detect this kind of misordering issue,
>>>>>>> then report to guest. Currently, the only known device that is affected
>>>>>>> by this VT-d defect is the vfio-pci typed devices. So for now we just
>>>>>>> check against it to make sure we are safe.
>>>>>>>
>>>>>> Hi,
>>>>>> I can't say that I like it but if we want it for 2.9 maybe we don't have a choice.
>>>>>>
>>>>>> I mentioned in another thread other idea:
>>>>>>      Maybe we should follow the same "template" as disk/drive, nic/netdev ?
>>>>>>      I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 .
>>>>>> You are able to change the order, I didn't look how it is done.
>>>>> I suppose that's done by using different keywords. For example:
>>>>>
>>>>>    -netdev user,id=net0 -device e1000,netdev=net0
>>>>>
>>>>> Here we are using "netdev" for the backend and "device" for the
>>>>> frontend.
>>>>>
>>>>> Since we have this difference, we can just make sure the ordering by
>>>>> init netdev first (in net_init_clients()), then the devices (in
>>>>> device_init_func()).
>>>>>
>>>>>> A nice side effect is that you can:
>>>>>>   1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables)
>>>>>>   2. In the future we can support multiple iommu devices.
>>>> Unfortunately AFAIK this needs a bunch of work in practice:
>>>> host and guest side.
>>>> So this opens a can of worms: you can all too easily create
>>>> configurations that we don't support.
>>>>
>>>>
>>>>> Yes. Thanks,
>>>>>
>>>>> -- peterx
>>>> While I agree this fixes the specific problem, we have the ordering
>>>> issue in many other places.
>>>
>>> Yes, just post a fix for virtio-pci with IOMMU. With this fix, we meet the exact issue as this patch, IOMMU needs to be created before any virtio-pci devices.
>>>
>>
>> Hi Jason,
>>
>> I am not saying we don't need to create the IOMMU before some other device,
>> I just don't think that the command line order should matter to user.
>>
>> BTW, are you working on a way to limit the IOMMU scope to a specific set of devices?
>> If yes, this approach will also help you work.
>
> Not yet, this may work (after 2.9) but I believe we still need a fix for 2.9.
>

I agree, I am thinking about sending a patch that at least takes the "vfio-pci"
check outside the iommu code in a separate header file. We can then add other
devices that need to be crated before the iommu.

Thanks,
Marcel

> Thanks
>
>>
>> Thanks,
>> Marcel
>
Peter Xu March 1, 2017, 9:18 a.m. UTC | #13
On Wed, Mar 01, 2017 at 09:03:47AM +0200, Marcel Apfelbaum wrote:
> On 03/01/2017 06:14 AM, Jason Wang wrote:

[...]

> Hi Jason,
> 
> I am not saying we don't need to create the IOMMU before some other device,
> I just don't think that the command line order should matter to user.
> 
> BTW, are you working on a way to limit the IOMMU scope to a specific set of devices?
> If yes, this approach will also help you work.

Marcel,

Do you have any plan after 2.9 to re-arrange the init order thing a
bit? In general, maybe we need an ordering support for devices.
Besides that, I am thinking whether it would be wise that we just init
the IOMMUs during machine init just like before, but keep the
"-device" interface? After all, at least the root IOMMU device should
be with root complex, and it feels a little strange we just split the
init process apart (we delay the IOMMU init into device
initializations).

Not sure whether above makes sense though. Thanks,

-- peterx
Marcel Apfelbaum March 1, 2017, 9:29 a.m. UTC | #14
On 03/01/2017 11:18 AM, Peter Xu wrote:
> On Wed, Mar 01, 2017 at 09:03:47AM +0200, Marcel Apfelbaum wrote:
>> On 03/01/2017 06:14 AM, Jason Wang wrote:
>
> [...]
>
>> Hi Jason,
>>
>> I am not saying we don't need to create the IOMMU before some other device,
>> I just don't think that the command line order should matter to user.
>>
>> BTW, are you working on a way to limit the IOMMU scope to a specific set of devices?
>> If yes, this approach will also help you work.
>
> Marcel,
>
> Do you have any plan after 2.9 to re-arrange the init order thing a
> bit? In general, maybe we need an ordering support for devices.

I don;t know when I'll start it, but yes, I am thinking about
taking this project. We need the ordering in order to be able
to have less "built-in" devices.

> Besides that, I am thinking whether it would be wise that we just init
> the IOMMUs during machine init just like before, but keep the
> "-device" interface? After all, at least the root IOMMU device should
> be with root complex, and it feels a little strange we just split the
> init process apart (we delay the IOMMU init into device
> initializations).
>

Keeping the IOMMU creation as part of the Root Complex is problematic.
What happens if we want to limit the IOMMU scope to a subset of devices?
How will the command line look? Also we may want/need multiple iommu
devices per system. It will not happen tomorrow, but we don't want to loose
this possibility.

The device re-ordering is not 2.9 material of course, and we need your patch.
The only thing we can do better is to take out the "vfio-pci" in another header
file and and have it in a array of devices that should be checked
(in order to avoid polluting the IOMMU code with a not related device)

I can try and send a patch for it if you prefer.

Thanks,
Marcel

> Not sure whether above makes sense though. Thanks,
>
> -- peterx
>
Peter Xu March 1, 2017, 9:59 a.m. UTC | #15
On Wed, Mar 01, 2017 at 11:29:47AM +0200, Marcel Apfelbaum wrote:
> On 03/01/2017 11:18 AM, Peter Xu wrote:
> >On Wed, Mar 01, 2017 at 09:03:47AM +0200, Marcel Apfelbaum wrote:
> >>On 03/01/2017 06:14 AM, Jason Wang wrote:
> >
> >[...]
> >
> >>Hi Jason,
> >>
> >>I am not saying we don't need to create the IOMMU before some other device,
> >>I just don't think that the command line order should matter to user.
> >>
> >>BTW, are you working on a way to limit the IOMMU scope to a specific set of devices?
> >>If yes, this approach will also help you work.
> >
> >Marcel,
> >
> >Do you have any plan after 2.9 to re-arrange the init order thing a
> >bit? In general, maybe we need an ordering support for devices.
> 
> I don;t know when I'll start it, but yes, I am thinking about
> taking this project. We need the ordering in order to be able
> to have less "built-in" devices.

That'll be great!

> 
> >Besides that, I am thinking whether it would be wise that we just init
> >the IOMMUs during machine init just like before, but keep the
> >"-device" interface? After all, at least the root IOMMU device should
> >be with root complex, and it feels a little strange we just split the
> >init process apart (we delay the IOMMU init into device
> >initializations).
> >
> 
> Keeping the IOMMU creation as part of the Root Complex is problematic.
> What happens if we want to limit the IOMMU scope to a subset of devices?
> How will the command line look? Also we may want/need multiple iommu
> devices per system. It will not happen tomorrow, but we don't want to loose
> this possibility.

I think that won't be a big problem. E.g., I don't see big problem if
we just create all these vIOMMUs along with machine init. Is there?

IMHO we can just pick these vIOMMU "devices" out of the device list,
and we can avoid doing that again in general device_init_func().

> 
> The device re-ordering is not 2.9 material of course, and we need your patch.
> The only thing we can do better is to take out the "vfio-pci" in another header
> file and and have it in a array of devices that should be checked
> (in order to avoid polluting the IOMMU code with a not related device)
> 
> I can try and send a patch for it if you prefer.

IMHO it'll be okay we "pollute" vtd code for a short while. We can
revert my patch (or any workarounds, like Jason's just posted one)
when we have device ordering support, and when we have a correct
bus_master_as when device init.

Thanks,

-- peterx
Marcel Apfelbaum March 1, 2017, 12:32 p.m. UTC | #16
On 03/01/2017 11:59 AM, Peter Xu wrote:
> On Wed, Mar 01, 2017 at 11:29:47AM +0200, Marcel Apfelbaum wrote:
>> On 03/01/2017 11:18 AM, Peter Xu wrote:
>>> On Wed, Mar 01, 2017 at 09:03:47AM +0200, Marcel Apfelbaum wrote:
>>>> On 03/01/2017 06:14 AM, Jason Wang wrote:
>>>
>>> [...]
>>>
>>>> Hi Jason,
>>>>
>>>> I am not saying we don't need to create the IOMMU before some other device,
>>>> I just don't think that the command line order should matter to user.
>>>>
>>>> BTW, are you working on a way to limit the IOMMU scope to a specific set of devices?
>>>> If yes, this approach will also help you work.
>>>
>>> Marcel,
>>>
>>> Do you have any plan after 2.9 to re-arrange the init order thing a
>>> bit? In general, maybe we need an ordering support for devices.
>>
>> I don;t know when I'll start it, but yes, I am thinking about
>> taking this project. We need the ordering in order to be able
>> to have less "built-in" devices.
>
> That'll be great!
>
>>
>>> Besides that, I am thinking whether it would be wise that we just init
>>> the IOMMUs during machine init just like before, but keep the
>>> "-device" interface? After all, at least the root IOMMU device should
>>> be with root complex, and it feels a little strange we just split the
>>> init process apart (we delay the IOMMU init into device
>>> initializations).
>>>
>>
>> Keeping the IOMMU creation as part of the Root Complex is problematic.
>> What happens if we want to limit the IOMMU scope to a subset of devices?
>> How will the command line look? Also we may want/need multiple iommu
>> devices per system. It will not happen tomorrow, but we don't want to loose
>> this possibility.
>
> I think that won't be a big problem. E.g., I don't see big problem if
> we just create all these vIOMMUs along with machine init. Is there?
>

How would you assign them do devices?

The "normal" QEMU cmd line would look like I mentioned earlier:

-device iommu,id=iommu1 -device pcie-root,iommu=iommu1 \
-device pcie-root \
-device iommu,id=iommu2 -device pcie-root,iommu=iommu2 \

How do you propose to do it otherwise?

> IMHO we can just pick these vIOMMU "devices" out of the device list,
> and we can avoid doing that again in general device_init_func().
>
>>
>> The device re-ordering is not 2.9 material of course, and we need your patch.
>> The only thing we can do better is to take out the "vfio-pci" in another header
>> file and and have it in a array of devices that should be checked
>> (in order to avoid polluting the IOMMU code with a not related device)
>>
>> I can try and send a patch for it if you prefer.
>
> IMHO it'll be okay we "pollute" vtd code for a short while. We can
> revert my patch (or any workarounds, like Jason's just posted one)

Can you please send me a link to Jason's patch?

If nobody else objects I am OK with it.

Thanks,
Marcel

> when we have device ordering support, and when we have a correct
> bus_master_as when device init.
>
> Thanks,
>
> -- peterx
>
Peter Xu March 2, 2017, 3:39 a.m. UTC | #17
On Wed, Mar 01, 2017 at 12:14:21PM +0800, Jason Wang wrote:
> On 2017年03月01日 11:23, Michael S. Tsirkin wrote:

[...]

> >While I agree this fixes the specific problem, we have the ordering
> >issue in many other places.
> 
> Yes, just post a fix for virtio-pci with IOMMU. With this fix, we meet the
> exact issue as this patch, IOMMU needs to be created before any virtio-pci
> devices.

I posted v3 to include detection of virtio-pci devices. Hope that's
what we needed for 2.9. Thanks,

-- peterx
Peter Xu March 2, 2017, 3:45 a.m. UTC | #18
On Wed, Mar 01, 2017 at 02:32:34PM +0200, Marcel Apfelbaum wrote:
> On 03/01/2017 11:59 AM, Peter Xu wrote:
> >On Wed, Mar 01, 2017 at 11:29:47AM +0200, Marcel Apfelbaum wrote:
> >>On 03/01/2017 11:18 AM, Peter Xu wrote:
> >>>On Wed, Mar 01, 2017 at 09:03:47AM +0200, Marcel Apfelbaum wrote:
> >>>>On 03/01/2017 06:14 AM, Jason Wang wrote:
> >>>
> >>>[...]
> >>>
> >>>>Hi Jason,
> >>>>
> >>>>I am not saying we don't need to create the IOMMU before some other device,
> >>>>I just don't think that the command line order should matter to user.
> >>>>
> >>>>BTW, are you working on a way to limit the IOMMU scope to a specific set of devices?
> >>>>If yes, this approach will also help you work.
> >>>
> >>>Marcel,
> >>>
> >>>Do you have any plan after 2.9 to re-arrange the init order thing a
> >>>bit? In general, maybe we need an ordering support for devices.
> >>
> >>I don;t know when I'll start it, but yes, I am thinking about
> >>taking this project. We need the ordering in order to be able
> >>to have less "built-in" devices.
> >
> >That'll be great!
> >
> >>
> >>>Besides that, I am thinking whether it would be wise that we just init
> >>>the IOMMUs during machine init just like before, but keep the
> >>>"-device" interface? After all, at least the root IOMMU device should
> >>>be with root complex, and it feels a little strange we just split the
> >>>init process apart (we delay the IOMMU init into device
> >>>initializations).
> >>>
> >>
> >>Keeping the IOMMU creation as part of the Root Complex is problematic.
> >>What happens if we want to limit the IOMMU scope to a subset of devices?
> >>How will the command line look? Also we may want/need multiple iommu
> >>devices per system. It will not happen tomorrow, but we don't want to loose
> >>this possibility.
> >
> >I think that won't be a big problem. E.g., I don't see big problem if
> >we just create all these vIOMMUs along with machine init. Is there?
> >
> 
> How would you assign them do devices?
> 
> The "normal" QEMU cmd line would look like I mentioned earlier:
> 
> -device iommu,id=iommu1 -device pcie-root,iommu=iommu1 \
> -device pcie-root \
> -device iommu,id=iommu2 -device pcie-root,iommu=iommu2 \
> 
> How do you propose to do it otherwise?

I totally agree that we can use the way above when we wants to bind
device with specific vIOMMU device. I was just wondering whether we
can move the init of "-device intel-iommu,..." to machine init phase,
no matter whether it's the vIOMMU on the root complex or not.

> 
> >IMHO we can just pick these vIOMMU "devices" out of the device list,
> >and we can avoid doing that again in general device_init_func().
> >
> >>
> >>The device re-ordering is not 2.9 material of course, and we need your patch.
> >>The only thing we can do better is to take out the "vfio-pci" in another header
> >>file and and have it in a array of devices that should be checked
> >>(in order to avoid polluting the IOMMU code with a not related device)
> >>
> >>I can try and send a patch for it if you prefer.
> >
> >IMHO it'll be okay we "pollute" vtd code for a short while. We can
> >revert my patch (or any workarounds, like Jason's just posted one)
> 
> Can you please send me a link to Jason's patch?

Sorry for the unclearness! This one:

  https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg07844.html

I just posted v3 for this patch to co-op with Jason's patch.

Thanks,

-- peterx
diff mbox

Patch

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 22d8226..b723ece 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2560,6 +2560,24 @@  static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
     return true;
 }
 
+/*
+ * TODO: we should have a better way to achieve the ordering rather
+ * than this misorder check explicitly against vfio-pci. After all, no
+ * one should be blamed for this, and vfio-pci did nothing wrong.
+ */
+static bool vtd_detected_misorder_init(Error **errp)
+{
+    Object *dev = object_resolve_path_type("", "vfio-pci", NULL);
+
+    if (dev) {
+        error_setg(errp, "Please specify \"intel-iommu\" before all the rest "
+                   "of the devices.");
+        return true;
+    }
+
+    return false;
+}
+
 static void vtd_realize(DeviceState *dev, Error **errp)
 {
     PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
@@ -2567,6 +2585,10 @@  static void vtd_realize(DeviceState *dev, Error **errp)
     IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
     X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
 
+    if (vtd_detected_misorder_init(errp)) {
+        return;
+    }
+
     VTD_DPRINTF(GENERAL, "");
     x86_iommu->type = TYPE_INTEL;