diff mbox

Add Xen platform PCI device version 2.

Message ID 1371634333-23565-1-git-send-email-paul.durrant@citrix.com
State New
Headers show

Commit Message

Paul Durrant June 19, 2013, 9:32 a.m. UTC
The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version
of the Xen platform device (since the newer driver set cannot co-exist
with previous drivers which bind to the existing "xen-platform" type of
device). This patch introduces a new "xen-platform-2" device type with
the appropriate device_id and revision.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
 hw/xen/xen_platform.c    |   75 ++++++++++++++++++++++++++++++++++++++--------
 include/hw/pci/pci_ids.h |    1 +
 2 files changed, 63 insertions(+), 13 deletions(-)

Comments

Ian Campbell June 19, 2013, 9:42 a.m. UTC | #1
On Wed, 2013-06-19 at 10:32 +0100, Paul Durrant wrote:
> The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version
> of the Xen platform device (since the newer driver set cannot co-exist
> with previous drivers which bind to the existing "xen-platform" type of
> device). This patch introduces a new "xen-platform-2" device type with
> the appropriate device_id and revision.

What is the difference between these two devices?

Ian.
Paul Durrant June 19, 2013, 9:43 a.m. UTC | #2
> -----Original Message-----

> From: Ian Campbell

> Sent: 19 June 2013 10:42

> To: Paul Durrant

> Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org

> Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

> 

> On Wed, 2013-06-19 at 10:32 +0100, Paul Durrant wrote:

> > The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version

> > of the Xen platform device (since the newer driver set cannot co-exist

> > with previous drivers which bind to the existing "xen-platform" type of

> > device). This patch introduces a new "xen-platform-2" device type with

> > the appropriate device_id and revision.

> 

> What is the difference between these two devices?

> 


As the comment implies, the device_id (2 rather than 1) and the revision (2 rather than 1).

  Paul
Ian Campbell June 19, 2013, 10:07 a.m. UTC | #3
On Wed, 2013-06-19 at 10:43 +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Ian Campbell
> > Sent: 19 June 2013 10:42
> > To: Paul Durrant
> > Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
> > 
> > On Wed, 2013-06-19 at 10:32 +0100, Paul Durrant wrote:
> > > The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version
> > > of the Xen platform device (since the newer driver set cannot co-exist
> > > with previous drivers which bind to the existing "xen-platform" type of
> > > device). This patch introduces a new "xen-platform-2" device type with
> > > the appropriate device_id and revision.
> > 
> > What is the difference between these two devices?
> > 
> 
> As the comment implies, the device_id (2 rather than 1) and the revision (2 rather than 1).

So what is the point of it?

Changing these IDs won't work for any other existing PV drivers, it will
break the Linux PVHVM ones and the BSD ones etc.

We obviously can't say to users "Are you running Windows and are you
running PV drivers >= X.Y, if so set lever A to position B, otherwise if
you are running some other OS or an earlier version of the Windows PV
driver set it to position A".

It sounds to me as if this is a hack to workaround some shortcoming of
the Windows driver model. If this is the case then I'm afraid you are
going to have to find a way to deal with this from within Windows and/or
your drivers. Failing that then I think you'll just have to document an
upgrade path for the users of your older drivers. Pushing the pain onto
the entire world is just not the right way to go about this.

Ian.
Paul Durrant June 19, 2013, 10:13 a.m. UTC | #4
> -----Original Message-----

> From: Ian Campbell

> Sent: 19 June 2013 11:08

> To: Paul Durrant

> Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org

> Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

> 

> On Wed, 2013-06-19 at 10:43 +0100, Paul Durrant wrote:

> > > -----Original Message-----

> > > From: Ian Campbell

> > > Sent: 19 June 2013 10:42

> > > To: Paul Durrant

> > > Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org

> > > Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.

> > >

> > > On Wed, 2013-06-19 at 10:32 +0100, Paul Durrant wrote:

> > > > The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version

> > > > of the Xen platform device (since the newer driver set cannot co-exist

> > > > with previous drivers which bind to the existing "xen-platform" type of

> > > > device). This patch introduces a new "xen-platform-2" device type with

> > > > the appropriate device_id and revision.

> > >

> > > What is the difference between these two devices?

> > >

> >

> > As the comment implies, the device_id (2 rather than 1) and the revision (2

> rather than 1).

> 

> So what is the point of it?

> 

> Changing these IDs won't work for any other existing PV drivers, it will

> break the Linux PVHVM ones and the BSD ones etc.

> 

> We obviously can't say to users "Are you running Windows and are you

> running PV drivers >= X.Y, if so set lever A to position B, otherwise if

> you are running some other OS or an earlier version of the Windows PV

> driver set it to position A".

> 


Why not? The device can be chosen on a per-VM basis.

> It sounds to me as if this is a hack to workaround some shortcoming of

> the Windows driver model. If this is the case then I'm afraid you are

> going to have to find a way to deal with this from within Windows and/or

> your drivers. Failing that then I think you'll just have to document an

> upgrade path for the users of your older drivers. Pushing the pain onto

> the entire world is just not the right way to go about this.

> 


After many months of worrying about this I had to conclude that there really is no way round this. If we bind newer drivers to the old device id and then hope to push them to Windows Update; much chaos, pain and anguish would occur. The only option is a new device.
Is it such a bad idea to be able to support multiple HVM VM configurations at the same time?

  Paul
Tim Deegan June 19, 2013, 10:36 a.m. UTC | #5
At 11:07 +0100 on 19 Jun (1371640052), Ian Campbell wrote:
> On Wed, 2013-06-19 at 10:43 +0100, Paul Durrant wrote:
> > > -----Original Message-----
> > > From: Ian Campbell
> > > Sent: 19 June 2013 10:42
> > > To: Paul Durrant
> > > Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org
> > > Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
> > > 
> > > On Wed, 2013-06-19 at 10:32 +0100, Paul Durrant wrote:
> > > > The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version
> > > > of the Xen platform device (since the newer driver set cannot co-exist
> > > > with previous drivers which bind to the existing "xen-platform" type of
> > > > device). This patch introduces a new "xen-platform-2" device type with
> > > > the appropriate device_id and revision.
> 
> We obviously can't say to users "Are you running Windows and are you
> running PV drivers >= X.Y, if so set lever A to position B, otherwise if
> you are running some other OS or an earlier version of the Windows PV
> driver set it to position A".

Agreed - that's a horrible interface, and an egregious layer violation.
The VM's admin should be able to install/upgrade software without
involving the host admin.

Can you explain a bit more about why this is needed?  Presumably 'real'
device manufacturers can ship new versions of their drivers through
Windows Update without needing to rev their hardware or get the user to
change BIOS settings.

Tim.
Alex Bligh June 19, 2013, 10:42 a.m. UTC | #6
--On 19 June 2013 10:13:17 +0000 Paul Durrant <Paul.Durrant@citrix.com> 
wrote:

>> We obviously can't say to users "Are you running Windows and are you
>> running PV drivers >= X.Y, if so set lever A to position B, otherwise if
>> you are running some other OS or an earlier version of the Windows PV
>> driver set it to position A".
>
> Why not? The device can be chosen on a per-VM basis.

Not everyone knows what guest some random user will be running
(consider cloud platforms).

Expecting an admin to introspect the guest is not a good plan.
Having the same set of levers that work at least reasonably with
everything is the way to go.
Ian Campbell June 19, 2013, 10:43 a.m. UTC | #7
On Wed, 2013-06-19 at 11:13 +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Ian Campbell
> > Sent: 19 June 2013 11:08
> > To: Paul Durrant
> > Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
> > 
> > On Wed, 2013-06-19 at 10:43 +0100, Paul Durrant wrote:
> > > > -----Original Message-----
> > > > From: Ian Campbell
> > > > Sent: 19 June 2013 10:42
> > > > To: Paul Durrant
> > > > Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org
> > > > Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
> > > >
> > > > On Wed, 2013-06-19 at 10:32 +0100, Paul Durrant wrote:
> > > > > The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version
> > > > > of the Xen platform device (since the newer driver set cannot co-exist
> > > > > with previous drivers which bind to the existing "xen-platform" type of
> > > > > device). This patch introduces a new "xen-platform-2" device type with
> > > > > the appropriate device_id and revision.
> > > >
> > > > What is the difference between these two devices?
> > > >
> > >
> > > As the comment implies, the device_id (2 rather than 1) and the revision (2
> > rather than 1).
> > 
> > So what is the point of it?
> > 
> > Changing these IDs won't work for any other existing PV drivers, it will
> > break the Linux PVHVM ones and the BSD ones etc.
> > 
> > We obviously can't say to users "Are you running Windows and are you
> > running PV drivers >= X.Y, if so set lever A to position B, otherwise if
> > you are running some other OS or an earlier version of the Windows PV
> > driver set it to position A".
> > 
> 
> Why not? The device can be chosen on a per-VM basis.

Pushing this decision onto the users is not the right design IMHO.

> > It sounds to me as if this is a hack to workaround some shortcoming of
> > the Windows driver model. If this is the case then I'm afraid you are
> > going to have to find a way to deal with this from within Windows and/or
> > your drivers. Failing that then I think you'll just have to document an
> > upgrade path for the users of your older drivers. Pushing the pain onto
> > the entire world is just not the right way to go about this.
> > 
> 
> After many months of worrying about this I had to conclude that there really is no way round this.

Then it'll have to be documentation, "remove these old drivers before
upgrading or enabling Windows update".

Perhaps blacklisting the old drivers via the I/O port protocol might
help, depending on how Windows reacts.

> Is it such a bad idea to be able to support multiple HVM VM
> configurations at the same time?

This isn't a new configuration, it is exactly the same as the old
configuration except it happens to trigger different behaviour in some
particular OSes driver loading code.

Sorry, I don't think this approach is going to fly.

Ian.
James Harper June 19, 2013, 10:54 a.m. UTC | #8
> 
> > -----Original Message-----
> > From: Ian Campbell
> > Sent: 19 June 2013 10:42
> > To: Paul Durrant
> > Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
> >
> > On Wed, 2013-06-19 at 10:32 +0100, Paul Durrant wrote:
> > > The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version
> > > of the Xen platform device (since the newer driver set cannot co-exist
> > > with previous drivers which bind to the existing "xen-platform" type of
> > > device). This patch introduces a new "xen-platform-2" device type with
> > > the appropriate device_id and revision.
> >
> > What is the difference between these two devices?
> >
> 
> As the comment implies, the device_id (2 rather than 1) and the revision (2
> rather than 1).
> 

Are the new drivers not backwards compatible?

Do any of the new driver names conflict with the old names? If not and if the bus driver in turn enumerates devices with different names than the old then the two should be able to exist side-by-side with only one being active depending on what the vm is configured with. 

James
Paul Durrant June 19, 2013, 11:23 a.m. UTC | #9
> -----Original Message-----
> From: James Harper [mailto:james.harper@bendigoit.com.au]
> Sent: 19 June 2013 11:55
> To: Paul Durrant; Ian Campbell
> Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org
> Subject: RE: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
> 
> >
> > > -----Original Message-----
> > > From: Ian Campbell
> > > Sent: 19 June 2013 10:42
> > > To: Paul Durrant
> > > Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org
> > > Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
> > >
> > > On Wed, 2013-06-19 at 10:32 +0100, Paul Durrant wrote:
> > > > The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version
> > > > of the Xen platform device (since the newer driver set cannot co-exist
> > > > with previous drivers which bind to the existing "xen-platform" type of
> > > > device). This patch introduces a new "xen-platform-2" device type with
> > > > the appropriate device_id and revision.
> > >
> > > What is the difference between these two devices?
> > >
> >
> > As the comment implies, the device_id (2 rather than 1) and the revision (2
> > rather than 1).
> >
> 
> Are the new drivers not backwards compatible?
> 
> Do any of the new driver names conflict with the old names? If not and if the
> bus driver in turn enumerates devices with different names than the old
> then the two should be able to exist side-by-side with only one being active
> depending on what the vm is configured with.
> 

The problem is that the old Citrix PV drivers bind their version of xenvbd directly to the platform device (id=1). The new PV drivers bind their xenbus driver to their platform device, because to go onto Windows Update you cannot have an installer and therefore cannot bind to a root enumerated node. Now, if I post  a version of xenbus onto Windows Update that binds to id=1 then I cannot prevent VMs installed with the old PV drivers from downloading the new version of xenbus and using that (since it is newer) in preference to their version of xenvbd. The result would then be a BSOD 0x7B after reboot because the system disk has vanished.

The issue going forward is that the only way of controlling driver download from Windows Update is by OS version and device binding. A PCI device binding only consists of vendor id, device id, subsystem vendor id, subsystem device id and revision. So if we want to control which version of the PV bus driver is applied to a VM, for a given OS, then we must vary one of those things.

  Paul
Paul Durrant June 19, 2013, 11:31 a.m. UTC | #10
> -----Original Message-----
> From: qemu-devel-bounces+paul.durrant=citrix.com@nongnu.org
> [mailto:qemu-devel-bounces+paul.durrant=citrix.com@nongnu.org] On
> Behalf Of Tim Deegan
> Sent: 19 June 2013 11:36
> To: Ian Campbell
> Cc: Paul Durrant; qemu-devel@nongnu.org; xen-devel@lists.xen.org
> Subject: Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device
> version 2.
> 
> At 11:07 +0100 on 19 Jun (1371640052), Ian Campbell wrote:
> > On Wed, 2013-06-19 at 10:43 +0100, Paul Durrant wrote:
> > > > -----Original Message-----
> > > > From: Ian Campbell
> > > > Sent: 19 June 2013 10:42
> > > > To: Paul Durrant
> > > > Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org
> > > > Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version
> 2.
> > > >
> > > > On Wed, 2013-06-19 at 10:32 +0100, Paul Durrant wrote:
> > > > > The XenServer 6.1+ Citrix Windows PV bus driver binds to a new
> version
> > > > > of the Xen platform device (since the newer driver set cannot co-exist
> > > > > with previous drivers which bind to the existing "xen-platform" type
> of
> > > > > device). This patch introduces a new "xen-platform-2" device type
> with
> > > > > the appropriate device_id and revision.
> >
> > We obviously can't say to users "Are you running Windows and are you
> > running PV drivers >= X.Y, if so set lever A to position B, otherwise if
> > you are running some other OS or an earlier version of the Windows PV
> > driver set it to position A".
> 
> Agreed - that's a horrible interface, and an egregious layer violation.
> The VM's admin should be able to install/upgrade software without
> involving the host admin.
> 
> Can you explain a bit more about why this is needed?  Presumably 'real'
> device manufacturers can ship new versions of their drivers through
> Windows Update without needing to rev their hardware or get the user to
> change BIOS settings.
> 

That's correct. If a vendor wishes to ship a new driver for an existing piece of h/w they just post it. However, at some point the vendor will sell a new piece of h/w which requires a driver that will not work with older h/w. So, they update their device id or revision to something new, leave the old driver on windows update alone, and post a new driver that will only bind to the new h/w.
This is model we have followed in XenServer: the platform device represents 'the set of PV drivers' and therefore to ship a new and non-backwards-compatible set of PV drivers we incremented the platform device id. That fits with the way that Windows update works and seems like a totally reasonable way to think of the platform device IMO. Clearly my opinion, and the reality of Windows Update, are somewhat contentious :-(

  Paul
Paul Durrant June 19, 2013, 11:35 a.m. UTC | #11
> -----Original Message-----
> From: Alex Bligh [mailto:alex@alex.org.uk]
> Sent: 19 June 2013 11:42
> To: Paul Durrant; Ian Campbell
> Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org; Alex Bligh
> Subject: Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device
> version 2.
> 
> 
> 
> --On 19 June 2013 10:13:17 +0000 Paul Durrant <Paul.Durrant@citrix.com>
> wrote:
> 
> >> We obviously can't say to users "Are you running Windows and are you
> >> running PV drivers >= X.Y, if so set lever A to position B, otherwise if
> >> you are running some other OS or an earlier version of the Windows PV
> >> driver set it to position A".
> >
> > Why not? The device can be chosen on a per-VM basis.
> 
> Not everyone knows what guest some random user will be running
> (consider cloud platforms).
> 
> Expecting an admin to introspect the guest is not a good plan.
> Having the same set of levers that work at least reasonably with
> everything is the way to go.
> 

If I can get agreement on the need for a new device (id/revision) then it would be possible to create both the new and the old device. If both were present then the admin doesn't need to know what the guest is. For instance, linux PVonHVM drivers should just continue to work as is. The only visible thing in the guest would be a device for which there was no driver.

  Paul
Paul Durrant June 19, 2013, 11:40 a.m. UTC | #12
> -----Original Message-----
> From: James Harper [mailto:james.harper@bendigoit.com.au]
> Sent: 19 June 2013 11:55
> To: Paul Durrant; Ian Campbell
> Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org
> Subject: RE: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
> 
> >
> > > -----Original Message-----
> > > From: Ian Campbell
> > > Sent: 19 June 2013 10:42
> > > To: Paul Durrant
> > > Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org
> > > Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
> > >
> > > On Wed, 2013-06-19 at 10:32 +0100, Paul Durrant wrote:
> > > > The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version
> > > > of the Xen platform device (since the newer driver set cannot co-exist
> > > > with previous drivers which bind to the existing "xen-platform" type of
> > > > device). This patch introduces a new "xen-platform-2" device type with
> > > > the appropriate device_id and revision.
> > >
> > > What is the difference between these two devices?
> > >
> >
> > As the comment implies, the device_id (2 rather than 1) and the revision (2
> > rather than 1).
> >
> 
> Are the new drivers not backwards compatible?
> 

No, the new drivers are not backward compatible (as I've explained in other mails) because of the way the old ones used the PCI device. It was an unfortunate choice but it was made many years ago - probably for good reasons at the time - and we now have to live with it.

  Paul

> Do any of the new driver names conflict with the old names? If not and if the
> bus driver in turn enumerates devices with different names than the old
> then the two should be able to exist side-by-side with only one being active
> depending on what the vm is configured with.
> 
> James
Tim Deegan June 19, 2013, 3:46 p.m. UTC | #13
At 11:23 +0000 on 19 Jun (1371641017), Paul Durrant wrote:
> The problem is that the old Citrix PV drivers bind their version of
> xenvbd directly to the platform device (id=1). The new PV drivers bind
> their xenbus driver to their platform device, because to go onto
> Windows Update you cannot have an installer and therefore cannot bind
> to a root enumerated node. Now, if I post a version of xenbus onto
> Windows Update that binds to id=1 then I cannot prevent VMs installed
> with the old PV drivers from downloading the new version of xenbus and
> using that (since it is newer)

Is there room in the versioning to make sure that the drivers you push
via Windows Update are always 'older' than any other PV drivers
currently in circulation?

Tim.
Paul Durrant June 19, 2013, 4:03 p.m. UTC | #14
> -----Original Message-----
> From: Tim Deegan [mailto:tim@xen.org]
> Sent: 19 June 2013 16:46
> To: Paul Durrant
> Cc: James Harper; Ian Campbell; qemu-devel@nongnu.org; xen-
> devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH] Add Xen platform PCI device version 2.
> 
> At 11:23 +0000 on 19 Jun (1371641017), Paul Durrant wrote:
> > The problem is that the old Citrix PV drivers bind their version of
> > xenvbd directly to the platform device (id=1). The new PV drivers bind
> > their xenbus driver to their platform device, because to go onto
> > Windows Update you cannot have an installer and therefore cannot bind
> > to a root enumerated node. Now, if I post a version of xenbus onto
> > Windows Update that binds to id=1 then I cannot prevent VMs installed
> > with the old PV drivers from downloading the new version of xenbus and
> > using that (since it is newer)
> 
> Is there room in the versioning to make sure that the drivers you push
> via Windows Update are always 'older' than any other PV drivers
> currently in circulation?
> 

The scoring rules are quite complex and I have a feeling that Windows will always select a newer (by date) version of a driver. I'd need to check.

  Paul
Paolo Bonzini June 19, 2013, 4:40 p.m. UTC | #15
Il 19/06/2013 13:31, Paul Durrant ha scritto:
> This is model we have followed in XenServer: the platform device
> represents 'the set of PV drivers' and therefore to ship a new and
> non-backwards-compatible set of PV drivers we incremented the
> platform device id.

But what is exactly the incompatibility about?

Paolo
Matt Wilson June 19, 2013, 6:21 p.m. UTC | #16
On Wed, Jun 19, 2013 at 11:42:06AM +0100, Alex Bligh wrote:
>
> --On 19 June 2013 10:13:17 +0000 Paul Durrant
> <Paul.Durrant@citrix.com> wrote:
> 
> >>We obviously can't say to users "Are you running Windows and are you
> >>running PV drivers >= X.Y, if so set lever A to position B, otherwise if
> >>you are running some other OS or an earlier version of the Windows PV
> >>driver set it to position A".
> >
> >Why not? The device can be chosen on a per-VM basis.
> 
> Not everyone knows what guest some random user will be running
> (consider cloud platforms).

I agree. If this is really the only solution, we would need to have
both versions presented to the guest so that old drivers continue to
work without any intervention.

Matt

> Expecting an admin to introspect the guest is not a good plan.
> Having the same set of levers that work at least reasonably with
> everything is the way to go.
>
Tim Deegan June 19, 2013, 8:15 p.m. UTC | #17
At 11:21 -0700 on 19 Jun (1371640904), Matt Wilson wrote:
> On Wed, Jun 19, 2013 at 11:42:06AM +0100, Alex Bligh wrote:
> >
> > --On 19 June 2013 10:13:17 +0000 Paul Durrant
> > <Paul.Durrant@citrix.com> wrote:
> > 
> > >>We obviously can't say to users "Are you running Windows and are you
> > >>running PV drivers >= X.Y, if so set lever A to position B, otherwise if
> > >>you are running some other OS or an earlier version of the Windows PV
> > >>driver set it to position A".
> > >
> > >Why not? The device can be chosen on a per-VM basis.
> > 
> > Not everyone knows what guest some random user will be running
> > (consider cloud platforms).
> 
> I agree. If this is really the only solution, we would need to have
> both versions presented to the guest so that old drivers continue to
> work without any intervention.

I suspect that if we expose both, both sets of drivers try to run the
same PV connections, and hilarity ensues.

Tim.
Paul Durrant June 20, 2013, 7:47 a.m. UTC | #18
> -----Original Message-----
> From: Tim Deegan [mailto:tim@xen.org]
> Sent: 19 June 2013 21:15
> To: Matt Wilson
> Cc: Alex Bligh; Paul Durrant; xen-devel@lists.xen.org; Ian Campbell; qemu-
> devel@nongnu.org
> Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
> version 2.
> 
> At 11:21 -0700 on 19 Jun (1371640904), Matt Wilson wrote:
> > On Wed, Jun 19, 2013 at 11:42:06AM +0100, Alex Bligh wrote:
> > >
> > > --On 19 June 2013 10:13:17 +0000 Paul Durrant
> > > <Paul.Durrant@citrix.com> wrote:
> > >
> > > >>We obviously can't say to users "Are you running Windows and are you
> > > >>running PV drivers >= X.Y, if so set lever A to position B, otherwise if
> > > >>you are running some other OS or an earlier version of the Windows
> PV
> > > >>driver set it to position A".
> > > >
> > > >Why not? The device can be chosen on a per-VM basis.
> > >
> > > Not everyone knows what guest some random user will be running
> > > (consider cloud platforms).
> >
> > I agree. If this is really the only solution, we would need to have
> > both versions presented to the guest so that old drivers continue to
> > work without any intervention.
> 
> I suspect that if we expose both, both sets of drivers try to run the
> same PV connections, and hilarity ensues.
> 

Actually I think I can make that work, and it is the conclusion I came to after Alex's comment. I'll create a new patch which introduces a new device, let's call it citrix-pv-bus or somesuch, which will have the necessary device id and revision and will be a dedicate device purely for the Citrix PV drivers. Then, if someone wants to create a VM which will be able use Citrix PV drivers they add this device to their config but leave all other aspects of the config unchanged, thus not precluding using that VM with any drivers that bind to the xen platform device.
If someone has a VM that has the old Citrix drivers installed, or GPLPV, I think I should be able to spot this and make sure that the new bus driver quiesces itself to prevent strangeness ensuing. If and when said previous drivers are un-installed then the new bus driver can wake up and enumerate the device nodes for the other pv drivers and Windows Update can carry on doing its stuff.

  Paul
Alex Bligh June 20, 2013, 8:08 a.m. UTC | #19
--On 20 June 2013 07:47:12 +0000 Paul Durrant <Paul.Durrant@citrix.com> 
wrote:

> If someone has a VM that has the old Citrix drivers installed, or GPLPV,
> I think I should be able to spot this and make sure that the new bus
> driver quiesces itself to prevent strangeness ensuing. If and when said
> previous drivers are un-installed then the new bus driver can wake up and
> enumerate the device nodes for the other pv drivers and Windows Update
> can carry on doing its stuff.

I have no clue about Windows device drivers, so this may be a silly
suggestion. If your suggestion above already requires a Xen code change,
one possibility might be copy the idea behind the PCI unplug logic. Either
if the new PCI device is used, it could unplug the old one, or vice versa.
Drivers magically unplugging themselves may not be ideal, but it beats
having 2 drivers fighting over the same device.
Paul Durrant June 20, 2013, 8:19 a.m. UTC | #20
> -----Original Message-----
> From: Alex Bligh [mailto:alex@alex.org.uk]
> Sent: 20 June 2013 09:09
> To: Paul Durrant; Tim (Xen.org); Matt Wilson
> Cc: xen-devel@lists.xen.org; Ian Campbell; qemu-devel@nongnu.org; Alex
> Bligh
> Subject: RE: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
> version 2.
> 
> 
> 
> --On 20 June 2013 07:47:12 +0000 Paul Durrant <Paul.Durrant@citrix.com>
> wrote:
> 
> > If someone has a VM that has the old Citrix drivers installed, or GPLPV,
> > I think I should be able to spot this and make sure that the new bus
> > driver quiesces itself to prevent strangeness ensuing. If and when said
> > previous drivers are un-installed then the new bus driver can wake up and
> > enumerate the device nodes for the other pv drivers and Windows Update
> > can carry on doing its stuff.
> 
> I have no clue about Windows device drivers, so this may be a silly
> suggestion. If your suggestion above already requires a Xen code change,
> one possibility might be copy the idea behind the PCI unplug logic. Either
> if the new PCI device is used, it could unplug the old one, or vice versa.
> Drivers magically unplugging themselves may not be ideal, but it beats
> having 2 drivers fighting over the same device.
> 

Unfortunately, whilst it sounds good on the face of it, it's not as straightforward as that. The old Citrix PV drivers did not just bind to the Xen platform device, and make that device go away automagically would actually cause the system disk to disappear without any clean fallback to emulation.
As long as nothing actually breaks if and when Windows fetches the new PV bus driver from Windows Update then we can document the need to manually uninstall any other PV drivers.

  Paul
Tim Deegan June 20, 2013, 8:56 a.m. UTC | #21
At 07:47 +0000 on 20 Jun (1371714432), Paul Durrant wrote:
> > > I agree. If this is really the only solution, we would need to have
> > > both versions presented to the guest so that old drivers continue to
> > > work without any intervention.
> > 
> > I suspect that if we expose both, both sets of drivers try to run the
> > same PV connections, and hilarity ensues.
> > 
> 
> Actually I think I can make that work, and it is the conclusion I came
> to after Alex's comment.

Ah, nice!  In that case, I'm a lot less worried -- we can just expose
both versions/devices by default and there's no need for a visible
control knob tied to driver version (except maybe for debugging).

It means an 'unsupported' device appearing on other/older OSes, which is
unfortunate, but ISTR only Windows really complains visibly about that.

Tim.
Paul Durrant June 20, 2013, 9:25 a.m. UTC | #22
> -----Original Message-----
> From: Tim Deegan [mailto:tim@xen.org]
> Sent: 20 June 2013 09:56
> To: Paul Durrant
> Cc: Matt Wilson; Alex Bligh; xen-devel@lists.xen.org; Ian Campbell; qemu-
> devel@nongnu.org
> Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
> version 2.
> 
> At 07:47 +0000 on 20 Jun (1371714432), Paul Durrant wrote:
> > > > I agree. If this is really the only solution, we would need to have
> > > > both versions presented to the guest so that old drivers continue to
> > > > work without any intervention.
> > >
> > > I suspect that if we expose both, both sets of drivers try to run the
> > > same PV connections, and hilarity ensues.
> > >
> >
> > Actually I think I can make that work, and it is the conclusion I came
> > to after Alex's comment.
> 
> Ah, nice!  In that case, I'm a lot less worried -- we can just expose
> both versions/devices by default and there's no need for a visible
> control knob tied to driver version (except maybe for debugging).
> 
> It means an 'unsupported' device appearing on other/older OSes, which is
> unfortunate, but ISTR only Windows really complains visibly about that.
> 

Yes, I think only Windows complains and we should be able to post an article somewhere saying 'don't worry about it' :-)

  Paul
Anthony Liguori June 20, 2013, 12:17 p.m. UTC | #23
Paul Durrant <paul.durrant@citrix.com> writes:

> The XenServer 6.1+ Citrix Windows PV bus driver binds to a new version
> of the Xen platform device (since the newer driver set cannot co-exist
> with previous drivers which bind to the existing "xen-platform" type of
> device). This patch introduces a new "xen-platform-2" device type with
> the appropriate device_id and revision.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> ---
>  hw/xen/xen_platform.c    |   75 ++++++++++++++++++++++++++++++++++++++--------
>  include/hw/pci/pci_ids.h |    1 +
>  2 files changed, 63 insertions(+), 13 deletions(-)
>
> diff --git a/hw/xen/xen_platform.c b/hw/xen/xen_platform.c
> index b6c6793..6edb850 100644
> --- a/hw/xen/xen_platform.c
> +++ b/hw/xen/xen_platform.c
> @@ -48,6 +48,20 @@
>  
>  #define PFFLAG_ROM_LOCK 1 /* Sets whether ROM memory area is RW or RO */
>  
> +typedef struct {
> +    const char *name;
> +    const char *desc;
> +    uint16_t device_id;
> +    uint8_t revision;
> +    uint16_t subsystem_vendor_id;
> +    uint16_t subsystem_id;
> +} PCIXenPlatformDeviceInfo;
> +
> +typedef struct PCIXenPlatformDeviceClass {
> +    PCIDeviceClass              parent_class;
> +    PCIXenPlatformDeviceInfo    info;
> +} PCIXenPlatformDeviceClass;
> +
>  typedef struct PCIXenPlatformState {
>      PCIDevice  pci_dev;
>      MemoryRegion fixed_io;
> @@ -372,8 +386,13 @@ static const VMStateDescription vmstate_xen_platform = {
>  static int xen_platform_initfn(PCIDevice *dev)
>  {
>      PCIXenPlatformState *d = DO_UPCAST(PCIXenPlatformState, pci_dev, dev);
> +    PCIDeviceClass *k = PCI_DEVICE_GET_CLASS(dev);
> +    __attribute__((unused)) PCIXenPlatformDeviceClass *u;
>      uint8_t *pci_conf;
>  
> +    u = container_of(k, PCIXenPlatformDeviceClass, parent_class);
> +    DPRINTF("initializing %s\n", u->info.name);
> +
>      pci_conf = d->pci_dev.config;
>  
>      pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_IO | PCI_COMMAND_MEMORY);
> @@ -402,33 +421,63 @@ static void platform_reset(DeviceState *dev)
>      platform_fixed_ioport_reset(s);
>  }
>  
> +static PCIXenPlatformDeviceInfo platform_devices[] = {
> +    {
> +        .name = "xen-platform",
> +        .desc = "XEN platform pci device (version 1)",
> +        .device_id = PCI_DEVICE_ID_XEN_PLATFORM,
> +        .revision = 1,
> +        .subsystem_vendor_id = PCI_VENDOR_ID_XEN,
> +        .subsystem_id = PCI_DEVICE_ID_XEN_PLATFORM,
> +    }, {
> +        .name = "xen-platform-2",
> +        .desc = "XEN platform pci device (version 2)",
> +        .device_id = PCI_DEVICE_ID_XEN_PLATFORM_V2,
> +        .revision = 2,
> +        .subsystem_vendor_id = PCI_VENDOR_ID_XEN,
> +        .subsystem_id = PCI_DEVICE_ID_XEN_PLATFORM_V2,
> +    }
> +};
> +
>  static void xen_platform_class_init(ObjectClass *klass, void *data)
>  {
>      DeviceClass *dc = DEVICE_CLASS(klass);
>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> +    PCIXenPlatformDeviceInfo *info = data;
> +    PCIXenPlatformDeviceClass *u;
> +
> +    u = container_of(k, PCIXenPlatformDeviceClass, parent_class);
>  
>      k->init = xen_platform_initfn;
>      k->vendor_id = PCI_VENDOR_ID_XEN;
> -    k->device_id = PCI_DEVICE_ID_XEN_PLATFORM;
> +    k->device_id = info->device_id;
>      k->class_id = PCI_CLASS_OTHERS << 8 | 0x80;
> -    k->subsystem_vendor_id = PCI_VENDOR_ID_XEN;
> -    k->subsystem_id = PCI_DEVICE_ID_XEN_PLATFORM;
> -    k->revision = 1;
> -    dc->desc = "XEN platform pci device";
> +    k->subsystem_vendor_id = info->subsystem_vendor_id;
> +    k->subsystem_id = info->subsystem_id;
> +    k->revision = info->revision;
> +    dc->desc = info->desc;
>      dc->reset = platform_reset;
>      dc->vmsd = &vmstate_xen_platform;
> +    u->info = *info;
>  }
>  
> -static const TypeInfo xen_platform_info = {
> -    .name          = "xen-platform",
> -    .parent        = TYPE_PCI_DEVICE,
> -    .instance_size = sizeof(PCIXenPlatformState),
> -    .class_init    = xen_platform_class_init,
> -};
> -
>  static void xen_platform_register_types(void)
>  {
> -    type_register_static(&xen_platform_info);
> +    TypeInfo type_info = {
> +        .parent        = TYPE_PCI_DEVICE,
> +        .instance_size = sizeof(PCIXenPlatformState),
> +        .class_size    = sizeof(PCIXenPlatformDeviceClass),
> +        .class_init    = xen_platform_class_init,
> +    };
> +    int i;
> +    for (i = 0; i < ARRAY_SIZE(platform_devices); i++) {
> +        PCIXenPlatformDeviceInfo *info = &platform_devices[i];
> +
> +        type_info.name = info->name;
> +        type_info.class_data = info;
> +        
> +        type_register(&type_info);
> +    }

I can't tell if this is an RFC or meant a complete patch.  But the
approach you're taking is overly complex.  v2 of the device can just
derive from v1 and in the class_init function change the PCI
information.

Also, if you are going to be adding logic for v2, you should use QOM
cast macros, not container_of.

I don't understand why two devices are required here and the thread
doesn't really answer that either.  Is there a spec for the Xen platform
devices?  Take a look at docs/specs for some examples in the tree.

It certainly helps to have one for discussions like this.

Regards,

Anthony Liguori

>  }
>  
>  type_init(xen_platform_register_types)
> diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
> index d8dc2f1..2039fba 100644
> --- a/include/hw/pci/pci_ids.h
> +++ b/include/hw/pci/pci_ids.h
> @@ -144,6 +144,7 @@
>  
>  #define PCI_VENDOR_ID_XEN               0x5853
>  #define PCI_DEVICE_ID_XEN_PLATFORM      0x0001
> +#define PCI_DEVICE_ID_XEN_PLATFORM_V2   0x0002
>  
>  #define PCI_VENDOR_ID_NEC                0x1033
>  #define PCI_DEVICE_ID_NEC_UPD720200      0x0194
> -- 
> 1.7.10.4
Paul Durrant June 20, 2013, 12:44 p.m. UTC | #24
> -----Original Message-----
> I don't understand why two devices are required here and the thread
> doesn't really answer that either.  Is there a spec for the Xen platform
> devices?  Take a look at docs/specs for some examples in the tree.
> 
> It certainly helps to have one for discussions like this.
> 

Anthony,

  I'm going to take a different approach so please disregard this patch now.

  Paul
Ian Campbell June 26, 2013, 10:38 a.m. UTC | #25
On Wed, 2013-06-19 at 12:31 +0100, Paul Durrant wrote:

> That's correct. If a vendor wishes to ship a new driver for an
> existing piece of h/w they just post it. However, at some point the
> vendor will sell a new piece of h/w which requires a driver that will
> not work with older h/w. So, they update their device id or revision
> to something new, leave the old driver on windows update alone, and
> post a new driver that will only bind to the new h/w.

Right, but that is not analogous to this change. This change is more
akin to the driver folks going to the hardware guys and saying "we've
ended up painted into a bit of corner, the easiest way to get out of it
is for you guys to reissue the exact same hardware (ASIC?) with a
different device id". I can imagine the hardware guys would be thrilled
with that prospect!

> This is model we have followed in XenServer: the platform device
> represents 'the set of PV drivers'

You may have followed that model in XenServer but it has never been the
case for upstream. The platform device simply provides an end point for
enabling/triggering baseline Xen detection/support. The "set of PV
drivers" (in so far as such a thing even exists in the mix and match
world of upstream Xen) is represented via the xenbus entries for the
devices and the associated feature flags and the like.

>  and therefore to ship a new and non-backwards-compatible set of PV
> drivers we incremented the platform device id. That fits with the way
> that Windows update works and seems like a totally reasonable way to
> think of the platform device IMO. Clearly my opinion, and the reality
> of Windows Update, are somewhat contentious :-(

What's contentious IMHO is adding hacks to the Xen (or qemu) code base
to workaround an issue which is entirely internal to the guest Windows
update/driver bindings.

Ian.
Ian Campbell June 26, 2013, 10:39 a.m. UTC | #26
On Thu, 2013-06-20 at 09:56 +0100, Tim Deegan wrote:
> At 07:47 +0000 on 20 Jun (1371714432), Paul Durrant wrote:
> > > > I agree. If this is really the only solution, we would need to have
> > > > both versions presented to the guest so that old drivers continue to
> > > > work without any intervention.
> > > 
> > > I suspect that if we expose both, both sets of drivers try to run the
> > > same PV connections, and hilarity ensues.
> > > 
> > 
> > Actually I think I can make that work, and it is the conclusion I came
> > to after Alex's comment.
> 
> Ah, nice!  In that case, I'm a lot less worried -- we can just expose
> both versions/devices by default and there's no need for a visible
> control knob tied to driver version (except maybe for debugging).
> 
> It means an 'unsupported' device appearing on other/older OSes, which is
> unfortunate, but ISTR only Windows really complains visibly about that.

I'm not at all convinced this is a good approach. Are we going to add a
third, forth and fifth device whenever Linux, BSD, $other-OS paint
themselves into a corner somehow WRT their internal driver model vs
their Xen PV drivers?

AFAIK the Citrix PV drivers have never been formally supported on
anything other than XenServer and XCP (and I'm not sure about "formally"
for XCP), so this is really an issue of supporting upgrades for people
running those. I think rather than making hacks upstream, which will
effectively need to be supported forever, the hack should be done on the
XenServer side and take advantage of whatever the supported upgrade path
is (N+1 or N+2 or whatever). This way the hack can eventually go away.
For anyone who grabbed the older drivers and used them outside of the
context of XenServer or XCP this is a documentation/awareness issue.

Can we use the blacklisting functionality of the PV unplug protocol to
blacklist previous versions of the Citrix PV drivers? I wouldn't
consider this an unsuitable thing to do in upstream, in fact it would be
using it for exactly the purpose for which it was designed. As long as
this is sufficient to boot with emulated devices in order to switch to
the newer drivers that should be good enough.

Ian.
Paul Durrant June 26, 2013, 11:23 a.m. UTC | #27
> -----Original Message-----

> From: Ian Campbell

> Sent: 26 June 2013 11:40

> To: Tim (Xen.org)

> Cc: Paul Durrant; Matt Wilson; Alex Bligh; xen-devel@lists.xen.org; qemu-

> devel@nongnu.org

> Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device

> version 2.

> 

> On Thu, 2013-06-20 at 09:56 +0100, Tim Deegan wrote:

> > At 07:47 +0000 on 20 Jun (1371714432), Paul Durrant wrote:

> > > > > I agree. If this is really the only solution, we would need to have

> > > > > both versions presented to the guest so that old drivers continue to

> > > > > work without any intervention.

> > > >

> > > > I suspect that if we expose both, both sets of drivers try to run the

> > > > same PV connections, and hilarity ensues.

> > > >

> > >

> > > Actually I think I can make that work, and it is the conclusion I came

> > > to after Alex's comment.

> >

> > Ah, nice!  In that case, I'm a lot less worried -- we can just expose

> > both versions/devices by default and there's no need for a visible

> > control knob tied to driver version (except maybe for debugging).

> >

> > It means an 'unsupported' device appearing on other/older OSes, which is

> > unfortunate, but ISTR only Windows really complains visibly about that.

> 

> I'm not at all convinced this is a good approach. Are we going to add a

> third, forth and fifth device whenever Linux, BSD, $other-OS paint

> themselves into a corner somehow WRT their internal driver model vs

> their Xen PV drivers?

> 


Is there any harm in having a separate device for each OS's PV drivers? I agree it's not entirely elegant, but at least it allows for revision control when you need it.

> AFAIK the Citrix PV drivers have never been formally supported on

> anything other than XenServer and XCP (and I'm not sure about "formally"

> for XCP), so this is really an issue of supporting upgrades for people

> running those. I think rather than making hacks upstream, which will

> effectively need to be supported forever, the hack should be done on the

> XenServer side and take advantage of whatever the supported upgrade path

> is (N+1 or N+2 or whatever). This way the hack can eventually go away.

> For anyone who grabbed the older drivers and used them outside of the

> context of XenServer or XCP this is a documentation/awareness issue.

> 

> Can we use the blacklisting functionality of the PV unplug protocol to

> blacklist previous versions of the Citrix PV drivers? I wouldn't

> consider this an unsuitable thing to do in upstream, in fact it would be

> using it for exactly the purpose for which it was designed. As long as

> this is sufficient to boot with emulated devices in order to switch to

> the newer drivers that should be good enough.

> 


We could blacklist all existing Citrix PV drivers in upstream QEMU, to avoid the clash, but that seems like a very unfriendly approach. Also, it's not going to stop someone with an existing VM, who happens to be using legacy Citrix PV drivers (an AWS VM for instance) receiving a driver from Windows Update that will blue-screen their VM on next reboot. Hence the only way forward is to bind the new drivers to something new, that we can control, so we know what driver a VM is going to get from Windows Update. And we may indeed need to modify its revision in future so that we can retire old sets of PV drivers and replace them with new ones, but only for newer XenServer releases. Thus, I also propose to make the PCI revision of the new device a command line parameter.

  Paul
Ian Campbell June 26, 2013, 11:53 a.m. UTC | #28
On Wed, 2013-06-26 at 12:23 +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Ian Campbell
> > Sent: 26 June 2013 11:40
> > To: Tim (Xen.org)
> > Cc: Paul Durrant; Matt Wilson; Alex Bligh; xen-devel@lists.xen.org; qemu-
> > devel@nongnu.org
> > Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
> > version 2.
> > 
> > On Thu, 2013-06-20 at 09:56 +0100, Tim Deegan wrote:
> > > At 07:47 +0000 on 20 Jun (1371714432), Paul Durrant wrote:
> > > > > > I agree. If this is really the only solution, we would need to have
> > > > > > both versions presented to the guest so that old drivers continue to
> > > > > > work without any intervention.
> > > > >
> > > > > I suspect that if we expose both, both sets of drivers try to run the
> > > > > same PV connections, and hilarity ensues.
> > > > >
> > > >
> > > > Actually I think I can make that work, and it is the conclusion I came
> > > > to after Alex's comment.
> > >
> > > Ah, nice!  In that case, I'm a lot less worried -- we can just expose
> > > both versions/devices by default and there's no need for a visible
> > > control knob tied to driver version (except maybe for debugging).
> > >
> > > It means an 'unsupported' device appearing on other/older OSes, which is
> > > unfortunate, but ISTR only Windows really complains visibly about that.
> > 
> > I'm not at all convinced this is a good approach. Are we going to add a
> > third, forth and fifth device whenever Linux, BSD, $other-OS paint
> > themselves into a corner somehow WRT their internal driver model vs
> > their Xen PV drivers?
> > 
> 
> Is there any harm in having a separate device for each OS's PV
> drivers? I agree it's not entirely elegant, but at least it allows for
> revision control when you need it.
> 
> > AFAIK the Citrix PV drivers have never been formally supported on
> > anything other than XenServer and XCP (and I'm not sure about "formally"
> > for XCP), so this is really an issue of supporting upgrades for people
> > running those. I think rather than making hacks upstream, which will
> > effectively need to be supported forever, the hack should be done on the
> > XenServer side and take advantage of whatever the supported upgrade path
> > is (N+1 or N+2 or whatever). This way the hack can eventually go away.
> > For anyone who grabbed the older drivers and used them outside of the
> > context of XenServer or XCP this is a documentation/awareness issue.
> > 
> > Can we use the blacklisting functionality of the PV unplug protocol to
> > blacklist previous versions of the Citrix PV drivers? I wouldn't
> > consider this an unsuitable thing to do in upstream, in fact it would be
> > using it for exactly the purpose for which it was designed. As long as
> > this is sufficient to boot with emulated devices in order to switch to
> > the newer drivers that should be good enough.
> > 
> 
> We could blacklist all existing Citrix PV drivers in upstream QEMU, to
> avoid the clash, but that seems like a very unfriendly approach. Also,
> it's not going to stop someone with an existing VM, who happens to be
> using legacy Citrix PV drivers (an AWS VM for instance) receiving a
> driver from Windows Update that will blue-screen their VM on next
> reboot. Hence the only way forward is to bind the new drivers to
> something new, that we can control, so we know what driver a VM is
> going to get from Windows Update.

Is it not also possible to issue a new version of the old driver via
Windows update, e.g. a stunt one which just disables itself? Or to have
the new one somehow reach out and nobble the old one. Or have the new
one detect the presence of the old one and refuse to install until it
has been removed?

>  And we may indeed need to modify its revision in future so that we
> can retire old sets of PV drivers and replace them with new ones, but
> only for newer XenServer releases. Thus, I also propose to make the
> PCI revision of the new device a command line parameter.

So this ugliness is really just the thin end of the wedge and all users
of these drivers in the future are going to need to be educated on which
magic qemu option they need to go with the version of the driver they
are running? And when they get an update they might need to go into
their configuration and change it or else risk a blue screen on the next
reboot? That sounds like madness to me...

It may be acceptable for XenServer to tie versions of the PV drivers to
specific versions of XenServer but I'm afraid that is not acceptable
upstream, we can't just go around willy nilly breaking compatibility
between front and backends, this is why we have feature flags,
negotiation and fallbacks.

It would be one thing to accept this unfortunate event and take a one
time hack to dig you out of a hole. But in that case it would really
need to be the case that the new version of the drivers are designed
with sufficient future proofing mechanisms that any future changes can
be handled internally. It seems to me like you intend to treat this
mechanism as an ongoing deliberate mechanism rather than a one off fix
for a historical mistake.

Ian.
Tim Deegan June 26, 2013, 11:57 a.m. UTC | #29
At 11:23 +0000 on 26 Jun (1372245783), Paul Durrant wrote:
>  We could blacklist all existing Citrix PV drivers in upstream QEMU,
> to avoid the clash, but that seems like a very unfriendly
> approach. Also, it's not going to stop someone with an existing VM,
> who happens to be using legacy Citrix PV drivers (an AWS VM for
> instance) receiving a driver from Windows Update that will blue-screen
> their VM on next reboot. Hence the only way forward is to bind the new
> drivers to something new, that we can control, so we know what driver
> a VM is going to get from Windows Update.

I don't think you ever got an answer about whether this could be
finagled using version numbers in the drivers.

Also: would it not be possible to have a blkfront driver (even a trivial
one) in your new bus driver so it can detect and avoid this problem?

Or: have your bus driver detect when the blkfront driver is missing and
not unplug the emulated disk?  In fact wouldn't having the emulated disk
driver do the unplug solve it for free?

> And we may indeed need to
> modify its revision in future so that we can retire old sets of PV
> drivers and replace them with new ones, but only for newer XenServer
> releases. Thus, I also propose to make the PCI revision of the new
> device a command line parameter.

I'd rather not.  This gets straight back to having host-admin controls
that have to manually be matched to in-guest software. 

Tim.
Paul Durrant June 26, 2013, 12:06 p.m. UTC | #30
> -----Original Message-----
> From: Tim Deegan [mailto:tim@xen.org]
> Sent: 26 June 2013 12:58
> To: Paul Durrant
> Cc: Ian Campbell; Matt Wilson; Alex Bligh; xen-devel@lists.xen.org; qemu-
> devel@nongnu.org
> Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
> version 2.
> 
> At 11:23 +0000 on 26 Jun (1372245783), Paul Durrant wrote:
> >  We could blacklist all existing Citrix PV drivers in upstream QEMU,
> > to avoid the clash, but that seems like a very unfriendly
> > approach. Also, it's not going to stop someone with an existing VM,
> > who happens to be using legacy Citrix PV drivers (an AWS VM for
> > instance) receiving a driver from Windows Update that will blue-screen
> > their VM on next reboot. Hence the only way forward is to bind the new
> > drivers to something new, that we can control, so we know what driver
> > a VM is going to get from Windows Update.
> 
> I don't think you ever got an answer about whether this could be
> finagled using version numbers in the drivers.

No, we thought about that and I don't believe it would be possible.
 
> Also: would it not be possible to have a blkfront driver (even a trivial
> one) in your new bus driver so it can detect and avoid this problem?
> 
> Or: have your bus driver detect when the blkfront driver is missing and
> not unplug the emulated disk?  In fact wouldn't having the emulated disk
> driver do the unplug solve it for free?
>

The issue is the old s/w not the new s/w. The old drivers make assumptions about each other's presence as we can't change that because they are already out there.
 
> > And we may indeed need to
> > modify its revision in future so that we can retire old sets of PV
> > drivers and replace them with new ones, but only for newer XenServer
> > releases. Thus, I also propose to make the PCI revision of the new
> > device a command line parameter.
> 
> I'd rather not.  This gets straight back to having host-admin controls
> that have to manually be matched to in-guest software.
> 

Well not really. This is just the same as a h/w vendor shipping a new device. The drivers for the old device are still there on Windows Update; so no change. The new drivers are for the new device so only download when it is present. Setting the revision number for the 'citrix pv bus' device would only be like choosing which emulated NIC you want.

  Paul
Tim Deegan June 26, 2013, 12:36 p.m. UTC | #31
At 12:06 +0000 on 26 Jun (1372248391), Paul Durrant wrote:
> > -----Original Message-----
> > From: Tim Deegan [mailto:tim@xen.org]
> > Sent: 26 June 2013 12:58
> > To: Paul Durrant
> > Cc: Ian Campbell; Matt Wilson; Alex Bligh; xen-devel@lists.xen.org; qemu-
> > devel@nongnu.org
> > Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
> > version 2.
> > 
> > At 11:23 +0000 on 26 Jun (1372245783), Paul Durrant wrote:
> > >  We could blacklist all existing Citrix PV drivers in upstream QEMU,
> > > to avoid the clash, but that seems like a very unfriendly
> > > approach. Also, it's not going to stop someone with an existing VM,
> > > who happens to be using legacy Citrix PV drivers (an AWS VM for
> > > instance) receiving a driver from Windows Update that will blue-screen
> > > their VM on next reboot. Hence the only way forward is to bind the new
> > > drivers to something new, that we can control, so we know what driver
> > > a VM is going to get from Windows Update.
> > 
> > I don't think you ever got an answer about whether this could be
> > finagled using version numbers in the drivers.
> 
> No, we thought about that and I don't believe it would be possible.

This doc makes it look like it's just a matter of choosing appropriate
version and dates:
http://msdn.microsoft.com/en-us/library/windows/hardware/gg487453.aspx

> > Also: would it not be possible to have a blkfront driver (even a trivial
> > one) in your new bus driver so it can detect and avoid this problem?
> > 
> > Or: have your bus driver detect when the blkfront driver is missing and
> > not unplug the emulated disk?  In fact wouldn't having the emulated disk
> > driver do the unplug solve it for free?
> 
> The issue is the old s/w not the new s/w. The old drivers make
> assumptions about each other's presence as we can't change that
> because they are already out there.

The issue you mentioned was that the old drivers bound the block driver
to the PCI device, and when your new driver is installed you get STOP 7B
because the system disk is missing (because I guess you'd need the new
xenbus driver to come up before it will trigger installing the new block
driver).

So can the new driver not fix this, either by running a trivial blkfront
itself or by allowing the emulated IDE controller to live?

> > > And we may indeed need to
> > > modify its revision in future so that we can retire old sets of PV
> > > drivers and replace them with new ones, but only for newer XenServer
> > > releases. Thus, I also propose to make the PCI revision of the new
> > > device a command line parameter.
> > 
> > I'd rather not.  This gets straight back to having host-admin controls
> > that have to manually be matched to in-guest software.
> 
> Well not really. This is just the same as a h/w vendor shipping a new
> device.

Well, that would be more like having the PCI revision reflect the Xen
version.  Which might be a reasonable idea, if there is to be a second
PCI device.  

But metaphors aside, it's still requiring an admin change in order to
match the software inside the VM, which as we've seen is unpopular with
admins. :)

Tim.
Paul Durrant June 26, 2013, 1 p.m. UTC | #32
> -----Original Message-----
> From: Tim Deegan [mailto:tim@xen.org]
> Sent: 26 June 2013 13:36
> To: Paul Durrant
> Cc: Ian Campbell; Matt Wilson; Alex Bligh; xen-devel@lists.xen.org; qemu-
> devel@nongnu.org
> Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
> version 2.
> 
> At 12:06 +0000 on 26 Jun (1372248391), Paul Durrant wrote:
> > > -----Original Message-----
> > > From: Tim Deegan [mailto:tim@xen.org]
> > > Sent: 26 June 2013 12:58
> > > To: Paul Durrant
> > > Cc: Ian Campbell; Matt Wilson; Alex Bligh; xen-devel@lists.xen.org;
> qemu-
> > > devel@nongnu.org
> > > Subject: Re: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI
> device
> > > version 2.
> > >
> > > At 11:23 +0000 on 26 Jun (1372245783), Paul Durrant wrote:
> > > >  We could blacklist all existing Citrix PV drivers in upstream QEMU,
> > > > to avoid the clash, but that seems like a very unfriendly
> > > > approach. Also, it's not going to stop someone with an existing VM,
> > > > who happens to be using legacy Citrix PV drivers (an AWS VM for
> > > > instance) receiving a driver from Windows Update that will blue-screen
> > > > their VM on next reboot. Hence the only way forward is to bind the
> new
> > > > drivers to something new, that we can control, so we know what driver
> > > > a VM is going to get from Windows Update.
> > >
> > > I don't think you ever got an answer about whether this could be
> > > finagled using version numbers in the drivers.
> >
> > No, we thought about that and I don't believe it would be possible.
> 
> This doc makes it look like it's just a matter of choosing appropriate
> version and dates:
> http://msdn.microsoft.com/en-
> us/library/windows/hardware/gg487453.aspx
> 

The issue, I believe, is that DriverVerDate trumps DriverVerVersion and AFAIK we can't fake an old date because the driver signing process will not allow it.

> > > Also: would it not be possible to have a blkfront driver (even a trivial
> > > one) in your new bus driver so it can detect and avoid this problem?
> > >
> > > Or: have your bus driver detect when the blkfront driver is missing and
> > > not unplug the emulated disk?  In fact wouldn't having the emulated disk
> > > driver do the unplug solve it for free?
> >
> > The issue is the old s/w not the new s/w. The old drivers make
> > assumptions about each other's presence as we can't change that
> > because they are already out there.
> 
> The issue you mentioned was that the old drivers bound the block driver
> to the PCI device, and when your new driver is installed you get STOP 7B
> because the system disk is missing (because I guess you'd need the new
> xenbus driver to come up before it will trigger installing the new block
> driver).
> 
> So can the new driver not fix this, either by running a trivial blkfront
> itself or by allowing the emulated IDE controller to live?
> 

Aha, the old drivers have a registry override which will cause them to quiesce so I could have the new driver go and set that to make them shut up. That might allow us to cleanly take over on the next reboot. 

> > > > And we may indeed need to
> > > > modify its revision in future so that we can retire old sets of PV
> > > > drivers and replace them with new ones, but only for newer XenServer
> > > > releases. Thus, I also propose to make the PCI revision of the new
> > > > device a command line parameter.
> > >
> > > I'd rather not.  This gets straight back to having host-admin controls
> > > that have to manually be matched to in-guest software.
> >
> > Well not really. This is just the same as a h/w vendor shipping a new
> > device.
> 
> Well, that would be more like having the PCI revision reflect the Xen
> version.  Which might be a reasonable idea, if there is to be a second
> PCI device.
> 
> But metaphors aside, it's still requiring an admin change in order to
> match the software inside the VM, which as we've seen is unpopular with
> admins. :)
> 

Yes, but that's more of a business decision than an architectural one. As you say, it may well be unpopular and thus is something I'd prefer we didn't do unless we absolutely have to. What we can't do is have drivers being downloaded from Windows Update into systems where we know they are going to break things so I would like the new device to have enough tunability to avoid a trainwreck, and a parameterised revision is just enough :-) Thus, I still prefer to have a new device and leave the old one alone.

  Paul
Alex Bligh June 26, 2013, 8 p.m. UTC | #33
--On 26 June 2013 12:06:31 +0000 Paul Durrant <Paul.Durrant@citrix.com> 
wrote:

> The issue is the old s/w not the new s/w. The old drivers make
> assumptions about each other's presence as we can't change that because
> they are already out there.

Then (without knowing the details) what's to prevent the new drivers not
making such assumptions, and carrying some versioning information, such
that we need one new PCI device now, but no more in the future?
Paul Durrant June 27, 2013, 8:29 a.m. UTC | #34
> -----Original Message-----
> From: Alex Bligh [mailto:alex@alex.org.uk]
> Sent: 26 June 2013 21:00
> To: Paul Durrant; Tim (Xen.org)
> Cc: Ian Campbell; Matt Wilson; xen-devel@lists.xen.org; qemu-
> devel@nongnu.org; Alex Bligh
> Subject: RE: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI device
> version 2.
> 
> 
> 
> --On 26 June 2013 12:06:31 +0000 Paul Durrant <Paul.Durrant@citrix.com>
> wrote:
> 
> > The issue is the old s/w not the new s/w. The old drivers make
> > assumptions about each other's presence as we can't change that because
> > they are already out there.
> 
> Then (without knowing the details) what's to prevent the new drivers not
> making such assumptions, and carrying some versioning information, such
> that we need one new PCI device now, but no more in the future?
> 

The new drivers are architected very differently such that they are suitable for Windows Update. That means each driver is separately installable and upgradeable and can make no assumptions about presence or version of any other driver. Thus all discovery is done at runtime and each individual interface carries a version number.
I still think we need the option to control some aspect of the PCI device that the top level bus driver binds to so that we have the possibility of using different PV drivers sets on different VMs. I'm not saying that this is necessarily a good idea but the idea of a virtual hardware platform version (which is essentially what I believe this top level device represents, since we have no other way of indicating that to Windows Update) has precedent; VMWare have had such a concept for a very long time (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1003746) and so it seems quite reasonable for a product such as XenServer to have a similar concept. I'll submit code for a new Citrix PV Bus device shortly.

  Paul
Paul Durrant June 27, 2013, 10:58 a.m. UTC | #35
> -----Original Message-----
> From: qemu-devel-bounces+paul.durrant=citrix.com@nongnu.org
> [mailto:qemu-devel-bounces+paul.durrant=citrix.com@nongnu.org] On
> Behalf Of Paul Durrant
> Sent: 27 June 2013 09:29
> To: Alex Bligh; Tim (Xen.org)
> Cc: qemu-devel@nongnu.org; Ian Campbell; Matt Wilson; xen-
> devel@lists.xen.org
> Subject: Re: [Qemu-devel] [Xen-devel] [PATCH] Add Xen platform PCI device
> version 2.
> 
> > -----Original Message-----
> > From: Alex Bligh [mailto:alex@alex.org.uk]
> > Sent: 26 June 2013 21:00
> > To: Paul Durrant; Tim (Xen.org)
> > Cc: Ian Campbell; Matt Wilson; xen-devel@lists.xen.org; qemu-
> > devel@nongnu.org; Alex Bligh
> > Subject: RE: [Xen-devel] [Qemu-devel] [PATCH] Add Xen platform PCI
> device
> > version 2.
> >
> >
> >
> > --On 26 June 2013 12:06:31 +0000 Paul Durrant <Paul.Durrant@citrix.com>
> > wrote:
> >
> > > The issue is the old s/w not the new s/w. The old drivers make
> > > assumptions about each other's presence as we can't change that
> because
> > > they are already out there.
> >
> > Then (without knowing the details) what's to prevent the new drivers not
> > making such assumptions, and carrying some versioning information, such
> > that we need one new PCI device now, but no more in the future?
> >
> 
> The new drivers are architected very differently such that they are suitable
> for Windows Update. That means each driver is separately installable and
> upgradeable and can make no assumptions about presence or version of any
> other driver. Thus all discovery is done at runtime and each individual
> interface carries a version number.
> I still think we need the option to control some aspect of the PCI device that
> the top level bus driver binds to so that we have the possibility of using
> different PV drivers sets on different VMs. I'm not saying that this is
> necessarily a good idea but the idea of a virtual hardware platform version
> (which is essentially what I believe this top level device represents, since we
> have no other way of indicating that to Windows Update) has precedent;
> VMWare have had such a concept for a very long time
> (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US
> &cmd=displayKC&externalId=1003746) and so it seems quite reasonable for
> a product such as XenServer to have a similar concept. I'll submit code for a
> new Citrix PV Bus device shortly.
> 

I had a chat with Tim and there may be chance we can do something sensible and still bind to the usual Xen platform device ID so I'm going to give that a try first.

  Paul
diff mbox

Patch

diff --git a/hw/xen/xen_platform.c b/hw/xen/xen_platform.c
index b6c6793..6edb850 100644
--- a/hw/xen/xen_platform.c
+++ b/hw/xen/xen_platform.c
@@ -48,6 +48,20 @@ 
 
 #define PFFLAG_ROM_LOCK 1 /* Sets whether ROM memory area is RW or RO */
 
+typedef struct {
+    const char *name;
+    const char *desc;
+    uint16_t device_id;
+    uint8_t revision;
+    uint16_t subsystem_vendor_id;
+    uint16_t subsystem_id;
+} PCIXenPlatformDeviceInfo;
+
+typedef struct PCIXenPlatformDeviceClass {
+    PCIDeviceClass              parent_class;
+    PCIXenPlatformDeviceInfo    info;
+} PCIXenPlatformDeviceClass;
+
 typedef struct PCIXenPlatformState {
     PCIDevice  pci_dev;
     MemoryRegion fixed_io;
@@ -372,8 +386,13 @@  static const VMStateDescription vmstate_xen_platform = {
 static int xen_platform_initfn(PCIDevice *dev)
 {
     PCIXenPlatformState *d = DO_UPCAST(PCIXenPlatformState, pci_dev, dev);
+    PCIDeviceClass *k = PCI_DEVICE_GET_CLASS(dev);
+    __attribute__((unused)) PCIXenPlatformDeviceClass *u;
     uint8_t *pci_conf;
 
+    u = container_of(k, PCIXenPlatformDeviceClass, parent_class);
+    DPRINTF("initializing %s\n", u->info.name);
+
     pci_conf = d->pci_dev.config;
 
     pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_IO | PCI_COMMAND_MEMORY);
@@ -402,33 +421,63 @@  static void platform_reset(DeviceState *dev)
     platform_fixed_ioport_reset(s);
 }
 
+static PCIXenPlatformDeviceInfo platform_devices[] = {
+    {
+        .name = "xen-platform",
+        .desc = "XEN platform pci device (version 1)",
+        .device_id = PCI_DEVICE_ID_XEN_PLATFORM,
+        .revision = 1,
+        .subsystem_vendor_id = PCI_VENDOR_ID_XEN,
+        .subsystem_id = PCI_DEVICE_ID_XEN_PLATFORM,
+    }, {
+        .name = "xen-platform-2",
+        .desc = "XEN platform pci device (version 2)",
+        .device_id = PCI_DEVICE_ID_XEN_PLATFORM_V2,
+        .revision = 2,
+        .subsystem_vendor_id = PCI_VENDOR_ID_XEN,
+        .subsystem_id = PCI_DEVICE_ID_XEN_PLATFORM_V2,
+    }
+};
+
 static void xen_platform_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
     PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+    PCIXenPlatformDeviceInfo *info = data;
+    PCIXenPlatformDeviceClass *u;
+
+    u = container_of(k, PCIXenPlatformDeviceClass, parent_class);
 
     k->init = xen_platform_initfn;
     k->vendor_id = PCI_VENDOR_ID_XEN;
-    k->device_id = PCI_DEVICE_ID_XEN_PLATFORM;
+    k->device_id = info->device_id;
     k->class_id = PCI_CLASS_OTHERS << 8 | 0x80;
-    k->subsystem_vendor_id = PCI_VENDOR_ID_XEN;
-    k->subsystem_id = PCI_DEVICE_ID_XEN_PLATFORM;
-    k->revision = 1;
-    dc->desc = "XEN platform pci device";
+    k->subsystem_vendor_id = info->subsystem_vendor_id;
+    k->subsystem_id = info->subsystem_id;
+    k->revision = info->revision;
+    dc->desc = info->desc;
     dc->reset = platform_reset;
     dc->vmsd = &vmstate_xen_platform;
+    u->info = *info;
 }
 
-static const TypeInfo xen_platform_info = {
-    .name          = "xen-platform",
-    .parent        = TYPE_PCI_DEVICE,
-    .instance_size = sizeof(PCIXenPlatformState),
-    .class_init    = xen_platform_class_init,
-};
-
 static void xen_platform_register_types(void)
 {
-    type_register_static(&xen_platform_info);
+    TypeInfo type_info = {
+        .parent        = TYPE_PCI_DEVICE,
+        .instance_size = sizeof(PCIXenPlatformState),
+        .class_size    = sizeof(PCIXenPlatformDeviceClass),
+        .class_init    = xen_platform_class_init,
+    };
+    int i;
+    for (i = 0; i < ARRAY_SIZE(platform_devices); i++) {
+        PCIXenPlatformDeviceInfo *info = &platform_devices[i];
+
+        type_info.name = info->name;
+        type_info.class_data = info;
+        
+        type_register(&type_info);
+    }
 }
 
 type_init(xen_platform_register_types)
diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
index d8dc2f1..2039fba 100644
--- a/include/hw/pci/pci_ids.h
+++ b/include/hw/pci/pci_ids.h
@@ -144,6 +144,7 @@ 
 
 #define PCI_VENDOR_ID_XEN               0x5853
 #define PCI_DEVICE_ID_XEN_PLATFORM      0x0001
+#define PCI_DEVICE_ID_XEN_PLATFORM_V2   0x0002
 
 #define PCI_VENDOR_ID_NEC                0x1033
 #define PCI_DEVICE_ID_NEC_UPD720200      0x0194