diff mbox series

xen-pcifront: Handle missed Connected state

Message ID 20220829151536.8578-1-jandryuk@gmail.com
State New
Headers show
Series xen-pcifront: Handle missed Connected state | expand

Commit Message

Jason Andryuk Aug. 29, 2022, 3:15 p.m. UTC
An HVM guest with linux stubdom and 2 PCI devices failed to start as
libxl timed out waiting for the PCI devices to be added.  It happens
intermittently but with some regularity.  libxl wrote the two xenstore
entries for the devices, but then timed out waiting for backend state 4
(Connected) - the state stayed at 7 (Reconfiguring).  (PCI passthrough
to an HVM with stubdomain is PV passthrough to the stubdomain and then
HVM passthrough with the QEMU inside the stubdomain.)

The stubdom kernel never printed "pcifront pci-0: Installing PCI
frontend", so it seems to have missed state 4 which would have
called pcifront_try_connect -> pcifront_connect_and_init_dma

Have pcifront_detach_devices special-case state Initialised and call
pcifront_connect_and_init_dma.  Don't use pcifront_try_connect because
that sets the xenbus state which may throw off the backend.  After
connecting, skip the remainder of detach_devices since none have been
initialized yet.  When the backend switches to Reconfigured,
pcifront_attach_devices will pick them up again.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
---
 drivers/pci/xen-pcifront.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Comments

Rich Persaud Sept. 1, 2022, 2:35 a.m. UTC | #1
On Aug 29, 2022, at 11:16 AM, Jason Andryuk <jandryuk@gmail.com> wrote:
> 
> An HVM guest with linux stubdom and 2 PCI devices failed to start as
> libxl timed out waiting for the PCI devices to be added.  It happens
> intermittently but with some regularity.  libxl wrote the two xenstore
> entries for the devices, but then timed out waiting for backend state 4
> (Connected) - the state stayed at 7 (Reconfiguring).  (PCI passthrough
> to an HVM with stubdomain is PV passthrough to the stubdomain and then
> HVM passthrough with the QEMU inside the stubdomain.)
> 
> The stubdom kernel never printed "pcifront pci-0: Installing PCI
> frontend", so it seems to have missed state 4 which would have
> called pcifront_try_connect -> pcifront_connect_and_init_dma

Is there a state machine doc/flowchart for LibXL and Xen PCI device passthrough to Linux? This would be a valuable addition to Xen's developer docs, even as a whiteboard photo in this thread.

Rich
Jason Andryuk Sept. 1, 2022, 12:55 p.m. UTC | #2
On Wed, Aug 31, 2022 at 10:35 PM Rich Persaud <persaur@gmail.com> wrote:
>
> On Aug 29, 2022, at 11:16 AM, Jason Andryuk <jandryuk@gmail.com> wrote:
> >
> > An HVM guest with linux stubdom and 2 PCI devices failed to start as
> > libxl timed out waiting for the PCI devices to be added.  It happens
> > intermittently but with some regularity.  libxl wrote the two xenstore
> > entries for the devices, but then timed out waiting for backend state 4
> > (Connected) - the state stayed at 7 (Reconfiguring).  (PCI passthrough
> > to an HVM with stubdomain is PV passthrough to the stubdomain and then
> > HVM passthrough with the QEMU inside the stubdomain.)
> >
> > The stubdom kernel never printed "pcifront pci-0: Installing PCI
> > frontend", so it seems to have missed state 4 which would have
> > called pcifront_try_connect -> pcifront_connect_and_init_dma
>
> Is there a state machine doc/flowchart for LibXL and Xen PCI device passthrough to Linux? This would be a valuable addition to Xen's developer docs, even as a whiteboard photo in this thread.

I am not aware of one.

-Jason
Bjorn Helgaas Sept. 2, 2022, 4:59 p.m. UTC | #3
The conventional style for subject (from "git log --oneline") is:

  xen/pcifront: Handle ...

On Mon, Aug 29, 2022 at 11:15:36AM -0400, Jason Andryuk wrote:
> An HVM guest with linux stubdom and 2 PCI devices failed to start as

"stubdom" might be handy shorthand in the Xen world, but I think
it would be nice to consistently spell out "stubdomain" since you use
both forms randomly in this commit log and newbies like me have to
wonder whether they're the same or different.

> libxl timed out waiting for the PCI devices to be added.  It happens
> intermittently but with some regularity.  libxl wrote the two xenstore
> entries for the devices, but then timed out waiting for backend state 4
> (Connected) - the state stayed at 7 (Reconfiguring).  (PCI passthrough
> to an HVM with stubdomain is PV passthrough to the stubdomain and then
> HVM passthrough with the QEMU inside the stubdomain.)
> 
> The stubdom kernel never printed "pcifront pci-0: Installing PCI
> frontend", so it seems to have missed state 4 which would have
> called pcifront_try_connect -> pcifront_connect_and_init_dma

Add "()" after function names for clarity.

> Have pcifront_detach_devices special-case state Initialised and call
> pcifront_connect_and_init_dma.  Don't use pcifront_try_connect because
> that sets the xenbus state which may throw off the backend.  After
> connecting, skip the remainder of detach_devices since none have been
> initialized yet.  When the backend switches to Reconfigured,
> pcifront_attach_devices will pick them up again.

Bjorn
Jason Andryuk Sept. 6, 2022, 3:18 p.m. UTC | #4
On Fri, Sep 2, 2022 at 12:59 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> The conventional style for subject (from "git log --oneline") is:
>
>   xen/pcifront: Handle ...
>
> On Mon, Aug 29, 2022 at 11:15:36AM -0400, Jason Andryuk wrote:
> > An HVM guest with linux stubdom and 2 PCI devices failed to start as
>
> "stubdom" might be handy shorthand in the Xen world, but I think
> it would be nice to consistently spell out "stubdomain" since you use
> both forms randomly in this commit log and newbies like me have to
> wonder whether they're the same or different.
>
> > libxl timed out waiting for the PCI devices to be added.  It happens
> > intermittently but with some regularity.  libxl wrote the two xenstore
> > entries for the devices, but then timed out waiting for backend state 4
> > (Connected) - the state stayed at 7 (Reconfiguring).  (PCI passthrough
> > to an HVM with stubdomain is PV passthrough to the stubdomain and then
> > HVM passthrough with the QEMU inside the stubdomain.)
> >
> > The stubdom kernel never printed "pcifront pci-0: Installing PCI
> > frontend", so it seems to have missed state 4 which would have
> > called pcifront_try_connect -> pcifront_connect_and_init_dma
>
> Add "()" after function names for clarity.
>
> > Have pcifront_detach_devices special-case state Initialised and call
> > pcifront_connect_and_init_dma.  Don't use pcifront_try_connect because
> > that sets the xenbus state which may throw off the backend.  After
> > connecting, skip the remainder of detach_devices since none have been
> > initialized yet.  When the backend switches to Reconfigured,
> > pcifront_attach_devices will pick them up again.

Thanks for taking a look, Bjorn.  That all sounds good.  I'll wait a
little longer to see if there is any more feedback before sending a
v2.

Regards,
Jason
Jürgen Groß Oct. 4, 2022, 2:22 p.m. UTC | #5
On 29.08.22 17:15, Jason Andryuk wrote:
> An HVM guest with linux stubdom and 2 PCI devices failed to start as
> libxl timed out waiting for the PCI devices to be added.  It happens
> intermittently but with some regularity.  libxl wrote the two xenstore
> entries for the devices, but then timed out waiting for backend state 4
> (Connected) - the state stayed at 7 (Reconfiguring).  (PCI passthrough
> to an HVM with stubdomain is PV passthrough to the stubdomain and then
> HVM passthrough with the QEMU inside the stubdomain.)
> 
> The stubdom kernel never printed "pcifront pci-0: Installing PCI
> frontend", so it seems to have missed state 4 which would have
> called pcifront_try_connect -> pcifront_connect_and_init_dma
> 
> Have pcifront_detach_devices special-case state Initialised and call
> pcifront_connect_and_init_dma.  Don't use pcifront_try_connect because
> that sets the xenbus state which may throw off the backend.  After
> connecting, skip the remainder of detach_devices since none have been
> initialized yet.  When the backend switches to Reconfigured,
> pcifront_attach_devices will pick them up again.
> 
> Signed-off-by: Jason Andryuk <jandryuk@gmail.com>

Reviewed-by: Juergen Gross <jgross@suse.com>

The modifications of the commit message requested by Bjorn can be done
while committing.


Juergen
Jürgen Groß Oct. 6, 2022, 5:12 a.m. UTC | #6
On 29.08.22 17:15, Jason Andryuk wrote:
> An HVM guest with linux stubdom and 2 PCI devices failed to start as
> libxl timed out waiting for the PCI devices to be added.  It happens
> intermittently but with some regularity.  libxl wrote the two xenstore
> entries for the devices, but then timed out waiting for backend state 4
> (Connected) - the state stayed at 7 (Reconfiguring).  (PCI passthrough
> to an HVM with stubdomain is PV passthrough to the stubdomain and then
> HVM passthrough with the QEMU inside the stubdomain.)
> 
> The stubdom kernel never printed "pcifront pci-0: Installing PCI
> frontend", so it seems to have missed state 4 which would have
> called pcifront_try_connect -> pcifront_connect_and_init_dma
> 
> Have pcifront_detach_devices special-case state Initialised and call
> pcifront_connect_and_init_dma.  Don't use pcifront_try_connect because
> that sets the xenbus state which may throw off the backend.  After
> connecting, skip the remainder of detach_devices since none have been
> initialized yet.  When the backend switches to Reconfigured,
> pcifront_attach_devices will pick them up again.
> 
> Signed-off-by: Jason Andryuk <jandryuk@gmail.com>

Pushed to xen/tip.git for-linus-6.1


Juergen
diff mbox series

Patch

--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -1012,13 +1012,26 @@  static int pcifront_detach_devices(struc
 {
 	int err = 0;
 	int i, num_devs;
+	enum xenbus_state state;
 	unsigned int domain, bus, slot, func;
 	struct pci_dev *pci_dev;
 	char str[64];
 
-	if (xenbus_read_driver_state(pdev->xdev->nodename) !=
-	    XenbusStateConnected)
+	state = xenbus_read_driver_state(pdev->xdev->nodename);
+	if (state == XenbusStateInitialised) {
+		dev_dbg(&pdev->xdev->dev, "Handle skipped connect.\n");
+		/* We missed Connected and need to initialize. */
+		err = pcifront_connect_and_init_dma(pdev);
+		if (err && err != -EEXIST) {
+			xenbus_dev_fatal(pdev->xdev, err,
+					 "Error setting up PCI Frontend");
+			goto out;
+		}
+
+		goto out_switch_state;
+	} else if (state != XenbusStateConnected) {
 		goto out;
+	}
 
 	err = xenbus_scanf(XBT_NIL, pdev->xdev->otherend, "num_devs", "%d",
 			   &num_devs);
@@ -1079,6 +1092,7 @@  static int pcifront_detach_devices(struc
 			domain, bus, slot, func);
 	}
 
+ out_switch_state:
 	err = xenbus_switch_state(pdev->xdev, XenbusStateReconfiguring);
 
 out: