diff mbox series

[RFC,for-5.1,4/4] spapr: Don't allow unplug of NVLink2 devices

Message ID 20200326054009.454477-5-david@gibson.dropbear.id.au
State New
Headers show
Series Better handling of attempt NVLink2 unplug | expand

Commit Message

David Gibson March 26, 2020, 5:40 a.m. UTC
Currently, we can't properly handle unplug of NVLink2 devices, because we
don't have code to tear down their special memory resources.  There's not
a lot of impetus to implement that. Since hardware NVLink2 devices can't
be hot unplugged, the guest side drivers don't usually support unplug
anyway.

Therefore, simply prevent unplug of NVLink2 devices.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_pci.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Greg Kurz March 26, 2020, 12:27 p.m. UTC | #1
On Thu, 26 Mar 2020 16:40:09 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> Currently, we can't properly handle unplug of NVLink2 devices, because we
> don't have code to tear down their special memory resources.  There's not
> a lot of impetus to implement that. Since hardware NVLink2 devices can't
> be hot unplugged, the guest side drivers don't usually support unplug
> anyway.
> 
> Therefore, simply prevent unplug of NVLink2 devices.
> 

This could maybe considered as a valid fix for 5.0 since this prevents
guest crashes IIUC. But since this requires the two preliminary cleanup
patches, I understand you may prefer to postpone that to 5.1.

> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/ppc/spapr_pci.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 55ca9dee1e..5c8262413a 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1666,6 +1666,11 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
>              return;
>          }
>  
> +        if (spapr_phb_is_nvlink_dev(pdev, phb)) {
> +            error_setg(errp, "PCI: Cannot unplug NVLink2 devices");
> +            return;
> +        }
> +
>          /* ensure any other present functions are pending unplug */
>          if (PCI_FUNC(pdev->devfn) == 0) {
>              for (i = 1; i < 8; i++) {
David Gibson March 26, 2020, 11:56 p.m. UTC | #2
On Thu, Mar 26, 2020 at 01:27:40PM +0100, Greg Kurz wrote:
> On Thu, 26 Mar 2020 16:40:09 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > Currently, we can't properly handle unplug of NVLink2 devices, because we
> > don't have code to tear down their special memory resources.  There's not
> > a lot of impetus to implement that. Since hardware NVLink2 devices can't
> > be hot unplugged, the guest side drivers don't usually support unplug
> > anyway.
> > 
> > Therefore, simply prevent unplug of NVLink2 devices.
> > 
> 
> This could maybe considered as a valid fix for 5.0 since this prevents
> guest crashes IIUC. But since this requires the two preliminary cleanup
> patches, I understand you may prefer to postpone that to 5.1.

Yeah, it's arguably a bug, but not a regression, so I'm inclined to
leave it to 5.1.

> 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> 
> Reviewed-by: Greg Kurz <groug@kaod.org>
> 
> >  hw/ppc/spapr_pci.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 55ca9dee1e..5c8262413a 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -1666,6 +1666,11 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
> >              return;
> >          }
> >  
> > +        if (spapr_phb_is_nvlink_dev(pdev, phb)) {
> > +            error_setg(errp, "PCI: Cannot unplug NVLink2 devices");
> > +            return;
> > +        }
> > +
> >          /* ensure any other present functions are pending unplug */
> >          if (PCI_FUNC(pdev->devfn) == 0) {
> >              for (i = 1; i < 8; i++) {
>
Alexey Kardashevskiy March 28, 2020, 12:32 p.m. UTC | #3
On 26/03/2020 16:40, David Gibson wrote:
> Currently, we can't properly handle unplug of NVLink2 devices, because we
> don't have code to tear down their special memory resources.  There's not
> a lot of impetus to implement that. Since hardware NVLink2 devices can't
> be hot unplugged, the guest side drivers don't usually support unplug
> anyway.
> 
> Therefore, simply prevent unplug of NVLink2 devices.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr_pci.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 55ca9dee1e..5c8262413a 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1666,6 +1666,11 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
>              return;
>          }
>  
> +        if (spapr_phb_is_nvlink_dev(pdev, phb)) {
> +            error_setg(errp, "PCI: Cannot unplug NVLink2 devices");
> +            return;
> +        }


Just this would do as well:

Object *po = OBJECT(pdev);
uint64_t tgt = object_property_get_uint(po, "nvlink2-tgt", NULL);

if (tgt) {
     error_setg(errp, "PCI: Cannot unplug NVLink2 devices");
     return;
}

honestly, I admin what 1/4 fixes is cryptic but since there is not going
to be any more new nvlinkX, this does not deserve this many patches imho.

	

> +
>          /* ensure any other present functions are pending unplug */
>          if (PCI_FUNC(pdev->devfn) == 0) {
>              for (i = 1; i < 8; i++) {
>
David Gibson March 31, 2020, 3:25 a.m. UTC | #4
On Sat, Mar 28, 2020 at 11:32:18PM +1100, Alexey Kardashevskiy wrote:
> 
> 
> On 26/03/2020 16:40, David Gibson wrote:
> > Currently, we can't properly handle unplug of NVLink2 devices, because we
> > don't have code to tear down their special memory resources.  There's not
> > a lot of impetus to implement that. Since hardware NVLink2 devices can't
> > be hot unplugged, the guest side drivers don't usually support unplug
> > anyway.
> > 
> > Therefore, simply prevent unplug of NVLink2 devices.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/ppc/spapr_pci.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 55ca9dee1e..5c8262413a 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -1666,6 +1666,11 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
> >              return;
> >          }
> >  
> > +        if (spapr_phb_is_nvlink_dev(pdev, phb)) {
> > +            error_setg(errp, "PCI: Cannot unplug NVLink2 devices");
> > +            return;
> > +        }
> 
> 
> Just this would do as well:
> 
> Object *po = OBJECT(pdev);
> uint64_t tgt = object_property_get_uint(po, "nvlink2-tgt", NULL);
> 
> if (tgt) {
>      error_setg(errp, "PCI: Cannot unplug NVLink2 devices");
>      return;
> }
> 
> honestly, I admin what 1/4 fixes is cryptic but since there is not going
> to be any more new nvlinkX, this does not deserve this many patches
> imho.

Good point, that is a simpler approach.

> 
> 	
> 
> > +
> >          /* ensure any other present functions are pending unplug */
> >          if (PCI_FUNC(pdev->devfn) == 0) {
> >              for (i = 1; i < 8; i++) {
> > 
>
diff mbox series

Patch

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 55ca9dee1e..5c8262413a 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1666,6 +1666,11 @@  static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
             return;
         }
 
+        if (spapr_phb_is_nvlink_dev(pdev, phb)) {
+            error_setg(errp, "PCI: Cannot unplug NVLink2 devices");
+            return;
+        }
+
         /* ensure any other present functions are pending unplug */
         if (PCI_FUNC(pdev->devfn) == 0) {
             for (i = 1; i < 8; i++) {