Message ID | 20171207222145.9769-3-Govinda.Tatti@Oracle.COM |
---|---|
State | Not Applicable |
Headers | show |
Series | Xen/PCIback: PCI reset using 'reset' SysFS attribute | expand |
>>> On 07.12.17 at 23:21, <Govinda.Tatti@Oracle.COM> wrote: > Due to the complexity with the PCI lock we cannot do the reset when a > device is bound ('echo $BDF > bind') or when unbound ('echo $BDF > unbind') > as the pci_[slot|bus]_reset also takes the same lock resulting in a > dead-lock. It took me a moment to figure that here you're referring to the process of (un)binding, not the state. To avoid that ambiguity in wording, how about "... we cannot do the reset while a device is being bound (...) or while it is being unbound ..."? > --- a/Documentation/ABI/testing/sysfs-driver-pciback > +++ b/Documentation/ABI/testing/sysfs-driver-pciback > @@ -11,3 +11,18 @@ Description: > #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks > will allow the guest to read and write to the configuration > register 0x0E. > + > +What: /sys/bus/pci/drivers/pciback/reset > +Date: Dec 2017 > +KernelVersion: 4.15 > +Contact: xen-devel@lists.xenproject.org > +Description: > + An option to perform a flr/slot/bus reset when a PCI device > + is owned by Xen PCI backend. Writing a string of DDDD:BB:DD.F SSSS:BB:DD.F (or else the D-s are ambiguous, the more that "domain" in Xen code is ambiguous anyway - I continue to be mislead by struct pcistub_device_id's domain field) Also I assume the SSSS part is optional (default zero), which probably can and should be expressed in some way. > --- a/drivers/xen/xen-pciback/pci_stub.c > +++ b/drivers/xen/xen-pciback/pci_stub.c > @@ -313,6 +313,102 @@ void pcistub_put_pci_dev(struct pci_dev *dev) > up_write(&pcistub_sem); > } > > +struct pcistub_args { > + const struct pci_dev *dev; > + unsigned int dcount; The sole use of this field is for a debug message. Why not drop it and make "dev" the "data" argument without further indirection? > +static int pcistub_device_search(struct pci_dev *dev, void *data) > +{ > + struct pcistub_device *psdev; > + struct pcistub_args *arg = data; > + bool found = false; > + unsigned long flags; > + > + spin_lock_irqsave(&pcistub_devices_lock, flags); > + > + list_for_each_entry(psdev, &pcistub_devices, dev_list) { > + if (psdev->dev == dev) { > + found = true; > + arg->dcount++; > + break; Neither here nor in the caller I can see a check whether the device is currently assigned to a guest. Ownership by pciback alone imo is not sufficient to allow a reset to be performed. > +static int pcistub_device_reset(struct pci_dev *dev) > +{ > + struct xen_pcibk_dev_data *dev_data; > + bool slot = false, bus = false; > + struct pcistub_args arg = {}; > + > + if (!dev) > + return -EINVAL; > + > + dev_dbg(&dev->dev, "[%s]\n", __func__); > + > + /* First check and try FLR */ > + if (pcie_has_flr(dev)) { > + dev_dbg(&dev->dev, "resetting %s device using FLR\n", > + pci_name(dev)); > + pcie_flr(dev); The lack of error check here puzzled me, but I see the function indeed returns void right now. I think the prereq patch should change this along with exporting the function - you really don't want the device to be handed to a guest when the FLR timed out. > + return 0; > + } > + > + if (!pci_probe_reset_slot(dev->slot)) > + slot = true; > + else if ((!pci_probe_reset_bus(dev->bus)) && > + (!pci_is_root_bus(dev->bus))) Too many parentheses for my taste. > +static ssize_t reset_store(struct device_driver *drv, const char *buf, > + size_t count) > +{ > + struct pcistub_device *psdev; > + int domain, bus, slot, func; > + int err; > + > + err = str_to_slot(buf, &domain, &bus, &slot, &func); > + if (err) > + return err; > + > + psdev = pcistub_device_find(domain, bus, slot, func); > + if (psdev) { > + err = pcistub_device_reset(psdev->dev); > + pcistub_device_put(psdev); > + } else { > + err = -ENODEV; > + } > + > + if (!err) > + err = count; > + > + return err; > +} > +static DRIVER_ATTR_WO(reset); Would it be worth for reads of the file to return whether the device can be reset this way (i.e. the result of the checks you do before actually doing the reset)? Jan
Thanks Jan for your review comments. Please see below for my comments. On 12/8/2017 3:34 AM, Jan Beulich wrote: >>>> On 07.12.17 at 23:21,<Govinda.Tatti@Oracle.COM> wrote: >> Due to the complexity with the PCI lock we cannot do the reset when a >> device is bound ('echo $BDF > bind') or when unbound ('echo $BDF > unbind') >> as the pci_[slot|bus]_reset also takes the same lock resulting in a >> dead-lock. > It took me a moment to figure that here you're referring to the > process of (un)binding, not the state. To avoid that ambiguity in > wording, how about "... we cannot do the reset while a device is > being bound (...) or while it is being unbound ..."? Sure, I will fix it. >> --- a/Documentation/ABI/testing/sysfs-driver-pciback >> +++ b/Documentation/ABI/testing/sysfs-driver-pciback >> @@ -11,3 +11,18 @@ Description: >> #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks >> will allow the guest to read and write to the configuration >> register 0x0E. >> + >> +What: /sys/bus/pci/drivers/pciback/reset >> +Date: Dec 2017 >> +KernelVersion: 4.15 >> +Contact:xen-devel@lists.xenproject.org >> +Description: >> + An option to perform a flr/slot/bus reset when a PCI device >> + is owned by Xen PCI backend. Writing a string of DDDD:BB:DD.F > SSSS:BB:DD.F (or else the D-s are ambiguous, the more that "domain" > in Xen code is ambiguous anyway - I continue to be mislead by struct > pcistub_device_id's domain field) Thanks for catching this issue. I will fix it. > Also I assume the SSSS part is optional (default zero), which > probably can and should be expressed in some way. SSSS can be 0 or non-zero, subject to system configuration. >> --- a/drivers/xen/xen-pciback/pci_stub.c >> +++ b/drivers/xen/xen-pciback/pci_stub.c >> @@ -313,6 +313,102 @@ void pcistub_put_pci_dev(struct pci_dev *dev) >> up_write(&pcistub_sem); >> } >> >> +struct pcistub_args { >> + const struct pci_dev *dev; >> + unsigned int dcount; > The sole use of this field is for a debug message. Why not drop it > and make "dev" the "data" argument without further indirection? I prefer to keep this data structure since it will be helpful to debug any issues orfor future enhancements. >> +static int pcistub_device_search(struct pci_dev *dev, void *data) >> +{ >> + struct pcistub_device *psdev; >> + struct pcistub_args *arg = data; >> + bool found = false; >> + unsigned long flags; >> + >> + spin_lock_irqsave(&pcistub_devices_lock, flags); >> + >> + list_for_each_entry(psdev, &pcistub_devices, dev_list) { >> + if (psdev->dev == dev) { >> + found = true; >> + arg->dcount++; >> + break; > Neither here nor in the caller I can see a check whether the device > is currently assigned to a guest. Ownership by pciback alone imo is > not sufficient to allow a reset to be performed. I can add the following check if ((psdev->dev == dev) && (pci_is_dev_assigned(dev))) >> +static int pcistub_device_reset(struct pci_dev *dev) >> +{ >> + struct xen_pcibk_dev_data *dev_data; >> + bool slot = false, bus = false; >> + struct pcistub_args arg = {}; >> + >> + if (!dev) >> + return -EINVAL; >> + >> + dev_dbg(&dev->dev, "[%s]\n", __func__); >> + >> + /* First check and try FLR */ >> + if (pcie_has_flr(dev)) { >> + dev_dbg(&dev->dev, "resetting %s device using FLR\n", >> + pci_name(dev)); >> + pcie_flr(dev); > The lack of error check here puzzled me, but I see the function > indeed returns void right now. I think the prereq patch should > change this along with exporting the function - you really don't > want the device to be handed to a guest when the FLR timed > out. We will change pcie_flr() to return error code. I will make this change in the next version of this patch. >> + return 0; >> + } >> + >> + if (!pci_probe_reset_slot(dev->slot)) >> + slot = true; >> + else if ((!pci_probe_reset_bus(dev->bus)) && >> + (!pci_is_root_bus(dev->bus))) > Too many parentheses for my taste. I will fix it. >> +static ssize_t reset_store(struct device_driver *drv, const char *buf, >> + size_t count) >> +{ >> + struct pcistub_device *psdev; >> + int domain, bus, slot, func; >> + int err; >> + >> + err = str_to_slot(buf, &domain, &bus, &slot, &func); >> + if (err) >> + return err; >> + >> + psdev = pcistub_device_find(domain, bus, slot, func); >> + if (psdev) { >> + err = pcistub_device_reset(psdev->dev); >> + pcistub_device_put(psdev); >> + } else { >> + err = -ENODEV; >> + } >> + >> + if (!err) >> + err = count; >> + >> + return err; >> +} >> +static DRIVER_ATTR_WO(reset); > Would it be worth for reads of the file to return whether the device > can be reset this way (i.e. the result of the checks you do before > actually doing the reset)? I don't think so. Plus, it makes this interface and its usage more complicated. Cheers GOVINDA
>>> On 12.12.17 at 15:48, <Govinda.Tatti@Oracle.COM> wrote: > Thanks Jan for your review comments. Please see below for my comments. First of all - can you please do something about your reply style? HTML mail should be avoided. You'll see that the (plain text) reply as a result is rather hard to follow, too. > --- a/Documentation/ABI/testing/sysfs-driver-pciback > +++ b/Documentation/ABI/testing/sysfs-driver-pciback > @@ -11,3 +11,18 @@ Description: > #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks > will allow the guest to read and write to the configuration > register 0x0E. > + > +What: /sys/bus/pci/drivers/pciback/reset > +Date: Dec 2017 > +KernelVersion: 4.15 > +Contact: xen-devel@lists.xenproject.org > +Description: > + An option to perform a flr/slot/bus reset when a PCI device > + is owned by Xen PCI backend. Writing a string of DDDD:BB:DD.F > SSSS:BB:DD.F (or else the D-s are ambiguous, the more that "domain" > in Xen code is ambiguous anyway - I continue to be mislead by struct > pcistub_device_id's domain field) Thanks for catching this issue. I will > fix it. > > > Also I assume the SSSS part is optional (default zero), which > probably can and should be expressed in some way. SSSS can be 0 or > non-zero, subject to system configuration. The question isn't system configuration, but whether the field can be omitted on input, with zero being assumed in such a case. That's a common shorthand, considering that the vast majority of x86 (and maybe other) systems aren't using segments other than zero. Jan
On 12/12/2017 9:01 AM, Jan Beulich wrote: >>>> On 12.12.17 at 15:48, <Govinda.Tatti@Oracle.COM> wrote: >> Thanks Jan for your review comments. Please see below for my comments. > First of all - can you please do something about your reply style? > HTML mail should be avoided. You'll see that the (plain text) reply > as a result is rather hard to follow, too. Sorry about it. I had an issue with my Thunderbird setting. > >> --- a/Documentation/ABI/testing/sysfs-driver-pciback >> +++ b/Documentation/ABI/testing/sysfs-driver-pciback >> @@ -11,3 +11,18 @@ Description: >> #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks >> will allow the guest to read and write to the configuration >> register 0x0E. >> + >> +What: /sys/bus/pci/drivers/pciback/reset >> +Date: Dec 2017 >> +KernelVersion: 4.15 >> +Contact: xen-devel@lists.xenproject.org >> +Description: >> + An option to perform a flr/slot/bus reset when a PCI device >> + is owned by Xen PCI backend. Writing a string of DDDD:BB:DD.F >> SSSS:BB:DD.F (or else the D-s are ambiguous, the more that "domain" >> in Xen code is ambiguous anyway - I continue to be mislead by struct >> pcistub_device_id's domain field) Thanks for catching this issue. I will >> fix it. >> >> >> Also I assume the SSSS part is optional (default zero), which >> probably can and should be expressed in some way. SSSS can be 0 or >> non-zero, subject to system configuration. > The question isn't system configuration, but whether the field can > be omitted on input, with zero being assumed in such a case. That's > a common shorthand, considering that the vast majority of x86 > (and maybe other) systems aren't using segments other than zero Yes, it can be omitted if SSSS is zero.I will add this information to above documentation file. Cheers GOVINDA
Jan, One quick update on pcie_flr() specific implementation. Please see below. >>> +static int pcistub_device_reset(struct pci_dev *dev) >>> +{ >>> + struct xen_pcibk_dev_data *dev_data; >>> + bool slot = false, bus = false; >>> + struct pcistub_args arg = {}; >>> + >>> + if (!dev) >>> + return -EINVAL; >>> + >>> + dev_dbg(&dev->dev, "[%s]\n", __func__); >>> + >>> + /* First check and try FLR */ >>> + if (pcie_has_flr(dev)) { >>> + dev_dbg(&dev->dev, "resetting %s device using FLR\n", >>> + pci_name(dev)); >>> + pcie_flr(dev); >> The lack of error check here puzzled me, but I see the function >> indeed returns void right now. I think the prereq patch should >> change this along with exporting the function - you really don't >> want the device to be handed to a guest when the FLR timed >> out. > We will change pcie_flr() to return error code. I will make this change > in the next version of this patch. I exchanged some emails with Bjorn/Christoph and it looks like Christoph as some planto restructure pcie flr specific functions but I don't know the exact time-frame. For now,I am planning to use existing pcie_flr() after checking FLR capability. We will switchto revised pcie_flr() once it is available. I hope you are fine with this approach. Please let me know. Thanks. Cheers GOVINDA
>>> On 15.12.17 at 20:52, <Govinda.Tatti@Oracle.COM> wrote: >>>> +static int pcistub_device_reset(struct pci_dev *dev) >>>> +{ >>>> + struct xen_pcibk_dev_data *dev_data; >>>> + bool slot = false, bus = false; >>>> + struct pcistub_args arg = {}; >>>> + >>>> + if (!dev) >>>> + return -EINVAL; >>>> + >>>> + dev_dbg(&dev->dev, "[%s]\n", __func__); >>>> + >>>> + /* First check and try FLR */ >>>> + if (pcie_has_flr(dev)) { >>>> + dev_dbg(&dev->dev, "resetting %s device using FLR\n", >>>> + pci_name(dev)); >>>> + pcie_flr(dev); >>> The lack of error check here puzzled me, but I see the function >>> indeed returns void right now. I think the prereq patch should >>> change this along with exporting the function - you really don't >>> want the device to be handed to a guest when the FLR timed >>> out. >> We will change pcie_flr() to return error code. I will make this change >> in the next version of this patch. > I exchanged some emails with Bjorn/Christoph and it looks like Christoph > as some planto restructure pcie flr specific functions but I don't know > the exact time-frame. For now,I am planning to use existing pcie_flr() > after checking FLR capability. We will switchto revised pcie_flr() once > it is available. > > I hope you are fine with this approach. Please let me know. Thanks. I've seen that other discussion. I don't think the change here should be done prior to the error reporting being put in place, for security reasons. But in the end it'll be Konrad as the maintainer to judge. Or wait, looks like there's some confusion in ./MAINTAINERS: Konrad is listed as maintainer for "XEN PCI SUBSYSTEM", but the list of files doesn't include pciback. So it would instead be Boris or Jürgen to give you a final word. Jan
On 12/18/2017 02:36 AM, Jan Beulich wrote: >>>> On 15.12.17 at 20:52, <Govinda.Tatti@Oracle.COM> wrote: >>>>> +static int pcistub_device_reset(struct pci_dev *dev) >>>>> +{ >>>>> + struct xen_pcibk_dev_data *dev_data; >>>>> + bool slot = false, bus = false; >>>>> + struct pcistub_args arg = {}; >>>>> + >>>>> + if (!dev) >>>>> + return -EINVAL; >>>>> + >>>>> + dev_dbg(&dev->dev, "[%s]\n", __func__); >>>>> + >>>>> + /* First check and try FLR */ >>>>> + if (pcie_has_flr(dev)) { >>>>> + dev_dbg(&dev->dev, "resetting %s device using FLR\n", >>>>> + pci_name(dev)); >>>>> + pcie_flr(dev); >>>> The lack of error check here puzzled me, but I see the function >>>> indeed returns void right now. I think the prereq patch should >>>> change this along with exporting the function - you really don't >>>> want the device to be handed to a guest when the FLR timed >>>> out. >>> We will change pcie_flr() to return error code. I will make this change >>> in the next version of this patch. >> I exchanged some emails with Bjorn/Christoph and it looks like Christoph >> as some planto restructure pcie flr specific functions but I don't know >> the exact time-frame. For now,I am planning to use existing pcie_flr() >> after checking FLR capability. We will switchto revised pcie_flr() once >> it is available. >> >> I hope you are fine with this approach. Please let me know. Thanks. > I've seen that other discussion. I don't think the change here > should be done prior to the error reporting being put in place, > for security reasons. But in the end it'll be Konrad as the > maintainer to judge. > > Or wait, looks like there's some confusion in ./MAINTAINERS: > Konrad is listed as maintainer for "XEN PCI SUBSYSTEM", but the > list of files doesn't include pciback. So it would instead be Boris > or Jürgen to give you a final word. This is now 4.16 material so we can at least wait until closer to opening of the merge window when we may have the PCI updates. (And I just noticed that you responded to Christoph.) Besides, we don't want to make kernel changes until the interface is settled (i.e the toolstack changes are accepted). -boris
Hi, On Mon, Dec 18, 2017 at 12:32:11PM -0500, Boris Ostrovsky wrote: > On 12/18/2017 02:36 AM, Jan Beulich wrote: > >>>> On 15.12.17 at 20:52, <Govinda.Tatti@Oracle.COM> wrote: > >>>>> +static int pcistub_device_reset(struct pci_dev *dev) > >>>>> +{ > >>>>> + struct xen_pcibk_dev_data *dev_data; > >>>>> + bool slot = false, bus = false; > >>>>> + struct pcistub_args arg = {}; > >>>>> + > >>>>> + if (!dev) > >>>>> + return -EINVAL; > >>>>> + > >>>>> + dev_dbg(&dev->dev, "[%s]\n", __func__); > >>>>> + > >>>>> + /* First check and try FLR */ > >>>>> + if (pcie_has_flr(dev)) { > >>>>> + dev_dbg(&dev->dev, "resetting %s device using FLR\n", > >>>>> + pci_name(dev)); > >>>>> + pcie_flr(dev); > >>>> The lack of error check here puzzled me, but I see the function > >>>> indeed returns void right now. I think the prereq patch should > >>>> change this along with exporting the function - you really don't > >>>> want the device to be handed to a guest when the FLR timed > >>>> out. > >>> We will change pcie_flr() to return error code. I will make this change > >>> in the next version of this patch. > >> I exchanged some emails with Bjorn/Christoph and it looks like Christoph > >> as some planto restructure pcie flr specific functions but I don't know > >> the exact time-frame. For now,I am planning to use existing pcie_flr() > >> after checking FLR capability. We will switchto revised pcie_flr() once > >> it is available. > >> > >> I hope you are fine with this approach. Please let me know. Thanks. > > I've seen that other discussion. I don't think the change here > > should be done prior to the error reporting being put in place, > > for security reasons. But in the end it'll be Konrad as the > > maintainer to judge. > > > > Or wait, looks like there's some confusion in ./MAINTAINERS: > > Konrad is listed as maintainer for "XEN PCI SUBSYSTEM", but the > > list of files doesn't include pciback. So it would instead be Boris > > or Jürgen to give you a final word. > > > This is now 4.16 material so we can at least wait until closer to > opening of the merge window when we may have the PCI updates. (And I > just noticed that you responded to Christoph.) > > Besides, we don't want to make kernel changes until the interface is > settled (i.e the toolstack changes are accepted). > It seems Govinda's email address is giving an error, so I assume someone else needs to pick up this pciback 'reset' feature. Is it likely someone else from Oracle can/will pick up and refresh this patch, with the review comments addressed? Meanwhile the pcie_has_flr() has been exported in upstream Linux kernel, so that's already available for use now. "PCI: Export pcie_has_flr()": https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d2917f7747805a1f4188672f308d82a8ba01700 > -boris > Thanks, -- Pasi
On 9/16/18 7:43 AM, Pasi Kärkkäinen wrote: > Hi, > > On Mon, Dec 18, 2017 at 12:32:11PM -0500, Boris Ostrovsky wrote: >> On 12/18/2017 02:36 AM, Jan Beulich wrote: >>>>>> On 15.12.17 at 20:52, <Govinda.Tatti@Oracle.COM> wrote: >>>>>>> +static int pcistub_device_reset(struct pci_dev *dev) >>>>>>> +{ >>>>>>> + struct xen_pcibk_dev_data *dev_data; >>>>>>> + bool slot = false, bus = false; >>>>>>> + struct pcistub_args arg = {}; >>>>>>> + >>>>>>> + if (!dev) >>>>>>> + return -EINVAL; >>>>>>> + >>>>>>> + dev_dbg(&dev->dev, "[%s]\n", __func__); >>>>>>> + >>>>>>> + /* First check and try FLR */ >>>>>>> + if (pcie_has_flr(dev)) { >>>>>>> + dev_dbg(&dev->dev, "resetting %s device using FLR\n", >>>>>>> + pci_name(dev)); >>>>>>> + pcie_flr(dev); >>>>>> The lack of error check here puzzled me, but I see the function >>>>>> indeed returns void right now. I think the prereq patch should >>>>>> change this along with exporting the function - you really don't >>>>>> want the device to be handed to a guest when the FLR timed >>>>>> out. >>>>> We will change pcie_flr() to return error code. I will make this change >>>>> in the next version of this patch. >>>> I exchanged some emails with Bjorn/Christoph and it looks like Christoph >>>> as some planto restructure pcie flr specific functions but I don't know >>>> the exact time-frame. For now,I am planning to use existing pcie_flr() >>>> after checking FLR capability. We will switchto revised pcie_flr() once >>>> it is available. >>>> >>>> I hope you are fine with this approach. Please let me know. Thanks. >>> I've seen that other discussion. I don't think the change here >>> should be done prior to the error reporting being put in place, >>> for security reasons. But in the end it'll be Konrad as the >>> maintainer to judge. >>> >>> Or wait, looks like there's some confusion in ./MAINTAINERS: >>> Konrad is listed as maintainer for "XEN PCI SUBSYSTEM", but the >>> list of files doesn't include pciback. So it would instead be Boris >>> or Jürgen to give you a final word. >> >> This is now 4.16 material so we can at least wait until closer to >> opening of the merge window when we may have the PCI updates. (And I >> just noticed that you responded to Christoph.) >> >> Besides, we don't want to make kernel changes until the interface is >> settled (i.e the toolstack changes are accepted). >> > It seems Govinda's email address is giving an error, so I assume someone else needs to pick up this pciback 'reset' feature. > Is it likely someone else from Oracle can/will pick up and refresh this patch, with the review comments addressed? Govinda is no longer at Oracle. What about the toolstack changes? Have they been accepted? I vaguely recall there was a discussion about those changes but don't remember how it ended. -boris > > > Meanwhile the pcie_has_flr() has been exported in upstream Linux kernel, so that's already available for use now. > > "PCI: Export pcie_has_flr()": > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d2917f7747805a1f4188672f308d82a8ba01700 > > >
Hi, On Mon, Sep 17, 2018 at 02:06:02PM -0400, Boris Ostrovsky wrote: > On 9/16/18 7:43 AM, Pasi Kärkkäinen wrote: > > Hi, > > > > On Mon, Dec 18, 2017 at 12:32:11PM -0500, Boris Ostrovsky wrote: > >> On 12/18/2017 02:36 AM, Jan Beulich wrote: > >>>>>> On 15.12.17 at 20:52, <Govinda.Tatti@Oracle.COM> wrote: > >>>>>>> +static int pcistub_device_reset(struct pci_dev *dev) > >>>>>>> +{ > >>>>>>> + struct xen_pcibk_dev_data *dev_data; > >>>>>>> + bool slot = false, bus = false; > >>>>>>> + struct pcistub_args arg = {}; > >>>>>>> + > >>>>>>> + if (!dev) > >>>>>>> + return -EINVAL; > >>>>>>> + > >>>>>>> + dev_dbg(&dev->dev, "[%s]\n", __func__); > >>>>>>> + > >>>>>>> + /* First check and try FLR */ > >>>>>>> + if (pcie_has_flr(dev)) { > >>>>>>> + dev_dbg(&dev->dev, "resetting %s device using FLR\n", > >>>>>>> + pci_name(dev)); > >>>>>>> + pcie_flr(dev); > >>>>>> The lack of error check here puzzled me, but I see the function > >>>>>> indeed returns void right now. I think the prereq patch should > >>>>>> change this along with exporting the function - you really don't > >>>>>> want the device to be handed to a guest when the FLR timed > >>>>>> out. > >>>>> We will change pcie_flr() to return error code. I will make this change > >>>>> in the next version of this patch. > >>>> I exchanged some emails with Bjorn/Christoph and it looks like Christoph > >>>> as some planto restructure pcie flr specific functions but I don't know > >>>> the exact time-frame. For now,I am planning to use existing pcie_flr() > >>>> after checking FLR capability. We will switchto revised pcie_flr() once > >>>> it is available. > >>>> > >>>> I hope you are fine with this approach. Please let me know. Thanks. > >>> I've seen that other discussion. I don't think the change here > >>> should be done prior to the error reporting being put in place, > >>> for security reasons. But in the end it'll be Konrad as the > >>> maintainer to judge. > >>> > >>> Or wait, looks like there's some confusion in ./MAINTAINERS: > >>> Konrad is listed as maintainer for "XEN PCI SUBSYSTEM", but the > >>> list of files doesn't include pciback. So it would instead be Boris > >>> or Jürgen to give you a final word. > >> > >> This is now 4.16 material so we can at least wait until closer to > >> opening of the merge window when we may have the PCI updates. (And I > >> just noticed that you responded to Christoph.) > >> > >> Besides, we don't want to make kernel changes until the interface is > >> settled (i.e the toolstack changes are accepted). > >> > > It seems Govinda's email address is giving an error, so I assume someone else needs to pick up this pciback 'reset' feature. > > Is it likely someone else from Oracle can/will pick up and refresh this patch, with the review comments addressed? > > > Govinda is no longer at Oracle. > Yep, thought so. Removed from CC list. > What about the toolstack changes? Have they been accepted? I vaguely > recall there was a discussion about those changes but don't remember how > it ended. > I don't think toolstack/libxl patch has been applied yet either. "[PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html "[PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html George asked for some clarifications: https://lists.xen.org/archives/html/xen-devel/2017-12/msg01044.html https://lists.xen.org/archives/html/xen-devel/2017-12/msg01116.html > > -boris > Thanks, -- Pasi > > > > > > > Meanwhile the pcie_has_flr() has been exported in upstream Linux kernel, so that's already available for use now. > > > > "PCI: Export pcie_has_flr()": > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d2917f7747805a1f4188672f308d82a8ba01700 > > > > > > >
> On Sep 18, 2018, at 8:15 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote: > > Hi, > > On Mon, Sep 17, 2018 at 02:06:02PM -0400, Boris Ostrovsky wrote: >> On 9/16/18 7:43 AM, Pasi Kärkkäinen wrote: >>> Hi, >>> >>> On Mon, Dec 18, 2017 at 12:32:11PM -0500, Boris Ostrovsky wrote: >>>> On 12/18/2017 02:36 AM, Jan Beulich wrote: >>>>>>>> On 15.12.17 at 20:52, <Govinda.Tatti@Oracle.COM> wrote: >>>>>>>>> +static int pcistub_device_reset(struct pci_dev *dev) >>>>>>>>> +{ >>>>>>>>> + struct xen_pcibk_dev_data *dev_data; >>>>>>>>> + bool slot = false, bus = false; >>>>>>>>> + struct pcistub_args arg = {}; >>>>>>>>> + >>>>>>>>> + if (!dev) >>>>>>>>> + return -EINVAL; >>>>>>>>> + >>>>>>>>> + dev_dbg(&dev->dev, "[%s]\n", __func__); >>>>>>>>> + >>>>>>>>> + /* First check and try FLR */ >>>>>>>>> + if (pcie_has_flr(dev)) { >>>>>>>>> + dev_dbg(&dev->dev, "resetting %s device using FLR\n", >>>>>>>>> + pci_name(dev)); >>>>>>>>> + pcie_flr(dev); >>>>>>>> The lack of error check here puzzled me, but I see the function >>>>>>>> indeed returns void right now. I think the prereq patch should >>>>>>>> change this along with exporting the function - you really don't >>>>>>>> want the device to be handed to a guest when the FLR timed >>>>>>>> out. >>>>>>> We will change pcie_flr() to return error code. I will make this change >>>>>>> in the next version of this patch. >>>>>> I exchanged some emails with Bjorn/Christoph and it looks like Christoph >>>>>> as some planto restructure pcie flr specific functions but I don't know >>>>>> the exact time-frame. For now,I am planning to use existing pcie_flr() >>>>>> after checking FLR capability. We will switchto revised pcie_flr() once >>>>>> it is available. >>>>>> >>>>>> I hope you are fine with this approach. Please let me know. Thanks. >>>>> I've seen that other discussion. I don't think the change here >>>>> should be done prior to the error reporting being put in place, >>>>> for security reasons. But in the end it'll be Konrad as the >>>>> maintainer to judge. >>>>> >>>>> Or wait, looks like there's some confusion in ./MAINTAINERS: >>>>> Konrad is listed as maintainer for "XEN PCI SUBSYSTEM", but the >>>>> list of files doesn't include pciback. So it would instead be Boris >>>>> or Jürgen to give you a final word. >>>> >>>> This is now 4.16 material so we can at least wait until closer to >>>> opening of the merge window when we may have the PCI updates. (And I >>>> just noticed that you responded to Christoph.) >>>> >>>> Besides, we don't want to make kernel changes until the interface is >>>> settled (i.e the toolstack changes are accepted). >>>> >>> It seems Govinda's email address is giving an error, so I assume someone else needs to pick up this pciback 'reset' feature. >>> Is it likely someone else from Oracle can/will pick up and refresh this patch, with the review comments addressed? >> >> >> Govinda is no longer at Oracle. >> > > Yep, thought so. Removed from CC list. > > >> What about the toolstack changes? Have they been accepted? I vaguely >> recall there was a discussion about those changes but don't remember how >> it ended. >> > > I don't think toolstack/libxl patch has been applied yet either. > > > "[PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": > https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html > > "[PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": > https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html > > George asked for some clarifications: > https://lists.xen.org/archives/html/xen-devel/2017-12/msg01044.html > https://lists.xen.org/archives/html/xen-devel/2017-12/msg01116.html Right, the description of the patch didn’t actually tell you what was going on. It should have said something like, “xl currently attempts to reset a device using X; but that’s never been implemented in Linux. Instead, use Y, which [is better for whatever reason]”. -George
On 9/18/18 5:32 AM, George Dunlap wrote: > >> On Sep 18, 2018, at 8:15 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote: >> >> Hi, >> >> On Mon, Sep 17, 2018 at 02:06:02PM -0400, Boris Ostrovsky wrote: >>> What about the toolstack changes? Have they been accepted? I vaguely >>> recall there was a discussion about those changes but don't remember how >>> it ended. >>> >> I don't think toolstack/libxl patch has been applied yet either. >> >> >> "[PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": >> https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html >> >> "[PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": >> https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html Will this patch work for *BSD? Roger? >> >> George asked for some clarifications: >> https://lists.xen.org/archives/html/xen-devel/2017-12/msg01044.html >> https://lists.xen.org/archives/html/xen-devel/2017-12/msg01116.html > Right, the description of the patch didn’t actually tell you what was going on. It should have said something like, “xl currently attempts to reset a device using X; but that’s never been implemented in Linux. Instead, use Y, which [is better for whatever reason]”. Yes, the description can be tightened a bit ;-) -boris
On Tue, Sep 18, 2018 at 02:09:53PM -0400, Boris Ostrovsky wrote: > On 9/18/18 5:32 AM, George Dunlap wrote: > > > >> On Sep 18, 2018, at 8:15 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote: > >> > >> Hi, > >> > >> On Mon, Sep 17, 2018 at 02:06:02PM -0400, Boris Ostrovsky wrote: > >>> What about the toolstack changes? Have they been accepted? I vaguely > >>> recall there was a discussion about those changes but don't remember how > >>> it ended. > >>> > >> I don't think toolstack/libxl patch has been applied yet either. > >> > >> > >> "[PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": > >> https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html > >> > >> "[PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": > >> https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html > > > Will this patch work for *BSD? Roger? At least FreeBSD don't support pci-passthrough, so none of this works ATM. There's no sysfs on BSD, so much of what's in libxl_pci.c will have to be moved to libxl_linux.c when BSD support is added. Thanks, Roger.
On Wed, Sep 19, 2018 at 11:05:26AM +0200, Roger Pau Monné wrote: > On Tue, Sep 18, 2018 at 02:09:53PM -0400, Boris Ostrovsky wrote: > > On 9/18/18 5:32 AM, George Dunlap wrote: > > > > > >> On Sep 18, 2018, at 8:15 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote: > > >> > > >> Hi, > > >> > > >> On Mon, Sep 17, 2018 at 02:06:02PM -0400, Boris Ostrovsky wrote: > > >>> What about the toolstack changes? Have they been accepted? I vaguely > > >>> recall there was a discussion about those changes but don't remember how > > >>> it ended. > > >>> > > >> I don't think toolstack/libxl patch has been applied yet either. > > >> > > >> > > >> "[PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": > > >> https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html > > >> > > >> "[PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": > > >> https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html > > > > > > Will this patch work for *BSD? Roger? > > At least FreeBSD don't support pci-passthrough, so none of this works > ATM. There's no sysfs on BSD, so much of what's in libxl_pci.c will > have to be moved to libxl_linux.c when BSD support is added. > Ok. That sounds like it's OK for the initial pci 'reset' implementation in xl/libxl to be linux-only.. Thanks, -- Pasi > Thanks, Roger.
On 10/3/18 11:51 AM, Pasi Kärkkäinen wrote: > On Wed, Sep 19, 2018 at 11:05:26AM +0200, Roger Pau Monné wrote: >> On Tue, Sep 18, 2018 at 02:09:53PM -0400, Boris Ostrovsky wrote: >>> On 9/18/18 5:32 AM, George Dunlap wrote: >>>>> On Sep 18, 2018, at 8:15 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote: >>>>> >>>>> Hi, >>>>> >>>>> On Mon, Sep 17, 2018 at 02:06:02PM -0400, Boris Ostrovsky wrote: >>>>>> What about the toolstack changes? Have they been accepted? I vaguely >>>>>> recall there was a discussion about those changes but don't remember how >>>>>> it ended. >>>>>> >>>>> I don't think toolstack/libxl patch has been applied yet either. >>>>> >>>>> >>>>> "[PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": >>>>> https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html >>>>> >>>>> "[PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": >>>>> https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html >>> >>> Will this patch work for *BSD? Roger? >> At least FreeBSD don't support pci-passthrough, so none of this works >> ATM. There's no sysfs on BSD, so much of what's in libxl_pci.c will >> have to be moved to libxl_linux.c when BSD support is added. >> > Ok. That sounds like it's OK for the initial pci 'reset' implementation in xl/libxl to be linux-only.. > Are these two patches still needed? ISTR they were written originally to deal with guest trying to use device that was previously assigned to another guest. But pcistub_put_pci_dev() calls __pci_reset_function_locked() which first tries FLR, and it looks like it was added relatively recently. -boris
Hi, On Mon, Oct 08, 2018 at 10:32:45AM -0400, Boris Ostrovsky wrote: > On 10/3/18 11:51 AM, Pasi Kärkkäinen wrote: > > On Wed, Sep 19, 2018 at 11:05:26AM +0200, Roger Pau Monné wrote: > >> On Tue, Sep 18, 2018 at 02:09:53PM -0400, Boris Ostrovsky wrote: > >>> On 9/18/18 5:32 AM, George Dunlap wrote: > >>>>> On Sep 18, 2018, at 8:15 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> On Mon, Sep 17, 2018 at 02:06:02PM -0400, Boris Ostrovsky wrote: > >>>>>> What about the toolstack changes? Have they been accepted? I vaguely > >>>>>> recall there was a discussion about those changes but don't remember how > >>>>>> it ended. > >>>>>> > >>>>> I don't think toolstack/libxl patch has been applied yet either. > >>>>> > >>>>> > >>>>> "[PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": > >>>>> https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html > >>>>> > >>>>> "[PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": > >>>>> https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html > >>> > >>> Will this patch work for *BSD? Roger? > >> At least FreeBSD don't support pci-passthrough, so none of this works > >> ATM. There's no sysfs on BSD, so much of what's in libxl_pci.c will > >> have to be moved to libxl_linux.c when BSD support is added. > >> > > Ok. That sounds like it's OK for the initial pci 'reset' implementation in xl/libxl to be linux-only.. > > > > Are these two patches still needed? ISTR they were written originally > to deal with guest trying to use device that was previously assigned to > another guest. But pcistub_put_pci_dev() calls > __pci_reset_function_locked() which first tries FLR, and it looks like > it was added relatively recently. > Replying to an old thread.. I only now realized I forgot to reply to this message earlier. afaik these patches are still needed. Håkon (CC'd) wrote to me in private that he gets a (dom0) Linux kernel crash if he doesn't have these patches applied. Here are the links to both the linux kernel and libxl patches: "[Xen-devel] [PATCH V3 0/2] Xen/PCIback: PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00659.html [Note that PATCH V3 1/2 "Drivers/PCI: Export pcie_has_flr() interface" is already applied in upstream linux kernel, so it's not needed anymore] "[Xen-devel] [PATCH V3 2/2] Xen/PCIback: Implement PCI flr/slot/bus reset with 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00661.html "[Xen-devel] [PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html "[Xen-devel] [PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html > > -boris Thanks, -- Pasi
diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback index 6a733bf..d295b42 100644 --- a/Documentation/ABI/testing/sysfs-driver-pciback +++ b/Documentation/ABI/testing/sysfs-driver-pciback @@ -11,3 +11,18 @@ Description: #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks will allow the guest to read and write to the configuration register 0x0E. + +What: /sys/bus/pci/drivers/pciback/reset +Date: Dec 2017 +KernelVersion: 4.15 +Contact: xen-devel@lists.xenproject.org +Description: + An option to perform a flr/slot/bus reset when a PCI device + is owned by Xen PCI backend. Writing a string of DDDD:BB:DD.F + will cause the pciback driver to perform a flr or slot or bus + reset if the device supports it. It will try to execute one + of these reset method, starting with FLR if it is supported. + Otherwise, it tries slot or bus reset methods. For slot or + bus reset method, it also checks to make sure that all of the + devices under the bridge are owned by Xen PCI backend before + performing those resets. diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c index 9e480fd..cad704e 100644 --- a/drivers/xen/xen-pciback/pci_stub.c +++ b/drivers/xen/xen-pciback/pci_stub.c @@ -313,6 +313,102 @@ void pcistub_put_pci_dev(struct pci_dev *dev) up_write(&pcistub_sem); } +struct pcistub_args { + const struct pci_dev *dev; + unsigned int dcount; +}; + +static int pcistub_device_search(struct pci_dev *dev, void *data) +{ + struct pcistub_device *psdev; + struct pcistub_args *arg = data; + bool found = false; + unsigned long flags; + + spin_lock_irqsave(&pcistub_devices_lock, flags); + + list_for_each_entry(psdev, &pcistub_devices, dev_list) { + if (psdev->dev == dev) { + found = true; + arg->dcount++; + break; + } + } + + spin_unlock_irqrestore(&pcistub_devices_lock, flags); + + /* Device not owned by pcistub. Abort the walk */ + if (!found) { + arg->dev = dev; + return 1; + } + + return 0; +} + +static int pcistub_device_reset(struct pci_dev *dev) +{ + struct xen_pcibk_dev_data *dev_data; + bool slot = false, bus = false; + struct pcistub_args arg = {}; + + if (!dev) + return -EINVAL; + + dev_dbg(&dev->dev, "[%s]\n", __func__); + + /* First check and try FLR */ + if (pcie_has_flr(dev)) { + dev_dbg(&dev->dev, "resetting %s device using FLR\n", + pci_name(dev)); + pcie_flr(dev); + return 0; + } + + if (!pci_probe_reset_slot(dev->slot)) + slot = true; + else if ((!pci_probe_reset_bus(dev->bus)) && + (!pci_is_root_bus(dev->bus))) + bus = true; + + if (!bus && !slot) + return -EOPNOTSUPP; + + /* + * Make sure all devices on this bus are owned by the + * PCI backend so that we can safely reset the whole bus. + */ + pci_walk_bus(dev->bus, pcistub_device_search, &arg); + + /* All devices under the bus should be part of pcistub! */ + if (arg.dev) { + dev_err(&dev->dev, + "%s on the same bus as %s and is not owned by " + DRV_NAME "\n", pci_name(arg.dev), pci_name(dev)); + + return -EBUSY; + } + + dev_dbg(&dev->dev, "pcistub owns %d devices on PCI Bus %04x:%02x", + arg.dcount, pci_domain_nr(dev->bus), dev->bus->number); + + dev_data = pci_get_drvdata(dev); + if (!pci_load_saved_state(dev, dev_data->pci_saved_state)) + pci_restore_state(dev); + + /* This disables the device. */ + xen_pcibk_reset_device(dev); + + /* Cleanup up any emulated fields */ + xen_pcibk_config_reset_dev(dev); + + dev_dbg(&dev->dev, "resetting %s device using %s reset\n", + pci_name(dev), slot ? "slot" : "bus"); + + return slot ? pci_try_reset_slot(dev->slot) : + pci_try_reset_bus(dev->bus); +} + static int pcistub_match_one(struct pci_dev *dev, struct pcistub_device_id *pdev_id) { @@ -1430,6 +1526,32 @@ static ssize_t permissive_show(struct device_driver *drv, char *buf) } static DRIVER_ATTR_RW(permissive); +static ssize_t reset_store(struct device_driver *drv, const char *buf, + size_t count) +{ + struct pcistub_device *psdev; + int domain, bus, slot, func; + int err; + + err = str_to_slot(buf, &domain, &bus, &slot, &func); + if (err) + return err; + + psdev = pcistub_device_find(domain, bus, slot, func); + if (psdev) { + err = pcistub_device_reset(psdev->dev); + pcistub_device_put(psdev); + } else { + err = -ENODEV; + } + + if (!err) + err = count; + + return err; +} +static DRIVER_ATTR_WO(reset); + static void pcistub_exit(void) { driver_remove_file(&xen_pcibk_pci_driver.driver, &driver_attr_new_slot); @@ -1443,6 +1565,8 @@ static void pcistub_exit(void) &driver_attr_irq_handlers); driver_remove_file(&xen_pcibk_pci_driver.driver, &driver_attr_irq_handler_state); + driver_remove_file(&xen_pcibk_pci_driver.driver, + &driver_attr_reset); pci_unregister_driver(&xen_pcibk_pci_driver); } @@ -1536,6 +1660,10 @@ static int __init pcistub_init(void) if (!err) err = driver_create_file(&xen_pcibk_pci_driver.driver, &driver_attr_irq_handler_state); + if (!err) + err = driver_create_file(&xen_pcibk_pci_driver.driver, + &driver_attr_reset); + if (err) pcistub_exit();