diff mbox series

pci: Warn if BME cannot be turned off during kexec

Message ID 20200104225052.27275-1-deepa.kernel@gmail.com
State New
Headers show
Series pci: Warn if BME cannot be turned off during kexec | expand

Commit Message

Deepa Dinamani Jan. 4, 2020, 10:50 p.m. UTC
BME not being off is a security risk, so for whatever
reason if we cannot disable it, print a warning.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
---
 drivers/pci/pci-driver.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Bjorn Helgaas Jan. 6, 2020, 1:54 p.m. UTC | #1
Hi Deepa,

Thanks for the patches.  Since these two patches touch the same piece
of code in pci_device_shutdown(), they conflict with each other.  I
could resolve this myself, but maybe you could make them a series that
applies cleanly together?

Can you also please edit the subject lines so they match the
convention (use "git log --oneline drivers/pci/pci-driver.c" to see
it).

On Sat, Jan 04, 2020 at 02:50:52PM -0800, Deepa Dinamani wrote:
> BME not being off is a security risk, so for whatever
> reason if we cannot disable it, print a warning.

"BME" is not a common term in drivers/pci; can you use "Bus Master
Enable" (to match the PCIe spec) or "PCI_COMMAND_MASTER" (to match the
Linux code)?

Can you also explain why this is a security risk?  It looks like we
disable bus mastering if the device is in D0-D3hot.  If the device is
in D3cold, it's powered off, so we can't read/write config space.  But
if it's in D3cold, the device is powered off, so it can't be a bus
master either, so why would we warn about it?

> Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
> ---
>  drivers/pci/pci-driver.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 0454ca0e4e3f..6c866a81f46c 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -491,8 +491,12 @@ static void pci_device_shutdown(struct device *dev)
>  	 * If it is not a kexec reboot, firmware will hit the PCI
>  	 * devices with big hammer and stop their DMA any way.
>  	 */
> -	if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
> -		pci_clear_master(pci_dev);
> +	if (kexec_in_progress) {
> +		if (likely(pci_dev->current_state <= PCI_D3hot))

No need to use "likely" here unless you can measure a difference.  I
doubt this is a performance path.

> +			pci_clear_master(pci_dev);
> +		else
> +			dev_warn(dev, "Unable to turn off BME during kexec");

How often and for what sort of devices would you expect this warning
to be emitted?  If this is a common situation and the user can't do
anything about it, the warnings will clutter the logs and train users
to ignore them.

Bjorn

> +	}
>  }
>  
>  #ifdef CONFIG_PM
> -- 
> 2.17.1
>
Deepa Dinamani Jan. 6, 2020, 7:38 p.m. UTC | #2
On Mon, Jan 6, 2020 at 5:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> Hi Deepa,
>
> Thanks for the patches.  Since these two patches touch the same piece
> of code in pci_device_shutdown(), they conflict with each other.  I
> could resolve this myself, but maybe you could make them a series that
> applies cleanly together?

Sure, will make this a series.

> Can you also please edit the subject lines so they match the
> convention (use "git log --oneline drivers/pci/pci-driver.c" to see
> it).

Will do.

> On Sat, Jan 04, 2020 at 02:50:52PM -0800, Deepa Dinamani wrote:
> > BME not being off is a security risk, so for whatever
> > reason if we cannot disable it, print a warning.
>
> "BME" is not a common term in drivers/pci; can you use "Bus Master
> Enable" (to match the PCIe spec) or "PCI_COMMAND_MASTER" (to match the
> Linux code)?

Will do.

> Can you also explain why this is a security risk?  It looks like we
> disable bus mastering if the device is in D0-D3hot.  If the device is
> in D3cold, it's powered off, so we can't read/write config space.  But
> if it's in D3cold, the device is powered off, so it can't be a bus
> master either, so why would we warn about it?

I was mainly concerned about the PCI_UNKNOWN state here. Maybe we can
add a specific check for the unknown state if that is preferable.

> > Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
> > ---
> >  drivers/pci/pci-driver.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index 0454ca0e4e3f..6c866a81f46c 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -491,8 +491,12 @@ static void pci_device_shutdown(struct device *dev)
> >        * If it is not a kexec reboot, firmware will hit the PCI
> >        * devices with big hammer and stop their DMA any way.
> >        */
> > -     if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
> > -             pci_clear_master(pci_dev);
> > +     if (kexec_in_progress) {
> > +             if (likely(pci_dev->current_state <= PCI_D3hot))
>
> No need to use "likely" here unless you can measure a difference.  I
> doubt this is a performance path.
>
> > +                     pci_clear_master(pci_dev);
> > +             else
> > +                     dev_warn(dev, "Unable to turn off BME during kexec");
>
> How often and for what sort of devices would you expect this warning
> to be emitted?  If this is a common situation and the user can't do
> anything about it, the warnings will clutter the logs and train users
> to ignore them.

This is not a common situation. I think I have seen this only a couple
of times in my kexec experiments. I have not found any documentation
about if a device can go into an unknown power state and still be
harmful or not. But, if you need more testing, I could check the patch
into the Google datacenter code and sweep the logs to see if these get
printed at all.

-Deepa
Deepa Dinamani Jan. 8, 2020, 5:44 p.m. UTC | #3
On Mon, Jan 6, 2020 at 11:38 AM Deepa Dinamani <deepa.kernel@gmail.com> wrote:
>
> On Mon, Jan 6, 2020 at 5:54 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > Hi Deepa,
> >
> > Thanks for the patches.  Since these two patches touch the same piece
> > of code in pci_device_shutdown(), they conflict with each other.  I
> > could resolve this myself, but maybe you could make them a series that
> > applies cleanly together?
>
> Sure, will make this a series.
>
> > Can you also please edit the subject lines so they match the
> > convention (use "git log --oneline drivers/pci/pci-driver.c" to see
> > it).
>
> Will do.
>
> > On Sat, Jan 04, 2020 at 02:50:52PM -0800, Deepa Dinamani wrote:
> > > BME not being off is a security risk, so for whatever
> > > reason if we cannot disable it, print a warning.
> >
> > "BME" is not a common term in drivers/pci; can you use "Bus Master
> > Enable" (to match the PCIe spec) or "PCI_COMMAND_MASTER" (to match the
> > Linux code)?
>
> Will do.
>
> > Can you also explain why this is a security risk?  It looks like we
> > disable bus mastering if the device is in D0-D3hot.  If the device is
> > in D3cold, it's powered off, so we can't read/write config space.  But
> > if it's in D3cold, the device is powered off, so it can't be a bus
> > master either, so why would we warn about it?
>
> I was mainly concerned about the PCI_UNKNOWN state here. Maybe we can
> add a specific check for the unknown state if that is preferable.

I did some more testing. You are right, these messages are printed
more often than I had remembered.

For some more context on what I am trying to do: I recently merged
another patch that disable iommu unconditionally at kexec:
https://lkml.org/lkml/2019/11/10/146
And, if we do not have IOMMU on and BME is on then we have no way of
controlling memory accesses from devices. This is why I wanted these
warning messages printed. Alternatively, it might just be enough to
warn if we cannot turn off BME on root ports rather than all the
devices. Does this seem like a better compromise?

-Deepa
diff mbox series

Patch

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 0454ca0e4e3f..6c866a81f46c 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -491,8 +491,12 @@  static void pci_device_shutdown(struct device *dev)
 	 * If it is not a kexec reboot, firmware will hit the PCI
 	 * devices with big hammer and stop their DMA any way.
 	 */
-	if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
-		pci_clear_master(pci_dev);
+	if (kexec_in_progress) {
+		if (likely(pci_dev->current_state <= PCI_D3hot))
+			pci_clear_master(pci_dev);
+		else
+			dev_warn(dev, "Unable to turn off BME during kexec");
+	}
 }
 
 #ifdef CONFIG_PM