powerpc/powernv: Enable reset_devices parameter to issue a PHB reset

Message ID 20171012222153.24905-1-gpiccoli@linux.vnet.ibm.com
State Changes Requested
Headers show
Series
  • powerpc/powernv: Enable reset_devices parameter to issue a PHB reset
Related show

Commit Message

Guilherme G. Piccoli Oct. 12, 2017, 10:21 p.m.
During a kdump kernel boot in PowerPC, we request a reset of the
PHBs to the FW. It makes sense, since if we are booting a kdump
kernel it means we had some trouble before and we cannot rely in
the adapters' health; they could be in a bad state, hence the
reset is needed.

But not only in a kdump kernel we could use this reset - there are
situations, specially when debugging drivers, that we could break
an adapter in a way it requires such reset. One can tell to just
go ahead and reboot the machine, but happens that many times doing
kexec is much faster, and so preferable than a full power cycle.
Also, we could have situations in which adapters are in bad state
due to adapter's FW issue, and only a PHB Fundamental Reset could
revive them.

This patch enables the reset_devices parameter to perform such reset.
The parameter is barely used - only few drivers make use of it.
This is a PowerPC-only change.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
---
This patch was built/tested against powerpc/next branch.

We recently had a situation in which i40e driver couldn't start,
even after a full power cycle, due to a bug in its FW triggered
by a DCB condition in switch (thanks Mauro for narrowing this).
This patch enabled us to revive the adapter and use network
while debugging.


 Documentation/admin-guide/kernel-parameters.txt | 3 ++-
 arch/powerpc/platforms/powernv/pci-ioda.c       | 7 +++++--
 init/main.c                                     | 4 ++--
 3 files changed, 9 insertions(+), 5 deletions(-)

Comments

Michael Ellerman Oct. 13, 2017, 8:37 a.m. | #1
"Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com> writes:

> During a kdump kernel boot in PowerPC, we request a reset of the
> PHBs to the FW. It makes sense, since if we are booting a kdump
> kernel it means we had some trouble before and we cannot rely in
> the adapters' health; they could be in a bad state, hence the
> reset is needed.
>
> But not only in a kdump kernel we could use this reset - there are
> situations, specially when debugging drivers, that we could break
> an adapter in a way it requires such reset. One can tell to just
> go ahead and reboot the machine, but happens that many times doing
> kexec is much faster, and so preferable than a full power cycle.
> Also, we could have situations in which adapters are in bad state
> due to adapter's FW issue, and only a PHB Fundamental Reset could
> revive them.
>
> This patch enables the reset_devices parameter to perform such reset.
> The parameter is barely used - only few drivers make use of it.
> This is a PowerPC-only change.
>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
> ---
> This patch was built/tested against powerpc/next branch.
>
> We recently had a situation in which i40e driver couldn't start,
> even after a full power cycle, due to a bug in its FW triggered
> by a DCB condition in switch (thanks Mauro for narrowing this).
> This patch enabled us to revive the adapter and use network
> while debugging.

I really dislike this.

You're basically saying the kernel can't work out how to get a device
working, so let's leave it up to the user.

The driver should be fixed to detect that the device is not responding
and request a reset.

cheers
Guilherme G. Piccoli Oct. 13, 2017, 1:02 p.m. | #2
On 10/13/2017 05:37 AM, Michael Ellerman wrote:
> 
> I really dislike this.
> 
> You're basically saying the kernel can't work out how to get a device
> working, so let's leave it up to the user.

Oh, it was never my intention to say such blasphemy :)
It meant to be just a debug option to help the users, specifically the
ones debugging drivers, to try using a hammer to recover bad devices! To
this issue that I mentioned as an example, the fix specifically goes in
the FW of the adapter.

Anyway, since you really dislike it, let's drop it, no big deal!
Cheers,


Guilherme

> 
> The driver should be fixed to detect that the device is not responding
> and request a reset.
> 
> cheers
>
Benjamin Herrenschmidt Oct. 14, 2017, 9:13 a.m. | #3
On Fri, 2017-10-13 at 19:37 +1100, Michael Ellerman wrote:
> > We recently had a situation in which i40e driver couldn't start,
> > even after a full power cycle, due to a bug in its FW triggered
> > by a DCB condition in switch (thanks Mauro for narrowing this).
> > This patch enabled us to revive the adapter and use network
> > while debugging.
> 
> I really dislike this.
> 
> You're basically saying the kernel can't work out how to get a device
> working, so let's leave it up to the user.
> 
> The driver should be fixed to detect that the device is not responding
> and request a reset.

No, he's saying this is useful for the developers when debugging the
kernel driver (or for asking users to "test" something as part of
debugging a driver problem).

It's common to have various command line options affecting PCIe
behaviour. I don't see a fundamental problem with this one.

One could argue in fact that we should always PERST everything, the
main reason for not doing so is that on "normal" boot, OPAL has already
done it and it would slow the boot process down.

My only objection here is the actual name of the argument. We've had a
number of pci options so far, it makes sense to make sure we still have
"pci" in the name.

Also having the driver do a "reset" is not always simple, we don't
always have full control of PERST on a per-device basis. The patch
proposed will PERST top level PHBs which will propagate as hot reset
down switches, not 100% PERST but still useful.

Cheers,
Ben.
Guilherme G. Piccoli Oct. 15, 2017, 9:51 p.m. | #4
On 10/14/2017 06:13 AM, Benjamin Herrenschmidt wrote:
> No, he's saying this is useful for the developers when debugging the
> kernel driver (or for asking users to "test" something as part of
> debugging a driver problem).
> 
> It's common to have various command line options affecting PCIe
> behaviour. I don't see a fundamental problem with this one.

Thanks Ben, very well explained.

> 
> One could argue in fact that we should always PERST everything, the
> main reason for not doing so is that on "normal" boot, OPAL has already
> done it and it would slow the boot process down.
> 
> My only objection here is the actual name of the argument. We've had a
> number of pci options so far, it makes sense to make sure we still have
> "pci" in the name.

I just wanted to take advantage of the already existing argument,
reset_devices. And..in fact, the PERST is a reset right?
But if you prefer, we can change it - pci_force_reset?
Suggestions are welcome.

> 
> Also having the driver do a "reset" is not always simple, we don't
> always have full control of PERST on a per-device basis. The patch
> proposed will PERST top level PHBs which will propagate as hot reset
> down switches, not 100% PERST but still useful.
> 

Exactly, this reset happens on early arch stage of PCI initialization,
not trivial to drivers perform it.

Anyway, let me know if you want a V2 with improvements or to drop it..I
still see use cases for this.

Thanks,


Guilherme

> Cheers,
> Ben.
>

Patch

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 05496622b4ef..e75cf4a9345d 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3656,7 +3656,8 @@ 
 			the bottom of the address space.
 
 	reset_devices	[KNL] Force drivers to reset the underlying device
-			during initialization.
+			during initialization. In PowerPC bare-metal (aka
+			PowerNV), it also issues a Fundamental Reset on PHBs.
 
 	resume=		[SWSUSP]
 			Specify the partition device for software suspend
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index fb5cd7511189..0a5f2a221a09 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -4014,9 +4014,12 @@  static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	 * If we're running in kdump kernel, the previous kernel never
 	 * shutdown PCI devices correctly. We already got IODA table
 	 * cleaned out. So we have to issue PHB reset to stop all PCI
-	 * transactions from previous kernel.
+	 * transactions from previous kernel. Also, reset_devices
+	 * parameter will force the issue of reset (useful in debug
+	 * scenarios).
 	 */
-	if (is_kdump_kernel()) {
+
+	if (is_kdump_kernel() || reset_devices) {
 		pr_info("  Issue PHB reset ...\n");
 		pnv_eeh_phb_reset(hose, EEH_RESET_FUNDAMENTAL);
 		pnv_eeh_phb_reset(hose, EEH_RESET_DEACTIVATE);
diff --git a/init/main.c b/init/main.c
index 0ee9c6866ada..9d3dc50fadfe 100644
--- a/init/main.c
+++ b/init/main.c
@@ -157,10 +157,10 @@  EXPORT_SYMBOL(reset_devices);
 static int __init set_reset_devices(char *str)
 {
 	reset_devices = 1;
-	return 1;
+	return 0;
 }
 
-__setup("reset_devices", set_reset_devices);
+early_param("reset_devices", set_reset_devices);
 
 static const char *argv_init[MAX_INIT_ARGS+2] = { "init", NULL, };
 const char *envp_init[MAX_INIT_ENVS+2] = { "HOME=/", "TERM=linux", NULL, };