diff mbox series

[iwl-net,v1] ice: reset first in crash dump kernels

Message ID 20230919212959.1939749-1-jesse.brandeburg@intel.com
State Changes Requested
Headers show
Series [iwl-net,v1] ice: reset first in crash dump kernels | expand

Commit Message

Jesse Brandeburg Sept. 19, 2023, 9:29 p.m. UTC
When booting into the crash dump kernels there are cases where upon
enabling the device, the system under test will panic or machine check.

One such test is to
- load ice driver
$ modprobe ice
- enable SR-IOV (2 VFs)
$ echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
- crash
echo c > /proc/sysrq-trigger
- load ice driver (or happens automatically)
modprobe ice
- crash during pcim_enable_device()

Avoid this problem by issuing a FLR to the device via PCIe config space
on the crash kernel, to clear out any outstanding transactions and stop
all queues and interrupts. Restore config space afterword because the
driver won't load successfully otherwise.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_main.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

Paul Menzel Sept. 20, 2023, 5:18 a.m. UTC | #1
Dear Jesse,


Thank you for your patch.

Am 19.09.23 um 23:29 schrieb Jesse Brandeburg:
> When booting into the crash dump kernels there are cases where upon
> enabling the device, the system under test will panic or machine check.
> 
> One such test is to
> - load ice driver
> $ modprobe ice
> - enable SR-IOV (2 VFs)
> $ echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
> - crash
> echo c > /proc/sysrq-trigger

Above you prepended a $.

> - load ice driver (or happens automatically)
> modprobe ice
> - crash during pcim_enable_device()
> 
> Avoid this problem by issuing a FLR to the device via PCIe config space
> on the crash kernel, to clear out any outstanding transactions and stop
> all queues and interrupts. Restore config space afterword because the

afterw*a*rd

> driver won't load successfully otherwise.

Excuse my ignorance, could you please add, what the crashdump kernel 
does differently from the “normal” kernel, so this special handling is 
needed?

> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> ---
>   drivers/net/ethernet/intel/ice/ice_main.c | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index c8286adae946..6550c46e4e36 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -6,6 +6,7 @@
>   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>   
>   #include <generated/utsrelease.h>
> +#include <linux/crash_dump.h>
>   #include "ice.h"
>   #include "ice_base.h"
>   #include "ice_lib.h"
> @@ -5014,6 +5015,20 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>   		return -EINVAL;
>   	}
>   
> +	/* when under a kdump kernel initiate a reset before enabling the
> +	 * device in order to clear out any pending DMA transactions. These
> +	 * transactions can cause some systems to machine check when doing
> +	 * the pcim_enable_device() below.
> +	 */
> +	if (is_kdump_kernel()) {
> +		pci_save_state(pdev);
> +		pci_clear_master(pdev);
> +		err = pcie_flr(pdev);
> +		if (err)
> +			return err;
> +		pci_restore_state(pdev);
> +	}
> +

Should this be added to the common PCI code? Maybe loop the PCI 
subsystem folks in?

>   	/* this driver uses devres, see
>   	 * Documentation/driver-api/driver-model/devres.rst
>   	 */


Kind regards,

Paul
Jesse Brandeburg Oct. 2, 2023, 6:16 p.m. UTC | #2
On 9/19/2023 10:18 PM, Paul Menzel wrote:
> Dear Jesse,
> 
> 
> Thank you for your patch.
> 
> Am 19.09.23 um 23:29 schrieb Jesse Brandeburg:
>> When booting into the crash dump kernels there are cases where upon
>> enabling the device, the system under test will panic or machine check.
>>
>> One such test is to
>> - load ice driver
>> $ modprobe ice
>> - enable SR-IOV (2 VFs)
>> $ echo 2 > /sys/class/net/eth0/device/sriov_num_vfs
>> - crash
>> echo c > /proc/sysrq-trigger
> 
> Above you prepended a $.

Fixed in v2.

> 
>> - load ice driver (or happens automatically)
>> modprobe ice
>> - crash during pcim_enable_device()
>>
>> Avoid this problem by issuing a FLR to the device via PCIe config space
>> on the crash kernel, to clear out any outstanding transactions and stop
>> all queues and interrupts. Restore config space afterword because the
> 
> afterw*a*rd

Fixed in v2.


> 
>> driver won't load successfully otherwise.
> 
> Excuse my ignorance, could you please add, what the crashdump kernel
> does differently from the “normal” kernel, so this special handling is
> needed?

I added more description in the v2 commit message, I hope that helps.

In summary: the crashdump kernel is starting up on "dirty" state of
hardware, due to the surprise crash of the previously running kernel
that had running devices when it "panicked"

> 
>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
>> ---
>>   drivers/net/ethernet/intel/ice/ice_main.c | 15 +++++++++++++++
>>   1 file changed, 15 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
>> b/drivers/net/ethernet/intel/ice/ice_main.c
>> index c8286adae946..6550c46e4e36 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_main.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
>> @@ -6,6 +6,7 @@
>>   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>     #include <generated/utsrelease.h>
>> +#include <linux/crash_dump.h>
>>   #include "ice.h"
>>   #include "ice_base.h"
>>   #include "ice_lib.h"
>> @@ -5014,6 +5015,20 @@ ice_probe(struct pci_dev *pdev, const struct
>> pci_device_id __always_unused *ent)
>>           return -EINVAL;
>>       }
>>   +    /* when under a kdump kernel initiate a reset before enabling the
>> +     * device in order to clear out any pending DMA transactions. These
>> +     * transactions can cause some systems to machine check when doing
>> +     * the pcim_enable_device() below.
>> +     */
>> +    if (is_kdump_kernel()) {
>> +        pci_save_state(pdev);
>> +        pci_clear_master(pdev);
>> +        err = pcie_flr(pdev);
>> +        if (err)
>> +            return err;
>> +        pci_restore_state(pdev);
>> +    }
>> +
> 
> Should this be added to the common PCI code? Maybe loop the PCI
> subsystem folks in?

Ok, I'll cc: linux-pci when I send v2.


> 
>>       /* this driver uses devres, see
>>        * Documentation/driver-api/driver-model/devres.rst
>>        */
> 
> 
> Kind regards,
> 
> Paul
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index c8286adae946..6550c46e4e36 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -6,6 +6,7 @@ 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <generated/utsrelease.h>
+#include <linux/crash_dump.h>
 #include "ice.h"
 #include "ice_base.h"
 #include "ice_lib.h"
@@ -5014,6 +5015,20 @@  ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
 		return -EINVAL;
 	}
 
+	/* when under a kdump kernel initiate a reset before enabling the
+	 * device in order to clear out any pending DMA transactions. These
+	 * transactions can cause some systems to machine check when doing
+	 * the pcim_enable_device() below.
+	 */
+	if (is_kdump_kernel()) {
+		pci_save_state(pdev);
+		pci_clear_master(pdev);
+		err = pcie_flr(pdev);
+		if (err)
+			return err;
+		pci_restore_state(pdev);
+	}
+
 	/* this driver uses devres, see
 	 * Documentation/driver-api/driver-model/devres.rst
 	 */