diff mbox series

[4/7] capp/phb4: Force CAPP to PCIe mode during kernel shutdown

Message ID 20180915143926.12870-5-vaibhav@linux.ibm.com
State Changes Requested
Headers show
Series Enable fast-reboot support for CAPI-2 | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success master/apply_patch Successfully applied

Commit Message

Vaibhav Jain Sept. 15, 2018, 2:39 p.m. UTC
This patch introduces a new opal syncer for PHB4 named
phb4_host_sync_reset(). We register this opal syncer when CAPP is
activated successfully in phb4_set_capi_mode() so that it will be
called at kernel shutdown during fast-reset.

The function will then repeatedly call phb->ops->set_capi_mode() to
switch switch CAPP to PCIe mode. In case set_capi_mode() indicates its
OPAL_BUSY. It calls slot->ops.run_sm() to ensure that Opal slot reset
state machine makes forward progress.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 hw/phb4.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

Comments

Andrew Donnellan Sept. 17, 2018, 8:46 a.m. UTC | #1
On 16/9/18 12:39 am, Vaibhav Jain wrote:
> This patch introduces a new opal syncer for PHB4 named
> phb4_host_sync_reset(). We register this opal syncer when CAPP is
> activated successfully in phb4_set_capi_mode() so that it will be
> called at kernel shutdown during fast-reset.
> 
> The function will then repeatedly call phb->ops->set_capi_mode() to
> switch switch CAPP to PCIe mode. In case set_capi_mode() indicates its
> OPAL_BUSY. It calls slot->ops.run_sm() to ensure that Opal slot reset
> state machine makes forward progress.
> 
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Stewart Smith Sept. 19, 2018, 8:36 a.m. UTC | #2
Vaibhav Jain <vaibhav@linux.ibm.com> writes:

> This patch introduces a new opal syncer for PHB4 named
> phb4_host_sync_reset(). We register this opal syncer when CAPP is
> activated successfully in phb4_set_capi_mode() so that it will be
> called at kernel shutdown during fast-reset.
>
> The function will then repeatedly call phb->ops->set_capi_mode() to
> switch switch CAPP to PCIe mode. In case set_capi_mode() indicates its
> OPAL_BUSY. It calls slot->ops.run_sm() to ensure that Opal slot reset
> state machine makes forward progress.
>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
>  hw/phb4.c | 36 +++++++++++++++++++++++++++++++++++-
>  1 file changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/hw/phb4.c b/hw/phb4.c
> index 4284c467..615cda66 100644
> --- a/hw/phb4.c
> +++ b/hw/phb4.c
> @@ -2747,6 +2747,36 @@ static void phb4_training_trace(struct phb4 *p)
>  	}
>  }
>  
> +/*
> + * Trigger a creset to disable CAPI mode on kernel shutdown.
> + *
> + * This helper is called repeatedly by the host sync notifier mechanism, which
> + * relies on the kernel to regularly poll the OPAL_SYNC_HOST_REBOOT call as it
> + * shuts down.
> + */
> +static bool phb4_host_sync_reset(void *data)
> +{
> +	struct phb4 *p = (struct phb4 *)data;
> +	struct phb *phb = &p->phb;
> +	struct pci_slot *slot = p->phb.slot;
> +	int64_t rc = 0;
> +
> +	/* Make sure no-one modifies the phb flags while we are active */
> +	phb_lock(phb);
> +
> +	/* Call phb ops to disable capi */
> +	rc = phb->ops->set_capi_mode(phb, OPAL_PHB_CAPI_MODE_PCIE,
> +				       phb->ops->get_reserved_pe_number(phb));
> +	if (rc == OPAL_BUSY) {
> +		/* Run the phb reset state machine */
> +		rc = slot->ops.run_sm(slot);
> +	}
> +
> +	phb_unlock(phb);
> +
> +	return rc <= OPAL_SUCCESS;

return true if success or failure? Is this one of the fun "true when
done" functions?
Andrew Donnellan Sept. 19, 2018, 9:15 a.m. UTC | #3
On 19/9/18 6:36 pm, Stewart Smith wrote:>> +	return rc <= OPAL_SUCCESS;
> 
> return true if success or failure? Is this one of the fun "true when
> done" functions?

Yep - there's no way to report an error as such. If you report false, 
that turns into an OPAL_BUSY being returned to Linux and you get polled 
a few ms later. If you report true, that turns into an OPAL_SUCCESS and 
you actually reboot.

Though if the PHB reset is failing, do we catch that and force a full 
IPL at some point?
Vaibhav Jain Sept. 19, 2018, 9:59 a.m. UTC | #4
Andrew Donnellan <andrew.donnellan@au1.ibm.com> writes:

> On 19/9/18 6:36 pm, Stewart Smith wrote:>> +	return rc <= OPAL_SUCCESS;
>> 
>> return true if success or failure? Is this one of the fun "true when
>> done" functions?
>
> Yep - there's no way to report an error as such. If you report false, 
> that turns into an OPAL_BUSY being returned to Linux and you get polled 
> a few ms later. If you report true, that turns into an OPAL_SUCCESS and 
> you actually reboot.
Agree

>
> Though if the PHB reset is failing, do we catch that and force a full 
> IPL at some point?
Yes, if PHB reset fails its handled in phb4_slot_sm_run_completed()
where we disable fast reset to do full IPL. I have tested it by manually
injecting a reset failure in code and found it to be working.
Frederic Barrat Sept. 21, 2018, 12:33 p.m. UTC | #5
Le 15/09/2018 à 16:39, Vaibhav Jain a écrit :
> This patch introduces a new opal syncer for PHB4 named
> phb4_host_sync_reset(). We register this opal syncer when CAPP is
> activated successfully in phb4_set_capi_mode() so that it will be
> called at kernel shutdown during fast-reset.
> 
> The function will then repeatedly call phb->ops->set_capi_mode() to
> switch switch CAPP to PCIe mode. In case set_capi_mode() indicates its
> OPAL_BUSY. It calls slot->ops.run_sm() to ensure that Opal slot reset
> state machine makes forward progress.
> 
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
>   hw/phb4.c | 36 +++++++++++++++++++++++++++++++++++-
>   1 file changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/phb4.c b/hw/phb4.c
> index 4284c467..615cda66 100644
> --- a/hw/phb4.c
> +++ b/hw/phb4.c
> @@ -2747,6 +2747,36 @@ static void phb4_training_trace(struct phb4 *p)
>   	}
>   }
>   
> +/*
> + * Trigger a creset to disable CAPI mode on kernel shutdown.
> + *
> + * This helper is called repeatedly by the host sync notifier mechanism, which
> + * relies on the kernel to regularly poll the OPAL_SYNC_HOST_REBOOT call as it
> + * shuts down.
> + */
> +static bool phb4_host_sync_reset(void *data)
> +{
> +	struct phb4 *p = (struct phb4 *)data;
> +	struct phb *phb = &p->phb;
> +	struct pci_slot *slot = p->phb.slot;
> +	int64_t rc = 0;
> +
> +	/* Make sure no-one modifies the phb flags while we are active */
> +	phb_lock(phb);
> +
> +	/* Call phb ops to disable capi */
> +	rc = phb->ops->set_capi_mode(phb, OPAL_PHB_CAPI_MODE_PCIE,
> +				       phb->ops->get_reserved_pe_number(phb));
> +	if (rc == OPAL_BUSY) {
> +		/* Run the phb reset state machine */
> +		rc = slot->ops.run_sm(slot);
> +	}

Nitpick, but since you're preparing a new version: at this point, the 
comment looks weird, since there's no reason for a reset to occur, we 
haven't changed the state of the slot yet. It's coming in a later patch, 
by calling creset from set_capi_mode()

   Fred


> +
> +	phb_unlock(phb);
> +
> +	return rc <= OPAL_SUCCESS;
> +}
> +
>   static int64_t phb4_poll_link(struct pci_slot *slot)
>   {
>   	struct phb4 *p = phb_to_phb4(slot->phb);
> @@ -4381,9 +4411,13 @@ static int64_t phb4_set_capi_mode(struct phb *phb, uint64_t mode,
>   					(CAPP_MAX_STQ_ENGINES |
>   					 CAPP_MIN_DMA_READ_ENGINES)));
>   
> -		if (ret == OPAL_SUCCESS)
> +		if (ret == OPAL_SUCCESS) {
> +			/* register notification on system shutdown */
> +			opal_add_host_sync_notifier(&phb4_host_sync_reset, p);
> +
>   			/* Disable fast reboot for CAPP */
>   			disable_fast_reboot("CAPP being enabled");
> +		}
>   		else
>   			/* In case of an error mark the PHB detached */
>   			chip->capp_phb4_attached_mask ^= 1 << p->index;
>
Vaibhav Jain Sept. 24, 2018, 5:28 a.m. UTC | #6
Thanks for reviewing this patch Fred,

Frederic Barrat <fbarrat@linux.ibm.com> writes:

>> +	if (rc == OPAL_BUSY) {
>> +		/* Run the phb reset state machine */
>> +		rc = slot->ops.run_sm(slot);
>> +	}
>
> Nitpick, but since you're preparing a new version: at this point, the 
> comment looks weird, since there's no reason for a reset to occur, we 
> haven't changed the state of the slot yet. It's coming in a later patch, 
> by calling creset from set_capi_mode()
Agreed, will fix this comment in v2.
diff mbox series

Patch

diff --git a/hw/phb4.c b/hw/phb4.c
index 4284c467..615cda66 100644
--- a/hw/phb4.c
+++ b/hw/phb4.c
@@ -2747,6 +2747,36 @@  static void phb4_training_trace(struct phb4 *p)
 	}
 }
 
+/*
+ * Trigger a creset to disable CAPI mode on kernel shutdown.
+ *
+ * This helper is called repeatedly by the host sync notifier mechanism, which
+ * relies on the kernel to regularly poll the OPAL_SYNC_HOST_REBOOT call as it
+ * shuts down.
+ */
+static bool phb4_host_sync_reset(void *data)
+{
+	struct phb4 *p = (struct phb4 *)data;
+	struct phb *phb = &p->phb;
+	struct pci_slot *slot = p->phb.slot;
+	int64_t rc = 0;
+
+	/* Make sure no-one modifies the phb flags while we are active */
+	phb_lock(phb);
+
+	/* Call phb ops to disable capi */
+	rc = phb->ops->set_capi_mode(phb, OPAL_PHB_CAPI_MODE_PCIE,
+				       phb->ops->get_reserved_pe_number(phb));
+	if (rc == OPAL_BUSY) {
+		/* Run the phb reset state machine */
+		rc = slot->ops.run_sm(slot);
+	}
+
+	phb_unlock(phb);
+
+	return rc <= OPAL_SUCCESS;
+}
+
 static int64_t phb4_poll_link(struct pci_slot *slot)
 {
 	struct phb4 *p = phb_to_phb4(slot->phb);
@@ -4381,9 +4411,13 @@  static int64_t phb4_set_capi_mode(struct phb *phb, uint64_t mode,
 					(CAPP_MAX_STQ_ENGINES |
 					 CAPP_MIN_DMA_READ_ENGINES)));
 
-		if (ret == OPAL_SUCCESS)
+		if (ret == OPAL_SUCCESS) {
+			/* register notification on system shutdown */
+			opal_add_host_sync_notifier(&phb4_host_sync_reset, p);
+
 			/* Disable fast reboot for CAPP */
 			disable_fast_reboot("CAPP being enabled");
+		}
 		else
 			/* In case of an error mark the PHB detached */
 			chip->capp_phb4_attached_mask ^= 1 << p->index;