diff mbox

[09/12] opal: Inform fsp about the topology switch.

Message ID 20150328093621.31780.11347.stgit@mars
State Changes Requested
Headers show

Commit Message

Mahesh J Salgaonkar March 28, 2015, 9:36 a.m. UTC
From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

After the topology switch, we may have a non-functional backup topology.
This means, we won't be able to recover from future TOD errors that
requires topology switch. Someone needs to either fix it OR configure
new functional backup topology.

Bit 18 of the Pervasive local FIR (SCOM: EH.TPCHIP.TPC.LOCAL_FIR: 0x0104000C)
is used to signal that TOD error analysis needs to be performed. This
allows FSP/PRD to investigate and re-configure new backup topology if
required. Once new backup topology is configured and ready, FSP sends a
mailbox command xE6, s/c 0x06, mod 0, to enable the backup topology.

This isn't documented anywhere. This info is provided by FSP folks.

This patch implements setting of bit 18 in Pervasive local FIR. The next
patch will handle FSP mailbox command xE6, s/c 0x06, mod 0.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 hw/chiptod.c |   32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

Comments

Stewart Smith April 22, 2015, 4:17 a.m. UTC | #1
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes:
> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>
> After the topology switch, we may have a non-functional backup topology.
> This means, we won't be able to recover from future TOD errors that
> requires topology switch. Someone needs to either fix it OR configure
> new functional backup topology.
>
> Bit 18 of the Pervasive local FIR (SCOM: EH.TPCHIP.TPC.LOCAL_FIR: 0x0104000C)
> is used to signal that TOD error analysis needs to be performed. This
> allows FSP/PRD to investigate and re-configure new backup topology if
> required. Once new backup topology is configured and ready, FSP sends a
> mailbox command xE6, s/c 0x06, mod 0, to enable the backup topology.
>
> This isn't documented anywhere. This info is provided by FSP folks.

Any chance they're going to fix their documentation?

We should have it down solid how this is going to work on non-FSP
systems too. Will it be magic with the prd daemon we have?

> This patch implements setting of bit 18 in Pervasive local FIR. The next
> patch will handle FSP mailbox command xE6, s/c 0x06, mod 0.
>
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
>  hw/chiptod.c |   32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/hw/chiptod.c b/hw/chiptod.c
> index 1f6e2f1..f43d973 100644
> --- a/hw/chiptod.c
> +++ b/hw/chiptod.c
> @@ -94,6 +94,10 @@
>  #define   TOD_ERR_TTYPE4_RECVD		PPC_BIT(42)
>  #define   TOD_ERR_TTYPE5_RECVD		PPC_BIT(43)
>  
> +/* Local FIR EH.TPCHIP.TPC.LOCAL_FIR */
> +#define LOCAL_CORE_FIR		0x0104000C
> +#define LFIR_SWITCH_COMPLETE	PPC_BIT(18)
> +
>  /* Magic TB value. One step cycle ahead of sync */
>  #define INIT_TB	0x000000000001ff0
>  
> @@ -1002,6 +1006,33 @@ static bool chiptod_backup_valid(void)
>  	return false;
>  }
>  
> +static void chiptod_topology_switch_complete(void)
> +{
> +	/*
> +	 * After the topology switch, we may have a non-functional backup
> +	 * topology, and we won't be able to recover from future TOD errors
> +	 * that requires topology switch. Someone needs to either fix it OR
> +	 * configure new functional backup topology.
> +	 *
> +	 * Bit 18 of the Pervasive FIR is used to signal that TOD error
> +	 * analysis needs to be performed. This allows FSP/PRD to
> +	 * investigate and re-configure new backup topology if required.
> +	 * Once new backup topology is configured and ready, FSP sends a
> +	 * mailbox command xE6, s/c 0x06, mod 0, to enable the backup
> +	 * topology.
> +	 *
> +	 * This isn't documented anywhere. This info is provided by FSP
> +	 * folks.
> +	 */
> +	if (xscom_writeme(LOCAL_CORE_FIR, LFIR_SWITCH_COMPLETE) != 0) {
> +		prerror("CHIPTOD: XSCOM error writing LOCAL_CORE_FIR\n");
> +		return;
> +	}
> +
> +	prlog(PR_DEBUG, "CHIPTOD: Topology switch complete\n");
> +	print_topology_info();
> +}
> +
>  /*
>   * Sync up TOD with other chips and get TOD in running state.
>   * Check if current topology is active and running. If not, then
> @@ -1045,6 +1076,7 @@ static int chiptod_start_tod(void)
>  		current_topology = query_current_topology();
>  		chiptod_update_topology(chiptod_topo_primary);
>  		chiptod_update_topology(chiptod_topo_secondary);
> +		chiptod_topology_switch_complete();
>  	}
>  
>  	if (!chiptod_master_running()) {
>
> _______________________________________________
> Skiboot mailing list
> Skiboot@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/skiboot
Mahesh J Salgaonkar May 5, 2015, 11:09 a.m. UTC | #2
On 04/22/2015 09:47 AM, Stewart Smith wrote:
> Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes:
>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>
>> After the topology switch, we may have a non-functional backup topology.
>> This means, we won't be able to recover from future TOD errors that
>> requires topology switch. Someone needs to either fix it OR configure
>> new functional backup topology.
>>
>> Bit 18 of the Pervasive local FIR (SCOM: EH.TPCHIP.TPC.LOCAL_FIR: 0x0104000C)
>> is used to signal that TOD error analysis needs to be performed. This
>> allows FSP/PRD to investigate and re-configure new backup topology if
>> required. Once new backup topology is configured and ready, FSP sends a
>> mailbox command xE6, s/c 0x06, mod 0, to enable the backup topology.
>>
>> This isn't documented anywhere. This info is provided by FSP folks.
> 
> Any chance they're going to fix their documentation?

I doubt :-).

> 
> We should have it down solid how this is going to work on non-FSP
> systems too. Will it be magic with the prd daemon we have?

I assume the PRD daemon should act in the same way as it used to be with
FSP based system. However I need to verify this. Next week I have
planned to verify HMI on non-FSP based system and I may have more idea
by then.

> 
>> This patch implements setting of bit 18 in Pervasive local FIR. The next
>> patch will handle FSP mailbox command xE6, s/c 0x06, mod 0.

Also, We may have to define HBRT Interface equivalent to above FSP
mailbox command that reports the new backup topology.

>>
>> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>> ---
>>  hw/chiptod.c |   32 ++++++++++++++++++++++++++++++++
>>  1 file changed, 32 insertions(+)
>>
>> diff --git a/hw/chiptod.c b/hw/chiptod.c
>> index 1f6e2f1..f43d973 100644
>> --- a/hw/chiptod.c
>> +++ b/hw/chiptod.c
>> @@ -94,6 +94,10 @@
>>  #define   TOD_ERR_TTYPE4_RECVD		PPC_BIT(42)
>>  #define   TOD_ERR_TTYPE5_RECVD		PPC_BIT(43)
>>  
>> +/* Local FIR EH.TPCHIP.TPC.LOCAL_FIR */
>> +#define LOCAL_CORE_FIR		0x0104000C
>> +#define LFIR_SWITCH_COMPLETE	PPC_BIT(18)
>> +
>>  /* Magic TB value. One step cycle ahead of sync */
>>  #define INIT_TB	0x000000000001ff0
>>  
>> @@ -1002,6 +1006,33 @@ static bool chiptod_backup_valid(void)
>>  	return false;
>>  }
>>  
>> +static void chiptod_topology_switch_complete(void)
>> +{
>> +	/*
>> +	 * After the topology switch, we may have a non-functional backup
>> +	 * topology, and we won't be able to recover from future TOD errors
>> +	 * that requires topology switch. Someone needs to either fix it OR
>> +	 * configure new functional backup topology.
>> +	 *
>> +	 * Bit 18 of the Pervasive FIR is used to signal that TOD error
>> +	 * analysis needs to be performed. This allows FSP/PRD to
>> +	 * investigate and re-configure new backup topology if required.
>> +	 * Once new backup topology is configured and ready, FSP sends a
>> +	 * mailbox command xE6, s/c 0x06, mod 0, to enable the backup
>> +	 * topology.
>> +	 *
>> +	 * This isn't documented anywhere. This info is provided by FSP
>> +	 * folks.
>> +	 */
>> +	if (xscom_writeme(LOCAL_CORE_FIR, LFIR_SWITCH_COMPLETE) != 0) {
>> +		prerror("CHIPTOD: XSCOM error writing LOCAL_CORE_FIR\n");
>> +		return;
>> +	}
>> +
>> +	prlog(PR_DEBUG, "CHIPTOD: Topology switch complete\n");
>> +	print_topology_info();
>> +}
>> +
>>  /*
>>   * Sync up TOD with other chips and get TOD in running state.
>>   * Check if current topology is active and running. If not, then
>> @@ -1045,6 +1076,7 @@ static int chiptod_start_tod(void)
>>  		current_topology = query_current_topology();
>>  		chiptod_update_topology(chiptod_topo_primary);
>>  		chiptod_update_topology(chiptod_topo_secondary);
>> +		chiptod_topology_switch_complete();
>>  	}
>>  
>>  	if (!chiptod_master_running()) {
>>
>> _______________________________________________
>> Skiboot mailing list
>> Skiboot@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/skiboot
diff mbox

Patch

diff --git a/hw/chiptod.c b/hw/chiptod.c
index 1f6e2f1..f43d973 100644
--- a/hw/chiptod.c
+++ b/hw/chiptod.c
@@ -94,6 +94,10 @@ 
 #define   TOD_ERR_TTYPE4_RECVD		PPC_BIT(42)
 #define   TOD_ERR_TTYPE5_RECVD		PPC_BIT(43)
 
+/* Local FIR EH.TPCHIP.TPC.LOCAL_FIR */
+#define LOCAL_CORE_FIR		0x0104000C
+#define LFIR_SWITCH_COMPLETE	PPC_BIT(18)
+
 /* Magic TB value. One step cycle ahead of sync */
 #define INIT_TB	0x000000000001ff0
 
@@ -1002,6 +1006,33 @@  static bool chiptod_backup_valid(void)
 	return false;
 }
 
+static void chiptod_topology_switch_complete(void)
+{
+	/*
+	 * After the topology switch, we may have a non-functional backup
+	 * topology, and we won't be able to recover from future TOD errors
+	 * that requires topology switch. Someone needs to either fix it OR
+	 * configure new functional backup topology.
+	 *
+	 * Bit 18 of the Pervasive FIR is used to signal that TOD error
+	 * analysis needs to be performed. This allows FSP/PRD to
+	 * investigate and re-configure new backup topology if required.
+	 * Once new backup topology is configured and ready, FSP sends a
+	 * mailbox command xE6, s/c 0x06, mod 0, to enable the backup
+	 * topology.
+	 *
+	 * This isn't documented anywhere. This info is provided by FSP
+	 * folks.
+	 */
+	if (xscom_writeme(LOCAL_CORE_FIR, LFIR_SWITCH_COMPLETE) != 0) {
+		prerror("CHIPTOD: XSCOM error writing LOCAL_CORE_FIR\n");
+		return;
+	}
+
+	prlog(PR_DEBUG, "CHIPTOD: Topology switch complete\n");
+	print_topology_info();
+}
+
 /*
  * Sync up TOD with other chips and get TOD in running state.
  * Check if current topology is active and running. If not, then
@@ -1045,6 +1076,7 @@  static int chiptod_start_tod(void)
 		current_topology = query_current_topology();
 		chiptod_update_topology(chiptod_topo_primary);
 		chiptod_update_topology(chiptod_topo_secondary);
+		chiptod_topology_switch_complete();
 	}
 
 	if (!chiptod_master_running()) {