diff mbox

powerpc/85xx: disable timebase synchronization under the hypervisor

Message ID 1308092673-13045-1-git-send-email-timur@freescale.com (mailing list archive)
State Accepted, archived
Commit 14497d31e65cca73c9814a1ff373ae294aae616b
Delegated to: Kumar Gala
Headers show

Commit Message

Timur Tabi June 14, 2011, 11:04 p.m. UTC
The Freescale hypervisor does not allow guests to write to the timebase
registers (virtualizing the timebase register was deemed too complicated),
so don't try to synchronize the timebase registers when we're running
under the hypervisor.

This typically happens when kexec support is enabled.

Signed-off-by: Timur Tabi <timur@freescale.com>
---
 arch/powerpc/platforms/85xx/p3041_ds.c |   11 +++++++++++
 arch/powerpc/platforms/85xx/p4080_ds.c |   11 +++++++++++
 arch/powerpc/platforms/85xx/p5020_ds.c |   11 +++++++++++
 3 files changed, 33 insertions(+), 0 deletions(-)

Comments

Scott Wood June 14, 2011, 11:14 p.m. UTC | #1
On Tue, 14 Jun 2011 18:04:33 -0500
Timur Tabi <timur@freescale.com> wrote:

> The Freescale hypervisor does not allow guests to write to the timebase
> registers (virtualizing the timebase register was deemed too complicated),
> so don't try to synchronize the timebase registers when we're running
> under the hypervisor.
> 
> This typically happens when kexec support is enabled.
> 
> Signed-off-by: Timur Tabi <timur@freescale.com>

FWIW, it's not supported under KVM either -- though we don't support an SMP
guest under KVM yet, and KVM silently ignores it rather than logs errors as
the FSL HV does.

-Scott
Timur Tabi June 14, 2011, 11:15 p.m. UTC | #2
Scott Wood wrote:
> FWIW, it's not supported under KVM either -- though we don't support an SMP
> guest under KVM yet, and KVM silently ignores it rather than logs errors as
> the FSL HV does.

Does KVM set the root compatible to "fsl,P4080DS-hv"?
Scott Wood June 14, 2011, 11:25 p.m. UTC | #3
On Tue, 14 Jun 2011 18:15:26 -0500
Timur Tabi <timur@freescale.com> wrote:

> Scott Wood wrote:
> > FWIW, it's not supported under KVM either -- though we don't support an SMP
> > guest under KVM yet, and KVM silently ignores it rather than logs errors as
> > the FSL HV does.
> 
> Does KVM set the root compatible to "fsl,P4080DS-hv"?

No, Qemu/KVM like to pretend they're fully emulating concrete hardware,
even though there's stuff missing.

The only upstream e500 KVM platform is currently mpc8544ds.

For now, there's no SMP KVM guest support.  Maybe kexec can be fixed to not
hard-reset the core by the time that changes. :-)

-Scott
Benjamin Herrenschmidt June 15, 2011, 1:58 a.m. UTC | #4
On Tue, 2011-06-14 at 18:25 -0500, Scott Wood wrote:
> On Tue, 14 Jun 2011 18:15:26 -0500
> Timur Tabi <timur@freescale.com> wrote:
> 
> > Scott Wood wrote:
> > > FWIW, it's not supported under KVM either -- though we don't support an SMP
> > > guest under KVM yet, and KVM silently ignores it rather than logs errors as
> > > the FSL HV does.
> > 
> > Does KVM set the root compatible to "fsl,P4080DS-hv"?
> 
> No, Qemu/KVM like to pretend they're fully emulating concrete hardware,
> even though there's stuff missing.
> 
> The only upstream e500 KVM platform is currently mpc8544ds.
> 
> For now, there's no SMP KVM guest support.  Maybe kexec can be fixed to not
> hard-reset the core by the time that changes. :-)

We might want to generically have a CPU feature bit indicating we are
running in guest vs. HV mode. I know Paulus is planning to introduce one
so you may want to sync with him.

Cheers,
Ben.
Tabi Timur-B04825 June 15, 2011, 2:10 a.m. UTC | #5
Benjamin Herrenschmidt wrote:
> We might want to generically have a CPU feature bit indicating we are
> running in guest vs. HV mode. I know Paulus is planning to introduce one
> so you may want to sync with him.

Are you talking about CPU_FTR_HVMODE_206?
Benjamin Herrenschmidt June 15, 2011, 2:33 a.m. UTC | #6
On Wed, 2011-06-15 at 02:10 +0000, Tabi Timur-B04825 wrote:
> Benjamin Herrenschmidt wrote:
> > We might want to generically have a CPU feature bit indicating we are
> > running in guest vs. HV mode. I know Paulus is planning to introduce one
> > so you may want to sync with him.
> 
> Are you talking about CPU_FTR_HVMODE_206?

Well, not exactly. Paul wants to break that up since we're adding some
primitive support for 201 HV mode too (for 970's). Last we discussed,
the plan was to go for a generic HV mode bit and a separate bit for the
version.

Cheers,
Ben.
Kumar Gala June 22, 2011, 11:44 a.m. UTC | #7
On Jun 14, 2011, at 9:33 PM, Benjamin Herrenschmidt wrote:

> On Wed, 2011-06-15 at 02:10 +0000, Tabi Timur-B04825 wrote:
>> Benjamin Herrenschmidt wrote:
>>> We might want to generically have a CPU feature bit indicating we are
>>> running in guest vs. HV mode. I know Paulus is planning to introduce one
>>> so you may want to sync with him.
>> 
>> Are you talking about CPU_FTR_HVMODE_206?
> 
> Well, not exactly. Paul wants to break that up since we're adding some
> primitive support for 201 HV mode too (for 970's). Last we discussed,
> the plan was to go for a generic HV mode bit and a separate bit for the
> version.
> 
> Cheers,
> Ben.

Any ETA on Paul's intro of the FTR bit?  If not I'll pull this into my 'next' tree and we can clean up later.

- k
Timur Tabi June 22, 2011, 2:55 p.m. UTC | #8
Kumar Gala wrote:
>> > 
>> > Well, not exactly. Paul wants to break that up since we're adding some
>> > primitive support for 201 HV mode too (for 970's). Last we discussed,
>> > the plan was to go for a generic HV mode bit and a separate bit for the
>> > version.
>> > 
>> > Cheers,
>> > Ben.

> Any ETA on Paul's intro of the FTR bit?  If not I'll pull this into my 'next' tree and we can clean up later.

Just FYI, this particular patch is because of a limitation in the Freescale
hypervisor.  It's not because we're running in guest mode.  If the hypervisor
provided full emulation of the timebase register, then we wouldn't need this
patch.  The same can be said of KVM or any other hypervisor.

So a generic HV mode bit is not going to help me, unless there's also a bit
that's specific to our hypervisor.  And even then, we would need some way to
differentiate among different versions of our hypervisor, in case some future
version adds timebase support.  We currently use the device tree for all this,
so I'm not sure what a FTR bit will gain us.
Scott Wood June 23, 2011, 5:22 p.m. UTC | #9
On Wed, 22 Jun 2011 09:55:36 -0500
Timur Tabi <timur@freescale.com> wrote:

> Kumar Gala wrote:
> >> > 
> >> > Well, not exactly. Paul wants to break that up since we're adding some
> >> > primitive support for 201 HV mode too (for 970's). Last we discussed,
> >> > the plan was to go for a generic HV mode bit and a separate bit for the
> >> > version.
> >> > 
> >> > Cheers,
> >> > Ben.
> 
> > Any ETA on Paul's intro of the FTR bit?  If not I'll pull this into my 'next' tree and we can clean up later.
> 
> Just FYI, this particular patch is because of a limitation in the Freescale
> hypervisor.  It's not because we're running in guest mode.  If the hypervisor
> provided full emulation of the timebase register, then we wouldn't need this
> patch.  The same can be said of KVM or any other hypervisor.

From Power ISA 2.06B, book III-E, section 9.2.1:

Virtualized Implementation Note:

In virtualized implementations, TBU and TBL are
read-only.

> So a generic HV mode bit is not going to help me, unless there's also a bit
> that's specific to our hypervisor.  And even then, we would need some way to
> differentiate among different versions of our hypervisor, in case some future
> version adds timebase support. 

That's very unlikely to happen.

Ideally we would avoid doing this sync even when not running under a
hypervisor, as long as firmware has done the sync, and kexec hasn't messed
it up.  Besides being a waste of boot time, the firmware's sync is
probably tighter since it can use a platform-specific mechanism to start all
the timebases at once.

-Scott
Timur Tabi June 23, 2011, 5:33 p.m. UTC | #10
Scott Wood wrote:
> From Power ISA 2.06B, book III-E, section 9.2.1:
> 
> Virtualized Implementation Note:
> 
> In virtualized implementations, TBU and TBL are
> read-only.

But does that mean that a guest should never be allowed to modify a virtualized
timebase register, even if the hypervisor can support it?

>> > So a generic HV mode bit is not going to help me, unless there's also a bit
>> > that's specific to our hypervisor.  And even then, we would need some way to
>> > differentiate among different versions of our hypervisor, in case some future
>> > version adds timebase support. 

> That's very unlikely to happen.

I know. I was just being architecturally pedantic.

> Ideally we would avoid doing this sync even when not running under a
> hypervisor, as long as firmware has done the sync, and kexec hasn't messed
> it up.  Besides being a waste of boot time, the firmware's sync is
> probably tighter since it can use a platform-specific mechanism to start all
> the timebases at once.

I agree with that, but for now, I need to work around that kexec "limitation".
Scott Wood June 23, 2011, 5:48 p.m. UTC | #11
On Thu, 23 Jun 2011 12:33:40 -0500
Timur Tabi <timur@freescale.com> wrote:

> Scott Wood wrote:
> > From Power ISA 2.06B, book III-E, section 9.2.1:
> > 
> > Virtualized Implementation Note:
> > 
> > In virtualized implementations, TBU and TBL are
> > read-only.
> 
> But does that mean that a guest should never be allowed to modify a virtualized
> timebase register, even if the hypervisor can support it?

The book3e mtspr writeup doesn't appear to specify the behavior when
writing to a read-only SPR, so perhaps you could argue that something other
than a no-op is implementation-specific behavior.

For a guest, the safe thing is to not write to those registers unless you
specifically know it's going to do what you want under a particular
implementation.  It's not specifically a Topaz limitation.

> >> > So a generic HV mode bit is not going to help me, unless there's also a bit
> >> > that's specific to our hypervisor.  And even then, we would need some way to
> >> > differentiate among different versions of our hypervisor, in case some future
> >> > version adds timebase support. 
> 
> > That's very unlikely to happen.
> 
> I know. I was just being architecturally pedantic.

It's not as if it would hurt anything to ignore such a capability.

> > Ideally we would avoid doing this sync even when not running under a
> > hypervisor, as long as firmware has done the sync, and kexec hasn't messed
> > it up.  Besides being a waste of boot time, the firmware's sync is
> > probably tighter since it can use a platform-specific mechanism to start all
> > the timebases at once.
> 
> I agree with that, but for now, I need to work around that kexec "limitation".

Is there any way we can detect whether we booted via kexec (as opposed to
just having kexec support enabled), and only do the sync in that case?

-Scott
Segher Boessenkool June 24, 2011, 2:36 a.m. UTC | #12
>> But does that mean that a guest should never be allowed to modify a 
>> virtualized
>> timebase register, even if the hypervisor can support it?
>
> The book3e mtspr writeup doesn't appear to specify the behavior when
> writing to a read-only SPR, so perhaps you could argue that something 
> other
> than a no-op is implementation-specific behavior.

v2.06 III-E 9.2.1:
"Writing the Time Base is hypervisor privileged."

v2.06 III-E 2.1:
"If a hypervisor-privileged register is accessed in the guest supervisor
state (MSR[GS PR] = 0b10), an Embedded Hypervisor Privilege exception 
occurs."

(v2.06 III-E 5.4.1, the big SPR table, also shows the TB regs (for 
writing,
i.e. 284 and 285) to be hypervisor privileged.  Consistency, hurray :-) 
)


Segher
Tabi Timur-B04825 June 24, 2011, 2:38 a.m. UTC | #13
Segher Boessenkool wrote:
>
> v2.06 III-E 9.2.1:
> "Writing the Time Base is hypervisor privileged."
>
> v2.06 III-E 2.1:
> "If a hypervisor-privileged register is accessed in the guest supervisor
> state (MSR[GS PR] = 0b10), an Embedded Hypervisor Privilege exception
> occurs."
>
> (v2.06 III-E 5.4.1, the big SPR table, also shows the TB regs (for writing,
> i.e. 284 and 285) to be hypervisor privileged.  Consistency, hurray :-) )

To me, all this means that a guest cannot write to the actual timebase 
register.  I'm not interpreting this to mean that a hypervisor can't 
virtualize the timebase and allow a guest to read/write a virtual timebase 
register, so that it thinks it's writing to the real hardware timebase register.
Segher Boessenkool June 24, 2011, 3:50 a.m. UTC | #14
(context put back:)

>> But does that mean that a guest should never be allowed to modify a 
>> virtualized
>> timebase register, even if the hypervisor can support it?
>
> The book3e mtspr writeup doesn't appear to specify the behavior when
> writing to a read-only SPR, so perhaps you could argue that something 
> other
> than a no-op is implementation-specific behavior.

>> v2.06 III-E 9.2.1:
>> "Writing the Time Base is hypervisor privileged."
>>
>> v2.06 III-E 2.1:
>> "If a hypervisor-privileged register is accessed in the guest 
>> supervisor
>> state (MSR[GS PR] = 0b10), an Embedded Hypervisor Privilege exception
>> occurs."
>>
>> (v2.06 III-E 5.4.1, the big SPR table, also shows the TB regs (for 
>> writing,
>> i.e. 284 and 285) to be hypervisor privileged.  Consistency, hurray 
>> :-) )
>
> To me, all this means that a guest cannot write to the actual timebase
> register.

It also means that the hypervisor gets a trap when a guest tries to do 
this.

>   I'm not interpreting this to mean that a hypervisor can't
> virtualize the timebase and allow a guest to read/write a virtual 
> timebase
> register, so that it thinks it's writing to the real hardware timebase 
> register.

Yes, a hypervisor can do this.  The behaviour of the hardware is not
implementation-specific (modulo bugs ;-) ); when a guest tries to write
to the timebase, the hypervisor gets a trap.  The hypervisor can then
do whatever it wants with it.


Segher
Scott Wood June 24, 2011, 3:16 p.m. UTC | #15
On Thu, 23 Jun 2011 21:38:58 -0500
Tabi Timur-B04825 <B04825@freescale.com> wrote:

> Segher Boessenkool wrote:
> >
> > v2.06 III-E 9.2.1:
> > "Writing the Time Base is hypervisor privileged."
> >
> > v2.06 III-E 2.1:
> > "If a hypervisor-privileged register is accessed in the guest supervisor
> > state (MSR[GS PR] = 0b10), an Embedded Hypervisor Privilege exception
> > occurs."
> >
> > (v2.06 III-E 5.4.1, the big SPR table, also shows the TB regs (for writing,
> > i.e. 284 and 285) to be hypervisor privileged.  Consistency, hurray :-) )
> 
> To me, all this means that a guest cannot write to the actual timebase 
> register.  I'm not interpreting this to mean that a hypervisor can't 
> virtualize the timebase and allow a guest to read/write a virtual timebase 
> register, so that it thinks it's writing to the real hardware timebase register.
> 

Right, I was referring to the virtualized implementation note added in
2.06B.  The virtualized implementation notes apply to what happens in the
guest as seen by the guest (considered as a separate implementation of the
Power ISA), not to what happens at a hardware level in guest mode.

-Scott
Benjamin Herrenschmidt June 24, 2011, 11:36 p.m. UTC | #16
On Wed, 2011-06-22 at 06:44 -0500, Kumar Gala wrote:
> 
> Any ETA on Paul's intro of the FTR bit?  If not I'll pull this into my
> 'next' tree and we can clean up later.

His latest KVM patch set has that.

Cheers,
Ben.
Kumar Gala June 27, 2011, 1:35 p.m. UTC | #17
On Jun 14, 2011, at 6:04 PM, Timur Tabi wrote:

> The Freescale hypervisor does not allow guests to write to the timebase
> registers (virtualizing the timebase register was deemed too complicated),
> so don't try to synchronize the timebase registers when we're running
> under the hypervisor.
> 
> This typically happens when kexec support is enabled.
> 
> Signed-off-by: Timur Tabi <timur@freescale.com>
> ---
> arch/powerpc/platforms/85xx/p3041_ds.c |   11 +++++++++++
> arch/powerpc/platforms/85xx/p4080_ds.c |   11 +++++++++++
> arch/powerpc/platforms/85xx/p5020_ds.c |   11 +++++++++++
> 3 files changed, 33 insertions(+), 0 deletions(-)

applied to next

- K
diff mbox

Patch

diff --git a/arch/powerpc/platforms/85xx/p3041_ds.c b/arch/powerpc/platforms/85xx/p3041_ds.c
index c0242bc..8b651dfe 100644
--- a/arch/powerpc/platforms/85xx/p3041_ds.c
+++ b/arch/powerpc/platforms/85xx/p3041_ds.c
@@ -40,6 +40,9 @@ 
 static int __init p3041_ds_probe(void)
 {
 	unsigned long root = of_get_flat_dt_root();
+#ifdef CONFIG_SMP
+	extern struct smp_ops_t smp_85xx_ops;
+#endif
 
 	if (of_flat_dt_is_compatible(root, "fsl,P3041DS"))
 		return 1;
@@ -51,6 +54,14 @@  static int __init p3041_ds_probe(void)
 		ppc_md.restart = fsl_hv_restart;
 		ppc_md.power_off = fsl_hv_halt;
 		ppc_md.halt = fsl_hv_halt;
+#ifdef CONFIG_SMP
+		/*
+		 * Disable the timebase sync operations because we can't write
+		 * to the timebase registers under the hypervisor.
+		  */
+		smp_85xx_ops.give_timebase = NULL;
+		smp_85xx_ops.take_timebase = NULL;
+#endif
 		return 1;
 	}
 
diff --git a/arch/powerpc/platforms/85xx/p4080_ds.c b/arch/powerpc/platforms/85xx/p4080_ds.c
index 32ea5eb..ae859ab 100644
--- a/arch/powerpc/platforms/85xx/p4080_ds.c
+++ b/arch/powerpc/platforms/85xx/p4080_ds.c
@@ -39,6 +39,9 @@ 
 static int __init p4080_ds_probe(void)
 {
 	unsigned long root = of_get_flat_dt_root();
+#ifdef CONFIG_SMP
+	extern struct smp_ops_t smp_85xx_ops;
+#endif
 
 	if (of_flat_dt_is_compatible(root, "fsl,P4080DS"))
 		return 1;
@@ -50,6 +53,14 @@  static int __init p4080_ds_probe(void)
 		ppc_md.restart = fsl_hv_restart;
 		ppc_md.power_off = fsl_hv_halt;
 		ppc_md.halt = fsl_hv_halt;
+#ifdef CONFIG_SMP
+		/*
+		 * Disable the timebase sync operations because we can't write
+		 * to the timebase registers under the hypervisor.
+		  */
+		smp_85xx_ops.give_timebase = NULL;
+		smp_85xx_ops.take_timebase = NULL;
+#endif
 		return 1;
 	}
 
diff --git a/arch/powerpc/platforms/85xx/p5020_ds.c b/arch/powerpc/platforms/85xx/p5020_ds.c
index 2ea9ccc..d951618 100644
--- a/arch/powerpc/platforms/85xx/p5020_ds.c
+++ b/arch/powerpc/platforms/85xx/p5020_ds.c
@@ -40,6 +40,9 @@ 
 static int __init p5020_ds_probe(void)
 {
 	unsigned long root = of_get_flat_dt_root();
+#ifdef CONFIG_SMP
+	extern struct smp_ops_t smp_85xx_ops;
+#endif
 
 	if (of_flat_dt_is_compatible(root, "fsl,P5020DS"))
 		return 1;
@@ -51,6 +54,14 @@  static int __init p5020_ds_probe(void)
 		ppc_md.restart = fsl_hv_restart;
 		ppc_md.power_off = fsl_hv_halt;
 		ppc_md.halt = fsl_hv_halt;
+#ifdef CONFIG_SMP
+		/*
+		 * Disable the timebase sync operations because we can't write
+		 * to the timebase registers under the hypervisor.
+		  */
+		smp_85xx_ops.give_timebase = NULL;
+		smp_85xx_ops.take_timebase = NULL;
+#endif
 		return 1;
 	}