diff mbox

silo: Don't touch %tick_cmpr on sun4v cpus.

Message ID 20120815.011416.353619816610212386.davem@davemloft.net
State Accepted
Delegated to: David Miller
Headers show

Commit Message

David Miller Aug. 15, 2012, 8:14 a.m. UTC
This generates an illegal instruction exception.

This has a long history.  For the first sun4v port of SILO in commit
494770a17eea7192d3242051e76f4da6d838e3a1 ("SILO Niagara/SUN4V
support") this code was removed entirely.

But later this was found to regress older UltraSPARC boxes, so we put
it back in commit bd708e35bdcd8e92cb7c65368f2a356982df7cd8 ("Fix
Ultra10 SILO timer").  But that was wrong too.

The OBP still owns the trap table when SILO runs and it uses the
%tick_cmpr generated interrupt.  This has a bad interraction with how
we use the %tick register in SILO.

SILO first reads the %tick register and remembers this value as the
time base.

Later, we read %tick again, compute the difference, and use this to
calcualte the amount of time elapsed.

OBP's %tick_cmpr interrupt handler is doing something funky, such as
resetting %tick, which makes our timeouts never actually expire.

This issue doesn't exist on sun4v machines, and we absolutely cannot
try to touch the %tick_cmpr register as that generates an illegal
instruction trap on such cpus.

Signed-off-by: David S. Miller <davem@davemloft.net>
---

I just committed this into the SILO git repo.

Debian folks, you really want this propagated into your installer as
soon as possible.  All the install ISOs will crash in SILO on all
sun4v (Niagara) machines unless an explicit SILO boot target is given
on the boot command line.  I used "boot cdrom install" to get around
this.

It triggers any time the timer mechanism is enabled ("timeout=foo" is
specified in silo.conf)

 include/silo.h | 1 +
 second/main.c  | 1 +
 second/misc.c  | 4 +++-
 second/timer.c | 2 +-
 4 files changed, 6 insertions(+), 2 deletions(-)

Comments

Jurij Smakov Aug. 19, 2012, 3:41 p.m. UTC | #1
On Wed, Aug 15, 2012 at 01:14:16AM -0700, David Miller wrote:
> 
> This generates an illegal instruction exception.
> 
> This has a long history.  For the first sun4v port of SILO in commit
> 494770a17eea7192d3242051e76f4da6d838e3a1 ("SILO Niagara/SUN4V
> support") this code was removed entirely.
> 
> But later this was found to regress older UltraSPARC boxes, so we put
> it back in commit bd708e35bdcd8e92cb7c65368f2a356982df7cd8 ("Fix
> Ultra10 SILO timer").  But that was wrong too.
> 
> The OBP still owns the trap table when SILO runs and it uses the
> %tick_cmpr generated interrupt.  This has a bad interraction with how
> we use the %tick register in SILO.
> 
> SILO first reads the %tick register and remembers this value as the
> time base.
> 
> Later, we read %tick again, compute the difference, and use this to
> calcualte the amount of time elapsed.
> 
> OBP's %tick_cmpr interrupt handler is doing something funky, such as
> resetting %tick, which makes our timeouts never actually expire.
> 
> This issue doesn't exist on sun4v machines, and we absolutely cannot
> try to touch the %tick_cmpr register as that generates an illegal
> instruction trap on such cpus.
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> 
> I just committed this into the SILO git repo.
> 
> Debian folks, you really want this propagated into your installer as
> soon as possible.  All the install ISOs will crash in SILO on all
> sun4v (Niagara) machines unless an explicit SILO boot target is given
> on the boot command line.  I used "boot cdrom install" to get around
> this.
> 
> It triggers any time the timer mechanism is enabled ("timeout=foo" is
> specified in silo.conf)

Thanks, David.

I just uploaded a new silo package (1.4.14+git20120819-1) including 
these fixes to unstable, and would encourage everyone to test it (it 
should appear on the mirrors within a few hours). After a grace period 
of 10 days we are going to arrange for its propagation to testing, 
given that no problems are reported.

Best regards,
David Miller Aug. 19, 2012, 10:24 p.m. UTC | #2
From: Jurij Smakov <jurij@wooyd.org>
Date: Sun, 19 Aug 2012 16:41:42 +0100

> I just uploaded a new silo package (1.4.14+git20120819-1) including 
> these fixes to unstable, and would encourage everyone to test it (it 
> should appear on the mirrors within a few hours). After a grace period 
> of 10 days we are going to arrange for its propagation to testing, 
> given that no problems are reported.

Thanks a lot Jurij.

Just FYI I also pushed a ext4 fix into the SILO tree yesterday
after I received positive feedback from a bug reporter.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jurij Smakov Aug. 19, 2012, 10:29 p.m. UTC | #3
On Sun, Aug 19, 2012 at 03:24:45PM -0700, David Miller wrote:
> From: Jurij Smakov <jurij@wooyd.org>
> Date: Sun, 19 Aug 2012 16:41:42 +0100
> 
> > I just uploaded a new silo package (1.4.14+git20120819-1) including 
> > these fixes to unstable, and would encourage everyone to test it (it 
> > should appear on the mirrors within a few hours). After a grace period 
> > of 10 days we are going to arrange for its propagation to testing, 
> > given that no problems are reported.
> 
> Thanks a lot Jurij.
> 
> Just FYI I also pushed a ext4 fix into the SILO tree yesterday
> after I received positive feedback from a bug reporter.

This fix is included in the latest uploaded version as well.

Best regards,
David Miller Aug. 19, 2012, 10:29 p.m. UTC | #4
From: Jurij Smakov <jurij@wooyd.org>
Date: Sun, 19 Aug 2012 23:29:05 +0100

> On Sun, Aug 19, 2012 at 03:24:45PM -0700, David Miller wrote:
>> From: Jurij Smakov <jurij@wooyd.org>
>> Date: Sun, 19 Aug 2012 16:41:42 +0100
>> 
>> > I just uploaded a new silo package (1.4.14+git20120819-1) including 
>> > these fixes to unstable, and would encourage everyone to test it (it 
>> > should appear on the mirrors within a few hours). After a grace period 
>> > of 10 days we are going to arrange for its propagation to testing, 
>> > given that no problems are reported.
>> 
>> Thanks a lot Jurij.
>> 
>> Just FYI I also pushed a ext4 fix into the SILO tree yesterday
>> after I received positive feedback from a bug reporter.
> 
> This fix is included in the latest uploaded version as well.

Excellent.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jurij Smakov Sept. 7, 2012, 8:33 a.m. UTC | #5
On Sun, Aug 19, 2012 at 03:29:59PM -0700, David Miller wrote:
> From: Jurij Smakov <jurij@wooyd.org>
> Date: Sun, 19 Aug 2012 23:29:05 +0100
> 
> > On Sun, Aug 19, 2012 at 03:24:45PM -0700, David Miller wrote:
> >> From: Jurij Smakov <jurij@wooyd.org>
> >> Date: Sun, 19 Aug 2012 16:41:42 +0100
> >> 
> >> > I just uploaded a new silo package (1.4.14+git20120819-1) including 
> >> > these fixes to unstable, and would encourage everyone to test it (it 
> >> > should appear on the mirrors within a few hours). After a grace period 
> >> > of 10 days we are going to arrange for its propagation to testing, 
> >> > given that no problems are reported.
> >> 
> >> Thanks a lot Jurij.
> >> 
> >> Just FYI I also pushed a ext4 fix into the SILO tree yesterday
> >> after I received positive feedback from a bug reporter.
> > 
> > This fix is included in the latest uploaded version as well.
> 
> Excellent.

The new silo has propagated to testing and should be used in this 
installer image:

http://cdimage.debian.org/cdimage/daily-builds/daily/arch-latest/sparc/iso-cd/debian-testing-sparc-netinst.iso

If you could give it a try to confirm that it now boots successfully 
on your machine, it would be appreciated.

Thanks.
David Miller Sept. 7, 2012, 4:30 p.m. UTC | #6
From: Jurij Smakov <jurij@wooyd.org>
Date: Fri, 7 Sep 2012 09:33:58 +0100

> If you could give it a try to confirm that it now boots successfully 
> on your machine, it would be appreciated.

I'm 3600 miles away from the machine for the next few months so
this isn't practical, sorry.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/silo.h b/include/silo.h
index fe5adcb..94d6e31 100644
--- a/include/silo.h
+++ b/include/silo.h
@@ -125,6 +125,7 @@  int strtol (const char *, char **, int);
 int decompress (char *, char *, unsigned char (*)(void), void (*)(void));
 /* main.c */
 extern enum arch architecture;
+extern int sun4v_cpu;
 /* timer.c */
 int init_timer ();
 void close_timer ();
diff --git a/second/main.c b/second/main.c
index 182b263..a45807d 100644
--- a/second/main.c
+++ b/second/main.c
@@ -64,6 +64,7 @@  enum {
     CMD_LS
 } load_cmd;
 enum arch architecture;
+int sun4v_cpu;
 static int timer_status = 0;
 static char *initrd_start;
 static int initrd_size;
diff --git a/second/misc.c b/second/misc.c
index 163738e..d6bcdb1 100644
--- a/second/misc.c
+++ b/second/misc.c
@@ -517,8 +517,10 @@  enum arch silo_get_architecture(void)
 	return sun4d;
     case 'e':
 	return sun4e;
-    case 'u':
     case 'v':
+	sun4v_cpu = 1;
+	/* FALLTHRU */
+    case 'u':
 	return sun4u;
     default:
     	for(i = 0; i < NUM_SUN_MACHINES; i++)
diff --git a/second/timer.c b/second/timer.c
index 51e928e..7f03996 100644
--- a/second/timer.c
+++ b/second/timer.c
@@ -156,7 +156,7 @@  static inline int sun4u_init_timer ()
     }
     if (!foundcpu || !clock_frequency)
         clock_frequency = prom_getint(prom_root_node, "clock-frequency") / 100;
-    if (notimer) {
+    if (notimer && !sun4v_cpu) {
         sun4u_notimer = 1;
         __asm__ __volatile__ ("\t"
         	"rd	%%tick_cmpr, %%g1\n\t"