Patchwork Re: ZILOG serial port broken in 2.6.32

login
register
mail settings
Submitter Rob Landley
Date Dec. 8, 2009, 12:42 p.m.
Message ID <200912080642.52103.rob@landley.net>
Download mbox | patch
Permalink /patch/40619/
State Superseded
Headers show

Comments

Rob Landley - Dec. 8, 2009, 12:42 p.m.
On Sunday 06 December 2009 19:10:48 Benjamin Herrenschmidt wrote:
> On Sun, 2009-12-06 at 01:01 -0600, Rob Landley wrote:
> > Trying again with a few likely-looking cc's from the MAINTAINERS file:
> >
> > Summary:
> >
> > The PMACZILOG serial driver last worked in 2.6.28.  It was broken by
> > commit f751928e0ddf54ea4fe5546f35e99efc5b5d9938 by Alan Cox making bits
> > of the tty layer dynamically allocated.  The PMACZILOG driver wasn't
> > properly converted, it works with interrupts disabled (for boot
> > messages), but as soon as interrupts are enabled (PID 1 spawns) the next
> > write to the serial console panics the kernel.
>
> Ah looks like I missed that... I'll dig. Thanks for the report.
>
> Cheers,
> Ben.

Ok, here's the fix.  It's not the _right_ fix, but it Works For Me (tm) and I'll
leave it to you guys to figure out what this _means_:

Signed-off-by: Rob Landley <rob@landley.net>


That one line workaround makes the panic go away, and things seem to work fine from there.

I note that pmac_zilog.c function pmz_receiv_chars() has the following chunk:

        /* Sanity check, make sure the old bug is no longer happening */
        if (uap->port.state == NULL || uap->port.state->port.tty == NULL) {
                WARN_ON(1);
                (void)read_zsdata(uap);
                return NULL;
        }

Which doesn't catch this because it's the write code path (not the read code path) that's running into 
this.

Rob
Benjamin Herrenschmidt - Jan. 8, 2010, 3 a.m.
> Ok, here's the fix.  It's not the _right_ fix, but it Works For Me (tm) and I'll
> leave it to you guys to figure out what this _means_:

I've failed to reproduce so far on both a Wallstreet powerbook (similar
generation and chipset as your beige G3) and a G5 with an added serial
port using current upstream...

Can you verify it's still there ? I might be able to reproduce on a
Beige G3 as well next week.

Cheers,
Ben.

> Signed-off-by: Rob Landley <rob@landley.net>
> 
> diff -ru build/packages/linux/drivers/serial/serial_core.c build/packages/linux2/drivers/serial/serial_core.c
> --- build/packages/linux/drivers/serial/serial_core.c	2009-12-02 21:51:21.000000000 -0600
> +++ build/packages/linux2/drivers/serial/serial_core.c	2009-12-08 06:17:06.000000000 -0600
> @@ -113,7 +113,7 @@
>  static void uart_tasklet_action(unsigned long data)
>  {
>  	struct uart_state *state = (struct uart_state *)data;
> -	tty_wakeup(state->port.tty);
> +	if (state->port.tty) tty_wakeup(state->port.tty);
>  }
>  
>  static inline void
> 
> That one line workaround makes the panic go away, and things seem to work fine from there.
> 
> I note that pmac_zilog.c function pmz_receiv_chars() has the following chunk:
> 
>         /* Sanity check, make sure the old bug is no longer happening */
>         if (uap->port.state == NULL || uap->port.state->port.tty == NULL) {
>                 WARN_ON(1);
>                 (void)read_zsdata(uap);
>                 return NULL;
>         }
> 
> Which doesn't catch this because it's the write code path (not the read code path) that's running into 
> this.
> 
> Rob
Rob Landley - Jan. 9, 2010, 8:17 a.m.
On Thursday 07 January 2010 21:00:43 Benjamin Herrenschmidt wrote:
> > Ok, here's the fix.  It's not the _right_ fix, but it Works For Me (tm)
> > and I'll leave it to you guys to figure out what this _means_:
>
> I've failed to reproduce so far on both a Wallstreet powerbook (similar
> generation and chipset as your beige G3) and a G5 with an added serial
> port using current upstream...
>
> Can you verify it's still there ? I might be able to reproduce on a
> Beige G3 as well next week.

It's still there on qemu 0.11.0's "g3beige" emulation when you use 
CONFIG_SERIAL_PMACZILOG as the serial console.  (QEMU 0.10.x used a 16550 
serial chip for its g3beige emulation instead of the actual ZILOG one.)  Still 
dunno if it's a qemu or bug or a kernel bug, I just know that kernel patch 
fixes it for me, and it comes back without the patch.

I tested 2.6.32.  Haven't tried the 2.6.32.3 but don't see why it would change 
this...

Rob
Benjamin Herrenschmidt - Jan. 11, 2010, 3:02 a.m.
On Sat, 2010-01-09 at 02:17 -0600, Rob Landley wrote:
> On Thursday 07 January 2010 21:00:43 Benjamin Herrenschmidt wrote:
> > > Ok, here's the fix.  It's not the _right_ fix, but it Works For Me (tm)
> > > and I'll leave it to you guys to figure out what this _means_:
> >
> > I've failed to reproduce so far on both a Wallstreet powerbook (similar
> > generation and chipset as your beige G3) and a G5 with an added serial
> > port using current upstream...
> >
> > Can you verify it's still there ? I might be able to reproduce on a
> > Beige G3 as well next week.
> 
> It's still there on qemu 0.11.0's "g3beige" emulation when you use 
> CONFIG_SERIAL_PMACZILOG as the serial console.  (QEMU 0.10.x used a 16550 
> serial chip for its g3beige emulation instead of the actual ZILOG one.)  Still 
> dunno if it's a qemu or bug or a kernel bug, I just know that kernel patch 
> fixes it for me, and it comes back without the patch.
> 
> I tested 2.6.32.  Haven't tried the 2.6.32.3 but don't see why it would change 
> this...

Ok so I compiled qemu and things are a bit strange.

How do you get the output of both channels of the serial port with it ?

If I use -nographics, what happens is that OpenBIOS, for some reason,
tells qemu that the console on the second channel of the ESCC.

I see my kernel messages in the console if I do console=ttyPZ0 but the
debug stuff goes where udbg initializes it, which is where OpenBIOS says
the FW console is, which is channel B and I don't know how to "see" that
with qemu.

I do see it crash due to a message from the kernel but I can't get into
xmon which is a pain.

If I modify the kernel to force udbg on channel A (same channel as the
console), then the problem doesn't appear (it doesn't crash) :-)

Cheers
Ben.
Rob Landley - Jan. 11, 2010, 7:22 a.m.
On Sunday 10 January 2010 21:02:16 Benjamin Herrenschmidt wrote:
> On Sat, 2010-01-09 at 02:17 -0600, Rob Landley wrote:
> > On Thursday 07 January 2010 21:00:43 Benjamin Herrenschmidt wrote:
> > > > Ok, here's the fix.  It's not the _right_ fix, but it Works For Me
> > > > (tm) and I'll leave it to you guys to figure out what this _means_:
> > >
> > > I've failed to reproduce so far on both a Wallstreet powerbook (similar
> > > generation and chipset as your beige G3) and a G5 with an added serial
> > > port using current upstream...
> > >
> > > Can you verify it's still there ? I might be able to reproduce on a
> > > Beige G3 as well next week.
> >
> > It's still there on qemu 0.11.0's "g3beige" emulation when you use
> > CONFIG_SERIAL_PMACZILOG as the serial console.  (QEMU 0.10.x used a 16550
> > serial chip for its g3beige emulation instead of the actual ZILOG one.) 
> > Still dunno if it's a qemu or bug or a kernel bug, I just know that
> > kernel patch fixes it for me, and it comes back without the patch.
> >
> > I tested 2.6.32.  Haven't tried the 2.6.32.3 but don't see why it would
> > change this...
>
> Ok so I compiled qemu and things are a bit strange.
>
> How do you get the output of both channels of the serial port with it ?
>
> If I use -nographics, what happens is that OpenBIOS, for some reason,
> tells qemu that the console on the second channel of the ESCC.

Instead of "-nographic", you could try "-serial stdio" instead?

> I see my kernel messages in the console if I do console=ttyPZ0 but the
> debug stuff goes where udbg initializes it, which is where OpenBIOS says
> the FW console is, which is channel B and I don't know how to "see" that
> with qemu.

I'm just trying to get a serial console, which is why I'm booting the sucker 
with:

qemu-system-ppc -M g3beige -nographic -no-reboot -kernel zImage-powerpc -hda 
image-powerpc.sqf -append "root=/dev/hda rw init=/usr/sbin/init.sh panic=1 
PATH=/usr/bin console=ttyS0"

I didn't even know there were more debug messages...

I have CONFIG_SERIAL_PMACZILOG_TTYS=y of course:

pmac_zilog: 0.6 (Benjamin Herrenschmidt <benh@kernel.crashing.org>)
ttyS0 at MMIO 0x80813020 (irq = 16) is a Z85c30 ESCC - Serial port
ttyS1 at MMIO 0x80813000 (irq = 17) is a Z85c30 ESCC - Serial port

CONFIG_SERIO=y
CONFIG_SERIAL_PMACZILOG=y
CONFIG_SERIAL_PMACZILOG_TTYS=y
CONFIG_SERIAL_PMACZILOG_CONSOLE=y

> I do see it crash due to a message from the kernel but I can't get into
> xmon which is a pain.

Does the -serial stdio thing help?

(I know to switch between screens in the qemu x11 window, it's ctrl-alt-number 
(so ctrl-alt-1, ctrl-alt-2, and so on.  I really don't use 'em much, though.)

> If I modify the kernel to force udbg on channel A (same channel as the
> console), then the problem doesn't appear (it doesn't crash) :-)

You can attach gdb to qemu via the "qemu -s" option and then in gdb use the 
"target remote" stuff like you would with gdbserver.  It acts a bit like you've 
connected it to a jtag through openocd, if that helps...

(I know qemu has many, many options I don't really use much.)

> Cheers
> Ben.

Rob

Patch

diff -ru build/packages/linux/drivers/serial/serial_core.c build/packages/linux2/drivers/serial/serial_core.c
--- build/packages/linux/drivers/serial/serial_core.c	2009-12-02 21:51:21.000000000 -0600
+++ build/packages/linux2/drivers/serial/serial_core.c	2009-12-08 06:17:06.000000000 -0600
@@ -113,7 +113,7 @@ 
 static void uart_tasklet_action(unsigned long data)
 {
 	struct uart_state *state = (struct uart_state *)data;
-	tty_wakeup(state->port.tty);
+	if (state->port.tty) tty_wakeup(state->port.tty);
 }
 
 static inline void