Patchwork cmd64x: irq 14: nobody cared - system is dreadfully slow

login
register
mail settings
Submitter David Miller
Date June 22, 2009, 1:56 a.m.
Message ID <20090621.185621.31766784.davem@davemloft.net>
Download mbox | patch
Permalink /patch/28963/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

David Miller - June 22, 2009, 1:56 a.m.
From: Frans Pop <elendil@planet.nl>
Date: Sun, 21 Jun 2009 14:46:37 +0200

> I tried the following additional patch, but unfortunately that did not help:
> --- a/drivers/ide/cmd64x.c
> +++ b/drivers/ide/cmd64x.c
> @@ -426,6 +426,7 @@ static const struct ide_port_info cmd64x_chipsets[]
>  		.port_ops	= &cmd64x_port_ops,
>  		.dma_ops	= &cmd648_dma_ops,
>  		.host_flags	= IDE_HFLAG_ABUSE_PREFETCH,
> +		.irq_flags	= IRQF_SHARED,
>  		.pio_mask	= ATA_PIO5,
>  		.mwdma_mask	= ATA_MWDMA2,
>  		.udma_mask	= ATA_UDMA2,
> 
> I got the idea for that from 255115fb and had hoped it would compensate
> for this change from Bart's commit:

That won't help.  All PCI IDE interfaces unconditionally set the
irq_flags to IRQF_SHARED.  This occurs in drivers/ide/setup-pci.c
via ide_pci_init_one(), which calls ide_pci_init_two(), which goes:

	host->irq_flags = IRQF_SHARED;

The key to this bug seems to be the setting of host->cur_port when the
interrupts arrive.  That's really the only major case where the IDE
driver interrupt handler elides at least reading the status register
to clear the interrupt.

That's why clearing the IDE_HFLAG_SERIALIZE flag makes the initial
bulk of unclearable interrupts go away.

I suspect that whatever is causing trouble due to IDE_HFLAG_SERIALIZE
is also, down the road, causing the hdd problem you still see.

Can you apply this debugging patch and print out the output?

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frans Pop - June 22, 2009, 4:28 a.m.
On Monday 22 June 2009, David Miller wrote:
> Can you apply this debugging patch and print out the output?

I get completely flooded...
I get the debug output 2 times (ide0 and ide1 maybe?): the first time it 
stops after a few 100 lines and driver init continues a bit, but the 
second time it just goes on and on and eventually I decided to reset the 
box.

> diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
> index 1059f80..8992fda 100644
> --- a/drivers/ide/ide-io.c
> +++ b/drivers/ide/ide-io.c
> @@ -797,6 +797,25 @@ irqreturn_t ide_intr (int irq, void *dev_id)
>  	int plug_device = 0;
>  	struct request *uninitialized_var(rq_in_flight);
>
> +#if 1
> +	{
> +		static int times = 0;
> +
> +		if (++times <= 32)

Should this be >=32 maybe?

If you really do want the to skip the first 32 and capture the rest, I can 
still get it, but I'd have to split the printk as it's too wide for my 
serial console so part gets truncated.

> +			goto no_log;
> +
> +		printk(KERN_INFO "IDE-DEBUG: host->host_flags[0x%lx] "
> +		       "hwif(%p) host->cur_port(%p) "
> +		       "hwif->port_ops(%pS) hwif->handler(%pS) "
> +		       "hwif->polling(%d)\n",
> +		       host->host_flags, hwif, host->cur_port,
> +		       hwif->port_ops, hwif->handler, (int) hwif->polling);
> +
> +	no_log:
> +		;
> +	}
> +#endif
> +
>  	if (host->host_flags & IDE_HFLAG_SERIALIZE) {
>  		if (hwif != host->cur_port)
>  			goto out_early;

This is what I could get from the *second* series:
IDE-DEBUG: host->host_flags[0x10] hwif(fffff8003e326800) 
host->cur_port((null)) 
hwif->port_ops(cmd648_port_ops+0x0/0xfffffffffffffe54 [cmd64x]) 
hwif->hand

Rest is truncated. This first part seemed constant.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - June 22, 2009, 5:45 a.m.
From: Frans Pop <elendil@planet.nl>
Date: Mon, 22 Jun 2009 06:28:36 +0200

> On Monday 22 June 2009, David Miller wrote:
>> +#if 1
>> +	{
>> +		static int times = 0;
>> +
>> +		if (++times <= 32)
> 
> Should this be >=32 maybe?

Yes, it should be >= 32 :-)

> If you really do want the to skip the first 32 and capture the rest, I can 
> still get it, but I'd have to split the printk as it's too wide for my 
> serial console so part gets truncated.

Sure.

> This is what I could get from the *second* series:
> IDE-DEBUG: host->host_flags[0x10] hwif(fffff8003e326800) 
> host->cur_port((null)) 
> hwif->port_ops(cmd648_port_ops+0x0/0xfffffffffffffe54 [cmd64x]) 
> hwif->hand
> 
> Rest is truncated. This first part seemed constant.

The basic gist of the problem seems to be that when the chip is
brought up, the initial request_irq() forces a lingering interrupt
status in the chip to have somewhere to go.

But the IDE layer refuses to do something to clear it because it's
not expecting any interrupts (this is the: cur_port != hwif).

Some things other than the commit in question have changed in the area
of interrupt and chip initialization and I'll try to go through
the commits to get some clues.

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frans Pop - June 22, 2009, 6:43 a.m.
On Monday 22 June 2009, David Miller wrote:
> Some things other than the commit in question have changed in the area
> of interrupt and chip initialization and I'll try to go through
> the commits to get some clues.

Great.

JFTR, here's the full output of your debug print. All 32 prints are
identical and the "nobody cared" is printed immediately after.
No really new info I guess...

This is on top of v2.6.30-7732-g2453d6f. I'll continue working on top of
current mainline. If you need anything done, just let me know.

cmd64x 0000:01:03.0: IDE controller (0x1095:0x0646 rev 0x03)
cmd64x 0000:01:03.0: 100% native mode on irq 14
    ide0: BM-DMA at 0x1fe02c00020-0x1fe02c00027
    ide1: BM-DMA at 0x1fe02c00028-0x1fe02c0002f
Probing IDE interface ide0...
hda: ST34342A, ATA DISK drive
hda: host max PIO5 wanted PIO255(auto-tune) selected PIO4
hda: MWDMA2 mode selected
Probing IDE interface ide1...
hdc: Maxtor 6E040L0, ATA DISK drive
hdd: CD-ROM 56X/AKH, ATAPI CD/DVD-ROM drive
hdc: host max PIO5 wanted PIO255(auto-tune) selected PIO4
hdc: MWDMA2 mode selected
hdd: host max PIO5 wanted PIO255(auto-tune) selected PIO4
hdd: bad DMA info in identify block
hdd: host max PIO5 wanted PIO255(auto-tune) selected PIO4
ide0 at 0x1fe02c00000-0x1fe02c00007,0x1fe02c0000a on irq 14
IDE-DEBUG: host->host_flags[0x10] hwif(fffff8003e393800) host->cur_port((null))
IDE-DEBUG: hwif->port_ops(cmd648_port_ops+0x0/0xfffffffffffffe54 [cmd64x])
IDE-DEBUG: hwif->handler((null)) hwif->polling(0)
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - June 22, 2009, 6:44 a.m.
From: Frans Pop <elendil@planet.nl>
Date: Mon, 22 Jun 2009 08:43:13 +0200

> This is on top of v2.6.30-7732-g2453d6f. I'll continue working on top of
> current mainline. If you need anything done, just let me know.

Thanks for all of your help so far Frans.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
index 1059f80..8992fda 100644
--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -797,6 +797,25 @@  irqreturn_t ide_intr (int irq, void *dev_id)
 	int plug_device = 0;
 	struct request *uninitialized_var(rq_in_flight);
 
+#if 1
+	{
+		static int times = 0;
+		
+		if (++times <= 32)
+			goto no_log;
+
+		printk(KERN_INFO "IDE-DEBUG: host->host_flags[0x%lx] "
+		       "hwif(%p) host->cur_port(%p) "
+		       "hwif->port_ops(%pS) hwif->handler(%pS) "
+		       "hwif->polling(%d)\n",
+		       host->host_flags, hwif, host->cur_port,
+		       hwif->port_ops, hwif->handler, (int) hwif->polling);
+
+	no_log:
+		;
+	}
+#endif
+
 	if (host->host_flags & IDE_HFLAG_SERIALIZE) {
 		if (hwif != host->cur_port)
 			goto out_early;