Patchwork sata_sil boot failure with 2.6.35

login
register
mail settings
Submitter Tejun Heo
Date Aug. 24, 2010, 1:52 p.m.
Message ID <4C73CE8A.406@kernel.org>
Download mbox | patch
Permalink /patch/62594/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Tejun Heo - Aug. 24, 2010, 1:52 p.m.
Hello,

On 08/23/2010 05:47 PM, Jan Beulich wrote:
>>>> On 23.08.10 at 15:58, Tejun Heo <tj@kernel.org> wrote:
>>   ata1: ST-ATA: DRQ=0 without device error, dev_stat 0x50
>>
>> which is _really_ weird.  Hmmm... can you please apply the following
>> patch and see whether anything changes?
> 
> No difference, but log attached again just in case.

I can't reproduce the problem w/ my 3114 and unfortunately I don't
have a 3112 or 3512 around.  :-(

The only thing I can think of is the following.  Can you please try
this one?  If this one fails, we might have to resort to bisecting.
It's very weird.  The sff common code is used by other controllers but
until now the problem is only being reported on sata_sil.

Thanks.
Tejun Heo - Aug. 24, 2010, 3:16 p.m.
On 08/24/2010 05:17 PM, Jan Beulich wrote:
>> The only thing I can think of is the following.  Can you please try
>> this one?  If this one fails, we might have to resort to bisecting.
>> It's very weird.  The sff common code is used by other controllers but
>> until now the problem is only being reported on sata_sil.
> 
> Sorry - again no (visible) difference.

Any chance I can persuade you into bisecting the problem? :-) Also,
can you please attach the output of lspci -nn output and .config?

Thanks.
Jan Beulich - Aug. 24, 2010, 3:17 p.m.
>>> On 24.08.10 at 15:52, Tejun Heo <tj@kernel.org> wrote:
> Hello,
> 
> On 08/23/2010 05:47 PM, Jan Beulich wrote:
>>>>> On 23.08.10 at 15:58, Tejun Heo <tj@kernel.org> wrote:
>>>   ata1: ST-ATA: DRQ=0 without device error, dev_stat 0x50
>>>
>>> which is _really_ weird.  Hmmm... can you please apply the following
>>> patch and see whether anything changes?
>> 
>> No difference, but log attached again just in case.
> 
> I can't reproduce the problem w/ my 3114 and unfortunately I don't
> have a 3112 or 3512 around.  :-(
> 
> The only thing I can think of is the following.  Can you please try
> this one?  If this one fails, we might have to resort to bisecting.
> It's very weird.  The sff common code is used by other controllers but
> until now the problem is only being reported on sata_sil.

Sorry - again no (visible) difference.

Jan
Jan Beulich - Aug. 25, 2010, 8:20 a.m.
>>> On 24.08.10 at 17:16, Tejun Heo <tj@kernel.org> wrote:
> On 08/24/2010 05:17 PM, Jan Beulich wrote:
>>> The only thing I can think of is the following.  Can you please try
>>> this one?  If this one fails, we might have to resort to bisecting.
>>> It's very weird.  The sff common code is used by other controllers but
>>> until now the problem is only being reported on sata_sil.
>> 
>> Sorry - again no (visible) difference.
> 
> Any chance I can persuade you into bisecting the problem? :-) Also,
> can you please attach the output of lspci -nn output and .config?

I'll see if I can find time, but I can give no estimate. However, the
box I see this on (AMD Sahara Rev F) likely has a number of
duplicates in our labs, so maybe you could try reproducing it there?

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - Aug. 30, 2010, 1:39 p.m.
Hello,

On 08/25/2010 10:20 AM, Jan Beulich wrote:
>> Any chance I can persuade you into bisecting the problem? :-) Also,
>> can you please attach the output of lspci -nn output and .config?
> 
> I'll see if I can find time, but I can give no estimate. However, the
> box I see this on (AMD Sahara Rev F) likely has a number of
> duplicates in our labs, so maybe you could try reproducing it there?

I just tried 3512 and 3112 but still can't reproduce the problem.
I'll see if I can get the same model hooked onto a remote console.

Thanks.
Tejun Heo - Aug. 30, 2010, 1:42 p.m.
On 08/30/2010 03:39 PM, Tejun Heo wrote:
> Hello,
> 
> On 08/25/2010 10:20 AM, Jan Beulich wrote:
>>> Any chance I can persuade you into bisecting the problem? :-) Also,
>>> can you please attach the output of lspci -nn output and .config?
>>
>> I'll see if I can find time, but I can give no estimate. However, the
>> box I see this on (AMD Sahara Rev F) likely has a number of
>> duplicates in our labs, so maybe you could try reproducing it there?
> 
> I just tried 3512 and 3112 but still can't reproduce the problem.
> I'll see if I can get the same model hooked onto a remote console.

Just one more thing before going forward with that.  Can you please
test the current libata-dev#upstream[1] and see the problem is still
there?  Also, can you please post your .config?

Thanks.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git upstream
Jan Beulich - Aug. 31, 2010, 3:22 p.m.
>>> On 30.08.10 at 15:42, Tejun Heo <teheo@novell.com> wrote:
> Just one more thing before going forward with that.  Can you please
> test the current libata-dev#upstream[1] and see the problem is still
> there?  Also, can you please post your .config?

Both 2.6.36-rc3 and the upstream branch of libata-dev fail the same
way. Attaching my .config from the latter build.

Jan

Patch

diff --git a/drivers/ata/sata_sil.c b/drivers/ata/sata_sil.c
index 3cb69d5..35bd5cc 100644
--- a/drivers/ata/sata_sil.c
+++ b/drivers/ata/sata_sil.c
@@ -565,19 +565,6 @@  static void sil_freeze(struct ata_port *ap)
 	tmp |= SIL_MASK_IDE0_INT << ap->port_no;
 	writel(tmp, mmio_base + SIL_SYSCFG);
 	readl(mmio_base + SIL_SYSCFG);	/* flush */
-
-	/* Ensure DMA_ENABLE is off.
-	 *
-	 * This is because the controller will not give us access to the
-	 * taskfile registers while a DMA is in progress
-	 */
-	iowrite8(ioread8(ap->ioaddr.bmdma_addr) & ~SIL_DMA_ENABLE,
-		 ap->ioaddr.bmdma_addr);
-
-	/* According to ata_bmdma_stop, an HDMA transition requires
-	 * on PIO cycle. But we can't read a taskfile register.
-	 */
-	ioread8(ap->ioaddr.bmdma_addr);
 }

 static void sil_thaw(struct ata_port *ap)