Message ID | 20121023101751.GA24656@liondog.tnic |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
On Tue, Oct 23, 2012 at 12:17:51PM +0200, Borislav Petkov wrote: > On Tue, Oct 23, 2012 at 11:05:49AM +0100, Alan Cox wrote: > > > So yes, blacklisting it and verifying that your system still operates > > > normally would be something to do. If it does, you could also build a > > > kernel with pata_acpi disabled (that is, provided you build your own > > > kernels). > > > > The crash is still a bug. It needs chasing down. > > Yes, ata_timing_find_mode gives NULL, we found that out. > > Anton, Phillip, anyone willing to run this hunk below and get us the > output: > > --- > diff --git a/drivers/ata/pata_acpi.c b/drivers/ata/pata_acpi.c > index 09723b76beac..80d594d6e7c8 100644 > --- a/drivers/ata/pata_acpi.c > +++ b/drivers/ata/pata_acpi.c > @@ -144,6 +144,13 @@ static void pacpi_set_dmamode(struct ata_port *ap, struct ata_device *adev) > > /* Now stuff the nS values into the structure */ > t = ata_timing_find_mode(adev->dma_mode); > + if (!t) { > + pr_err("%s: ata_timing_find_mode gives NULL; adev->dma_mode: 0x%x\n", > + __func__, adev->dma_mode); > + > + return; > + } > + > if (adev->dma_mode >= XFER_UDMA_0) { > acpi->gtm.drive[unit].dma = t->udma; > acpi->gtm.flags |= (1 << (2 * unit)); > -- Ok, here's the output from Anton's box (https://bugzilla.kernel.org/show_bug.cgi?id=49151#c16): pacpi_set_dmamode: ata_timing_find_mode gives NULL; adev->dma_mode: 0x0 And in that case we should've matched XFER_PIO_SLOW but it is commented out for some reason. Judging by 70cd071e4ecc06c985189665af75c108601fd5a3, I think we should involve Tejun into this. @Tejun: so basically there are two people with oopses when using pata_acpi because pacpi_set_dmamode queries xfer mode with adev->dma_mode = 0x0 and ata_timing_find_mode returns NULL (presumably a BIOS bug, I'd say). More details if you follow the thread here: http://marc.info/?l=linux-ide&m=135073445731432 I don't know how to fix this since if BIOS gives xfer_mode 0, how can each pata controller find its max mode? Sure, we can always fall back to 0x0 but that is the slowest and not necessarily optimal. One other possibility could be to load chipset-specific drivers first and pata_acpi only as a last resort but I don't know whether this is doable at all. Thanks.
Hello В Tue, 23 Oct 2012 18:12:16 +0200 Borislav Petkov wrote: > @Tejun: so basically there are two people with oopses when using > pata_acpi Not two. We have two boxes in our office (about 20 boxes) and more then ten reports from our users (only from early adopters yet). It seems that this problem doesn't exists when someone build kernel for his own box, but will be triggered very often when 3.6 will be wide used distro kernel.
On 10/23/2012 11:17 AM, Borislav Petkov wrote: > --- > diff --git a/drivers/ata/pata_acpi.c b/drivers/ata/pata_acpi.c > index 09723b76beac..80d594d6e7c8 100644 > --- a/drivers/ata/pata_acpi.c > +++ b/drivers/ata/pata_acpi.c > @@ -144,6 +144,13 @@ static void pacpi_set_dmamode(struct ata_port *ap, struct ata_device *adev) > > /* Now stuff the nS values into the structure */ > t = ata_timing_find_mode(adev->dma_mode); > + if (!t) { > + pr_err("%s: ata_timing_find_mode gives NULL; adev->dma_mode: 0x%x\n", > + __func__, adev->dma_mode); > + > + return; > + } > + > if (adev->dma_mode >= XFER_UDMA_0) { > acpi->gtm.drive[unit].dma = t->udma; > acpi->gtm.flags |= (1 << (2 * unit)); > -- pacpi_set_dmamode: ata_timing_find_mode gives NULL; adev->dma_mode: 0x0 as well here if I build pata_acpi as a module, if I build it into the kernel I don't get any message. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 24 Oct 2012 10:28:42 +0100 Phillip Wood <phillip.wood@talktalk.net> wrote: > On 10/23/2012 11:17 AM, Borislav Petkov wrote: > > --- > > diff --git a/drivers/ata/pata_acpi.c b/drivers/ata/pata_acpi.c > > index 09723b76beac..80d594d6e7c8 100644 > > --- a/drivers/ata/pata_acpi.c > > +++ b/drivers/ata/pata_acpi.c > > @@ -144,6 +144,13 @@ static void pacpi_set_dmamode(struct ata_port *ap, struct ata_device *adev) > > > > /* Now stuff the nS values into the structure */ > > t = ata_timing_find_mode(adev->dma_mode); > > + if (!t) { > > + pr_err("%s: ata_timing_find_mode gives NULL; adev->dma_mode: 0x%x\n", > > + __func__, adev->dma_mode); > > + > > + return; > > + } > > + > > if (adev->dma_mode >= XFER_UDMA_0) { > > acpi->gtm.drive[unit].dma = t->udma; > > acpi->gtm.flags |= (1 << (2 * unit)); > > -- > > pacpi_set_dmamode: ata_timing_find_mode gives NULL; adev->dma_mode: 0x0 Which is an ATA layer bug - adev->dma_mode should never be called without a DMA mode in normal use. > as well here if I build pata_acpi as a module, if I build it into the > kernel I don't get any message. If you build the drivers into the kernel the link order ensures the generic drivers execute last so the native driver will already have been used. When loading modules it is expected that the distribution is smart enough to get this right. So the built in case is covering up the failure case because the code never gets executed, Alan -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 24, 2012 at 11:57:46AM +0100, Alan Cox wrote: > Which is an ATA layer bug - adev->dma_mode should never be called > without a DMA mode in normal use. Ok, it looks like this would take a while to fix. Alan, what is your suggestion for a proper fix, uncomment XFER_PIO_SLOW and drop to it with a big warning that ACPI is giving botched information on those chips and people should try using the platform-specific driver if they want better/optimal speeds? Thanks.
On Sat, 3 Nov 2012 05:26:35 +0100 Borislav Petkov <bp@alien8.de> wrote: > On Wed, Oct 24, 2012 at 11:57:46AM +0100, Alan Cox wrote: > > Which is an ATA layer bug - adev->dma_mode should never be called > > without a DMA mode in normal use. > > Ok, it looks like this would take a while to fix. > > Alan, what is your suggestion for a proper fix, uncomment XFER_PIO_SLOW > and drop to it with a big warning that ACPI is giving botched No. The proper fix is to find out how it got called with no DMA mode set. There is no reason ACPI can't return pure PIO answers. If it does however then the DMA mode setting call should not be made by the core libata code. Lots of our ATA driver code relies upon that so if it's actually what is happening that is what needs fixing. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 3 Nov 2012 05:26:35 +0100 Borislav Petkov <bp@alien8.de> wrote: > On Wed, Oct 24, 2012 at 11:57:46AM +0100, Alan Cox wrote: > > Which is an ATA layer bug - adev->dma_mode should never be called > > without a DMA mode in normal use. > > Ok, it looks like this would take a while to fix. So a 30 second glance says that the problem is that you seem to have dma_mode uninitialised as zero which is bogus. That means either ata_acpi_gtm_xfermask broke (it should have set the bits to 0xFF if no mode is found), ata_dma_enabled is broken, or pacpi_qc_issue got called before pacpi_port_start (which seems wildly unlikely) Needs someone to go and dump the relevant values in the right places and see what is breaking in the pata_acpi setup logic. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/03/2012 12:48 PM, Alan Cox wrote: > On Sat, 3 Nov 2012 05:26:35 +0100 > Borislav Petkov <bp@alien8.de> wrote: > >> On Wed, Oct 24, 2012 at 11:57:46AM +0100, Alan Cox wrote: >>> Which is an ATA layer bug - adev->dma_mode should never be called >>> without a DMA mode in normal use. >> >> Ok, it looks like this would take a while to fix. > > So a 30 second glance says that the problem is that you seem to have > dma_mode uninitialised as zero which is bogus. > > That means either ata_acpi_gtm_xfermask broke (it should have set the > bits to 0xFF if no mode is found), ata_dma_enabled is broken, or > pacpi_qc_issue got called before pacpi_port_start (which seems wildly > unlikely) > > Needs someone to go and dump the relevant values in the right places and > see what is breaking in the pata_acpi setup logic. Agreed -- though the WARN_ONCE() will at least give us trivially better poops. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/ata/pata_acpi.c b/drivers/ata/pata_acpi.c index 09723b76beac..80d594d6e7c8 100644 --- a/drivers/ata/pata_acpi.c +++ b/drivers/ata/pata_acpi.c @@ -144,6 +144,13 @@ static void pacpi_set_dmamode(struct ata_port *ap, struct ata_device *adev) /* Now stuff the nS values into the structure */ t = ata_timing_find_mode(adev->dma_mode); + if (!t) { + pr_err("%s: ata_timing_find_mode gives NULL; adev->dma_mode: 0x%x\n", + __func__, adev->dma_mode); + + return; + } + if (adev->dma_mode >= XFER_UDMA_0) { acpi->gtm.drive[unit].dma = t->udma; acpi->gtm.flags |= (1 << (2 * unit));