Message ID | 1326691403.13517.21.camel@minggr |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
On Sun, Jan 15, 2012 at 9:23 PM, Lin Ming <ming.m.lin@intel.com> wrote: > > Set SATA mode to IDE on my machine can reproduce this problem. > The cause is that ata port async suspend was not enabled yet. Yup. Your patch helps. But why don't we fix the silly legacy mode thing? Yeah, Apple's firmware is crap and doesn't tend to initialize things, but still - we should have at least an option to not use the legacy PIIX mode just because the chipset wasn't programmed for AHCI by the bootup. That would also help with people who have a BIOS that has the option, but where the person hadn't noticed, and left it in legacy mode. The IO ports are all there, the *only* reason we don't use AHCI seems to be that the firmware didn't turn off the legacy bit, so it doesn't show as an AHCI interface. That's kind of sad. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 16, 2012 at 11:25 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > But why don't we fix the silly legacy mode thing? Yeah, Apple's > firmware is crap and doesn't tend to initialize things, but still - we > should have at least an option to not use the legacy PIIX mode just > because the chipset wasn't programmed for AHCI by the bootup. Here's the lspci output for that device, in case anybody cares: 00:1f.2 IDE interface [0101]: Intel Corporation 6 Series/C200 Series Chipset Family 4 port SATA IDE Controller [8086:1c01] (rev 05) (prog-if 8f [Master SecP SecO PriP PriO]) Subsystem: Intel Corporation Device [8086:7270] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin B routed to IRQ 19 Region 0: I/O ports at 2168 [size=8] Region 1: I/O ports at 217c [size=4] Region 2: I/O ports at 2160 [size=8] Region 3: I/O ports at 2178 [size=4] Region 4: I/O ports at 2060 [size=16] Region 5: I/O ports at ffe0 [size=16] Capabilities: <access denied> Kernel driver in use: ata_piix 00: 86 80 01 1c 07 00 b0 02 05 8f 01 01 00 00 00 00 10: 69 21 00 00 7d 21 00 00 61 21 00 00 79 21 00 00 20: 61 20 00 00 e1 ff 00 00 00 00 00 00 86 80 70 72 30: 00 00 00 00 70 00 00 00 00 00 00 00 0b 02 00 00 and I do think that resume used to be faster in v3.2 than it is even with your patch. But maybe that's some rose-colored glasses. I'll recompile an old kernel to check. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> The IO ports are all there, the *only* reason we don't use AHCI seems > to be that the firmware didn't turn off the legacy bit, so it doesn't > show as an AHCI interface. That's kind of sad. There are two sets of problems with that The first is that quite a few machines crap themselves if you do this because nobody has ever tested things like the IRQ routing or the firmware on suspend/resume when they find the hardwared controller has gone for a walk. Putting it back on suspend might help in some cases but then you get to pick your way through the documentation minefield, deal with device changes while in non AHCI mode, figure out how to set up registers the BIOS didn't etc. Would be a good project for someone with a lot of patience anyway as the AHCI mode is also 64bit capable and *much* faster overall as it doesn't keep stalling the CPU poking imaginary registers and turning them into messages on the SATA bus. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 16, 2012 at 11:42 AM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > > There are two sets of problems with that > > The first is that quite a few machines crap themselves if you do this > because nobody has ever tested things like the IRQ routing or the > firmware on suspend/resume when they find the hardwared controller has > gone for a walk. I think this could trivially be solved at least on these kinds of Apple machines by just making it a config option, and just not doing it if it doesn't work. Some kernel command line to say "ahci=force" or whatever. That doesn't solve the "user never even realized" problem, but at least it gives the user the possibility of solving the "crap firmware doesn't even allow this". > Putting it back on suspend might help in some cases but then you get to > pick your way through the documentation minefield, deal with device > changes while in non AHCI mode, figure out how to set up registers the > BIOS didn't etc. Yeah, no, that would just be horrible. I doubt it happens in practice, though. I bet the normal PCI/AHCI resume will just do the right thing. But I haven't tried.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 16 Jan 2012 11:47:12 -0800 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Mon, Jan 16, 2012 at 11:42 AM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > > > > There are two sets of problems with that > > > > The first is that quite a few machines crap themselves if you do this > > because nobody has ever tested things like the IRQ routing or the > > firmware on suspend/resume when they find the hardwared controller has > > gone for a walk. > > I think this could trivially be solved at least on these kinds of > Apple machines by just making it a config option, and just not doing > it if it doesn't work. Some kernel command line to say "ahci=force" or > whatever. There are patches for this from way back including a white list of devices that were tested Just ping Matthew about http://www.codon.org.uk/~mjg59/tmp/ahci_quirk_cleanup.diff > That doesn't solve the "user never even realized" problem, but at > least it gives the user the possibility of solving the "crap firmware > doesn't even allow this". For the "memory found over 4GB" case it may even be worth printing a warning. > > Putting it back on suspend might help in some cases but then you get to > > pick your way through the documentation minefield, deal with device > > changes while in non AHCI mode, figure out how to set up registers the > > BIOS didn't etc. > > Yeah, no, that would just be horrible. I doubt it happens in practice, > though. I bet the normal PCI/AHCI resume will just do the right > thing. But I haven't tried.. Some BIOSes go off and try and re-unlock the drive with the bios password etc. They get most unhappy when the controller isn't in the mode they expected. Matthew used a white list for some he tested which didn't have problems. Most netbooks seem ok with it. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 16, 2012 at 11:29 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > and I do think that resume used to be faster in v3.2 than it is even > with your patch. But maybe that's some rose-colored glasses. > > I'll recompile an old kernel to check. Confirmed. Plain v3.2 really *is* faster than current git, even current git with your patch. In v3.2 I get this: [ 92.035600] ata_piix 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19 [ 92.035610] ata_piix 0000:00:1f.2: setting latency timer to 64 [ 92.036213] sd 0:0:0:0: [sda] Starting disk [ 93.060471] ata2.00: failed to resume link (SControl 0) [ 93.379963] ata1.01: failed to resume link (SControl 0) [ 93.535802] ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 93.535815] ata1.01: SATA link down (SStatus 0 SControl 0) [ 93.543968] ata1.00: ACPI cmd ef/03:46:00:00:00:a0 (SET FEATURES) filtered out [ 93.544747] ata1.00: configured for UDMA/100 [ 93.916453] PM: resume of devices complete after 1883.974 msecs [ 93.916677] PM: Finishing wakeup. [ 94.086800] ata2.01: failed to resume link (SControl 0) [ 94.098408] ata2.00: SATA link down (SStatus 4 SControl 0) [ 94.098429] ata2.01: SATA link down (SStatus 0 SControl 0) while current git (with your patch) gives me [ 108.115373] ata_piix 0000:00:1f.2: setting latency timer to 64 [ 109.142010] ata2.00: failed to resume link (SControl 0) [ 109.462004] ata1.01: failed to resume link (SControl 0) [ 109.618065] ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 109.618078] ata1.01: SATA link down (SStatus 0 SControl 0) [ 109.626242] ata1.00: ACPI cmd ef/03:46:00:00:00:a0 (SET FEATURES) filtered out [ 109.627060] ata1.00: configured for UDMA/100 [ 109.627240] sd 0:0:0:0: [sda] Starting disk [ 110.170015] ata2.01: failed to resume link (SControl 0) [ 110.181480] ata2.00: SATA link down (SStatus 4 SControl 0) [ 110.181496] ata2.01: SATA link down (SStatus 0 SControl 0) [ 110.182124] PM: resume of devices complete after 2066.945 msecs [ 110.224533] PM: Finishing wakeup. so old kernels used to be a tiny bit faster despite not doing that async thing (still slower than I'd like: I'd think that we should be able to resume devices in less than a second, but I don't know where all the time goes) Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 16, 2012 at 11:54 AM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > > Just ping Matthew about > > http://www.codon.org.uk/~mjg59/tmp/ahci_quirk_cleanup.diff Hmm. This with just a setup option to enable it might well be good enough. No need for whitelists/blacklists, since we couldn't maintain those sanely anyway. Matthew? Googling shows that there's a few old bugreports for this too (big surprise: macs): https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/96692 and various other ones.. Linus Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 16, 2012 at 12:02 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: >> >> http://www.codon.org.uk/~mjg59/tmp/ahci_quirk_cleanup.diff > > Hmm. This with just a setup option to enable it might well be good > enough. No need for whitelists/blacklists, since we couldn't maintain > those sanely anyway. That patch doesn't work for me. I get a panic in ahci_enable_ahci, looks like 'mmio' is bogus there (attempting to dereference 0x01ffe6). I notice that the PIIX device doesn't have any MMIO at all, it's all PIO. Maybe there is some "need to enable MMIO thing" too. Or maybe I used the wrong PCI ID when I added it to the quirk table. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/16/2012 03:21 PM, Linus Torvalds wrote: > On Mon, Jan 16, 2012 at 12:02 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >>> >>> http://www.codon.org.uk/~mjg59/tmp/ahci_quirk_cleanup.diff >> >> Hmm. This with just a setup option to enable it might well be good >> enough. No need for whitelists/blacklists, since we couldn't maintain >> those sanely anyway. > > That patch doesn't work for me. I get a panic in ahci_enable_ahci, > looks like 'mmio' is bogus there (attempting to dereference 0x01ffe6). > > I notice that the PIIX device doesn't have any MMIO at all, it's all > PIO. Maybe there is some "need to enable MMIO thing" too. Or maybe I > used the wrong PCI ID when I added it to the quirk table. Part of the problem with force-enable is that the MMIO BAR may need a value, and not have it. That, and an expectations mismatch between BIOS and kernel WRT mode (IDE/AHCI) has always been the reason why 'ahci=force' never made it in. It just never seemed to be reliable broadly. I'm happy with anything we can get working reliably, even with the obvious proviso that 'ahci=force' would be a default-off, user-enabled option. It's always been motherboard/BIOS issues that got in the way. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 16, 2012 at 12:02:32PM -0800, Linus Torvalds wrote: > On Mon, Jan 16, 2012 at 11:54 AM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > > > > Just ping Matthew about > > > > http://www.codon.org.uk/~mjg59/tmp/ahci_quirk_cleanup.diff > > Hmm. This with just a setup option to enable it might well be good > enough. No need for whitelists/blacklists, since we couldn't maintain > those sanely anyway. > > Matthew? Last time I tested it it wasn't dealing well with setting up combination mode for devices with optical media, and I don't have much hardware lying around which still uses piix. But if I can find some time I'll dig something up and see if it still works. Or, if anyone else wants to do the same... > Googling shows that there's a few old bugreports for this too (big > surprise: macs): Macs will come up with ahci if you boot via EFI rather than the BIOS compatibility module.
On Mon, Jan 16, 2012 at 1:26 PM, Matthew Garrett <mjg59@srcf.ucam.org> wrote: > > Macs will come up with ahci if you boot via EFI rather than the BIOS > compatibility module. Apparently not this one. Or there is something else subtly wrong. I use rEFIt to boot Linux on this thing, which should mean that it booted as an EFI image, even if the kernel itself is not natively using EFI (or even built to care). But maybe rEFIt does something else than I expected it to do, so who knows.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 16, 2012 at 01:34:53PM -0800, Linus Torvalds wrote: > On Mon, Jan 16, 2012 at 1:26 PM, Matthew Garrett <mjg59@srcf.ucam.org> wrote: > > > > Macs will come up with ahci if you boot via EFI rather than the BIOS > > compatibility module. > > Apparently not this one. Or there is something else subtly wrong. I > use rEFIt to boot Linux on this thing, which should mean that it > booted as an EFI image, even if the kernel itself is not natively > using EFI (or even built to care). refit is an EFI app, but it's usually used to trigger a BIOS-based boot.
> Part of the problem with force-enable is that the MMIO BAR may need a > value, and not have it. That, and an expectations mismatch between BIOS That's fine - we can happily assign one at boot time. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 16, 2012 at 3:54 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: >> Part of the problem with force-enable is that the MMIO BAR may need a >> value, and not have it. That, and an expectations mismatch between BIOS > > That's fine - we can happily assign one at boot time. Actually, I think the reason that Matthews patch doesn't work for me is that on my device, BAR #5 really is a _port_ BAR. I didn't play with it much - busy merging and looking at various other issues - but I'm starting to wonder whether maybe that 8086:1c01 chip doesn't support AHCI at all. Or maybe it does something differently. Other reports of this have BAR#5 either clear, or an MMIO BAR. That Region 5: I/O ports at ffe0 [size=16] looks really odd. Matthew's patch uses pci_assign_resource(pdev, 5); but if it is a PIO region, that won't help anything.. That may be why it oopses for me - the AHCI driver is trying to use a PIO address as an MMIO one with readl(). Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 16, 2012 at 04:02:31PM -0800, Linus Torvalds wrote: > On Mon, Jan 16, 2012 at 3:54 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > >> Part of the problem with force-enable is that the MMIO BAR may need a > >> value, and not have it. That, and an expectations mismatch between BIOS > > > > That's fine - we can happily assign one at boot time. > > Actually, I think the reason that Matthews patch doesn't work for me > is that on my device, BAR #5 really is a _port_ BAR. > > I didn't play with it much - busy merging and looking at various other > issues - but I'm starting to wonder whether maybe that 8086:1c01 chip > doesn't support AHCI at all. Or maybe it does something differently. > > Other reports of this have BAR#5 either clear, or an MMIO BAR. That > > Region 5: I/O ports at ffe0 [size=16] > > looks really odd. Matthew's patch uses > > pci_assign_resource(pdev, 5); > > but if it is a PIO region, that won't help anything.. More recent ICHs may need different setup here. The datasheet should have enough to figure it out, but I can take a look once I get back from LCA.
On Mon, 2012-01-16 at 11:55 -0800, Linus Torvalds wrote: > On Mon, Jan 16, 2012 at 11:29 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > and I do think that resume used to be faster in v3.2 than it is even > > with your patch. But maybe that's some rose-colored glasses. > > > > I'll recompile an old kernel to check. > > Confirmed. > > Plain v3.2 really *is* faster than current git, even current git with > your patch. > > In v3.2 I get this: > > [ 92.035600] ata_piix 0000:00:1f.2: PCI INT B -> GSI 19 (level, > low) -> IRQ 19 > [ 92.035610] ata_piix 0000:00:1f.2: setting latency timer to 64 > [ 92.036213] sd 0:0:0:0: [sda] Starting disk > [ 93.060471] ata2.00: failed to resume link (SControl 0) > [ 93.379963] ata1.01: failed to resume link (SControl 0) > [ 93.535802] ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > [ 93.535815] ata1.01: SATA link down (SStatus 0 SControl 0) > [ 93.543968] ata1.00: ACPI cmd ef/03:46:00:00:00:a0 (SET FEATURES) > filtered out > [ 93.544747] ata1.00: configured for UDMA/100 > [ 93.916453] PM: resume of devices complete after 1883.974 msecs > [ 93.916677] PM: Finishing wakeup. > [ 94.086800] ata2.01: failed to resume link (SControl 0) > [ 94.098408] ata2.00: SATA link down (SStatus 4 SControl 0) > [ 94.098429] ata2.01: SATA link down (SStatus 0 SControl 0) > > while current git (with your patch) gives me > > [ 108.115373] ata_piix 0000:00:1f.2: setting latency timer to 64 > [ 109.142010] ata2.00: failed to resume link (SControl 0) > [ 109.462004] ata1.01: failed to resume link (SControl 0) > [ 109.618065] ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > [ 109.618078] ata1.01: SATA link down (SStatus 0 SControl 0) > [ 109.626242] ata1.00: ACPI cmd ef/03:46:00:00:00:a0 (SET FEATURES) > filtered out > [ 109.627060] ata1.00: configured for UDMA/100 > [ 109.627240] sd 0:0:0:0: [sda] Starting disk > [ 110.170015] ata2.01: failed to resume link (SControl 0) > [ 110.181480] ata2.00: SATA link down (SStatus 4 SControl 0) > [ 110.181496] ata2.01: SATA link down (SStatus 0 SControl 0) > [ 110.182124] PM: resume of devices complete after 2066.945 msecs > [ 110.224533] PM: Finishing wakeup. > > so old kernels used to be a tiny bit faster despite not doing that > async thing (still slower than I'd like: I'd think that we should be No, old kernel did that async thing. Old kernel embedded the ata port suspend/resume code in host suspend/resume code and host has async suspend enabled. I split the ata port suspend/resume code out. ata: add ata port system PM callbacks http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=5ef4108 But ata port async suspend was not enabled in above patch. So I think this is a regression. Thanks, Lin Ming > able to resume devices in less than a second, but I don't know where > all the time goes) > > Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 16, 2012 at 9:16 PM, Lin Ming <ming.m.lin@intel.com> wrote: > > But ata port async suspend was not enabled in above patch. > So I think this is a regression. Ok. Jeff? Should I take that patch directly, or will you take it? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/17/2012 12:19 AM, Linus Torvalds wrote: > On Mon, Jan 16, 2012 at 9:16 PM, Lin Ming<ming.m.lin@intel.com> wrote: >> >> But ata port async suspend was not enabled in above patch. >> So I think this is a regression. > > Ok. Jeff? Should I take that patch directly, or will you take it? Coming today, with at least one other change... still keeping an eye out for possible PM problems on older non-AHCI SATA hardware. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/16/2012 02:55 PM, Linus Torvalds wrote: > so old kernels used to be a tiny bit faster despite not doing that > async thing (still slower than I'd like: I'd think that we should be > able to resume devices in less than a second, but I don't know where > all the time goes) In libata specifically, we remain peppered liberally with a bunch of msleep()'s all over the place, particularly when dealing with SATA phys. 10ms here, 20ms there, 200ms in sata_link_resume(), it all adds up when you're talking about fast boot. For most hardware, we can scan ports in parallel, which gets you past the biggest annoyances, but not all of them. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/ata/libata-transport.c b/drivers/ata/libata-transport.c index 9a7f0ea..74aaee3 100644 --- a/drivers/ata/libata-transport.c +++ b/drivers/ata/libata-transport.c @@ -291,6 +291,7 @@ int ata_tport_add(struct device *parent, goto tport_err; } + device_enable_async_suspend(dev); pm_runtime_set_active(dev); pm_runtime_enable(dev);