Message ID | 20210326124326.21163-1-pali@kernel.org |
---|---|
State | New |
Headers | show |
Series | PCI: Disallow retraining link for Atheros QCA98xx chips on non-Gen1 PCIe bridges | expand |
Pali Rohár <pali@kernel.org> writes: > Atheros QCA9880 and QCA9890 chips do not behave after a bus reset and also > after retrain link when PCIe bridge is not in GEN1 mode at 2.5 GT/s speed. > The device will throw a Link Down error and config space is not accessible > again. Retrain link can be called only when using GEN1 PCIe bridge or when > PCIe bridge has forced link speed to 2.5 GT/s via PCI_EXP_LNKCTL2 register. > > This issue was reproduced with more Compex WLE900VX cards (QCA9880 based) > on Armada 385 with pci-mvebu.c driver and also on Armada 3720 with > pci-aardvark.c driver. Also this issue was reproduced with some "noname" > card with QCA9890 WiFi chip on Armada 3720. All problematic cards with > these QCA chips have PCI device id 0x003c. > > Tests showed that other WiFi cards based on AR93xx (PCI device id 0x0030) > and AR9287 (PCI device id 0x002e) chips do not have these problems. > > To workaround this issue, this change introduces a new PCI quirk called > PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1 for PCI device id 0x003c. > > When this quirk is set then kernel disallows triggering PCI_EXP_LNKCTL_RL > bit in config space of PCIe Bridge in case PCIe Bridge is capable of higher > speed than 2.5 GT/s and higher speed is already allowed. When PCIe Bridge > has accessible LNKCTL2 register then kernel tries to force target link > speed via PCI_EXP_LNKCTL2_TLS* bits to 2.5 GT/s. After this change it is > possible to trigger PCI_EXP_LNKCTL_RL bit without causing issues on > problematic Atheros QCA98xx cards. > > Currently only PCIe ASPM kernel code triggers this PCI_EXP_LNKCTL_RL bit, > so quirk check is added only into pcie/aspm.c file. > > Signed-off-by: Pali Rohár <pali@kernel.org> > Reported-by: Toke Høiland-Jørgensen <toke@redhat.com> > Tested-by: Marek Behún <kabel@kernel.org> > Link: https://lore.kernel.org/linux-pci/87h7l8axqp.fsf@toke.dk/ > Cc: stable@vger.kernel.org # c80851f6ce63a ("PCI: Add > PCI_EXP_LNKCTL2_TLS* macros") Thanks! Tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Hi Pali, Thank you for sending the patch over! [...] > +static int pcie_change_tls_to_gen1(struct pci_dev *parent) Just a nitpick, so feel free to ignore it. I would just call the variable "dev" as we pass a pointer to a particular device, but it does not matter as much, so I am leaving this to you. [...] > + if (ret == 0) { You prefer this style over "if (!ret)"? Just asking in the view of the style that seem to be preferred in the code base at the moment. > + /* Verify that new value was really set */ > + pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, ®16); > + if ((reg16 & PCI_EXP_LNKCTL2_TLS) != PCI_EXP_LNKCTL2_TLS_2_5GT) > + ret = -EINVAL; I am wondering about this verification - did you have a case where the device would not properly set its capability, or accept the write and do nothing? > + if (ret != 0) I think "if (ret)" would be fine to use here, unless you prefer being more explicit. See my question about style above. > static bool pcie_retrain_link(struct pcie_link_state *link) > { > struct pci_dev *parent = link->pdev; > unsigned long end_jiffies; > u16 reg16; > + u32 reg32; > + > + /* Check if link is capable of higher speed than 2.5 GT/s and needs quirk */ > + pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, ®32); > + if ((reg32 & PCI_EXP_LNKCAP_SLS) > PCI_EXP_LNKCAP_SLS_2_5GB) { I wonder if moving this check to pcie_change_tls_to_gen1() would make more sense? It would then make this function a little cleaner. What do you think? [...] > +static void quirk_no_bus_reset_and_no_retrain_link(struct pci_dev *dev) > +{ > + dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET | PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1; > +} [...] I know that the style has been changed to allow 100 characters width and that checkpatch.pl now also does not warn about line length, as per commit bdc48fa11e46 ("checkpatch/coding-style: deprecate 80-column warning"), but I think Bjorn still prefers 80 characters, thus this line above might have to be aligned. Krzysztof
Hello! On Saturday 27 March 2021 01:14:10 Krzysztof Wilczyński wrote: > Hi Pali, > > Thank you for sending the patch over! > > [...] > > +static int pcie_change_tls_to_gen1(struct pci_dev *parent) > > Just a nitpick, so feel free to ignore it. I would just call the > variable "dev" as we pass a pointer to a particular device, but it does > not matter as much, so I am leaving this to you. I called it 'parent' because it is called 'parent' also in caller function. Link consists of two devices, so 'dev' could be ambiguous. > [...] > > + if (ret == 0) { > > You prefer this style over "if (!ret)"? Just asking in the view of the > style that seem to be preferred in the code base at the moment. I can change this to 'if (!ret)' if needed, no problem. I use 'if (!val)' mostly for boolean and pointer variables. If variable can contain more integer values then I lot of times I use '=='. > > + /* Verify that new value was really set */ > > + pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, ®16); > > + if ((reg16 & PCI_EXP_LNKCTL2_TLS) != PCI_EXP_LNKCTL2_TLS_2_5GT) > > + ret = -EINVAL; > > I am wondering about this verification - did you have a case where the > device would not properly set its capability, or accept the write and do > nothing? I do not know any real device which is doing this thing. But this issue can happen with kernel's PCIe emulated root bridge: drivers/pci/pci-bridge-emul.c Drivers which are using this emulated root bridge (and both pci-mvebu.c and pci-aardvark.c are using it!) do not have to implement callback for every read and every write operation of every register. Note that both pci-mvebu.c and pci-aardvark.c currently does not implement access to HW register PCI_EXP_LNKCTL2. So currently it is not possible to set PCI_EXP_LNKCTL2_TLS_2_5GT. And above code correctly fails and disallow ASPM code to retrain link. I have some WIP patches which implement LNKCAP2, LNKCTL2 and LNKSTA2 read/write callbacks on emulated bridge for pci-mvebu.c and pci-aardvark.c. And I have tested that with those WIP patches ASPM code can correctly switch link to 2.5GT/s and enable ASPM. > > + if (ret != 0) > > I think "if (ret)" would be fine to use here, unless you prefer being > more explicit. See my question about style above. > > > static bool pcie_retrain_link(struct pcie_link_state *link) > > { > > struct pci_dev *parent = link->pdev; > > unsigned long end_jiffies; > > u16 reg16; > > + u32 reg32; > > + > > + /* Check if link is capable of higher speed than 2.5 GT/s and needs quirk */ > > + pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, ®32); > > + if ((reg32 & PCI_EXP_LNKCAP_SLS) > PCI_EXP_LNKCAP_SLS_2_5GB) { > > I wonder if moving this check to pcie_change_tls_to_gen1() would make > more sense? It would then make this function a little cleaner. What do > you think? Ok, no problem. But then function needs to be renamed. Any idea how should be this function called? > [...] > > +static void quirk_no_bus_reset_and_no_retrain_link(struct pci_dev *dev) > > +{ > > + dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET | PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1; > > +} > [...] > > I know that the style has been changed to allow 100 characters width and > that checkpatch.pl now also does not warn about line length, as per > commit bdc48fa11e46 ("checkpatch/coding-style: deprecate 80-column > warning"), but I think Bjorn still prefers 80 characters, thus this line > above might have to be aligned. Ok! If needed I can reformat patch to 80 characters per line.
On Sat, 27 Mar 2021 14:29:59 +0100 Pali Rohár <pali@kernel.org> wrote: > I can change this to 'if (!ret)' if needed, no problem. > > I use 'if (!val)' mostly for boolean and pointer variables. If > variable can contain more integer values then I lot of times I use > '=='. Comparing integer varibales with explicit literals is sensible, but if a function returning integer returns 0 on success and negative value on error, Linux kernel has a tradition of using just if (!ret) or if (ret) Marek
On Saturday 27 March 2021 15:42:13 Marek Behún wrote: > On Sat, 27 Mar 2021 14:29:59 +0100 > Pali Rohár <pali@kernel.org> wrote: > > > I can change this to 'if (!ret)' if needed, no problem. > > > > I use 'if (!val)' mostly for boolean and pointer variables. If > > variable can contain more integer values then I lot of times I use > > '=='. > > Comparing integer varibales with explicit literals is sensible, but > if a function returning integer returns 0 on success and negative value > on error, Linux kernel has a tradition of using just > if (!ret) > or > if (ret) > > Marek Ok, I will change it.
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index ac0557a305af..ea5bdf6107f6 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -192,11 +192,54 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist) link->clkpm_disable = blacklist ? 1 : 0; } +static int pcie_change_tls_to_gen1(struct pci_dev *parent) +{ + u16 reg16; + u32 reg32; + int ret; + + /* Check if link speed can be forced to 2.5 GT/s */ + pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, ®32); + if (!(reg32 & PCI_EXP_LNKCAP2_SLS_2_5GB)) { + pci_err(parent, "ASPM: Bridge does not support changing Link Speed to 2.5 GT/s\n"); + return -EOPNOTSUPP; + } + + /* Force link speed to 2.5 GT/s */ + ret = pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2, + PCI_EXP_LNKCTL2_TLS, + PCI_EXP_LNKCTL2_TLS_2_5GT); + if (ret == 0) { + /* Verify that new value was really set */ + pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, ®16); + if ((reg16 & PCI_EXP_LNKCTL2_TLS) != PCI_EXP_LNKCTL2_TLS_2_5GT) + ret = -EINVAL; + } + + if (ret != 0) + pci_err(parent, "ASPM: Changing Target Link Speed to 2.5 GT/s failed: %d\n", ret); + + return ret; +} + static bool pcie_retrain_link(struct pcie_link_state *link) { struct pci_dev *parent = link->pdev; unsigned long end_jiffies; u16 reg16; + u32 reg32; + + if (link->downstream->dev_flags & PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1) { + /* Check if link is capable of higher speed than 2.5 GT/s and needs quirk */ + pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, ®32); + if ((reg32 & PCI_EXP_LNKCAP_SLS) > PCI_EXP_LNKCAP_SLS_2_5GB) { + if (pcie_change_tls_to_gen1(parent) != 0) { + pci_err(parent, "ASPM: Retrain Link at higher speed is disallowed by quirk\n"); + return false; + } + pci_info(parent, "ASPM: Target Link Speed changed to 2.5 GT/s due to quirk\n"); + } + } pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16); reg16 |= PCI_EXP_LNKCTL_RL; diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 653660e3ba9e..8ff690c7679d 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3558,6 +3558,11 @@ static void quirk_no_bus_reset(struct pci_dev *dev) dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET; } +static void quirk_no_bus_reset_and_no_retrain_link(struct pci_dev *dev) +{ + dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET | PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1; +} + /* * Some Atheros AR9xxx and QCA988x chips do not behave after a bus reset. * The device will throw a Link Down error on AER-capable systems and @@ -3567,10 +3572,18 @@ static void quirk_no_bus_reset(struct pci_dev *dev) */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0030, quirk_no_bus_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0032, quirk_no_bus_reset); -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, quirk_no_bus_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033, quirk_no_bus_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0034, quirk_no_bus_reset); +/* + * Atheros QCA9880 and QCA9890 chips do not behave after a bus reset and also + * after retrain link when PCIe bridge is not in GEN1 mode at 2.5 GT/s speed. + * The device will throw a Link Down error and config space is not accessible + * again. Retrain link can be called only when using GEN1 PCIe bridge or when + * PCIe bridge has forced link speed to 2.5 GT/s via PCI_EXP_LNKCTL2 register. + */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, quirk_no_bus_reset_and_no_retrain_link); + /* * Root port on some Cavium CN8xxx chips do not successfully complete a bus * reset when used with certain child devices. After the reset, config diff --git a/include/linux/pci.h b/include/linux/pci.h index 86c799c97b77..fdbf7254e4ab 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -227,6 +227,8 @@ enum pci_dev_flags { PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10), /* Don't use Relaxed Ordering for TLPs directed at this device */ PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11), + /* Don't Retrain Link for device when bridge is not in GEN1 mode */ + PCI_DEV_FLAGS_NO_RETRAIN_LINK_WHEN_NOT_GEN1 = (__force pci_dev_flags_t) (1 << 12), }; enum pci_irq_reroute_variant {