From patchwork Mon Mar 14 18:19:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dimitri John Ledkov X-Patchwork-Id: 1605240 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=gKr1BNkC; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4KHPvS14xtz9sFq for ; Tue, 15 Mar 2022 05:19:30 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1nTpHs-0003Do-Vh; Mon, 14 Mar 2022 18:19:20 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1nTpHr-0003Dg-0D for kernel-team@lists.ubuntu.com; Mon, 14 Mar 2022 18:19:19 +0000 Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id C055740024 for ; Mon, 14 Mar 2022 18:19:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1647281958; bh=moJlY1v1QkQMZauiKZkc6JjB5o16AuDzD5fEHGUuZZs=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=gKr1BNkCrlczeov2GCA1srIwMygbC9Z2qlhkNw9B+Nl8yrsmvasoyd+E4IL2K/3xA HpDfSuh2xzPSnRRND0tWfna32wgAyYb7Zy87LVt7PSNYJ2WxAlu2b/t3PPUtRdDEFV 2dn2f7e57H1qVTUiu2UblPgkOhIcHB5xvVaj3By7wtkSY/QisYW8MvvNq+bAp+cVrQ ZqIokHDMgytrtjQp0TfgXqn9X14kBsiqKMkZTQekzBsmdPyoC6IOtcneRcZ330hlC4 DD5OtFUC+BOhE4BD54dHbaKn5RIr88x0pLrYl41bs9enfnOGJzHklXRpZl9JtiDcf1 X7yS+q1zD2/cA== Received: by mail-wm1-f71.google.com with SMTP id 14-20020a05600c104e00b003897a167353so14181wmx.8 for ; Mon, 14 Mar 2022 11:19:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=moJlY1v1QkQMZauiKZkc6JjB5o16AuDzD5fEHGUuZZs=; b=dk/D1q6z7Jy4XRSCcZSBkZlWU6NpBkoTjNSGXrvIXkzTf1emtqcbu7IlBJs3+OzRmJ s/rOzjtSkXdDsIHR5jnsD2W3Pwfa22N525JdsLIL3OCWp5zWBd+HDuLQTvCzf6ke0tIJ NtdSPPTRgERIvB/X+GM2C9Xd6pzCOrT+lnbGzkEDKxmpmdB+WxLjevhdzftGPZuVbhQ5 KK8lr1gBkZuYGtFlbyKHJXzoRBXAM7iEA8zJvddqWtmV50Wrq15w+cCCVb5w4Fh4L0TG VJlfI4m3XZUCYxcY9d+F2iYq8HmTYspvZ6U95BbWIACUpqqri8IGM0jKYRm78IuCxjTp uh0A== X-Gm-Message-State: AOAM532mIzmqb6PpR6OxvT8ZCrpukeTPieqjoGFg0e0CAjr1GizuT4ty d61LPNOhmBS1LyUS2ltoTB3kZEZcrpjsGGx1OPLnyW7KYQUIGSe0t8BdAKgqUuYK3icuUMGc1w+ DAe77WBogdLO+4D6w1cRXZJqxF7/WiAFZgPWV7IuIRw== X-Received: by 2002:adf:efc6:0:b0:1f1:e397:44b with SMTP id i6-20020adfefc6000000b001f1e397044bmr17736243wrp.298.1647281958046; Mon, 14 Mar 2022 11:19:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx5X9WV6sjmdvbScUJ5ELnTUY0KDLqQMfAroJQx42c43RyISE1UzL3uXJjAJjvRDhZDmPY5hw== X-Received: by 2002:adf:efc6:0:b0:1f1:e397:44b with SMTP id i6-20020adfefc6000000b001f1e397044bmr17736222wrp.298.1647281957531; Mon, 14 Mar 2022 11:19:17 -0700 (PDT) Received: from localhost ([2a01:4b00:85fd:d700:9346:8fa3:de8e:a321]) by smtp.gmail.com with ESMTPSA id f17-20020adffcd1000000b001edbf438d83sm13562705wrs.32.2022.03.14.11.19.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Mar 2022 11:19:17 -0700 (PDT) From: Dimitri John Ledkov To: kernel-team@lists.ubuntu.com Subject: [J/linux-riscv][PATCH] UBUNTU: SAUCE: pci: Work around ASMedia ASM2824 PCIe link training failures Date: Mon, 14 Mar 2022 18:19:12 +0000 Message-Id: <20220314181912.26934-1-dimitri.ledkov@canonical.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: "Maciej W. Rozycki" Attempt to handle cases with a downstream port of the ASMedia ASM2824 PCIe switch where link training never completes and the link continues switching between speeds indefinitely with the data link layer never reaching the active state. It has been observed with a downstream port of the ASMedia ASM2824 Gen 3 switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 switch, using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, P/N 41433, wired to a SiFive HiFive Unmatched board. In this setup the switches are supposed to negotiate the link speed of preferably 5.0GT/s, falling back to 2.5GT/s. However the link continues oscillating between the two speeds, at the rate of 34-35 times per second, with link training reported repeatedly active ~84% of the time, e.g.: 02:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM2824 PCIe Gen3 Packet Switch [1b21:2824] (rev 01) (prog-if 00 [Normal decode]) [...] Bus: primary=02, secondary=05, subordinate=05, sec-latency=0 [...] Capabilities: [80] Express (v2) Downstream Port (Slot+), MSI 00 [...] LnkSta: Speed 5GT/s (downgraded), Width x1 (ok) TrErr- Train+ SlotClk+ DLActive- BWMgmt+ ABWMgmt- [...] LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis+, Selectable De-emphasis: -3.5dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB [...] Forcibly limiting the target link speed to 2.5GT/s with the upstream ASM2824 device makes the two switches communicate correctly however: 02:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM2824 PCIe Gen3 Packet Switch [1b21:2824] (rev 01) (prog-if 00 [Normal decode]) [...] Bus: primary=02, secondary=05, subordinate=09, sec-latency=0 [...] Capabilities: [80] Express (v2) Downstream Port (Slot+), MSI 00 [...] LnkSta: Speed 2.5GT/s (downgraded), Width x1 (ok) TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- [...] LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis+, Selectable De-emphasis: -3.5dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB [...] and then: 05:00.0 PCI bridge [0604]: Pericom Semiconductor PI7C9X2G304 EL/SL PCIe2 3-Port/4-Lane Packet Switch [12d8:2304] (rev 05) (prog-if 00 [Normal decode]) [...] Bus: primary=05, secondary=06, subordinate=09, sec-latency=0 [...] Capabilities: [c0] Express (v2) Upstream Port, MSI 00 [...] LnkSta: Speed 2.5GT/s (downgraded), Width x1 (downgraded) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- [...] LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB [...] Removing the speed restriction afterwards makes the two devices switch to 5.0GT/s then. Make use of these observations then and detect the inability to train the link, by checking for the Data Link Layer Link Active status bit implemented by the ASM2824 being off while the Link Bandwidth Management Status indicating that hardware has changed the link speed or width in an attempt to correct unreliable link operation. Restrict the speed to 2.5GT/s then with the Target Link Speed field, request a retrain and wait 200ms for the data link to go up. If this turns out successful, then lift the restriction, letting the devices negotiate a higher speed. Also check for a 2.5GT/s speed restriction the firmware may have already arranged and lift it too with ports that already report their data link being up. Signed-off-by: Maciej W. Rozycki BugLink: https://bugs.launchpad.net/bugs/1964796 Link: https://lore.kernel.org/all/alpine.DEB.2.21.2202010240190.58572@angie.orcam.me.uk/raw Signed-off-by: Dimitri John Ledkov Acked-by: Tim Gardner --- Did a test build, and test boot on Unmatched. Not much else I can test, as I don't actually have PCIe cards that I can connect to my board. I don't see any issues with NVMe, and the pci bridge is visible and appears to be operational. Maybe I should try get a useful PCIe card for my Unmatched. I.e. some old nvidia card supported with open source drivers maybe?! drivers/pci/quirks.c | 98 +++++++++++++++++++++++++++++++++++++++++ include/linux/pci_ids.h | 1 + 2 files changed, 99 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 8e35273673..8f76ccd133 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -12,6 +12,7 @@ * file, where their drivers can use them. */ +#include #include #include #include @@ -5906,3 +5907,100 @@ static void nvidia_ion_ahci_fixup(struct pci_dev *pdev) pdev->dev_flags |= PCI_DEV_FLAGS_HAS_MSI_MASKING; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, 0x0ab8, nvidia_ion_ahci_fixup); + +/* + * Retrain the link of a downstream PCIe port by hand if necessary. + * + * This is needed at least where a downstream port of the ASMedia ASM2824 + * Gen 3 switch is wired to the upstream port of the Pericom PI7C9X2G304 + * Gen 2 switch, and observed with the Delock Riser Card PCI Express x1 > + * 2 x PCIe x1 device, P/N 41433, plugged into the SiFive HiFive Unmatched + * board. + * + * In such a configuration the switches are supposed to negotiate the link + * speed of preferably 5.0GT/s, falling back to 2.5GT/s. However the link + * continues switching between the two speeds indefinitely and the data + * link layer never reaches the active state, with link training reported + * repeatedly active ~84% of the time. Forcing the target link speed to + * 2.5GT/s with the upstream ASM2824 device makes the two switches talk to + * each other correctly however. And more interestingly retraining with a + * higher target link speed afterwards lets the two successfully negotiate + * 5.0GT/s. + * + * With the ASM2824 we can rely on the otherwise optional Data Link Layer + * Link Active status bit and in the failed link training scenario it will + * be off along with the Link Bandwidth Management Status indicating that + * hardware has changed the link speed or width in an attempt to correct + * unreliable link operation. For a port that has been left unconnected + * both bits will be clear. So use this information to detect the problem + * rather than polling the Link Training bit and watching out for flips or + * at least the active status. + * + * Restrict the speed to 2.5GT/s then with the Target Link Speed field, + * request a retrain and wait 200ms for the data link to go up. If this + * turns out successful, then lift the restriction, letting the devices + * negotiate a higher speed. Also check for a 2.5GT/s speed restriction + * the firmware may have already arranged and lift it too with ports that + * already report their data link being up. + */ +static void pcie_downstream_link_retrain(struct pci_dev *dev) +{ + u16 lnksta, lnkctl2; + u8 pos; + + pos = pci_find_capability(dev, PCI_CAP_ID_EXP); + WARN_ON(!pos); + if (!pos) + return; + + pci_read_config_word(dev, pos + PCI_EXP_LNKCTL2, &lnkctl2); + pci_read_config_word(dev, pos + PCI_EXP_LNKSTA, &lnksta); + if ((lnksta & (PCI_EXP_LNKSTA_LBMS | PCI_EXP_LNKSTA_DLLLA)) == + PCI_EXP_LNKSTA_LBMS) { + unsigned long timeout; + u16 lnkctl; + + pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s...\n"); + + pci_read_config_word(dev, pos + PCI_EXP_LNKCTL, &lnkctl); + lnkctl |= PCI_EXP_LNKCTL_RL; + lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS; + lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT; + pci_write_config_word(dev, pos + PCI_EXP_LNKCTL2, lnkctl2); + pci_write_config_word(dev, pos + PCI_EXP_LNKCTL, lnkctl); + + timeout = jiffies + msecs_to_jiffies(200); + do { + pci_read_config_word(dev, pos + PCI_EXP_LNKSTA, + &lnksta); + if (lnksta & PCI_EXP_LNKSTA_DLLLA) + break; + usleep_range(10000, 20000); + } while (time_before(jiffies, timeout)); + + pci_info(dev, "retraining %s!\n", + lnksta & PCI_EXP_LNKSTA_DLLLA ? + "succeeded" : "failed"); + } + + if ((lnksta & PCI_EXP_LNKSTA_DLLLA) && + (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT) { + u32 lnkcap; + u16 lnkctl; + + pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n"); + pci_read_config_word(dev, pos + PCI_EXP_LNKCTL, &lnkctl); + pci_read_config_dword(dev, pos + PCI_EXP_LNKCAP, &lnkcap); + lnkctl |= PCI_EXP_LNKCTL_RL; + lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS; + lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS; + pci_write_config_word(dev, pos + PCI_EXP_LNKCTL2, lnkctl2); + pci_write_config_word(dev, pos + PCI_EXP_LNKCTL, lnkctl); + } +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ASMEDIA, + PCI_DEVICE_ID_ASMEDIA_ASM2824, + pcie_downstream_link_retrain); +DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_ASMEDIA, + PCI_DEVICE_ID_ASMEDIA_ASM2824, + pcie_downstream_link_retrain); diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 011f2f1ea5..810052d576 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2562,6 +2562,7 @@ #define PCI_SUBDEVICE_ID_QEMU 0x1100 #define PCI_VENDOR_ID_ASMEDIA 0x1b21 +#define PCI_DEVICE_ID_ASMEDIA_ASM2824 0x2824 #define PCI_VENDOR_ID_REDHAT 0x1b36