Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/725708/?format=api
{ "id": 725708, "url": "http://patchwork.ozlabs.org/api/patches/725708/?format=api", "web_url": "http://patchwork.ozlabs.org/project/intel-wired-lan/patch/64ac2d0b-7685-4adb-a0e4-2ab7bfd6975e@linux.vnet.ibm.com/", "project": { "id": 46, "url": "http://patchwork.ozlabs.org/api/projects/46/?format=api", "name": "Intel Wired Ethernet development", "link_name": "intel-wired-lan", "list_id": "intel-wired-lan.osuosl.org", "list_email": "intel-wired-lan@osuosl.org", "web_url": "", "scm_url": "", "webscm_url": "", "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<64ac2d0b-7685-4adb-a0e4-2ab7bfd6975e@linux.vnet.ibm.com>", "list_archive_url": null, "date": "2017-02-08T16:31:58", "name": "i40e: driver can't probe device (capabilities discovery error)", "commit_ref": null, "pull_url": null, "state": "rfc", "archived": false, "hash": "4c0b308b9f61dc4e7ffaf0e1c8ffd9c6ea6cfba0", "submitter": { "id": 67066, "url": "http://patchwork.ozlabs.org/api/people/67066/?format=api", "name": "Guilherme G. Piccoli", "email": "gpiccoli@linux.vnet.ibm.com" }, "delegate": { "id": 68, "url": "http://patchwork.ozlabs.org/api/users/68/?format=api", "username": "jtkirshe", "first_name": "Jeff", "last_name": "Kirsher", "email": "jeffrey.t.kirsher@intel.com" }, "mbox": "http://patchwork.ozlabs.org/project/intel-wired-lan/patch/64ac2d0b-7685-4adb-a0e4-2ab7bfd6975e@linux.vnet.ibm.com/mbox/", "series": [], "comments": "http://patchwork.ozlabs.org/api/patches/725708/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/725708/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<intel-wired-lan-bounces@lists.osuosl.org>", "X-Original-To": [ "incoming@patchwork.ozlabs.org", "intel-wired-lan@lists.osuosl.org" ], "Delivered-To": [ "patchwork-incoming@bilbo.ozlabs.org", "intel-wired-lan@lists.osuosl.org" ], "Received": [ "from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3vJRZD6284z9s7x\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu, 9 Feb 2017 03:32:20 +1100 (AEDT)", "from localhost (localhost [127.0.0.1])\n\tby hemlock.osuosl.org (Postfix) with ESMTP id 1140889FBC;\n\tWed, 8 Feb 2017 16:32:19 +0000 (UTC)", "from hemlock.osuosl.org ([127.0.0.1])\n\tby localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024)\n\twith ESMTP id qGpZc9+xX+Sr; Wed, 8 Feb 2017 16:32:17 +0000 (UTC)", "from ash.osuosl.org (ash.osuosl.org [140.211.166.34])\n\tby hemlock.osuosl.org (Postfix) with ESMTP id 2D6B289ED3;\n\tWed, 8 Feb 2017 16:32:17 +0000 (UTC)", "from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138])\n\tby ash.osuosl.org (Postfix) with ESMTP id E88BA1BFC46\n\tfor <intel-wired-lan@lists.osuosl.org>;\n\tWed, 8 Feb 2017 16:32:15 +0000 (UTC)", "from localhost (localhost [127.0.0.1])\n\tby whitealder.osuosl.org (Postfix) with ESMTP id E4E0081F69\n\tfor <intel-wired-lan@lists.osuosl.org>;\n\tWed, 8 Feb 2017 16:32:15 +0000 (UTC)", "from whitealder.osuosl.org ([127.0.0.1])\n\tby localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024)\n\twith ESMTP id rdZ9DpH2dW-r for <intel-wired-lan@lists.osuosl.org>;\n\tWed, 8 Feb 2017 16:32:15 +0000 (UTC)", "from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com\n\t[148.163.156.1])\n\tby whitealder.osuosl.org (Postfix) with ESMTPS id 5CEC781F48\n\tfor <intel-wired-lan@lists.osuosl.org>;\n\tWed, 8 Feb 2017 16:32:13 +0000 (UTC)", "from pps.filterd (m0098396.ppops.net [127.0.0.1])\n\tby mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id\n\tv18GTCXj084981\n\tfor <intel-wired-lan@lists.osuosl.org>; Wed, 8 Feb 2017 11:32:12 -0500", "from e24smtp01.br.ibm.com (e24smtp01.br.ibm.com [32.104.18.85])\n\tby mx0a-001b2d01.pphosted.com with ESMTP id 28g3uvh85n-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <intel-wired-lan@lists.osuosl.org>;\n\tWed, 08 Feb 2017 11:32:12 -0500", "from localhost\n\tby e24smtp01.br.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <intel-wired-lan@lists.osuosl.org> from\n\t<gpiccoli@linux.vnet.ibm.com>; Wed, 8 Feb 2017 14:32:09 -0200", "from d24dlp02.br.ibm.com (9.18.248.206)\n\tby e24smtp01.br.ibm.com (10.172.0.143) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tWed, 8 Feb 2017 14:32:08 -0200", "from d24relay02.br.ibm.com (d24relay02.br.ibm.com [9.18.232.42])\n\tby d24dlp02.br.ibm.com (Postfix) with ESMTP id 12F581DC006E\n\tfor <intel-wired-lan@lists.osuosl.org>;\n\tWed, 8 Feb 2017 11:32:09 -0500 (EST)", "from d24av04.br.ibm.com (d24av04.br.ibm.com [9.8.31.97])\n\tby d24relay02.br.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id\n\tv18GW7e428573720\n\tfor <intel-wired-lan@lists.osuosl.org>; Wed, 8 Feb 2017 14:32:07 -0200", "from d24av04.br.ibm.com (localhost [127.0.0.1])\n\tby d24av04.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id\n\tv18GW7Tt019712\n\tfor <intel-wired-lan@lists.osuosl.org>; Wed, 8 Feb 2017 14:32:07 -0200", "from [9.80.235.48] ([9.80.235.48])\n\tby d24av04.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id\n\tv18GW3f5019675; Wed, 8 Feb 2017 14:32:04 -0200" ], "X-Virus-Scanned": [ "amavisd-new at osuosl.org", "amavisd-new at osuosl.org" ], "X-Greylist": "domain auto-whitelisted by SQLgrey-1.7.6", "From": "\"Guilherme G. Piccoli\" <gpiccoli@linux.vnet.ibm.com>", "To": "\"intel-wired-lan@lists.osuosl.org\" <intel-wired-lan@lists.osuosl.org>", "Date": "Wed, 8 Feb 2017 14:31:58 -0200", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101\n\tThunderbird/45.7.0", "MIME-Version": "1.0", "Content-Type": "multipart/mixed;\n\tboundary=\"------------0DB768661D23E6149FC4FA92\"", "X-TM-AS-MML": "disable", "X-Content-Scanned": "Fidelis XPS MAILER", "x-cbid": "17020816-1523-0000-0000-0000027D4083", "X-IBM-AV-DETECTION": "SAVI=unused REMOTE=unused XFE=unused", "x-cbparentid": "17020816-1524-0000-0000-00002A116255", "Message-Id": "<64ac2d0b-7685-4adb-a0e4-2ab7bfd6975e@linux.vnet.ibm.com>", "X-Proofpoint-Virus-Version": "vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-02-08_10:, , signatures=0", "X-Proofpoint-Spam-Details": "rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=0\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000\n\tdefinitions=main-1702080161", "Cc": "Murilo pIO <muvic@linux.vnet.ibm.com>, maurosr@linux.vnet.ibm.com,\n\tnetdev <netdev@vger.kernel.org>, gpiccoli@linux.vnet.ibm.com,\n\tBrian King <brking@linux.vnet.ibm.com>", "Subject": "[Intel-wired-lan] i40e: driver can't probe device (capabilities\n\tdiscovery error)", "X-BeenThere": "intel-wired-lan@lists.osuosl.org", "X-Mailman-Version": "2.1.18-1", "Precedence": "list", "List-Id": "Intel Wired Ethernet Linux Kernel Driver Development\n\t<intel-wired-lan.lists.osuosl.org>", "List-Unsubscribe": "<http://lists.osuosl.org/mailman/options/intel-wired-lan>, \n\t<mailto:intel-wired-lan-request@lists.osuosl.org?subject=unsubscribe>", "List-Archive": "<http://lists.osuosl.org/pipermail/intel-wired-lan/>", "List-Post": "<mailto:intel-wired-lan@lists.osuosl.org>", "List-Help": "<mailto:intel-wired-lan-request@lists.osuosl.org?subject=help>", "List-Subscribe": "<http://lists.osuosl.org/mailman/listinfo/intel-wired-lan>, \n\t<mailto:intel-wired-lan-request@lists.osuosl.org?subject=subscribe>", "Errors-To": "intel-wired-lan-bounces@lists.osuosl.org", "Sender": "\"Intel-wired-lan\" <intel-wired-lan-bounces@lists.osuosl.org>" }, "content": "Recently we had a sudden fail on Intel XL710 adapter, in which the i40e\ndriver is not able to probe the device anymore - it fails right on the\nbeginning of the probe process, on discovery capabilities procedure. We\nobserved the following messages on kernel (v4.10-rc7) log:\n\n\ni40e: Intel(R) Ethernet Connection XL710 Network Driver - version 1.6.25-k\ni40e: Copyright (c) 2013 - 2014 Intel Corporation.\ni40e 0002:01:00.0: Using 64-bit DMA iommu bypass\ni40e 0002:01:00.0: fw 5.1.40981 api 1.5 nvm 5.03 0x80002469 1.1313.0\ni40e 0002:01:00.0: capability discovery failed, err OK aq_err\nI40E_AQ_RC_EMODE\ni40e 0002:01:00.1: Using 64-bit DMA iommu bypass\ni40e 0002:01:00.1: fw 5.1.40981 api 1.5 nvm 5.03 0x80002469 1.1313.0\ni40e 0002:01:00.1: capability discovery failed, err OK aq_err\nI40E_AQ_RC_EMODE\n\n<and the same messages on functions .2 and .3 too>\n\n\nWe were able to \"revive\" the adapter using one of the following 2\nprocedures:\n\ni) PowerPC systems have a feature called EEH, that is a PCI slot reset\nin essence. It's something in HW/PHB level, so the mechanism does a slot\nreset, that can be a PCI Hot Reset or Fundamental Reset (PERST).\n\nThe 1st way to recover the adapter was to inject an error on this slot\nand forcing a called \"hotplug recovery\". Basically, we removed the\nadapter from the PCI core (echo 1 >\n/sys/bus/pci/devices/0002:01:00.*/remove), then we froze the PHB\ntransactions (using a debug facility on powerpc kernel) and then we do a\nrescan on PCI bus (echo 1 > /sys/bus/pci/rescan).\n\nThis led to Hot Reset on slot, and adapter recovered fine, i40e driver\nwas able to complete the probe procedure. I can provide full logs if\ndesired.\nAlthough I think this is too hacky way...\n\nii) With the attached patch, we were able to \"partially\" circumvent the\nissue. Basically, the probe procedure worked fine to all device\nfunctions, but on function 3 we failed in eeprom check - the following\nmessages were observed in the kernel log:\n\n[29.1126] i40e 0002:01:00.3: Using 64-bit DMA iommu bypass\n[32.3530] i40e 0002:01:00.3: fw 5.1.40981 api 1.5 nvm 5.03 0x24695003\n192.0.63\n[32.8441] i40e 0002:01:00.3: eeprom check failed (-2), Tx/Rx traffic\ndisabled\n[32.8583] i40e 0002:01:00.3: MAC address: 0c:c4:7a:89:f1:c3\n[32.8712] i40e 0002:01:00.3: MSI-X vector limit reached, attempting to\nredistribute vectors\n[32.9765] i40e 0002:01:00.3: Added LAN device PF3 bus=0x00 func=0x03\n[32.9766] i40e 0002:01:00.3: PCI-Express: Speed 8.0GT/s Width x8\n[32.9867] i40e 0002:01:00.3: Features: PF-id[3] VFs: 32 VSIs: 34 QP: 119\nRSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA\n\n\nAll the other 3 functions presented the same messages except the eeprom\ncheck failed.\nI'm aware the patch needs some rework (in my understanding, the logic\nworks only to a single adapter, because we need a global reset only in\none function of the adapter. But the patch logic fails if we have more\nthan 1 physical adapter on machine. It's just a draft/RFC version for now).\n--\n\nSo, I'd like to request help/feedback from you regarding what's going\non. I'm not sure the root cause of the sudden adapter failure. In one\nday it was fine, and in the other, after a machine reboot, it entered in\nthis odd state. We have 2 machines presenting this behavior and 5 others\nthat are fine.\n\nIs there a way to clear this bad state on the adapter, like a special\nreset (or even a jumper that we should play physically)? I tried EMP\nreset too, but seems it's not allowed for some reason (perhaps only in\nNVM update mode? Not sure). Also, any pointers on how to understand the\nroot cause are welcome.\nThanks in advance,\n\n\nGuilherme", "diff": ">From 1a49e453816dbab747788b87f9d03edc978cb50b Mon Sep 17 00:00:00 2001\nFrom: \"Guilherme G. Piccoli\" <gpiccoli@linux.vnet.ibm.com>\nDate: Tue, 7 Feb 2017 17:38:04 -0200\nSubject: [PATCH] i40: force global reset on adapter probe\n\nDevice might experience a bad state on probe time, making impossible\nto the function i40e_probe() to successfully complete.\n\nIn these cases, for example we observed the following message in\nkernel log:\n\n [22.6397] i40e 0002:01:00.0: capability discovery failed, err OK aq_err I40E_AQ_RC_EMODE\n\nThis patch forces a global reset to happen on driver probe to avoid\nthe issue.\n\nSigned-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>\n---\n drivers/net/ethernet/intel/i40e/i40e_main.c | 7 ++++++-\n 1 file changed, 6 insertions(+), 1 deletion(-)\n\ndiff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c\nindex ad4cf63..f686c4a 100644\n--- a/drivers/net/ethernet/intel/i40e/i40e_main.c\n+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c\n@@ -10928,7 +10928,7 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)\n \tstatic u16 pfs_found;\n \tu16 wol_nvm_bits;\n \tu16 link_status;\n-\tint err;\n+\tint err, globr_probe = 1;\n \tu32 val;\n \tu32 i;\n \tu8 set_fc_aq_fail;\n@@ -11009,6 +11009,11 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)\n \tif (debug < -1)\n \t\tpf->hw.debug_mask = debug;\n \n+\tif (globr_probe) {\n+\t\ti40e_do_reset_safe(pf, BIT(__I40E_GLOBAL_RESET_REQUESTED));\n+\t\tglobr_probe = 0;\n+\t}\n+\n \t/* do a special CORER for clearing PXE mode once at init */\n \tif (hw->revision_id == 0 &&\n \t (rd32(hw, I40E_GLLAN_RCTL_0) & I40E_GLLAN_RCTL_0_PXE_MODE_MASK)) {\n-- \n2.7.4\n\n\n\n", "prefixes": [] }