Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/1326503/?format=api
{ "id": 1326503, "url": "http://patchwork.ozlabs.org/api/patches/1326503/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200710052340.737567-7-oohall@gmail.com/", "project": { "id": 2, "url": "http://patchwork.ozlabs.org/api/projects/2/?format=api", "name": "Linux PPC development", "link_name": "linuxppc-dev", "list_id": "linuxppc-dev.lists.ozlabs.org", "list_email": "linuxppc-dev@lists.ozlabs.org", "web_url": "https://github.com/linuxppc/wiki/wiki", "scm_url": "https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git", "webscm_url": "https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/", "list_archive_url": "https://lore.kernel.org/linuxppc-dev/", "list_archive_url_format": "https://lore.kernel.org/linuxppc-dev/{}/", "commit_url_format": "https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id={}" }, "msgid": "<20200710052340.737567-7-oohall@gmail.com>", "list_archive_url": "https://lore.kernel.org/linuxppc-dev/20200710052340.737567-7-oohall@gmail.com/", "date": "2020-07-10T05:23:31", "name": "[06/15] powerpc/powernv/sriov: Explain how SR-IOV works on PowerNV", "commit_ref": null, "pull_url": null, "state": "changes-requested", "archived": false, "hash": "5e5542d4800abc8a313b034b40531631302344aa", "submitter": { "id": 68108, "url": "http://patchwork.ozlabs.org/api/people/68108/?format=api", "name": "Oliver O'Halloran", "email": "oohall@gmail.com" }, "delegate": null, "mbox": "http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200710052340.737567-7-oohall@gmail.com/mbox/", "series": [ { "id": 188782, "url": "http://patchwork.ozlabs.org/api/series/188782/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=188782", "date": "2020-07-10T05:23:26", "name": "[01/15] powernv/pci: Add pci_bus_to_pnvhb() helper", "version": 1, "mbox": "http://patchwork.ozlabs.org/series/188782/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/patches/1326503/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/1326503/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "\n <linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>", "X-Original-To": [ "patchwork-incoming@ozlabs.org", "linuxppc-dev@lists.ozlabs.org" ], "Delivered-To": [ "patchwork-incoming@ozlabs.org", "linuxppc-dev@lists.ozlabs.org" ], "Received": [ "from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature RSA-PSS (4096 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 4B322R5260z9s1x\n\tfor <patchwork-incoming@ozlabs.org>; Fri, 10 Jul 2020 15:40:47 +1000 (AEST)", "from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 4B322R4FFXzDrJn\n\tfor <patchwork-incoming@ozlabs.org>; Fri, 10 Jul 2020 15:40:47 +1000 (AEST)", "from mail-wm1-x341.google.com (mail-wm1-x341.google.com\n [IPv6:2a00:1450:4864:20::341])\n (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest\n SHA256)\n (No client certificate requested)\n by lists.ozlabs.org (Postfix) with ESMTPS id 4B31gL2mSczDrJn\n for <linuxppc-dev@lists.ozlabs.org>; Fri, 10 Jul 2020 15:24:13 +1000 (AEST)", "by mail-wm1-x341.google.com with SMTP id q15so4551316wmj.2\n for <linuxppc-dev@lists.ozlabs.org>; Thu, 09 Jul 2020 22:24:13 -0700 (PDT)", "from 192-168-1-18.tpgi.com.au ([220.240.245.68])\n by smtp.gmail.com with ESMTPSA id 92sm9090941wrr.96.2020.07.09.22.24.07\n (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n Thu, 09 Jul 2020 22:24:09 -0700 (PDT)" ], "Authentication-Results": [ "ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com", "ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20161025 header.b=TjzZwRCl;\n\tdkim-atps=neutral", "lists.ozlabs.org; spf=pass (sender SPF authorized)\n smtp.mailfrom=gmail.com (client-ip=2a00:1450:4864:20::341;\n helo=mail-wm1-x341.google.com; envelope-from=oohall@gmail.com;\n receiver=<UNKNOWN>)", "lists.ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com", "lists.ozlabs.org; dkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20161025 header.b=TjzZwRCl; dkim-atps=neutral" ], "DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;\n h=from:to:cc:subject:date:message-id:in-reply-to:references\n :mime-version:content-transfer-encoding;\n bh=3rI/j03qnMWqqmLPfrr2nACml6hNFpxMMQGfEK+4cKM=;\n b=TjzZwRClz3OoOmqVSYZyK5Knm/IXzScy5Khw7N6pf0suWF1xjZ9IhzYPUdRRk8nC/C\n OME2NxYlu8cQepRtKAf+eWRcxKEWeCw6HCG8LsvOP+UbYTeXEh72dAkWg2t6HwlQSeRo\n q/DnkJ9+ICIZXyYZ8oA121dCYKhBIwssaqgJ23o8aIPkxr85tG1m9VJaNw0TeC4AJNnB\n TFTYXrZBCkxiEjgQ4oMOxiCg6rGkUPOVfLWDPRThGv0qL2B5rKa2C8SsjaON52V0Sd8b\n vLJ+2+GHJkf6ThMCq/diQxngToG2J7IyGrIiAeatzDQVvs90h4N7FYoFLSJw8AJ6Puny\n 7tkQ==", "X-Google-DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20161025;\n h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to\n :references:mime-version:content-transfer-encoding;\n bh=3rI/j03qnMWqqmLPfrr2nACml6hNFpxMMQGfEK+4cKM=;\n b=WxH0MQhkW+ywO3qhDVl2xntMK7hbXHGeZ1OS3o0fp18rg0i+nxdBC5vkOf/reN96lK\n oF6ogyBZXq6pf/o4YAVBFJM2cOHEDv2EEkcHMxJO86UBmV+luQBBFf3Qh+fjqfu5xADT\n 6pHfGFz5B4XmvwVMTt8txf1gGJm7oF/jdIDDNSB+05rBZsFc53fd3wMgXBpLRWRD3hg9\n q5bqUW2s4KXbC92VLVXReOWUgkdf7Ze1Bvmd2y4YI3BBfDiMDjQkrzlYhSjZyIiMKeU8\n 8JNpwC3vgyKj5CpkM1AD5lgBJBphZu3pUZnL7t71+BC1ZddHt8hwx2422KCeV9FsGcj9\n l0Vw==", "X-Gm-Message-State": "AOAM531FxlOAglyMVF/DZ0l0nLJUn0u37SLB/886KMBtfzf+Gf0IEx2B\n AAZa3I5sRRabUBW0VDz9MJMp5Ht5qGQ=", "X-Google-Smtp-Source": "\n ABdhPJyO9LO7c3q0zL0zk6HB0crtQFc/Q63eB5AQBztfxnM2lxYrtlP6c/y2vJ/hf8RmNz1Am9lGdQ==", "X-Received": "by 2002:a1c:ac81:: with SMTP id\n v123mr3162665wme.159.1594358649769;\n Thu, 09 Jul 2020 22:24:09 -0700 (PDT)", "From": "Oliver O'Halloran <oohall@gmail.com>", "To": "linuxppc-dev@lists.ozlabs.org", "Subject": "[PATCH 06/15] powerpc/powernv/sriov: Explain how SR-IOV works on\n PowerNV", "Date": "Fri, 10 Jul 2020 15:23:31 +1000", "Message-Id": "<20200710052340.737567-7-oohall@gmail.com>", "X-Mailer": "git-send-email 2.26.2", "In-Reply-To": "<20200710052340.737567-1-oohall@gmail.com>", "References": "<20200710052340.737567-1-oohall@gmail.com>", "MIME-Version": "1.0", "Content-Transfer-Encoding": "8bit", "X-BeenThere": "linuxppc-dev@lists.ozlabs.org", "X-Mailman-Version": "2.1.29", "Precedence": "list", "List-Id": "Linux on PowerPC Developers Mail List <linuxppc-dev.lists.ozlabs.org>", "List-Unsubscribe": "<https://lists.ozlabs.org/options/linuxppc-dev>,\n <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>", "List-Archive": "<http://lists.ozlabs.org/pipermail/linuxppc-dev/>", "List-Post": "<mailto:linuxppc-dev@lists.ozlabs.org>", "List-Help": "<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>", "List-Subscribe": "<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>", "Cc": "Oliver O'Halloran <oohall@gmail.com>", "Errors-To": "linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org", "Sender": "\"Linuxppc-dev\"\n <linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>" }, "content": "SR-IOV support on PowerNV is a byzantine maze of hooks. I have no idea\nhow anyone is supposed to know how it works except through a lot of\nstuffering. Write up some docs about the overall story to help out\nthe next sucker^Wperson who needs to tinker with it.\n\nSigned-off-by: Oliver O'Halloran <oohall@gmail.com>\n---\n arch/powerpc/platforms/powernv/pci-sriov.c | 130 +++++++++++++++++++++\n 1 file changed, 130 insertions(+)", "diff": "diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c\nindex 080ea39f5a83..f4c74ab1284d 100644\n--- a/arch/powerpc/platforms/powernv/pci-sriov.c\n+++ b/arch/powerpc/platforms/powernv/pci-sriov.c\n@@ -12,6 +12,136 @@\n /* for pci_dev_is_added() */\n #include \"../../../../drivers/pci/pci.h\"\n \n+/*\n+ * The majority of the complexity in supporting SR-IOV on PowerNV comes from\n+ * the need to put the MMIO space for each VF into a separate PE. Internally\n+ * the PHB maps MMIO addresses to a specific PE using the \"Memory BAR Table\".\n+ * The MBT historically only applied to the 64bit MMIO window of the PHB\n+ * so it's common to see it referred to as the \"M64BT\".\n+ *\n+ * An MBT entry stores the mapped range as an <base>,<mask> pair. This forces\n+ * the address range that we want to map to be power-of-two sized and aligned.\n+ * For conventional PCI devices this isn't really an issue since PCI device BARs\n+ * have the same requirement.\n+ *\n+ * For a SR-IOV BAR things are a little more awkward since size and alignment\n+ * are not coupled. The alignment is set based on the the per-VF BAR size, but\n+ * the total BAR area is: number-of-vfs * per-vf-size. The number of VFs\n+ * isn't necessarily a power of two, so neither is the total size. To fix that\n+ * we need to finesse (read: hack) the Linux BAR allocator so that it will\n+ * allocate the SR-IOV BARs in a way that lets us map them using the MBT.\n+ *\n+ * The changes to size and alignment that we need to do depend on the \"mode\"\n+ * of MBT entry that we use. We only support SR-IOV on PHB3 (IODA2) and above,\n+ * so as a baseline we can assume that we have the following BAR modes\n+ * available:\n+ *\n+ * NB: $PE_COUNT is the number of PEs that the PHB supports.\n+ *\n+ * a) A segmented BAR that splits the mapped range into $PE_COUNT equally sized\n+ * segments. The n'th segment is mapped to the n'th PE.\n+ * b) An un-segmented BAR that maps the whole address range to a specific PE.\n+ *\n+ *\n+ * We prefer to use mode a) since it only requires one MBT entry per SR-IOV BAR\n+ * For comparison b) requires one entry per-VF per-BAR, or:\n+ * (num-vfs * num-sriov-bars) in total. To use a) we need the size of each segment\n+ * to equal the size of the per-VF BAR area. So:\n+ *\n+ *\tnew_size = per-vf-size * number-of-PEs\n+ *\n+ * The alignment for the SR-IOV BAR also needs to be changed from per-vf-size\n+ * to \"new_size\", calculated above. Implementing this is a convoluted process\n+ * which requires several hooks in the PCI core:\n+ *\n+ * 1. In pcibios_add_device() we call pnv_pci_ioda_fixup_iov().\n+ *\n+ * At this point the device has been probed and the device's BARs are sized,\n+ * but no resource allocations have been done. The SR-IOV BARs are sized\n+ * based on the maximum number of VFs supported by the device and we need\n+ * to increase that to new_size.\n+ *\n+ * 2. Later, when Linux actually assigns resources it tries to make the resource\n+ * allocations for each PCI bus as compact as possible. As a part of that it\n+ * sorts the BARs on a bus by their required alignment, which is calculated\n+ * using pci_resource_alignment().\n+ *\n+ * For IOV resources this goes:\n+ * pci_resource_alignment()\n+ * pci_sriov_resource_alignment()\n+ * pcibios_sriov_resource_alignment()\n+ * pnv_pci_iov_resource_alignment()\n+ *\n+ * Our hook overrides the default alignment, equal to the per-vf-size, with\n+ * new_size computed above.\n+ *\n+ * 3. When userspace enables VFs for a device:\n+ *\n+ * sriov_enable()\n+ * pcibios_sriov_enable()\n+ * pnv_pcibios_sriov_enable()\n+ *\n+ * This is where we actually allocate PE numbers for each VF and setup the\n+ * MBT mapping for each SR-IOV BAR. In steps 1) and 2) we setup an \"arena\"\n+ * where each MBT segment is equal in size to the VF BAR so we can shift\n+ * around the actual SR-IOV BAR location within this arena. We need this\n+ * ability because the PE space is shared by all devices on the same PHB.\n+ * When using mode a) described above segment 0 in maps to PE#0 which might\n+ * be already being used by another device on the PHB.\n+ *\n+ * As a result we need allocate a contigious range of PE numbers, then shift\n+ * the address programmed into the SR-IOV BAR of the PF so that the address\n+ * of VF0 matches up with the segment corresponding to the first allocated\n+ * PE number. This is handled in pnv_pci_vf_resource_shift().\n+ *\n+ * Once all that is done we return to the PCI core which then enables VFs,\n+ * scans them and creates pci_devs for each. The init process for a VF is\n+ * largely the same as a normal device, but the VF is inserted into the IODA\n+ * PE that we allocated for it rather than the PE associated with the bus.\n+ *\n+ * 4. When userspace disables VFs we unwind the above in\n+ * pnv_pcibios_sriov_disable(). Fortunately this is relatively simple since\n+ * we don't need to validate anything, just tear down the mappings and\n+ * move SR-IOV resource back to its \"proper\" location.\n+ *\n+ * That's how mode a) works. In theory mode b) (single PE mapping) is less work\n+ * since we can map each individual VF with a separate BAR. However, there's a\n+ * few limitations:\n+ *\n+ * 1) For IODA2 mode b) has a minimum alignment requirement of 32MB. This makes\n+ * it only usable for devices with very large per-VF BARs. Such devices are\n+ * similar to Big Foot. They definitely exist, but I've never seen one.\n+ *\n+ * 2) The number of MBT entries that we have is limited. PHB3 and PHB4 only\n+ * 16 total and some are needed for. Most SR-IOV capable network cards can support\n+ * more than 16 VFs on each port.\n+ *\n+ * We use b) when using a) would use more than 1/4 of the entire 64 bit MMIO\n+ * window of the PHB.\n+ *\n+ *\n+ *\n+ * PHB4 (IODA3) added a few new features that would be useful for SR-IOV. It\n+ * allowed the MBT to map 32bit MMIO space in addition to 64bit which allows\n+ * us to support SR-IOV BARs in the 32bit MMIO window. This is useful since\n+ * the Linux BAR allocation will place any BAR marked as non-prefetchable into\n+ * the non-prefetchable bridge window, which is 32bit only. It also added two\n+ * new modes:\n+ *\n+ * c) A segmented BAR similar to a), but each segment can be individually\n+ * mapped to any PE. This is matches how the 32bit MMIO window worked on\n+ * IODA1&2.\n+ *\n+ * d) A segmented BAR with 8, 64, or 128 segments. This works similarly to a),\n+ * but with fewer segments and configurable base PE.\n+ *\n+ * i.e. The n'th segment maps to the (n + base)'th PE.\n+ *\n+ * The base PE is also required to be a multiple of the window size.\n+ *\n+ * Unfortunately, the OPAL API doesn't currently (as of skiboot v6.6) allow us\n+ * to exploit any of the IODA3 features.\n+ */\n \n static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)\n {\n", "prefixes": [ "06/15" ] }