[{"id":1785073,"web_url":"http://patchwork.ozlabs.org/comment/1785073/","msgid":"<20171011233243.GX25517@bhelgaas-glaptop.roam.corp.google.com>","list_archive_url":null,"date":"2017-10-11T23:32:43","subject":"Re: [PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","submitter":{"id":67298,"url":"http://patchwork.ozlabs.org/api/people/67298/","name":"Bjorn Helgaas","email":"helgaas@kernel.org"},"content":"On Tue, Sep 26, 2017 at 05:17:16PM +0300, Mika Westerberg wrote:\n> System BIOS sometimes allocates extra bus space for hotplug capable PCIe\n> root/downstream ports. This space is needed if the device plugged to the\n> port will have more hotplug capable downstream ports. A good example of\n> this is Thunderbolt. Each Thunderbolt device contains a PCIe switch and\n> one or more hotplug capable PCIe downstream ports where the daisy chain\n> can be extended.\n> \n> Currently Linux only allocates minimal bus space to make sure all the\n> enumerated devices barely fit there. The BIOS reserved extra space is\n> not taken into consideration at all. Because of this we run out of bus\n> space pretty quickly when more PCIe devices are attached to hotplug\n> downstream ports in order to extend the chain.\n> \n> This modifies PCI core so that we distribute the available BIOS\n> allocated bus space equally between hotplug capable PCIe downstream\n> ports to make sure there is enough bus space for extending the\n> hierarchy later on.\n\nI think this makes sense in general.  It's a fairly complicated patch,\nso my comments here are just a first pass.\n\nWhy do you limit it to PCIe?  Isn't it conceivable that one could\nhot-add a conventional PCI card that contained a bridge leading to\nanother hotplug slot?  E.g., a PCI card with PCMCIA slot or something\non it?\n\n> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>\n> ---\n>  drivers/pci/hotplug-pci.c |  13 +++++-\n>  drivers/pci/probe.c       | 114 ++++++++++++++++++++++++++++++++++++++--------\n>  include/linux/pci.h       |  19 ++++++--\n>  3 files changed, 123 insertions(+), 23 deletions(-)\n> \n> diff --git a/drivers/pci/hotplug-pci.c b/drivers/pci/hotplug-pci.c\n> index c68366cee6b7..deb06fe22b26 100644\n> --- a/drivers/pci/hotplug-pci.c\n> +++ b/drivers/pci/hotplug-pci.c\n> @@ -8,6 +8,7 @@ int pci_hp_add_bridge(struct pci_dev *dev)\n>  {\n>  \tstruct pci_bus *parent = dev->bus;\n>  \tint pass, busnr, start = parent->busn_res.start;\n> +\tunsigned int available_buses = 0;\n>  \tint end = parent->busn_res.end;\n>  \n>  \tfor (busnr = start; busnr <= end; busnr++) {\n> @@ -19,8 +20,18 @@ int pci_hp_add_bridge(struct pci_dev *dev)\n>  \t\t\t\tpci_name(dev));\n>  \t\treturn -1;\n>  \t}\n> +\n> +\t/*\n> +\t * In case of PCIe the hierarchy can be extended through hotplug\n> +\t * capable downstream ports. Distribute the available bus\n> +\t * numbers between them to make extending the chain possible.\n> +\t */\n> +\tif (pci_is_pcie(dev))\n> +\t\tavailable_buses = end - busnr;\n> +\n>  \tfor (pass = 0; pass < 2; pass++)\n> -\t\tbusnr = pci_scan_bridge(parent, dev, busnr, pass);\n> +\t\tbusnr = pci_scan_bridge_extend(parent, dev, busnr,\n> +\t\t\t\t\t       available_buses, pass);\n>  \tif (!dev->subordinate)\n>  \t\treturn -1;\n>  \n> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c\n> index f285cd74088e..ae0bf3c807f5 100644\n> --- a/drivers/pci/probe.c\n> +++ b/drivers/pci/probe.c\n> @@ -960,6 +960,17 @@ static void pci_enable_crs(struct pci_dev *pdev)\n>  }\n>  \n>  /*\n> + * pci_scan_bridge_extend() - Scan buses behind a bridge\n> + * @bus: Parent bus the bridge is on\n> + * @dev: Bridge itself\n> + * @max: Starting subordinate number of buses behind this bridge\n> + * @available_buses: Total number of buses available for this bridge and\n> + *\t\t     the devices below. After the minimal bus space has\n> + *\t\t     been allocated the remaining buses will be\n> + *\t\t     distributed equally between hotplug capable bridges.\n> + * @pass: Either %0 (scan already configured bridges) or %1 (scan bridges\n> + *        that need to be reconfigured.\n> + *\n>   * If it's a bridge, configure it and scan the bus behind it.\n>   * For CardBus bridges, we don't scan behind as the devices will\n>   * be handled by the bridge driver itself.\n> @@ -969,7 +980,8 @@ static void pci_enable_crs(struct pci_dev *pdev)\n>   * them, we proceed to assigning numbers to the remaining buses in\n>   * order to avoid overlaps between old and new bus numbers.\n>   */\n> -int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)\n> +int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, int max,\n> +\t\t\t   unsigned int available_buses, int pass)\n>  {\n>  \tstruct pci_bus *child;\n>  \tint is_cardbus = (dev->hdr_type == PCI_HEADER_TYPE_CARDBUS);\n> @@ -1080,6 +1092,9 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)\n>  \t\t\t\t\t\tbus->busn_res.end);\n>  \t\t}\n>  \t\tmax++;\n> +\t\tif (available_buses)\n> +\t\t\tavailable_buses--;\n> +\n>  \t\tbuses = (buses & 0xff000000)\n>  \t\t      | ((unsigned int)(child->primary)     <<  0)\n>  \t\t      | ((unsigned int)(child->busn_res.start)   <<  8)\n> @@ -1101,7 +1116,7 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)\n>  \n>  \t\tif (!is_cardbus) {\n>  \t\t\tchild->bridge_ctl = bctl;\n> -\t\t\tmax = pci_scan_child_bus(child);\n> +\t\t\tmax = pci_scan_child_bus_extend(child, available_buses);\n>  \t\t} else {\n>  \t\t\t/*\n>  \t\t\t * For CardBus bridges, we leave 4 bus numbers\n> @@ -1169,7 +1184,7 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)\n>  \n>  \treturn max;\n>  }\n> -EXPORT_SYMBOL(pci_scan_bridge);\n> +EXPORT_SYMBOL(pci_scan_bridge_extend);\n>  \n>  /*\n>   * Read interrupt line and base address registers.\n> @@ -2397,9 +2412,24 @@ void __weak pcibios_fixup_bus(struct pci_bus *bus)\n>         /* nothing to do, expected to be removed in the future */\n>  }\n>  \n> -unsigned int pci_scan_child_bus(struct pci_bus *bus)\n> +/**\n> + * pci_scan_child_bus_extend() - Scan devices below a bus\n> + * @bus: Bus to scan for devices\n> + * @available_buses: Total number of buses available (%0 does not try to\n> + *\t\t     extend beyond the minimal)\n\nWhat does \"%0\" mean?  Is that kernel-doc for something?  0?\n\n> + * Scans devices below @bus including subordinate buses. Returns new\n> + * subordinate number including all the found devices. Passing\n> + * @available_buses causes the remaining bus space to be distributed\n> + * equally between hotplug capable bridges to allow future extension of\n> + * the hierarchy.\n> + */\n> +unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,\n> +\t\t\t\t       unsigned int available_buses)\n>  {\n> -\tunsigned int devfn, pass, max = bus->busn_res.start;\n> +\tunsigned int used_buses, hotplug_bridges = 0;\n> +\tunsigned int start = bus->busn_res.start;\n> +\tunsigned int devfn, cmax, max = start;\n>  \tstruct pci_dev *dev;\n>  \n>  \tdev_dbg(&bus->dev, \"scanning bus\\n\");\n> @@ -2409,7 +2439,8 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus)\n>  \t\tpci_scan_slot(bus, devfn);\n>  \n>  \t/* Reserve buses for SR-IOV capability. */\n> -\tmax += pci_iov_bus_range(bus);\n> +\tused_buses = pci_iov_bus_range(bus);\n> +\tmax += used_buses;\n>  \n>  \t/*\n>  \t * After performing arch-dependent fixup of the bus, look behind\n> @@ -2421,23 +2452,68 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus)\n>  \t\tbus->is_added = 1;\n>  \t}\n>  \n> -\tfor (pass = 0; pass < 2; pass++)\n> -\t\tlist_for_each_entry(dev, &bus->devices, bus_list) {\n> -\t\t\tif (pci_is_bridge(dev))\n> -\t\t\t\tmax = pci_scan_bridge(bus, dev, max, pass);\n\nThanks for getting rid of this \"for (pass = 0; ...)\" loop.  Much nicer\nto just have it spelled out.  It might even be worth pulling this\nlooping change into a separate patch to simplify *this* patch a little\nbit.\n\nI'd be happy if you did the same for the similar loop in\npci_hp_add_bridge().\n\n> +\t/*\n> +\t * First pass. Bridges that are already configured. We don't touch\n> +\t * these unless they are misconfigured (which we will do in second\n> +\t * pass).\n> +\t */\n> +\tlist_for_each_entry(dev, &bus->devices, bus_list) {\n> +\t\tif (pci_is_bridge(dev)) {\n> +\t\t\t/*\n> +\t\t\t * Calculate how many hotplug bridges there are on\n> +\t\t\t * this bus. We will distribute the additional\n> +\t\t\t * available buses between them.\n> +\t\t\t */\n> +\t\t\tif (dev->is_hotplug_bridge)\n> +\t\t\t\thotplug_bridges++;\n\nMaybe pull this out into a third list traversal, since it's not\nrelated to the \"scan bridge\" functionality?\n\n> +\n> +\t\t\tcmax = max;\n> +\t\t\tmax = pci_scan_bridge_extend(bus, dev, max, 0, 0);\n> +\t\t\tused_buses += cmax - max;\n> +\t\t}\n> +\t}\n\nI'm trying to relate the \"Bridges that are already configured\" comment\nto this code.  I don't see any test here for whether the bridge is\nalready configured.  Oh, I see -- the last parameter (0) to\npci_scan_bridge_extend() means this is the first pass, and it\nbasically just checks for secondary and subordinate being set.\nI thought \"first pass\" was related to the list_for_each_entry() loop\nhere, but it's actually related to the internals of\npci_scan_bridge_extend().  I think maybe rewording the comment can\naddress this.\n\n\n> +\t/* Second pass. Bridges that need to be configured. */\n> +\tlist_for_each_entry(dev, &bus->devices, bus_list) {\n> +\t\tif (pci_is_bridge(dev)) {\n> +\t\t\tunsigned int buses = 0;\n> +\n> +\t\t\tif (pcie_upstream_port(dev)) {\n> +\t\t\t\t/* Upstream port gets all available buses */\n> +\t\t\t\tbuses = available_buses;\n\nI guess this relies on the assumption that there can only be one\nupstream port on a bus?  Is that true?  Typically a switch only has a\nfunction 0 upstream port, but something niggles at me like the spec\nadmits the possibility of a switch with multiple functions of upstream\nports?  I don't know where that is right now (or if it exists), but\nI'll try to find it.\n\n> +\t\t\t} else if (dev->is_hotplug_bridge) {\n> +\t\t\t\t/*\n> +\t\t\t\t * Distribute the extra buses between\n> +\t\t\t\t * hotplug bridges if any.\n> +\t\t\t\t */\n> +\t\t\t\tbuses = available_buses / hotplug_bridges;\n> +\t\t\t\tbuses = min(buses, available_buses - used_buses);\n> +\t\t\t}\n> +\n> +\t\t\tcmax = max;\n> +\t\t\tmax = pci_scan_bridge_extend(bus, dev, cmax, buses, 1);\n> +\t\t\tused_buses += max - cmax;\n>  \t\t}\n> +\t}\n>  \n>  \t/*\n>  \t * Make sure a hotplug bridge has at least the minimum requested\n> -\t * number of buses.\n> +\t * number of buses but allow it to grow up to the maximum available\n> +\t * bus number of there is room.\n>  \t */\n> -\tif (bus->self && bus->self->is_hotplug_bridge && pci_hotplug_bus_size) {\n> -\t\tif (max - bus->busn_res.start < pci_hotplug_bus_size - 1)\n> -\t\t\tmax = bus->busn_res.start + pci_hotplug_bus_size - 1;\n> -\n> -\t\t/* Do not allocate more buses than we have room left */\n> -\t\tif (max > bus->busn_res.end)\n> -\t\t\tmax = bus->busn_res.end;\n> +\tif (bus->self && bus->self->is_hotplug_bridge) {\n> +\t\tused_buses = max_t(unsigned int, available_buses,\n> +\t\t\t\t   pci_hotplug_bus_size - 1);\n> +\t\tif (max - start < used_buses) {\n> +\t\t\tmax = start + used_buses;\n> +\n> +\t\t\t/* Do not allocate more buses than we have room left */\n> +\t\t\tif (max > bus->busn_res.end)\n> +\t\t\t\tmax = bus->busn_res.end;\n> +\n> +\t\t\tdev_dbg(&bus->dev, \"%pR extended by %#02x\\n\",\n> +\t\t\t\t&bus->busn_res, max - start);\n> +\t\t}\n>  \t}\n>  \n>  \t/*\n> @@ -2450,7 +2526,7 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus)\n>  \tdev_dbg(&bus->dev, \"bus scan returning with max=%02x\\n\", max);\n>  \treturn max;\n>  }\n> -EXPORT_SYMBOL_GPL(pci_scan_child_bus);\n> +EXPORT_SYMBOL_GPL(pci_scan_child_bus_extend);\n\nDoes this need to be exported?  The only callers I see are\npci_scan_bridge_extend() (already in the same module) and\npci_scan_child_bus() (now an inline in linux/pci.h).\n\nI'd rather move pci_scan_child_bus() back to probe.c and make\npci_scan_child_bus_extend() static.\n\nSame with pci_scan_bridge_extend(), although that looks harder because\nit's called from hotplug-pci.c.  Really, I'm not sure hotplug-pci.c\ndeserves to exist.  I think the whole file (one function) could be\nfolded into pci/probe.c (as a separate patch all by itself).\n\n>  /**\n>   * pcibios_root_bridge_prepare - Platform-specific host bridge setup.\n> diff --git a/include/linux/pci.h b/include/linux/pci.h\n> index 4397692be538..c9b34c2de0fb 100644\n> --- a/include/linux/pci.h\n> +++ b/include/linux/pci.h\n> @@ -909,7 +909,14 @@ static inline void pci_dev_assign_slot(struct pci_dev *dev) { }\n>  int pci_scan_slot(struct pci_bus *bus, int devfn);\n>  struct pci_dev *pci_scan_single_device(struct pci_bus *bus, int devfn);\n>  void pci_device_add(struct pci_dev *dev, struct pci_bus *bus);\n> -unsigned int pci_scan_child_bus(struct pci_bus *bus);\n> +unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,\n> +\t\t\t\t       unsigned int available_buses);\n> +\n> +static inline unsigned int pci_scan_child_bus(struct pci_bus *bus)\n> +{\n> +\treturn pci_scan_child_bus_extend(bus, 0);\n> +}\n> +\n>  void pci_bus_add_device(struct pci_dev *dev);\n>  void pci_read_bridge_bases(struct pci_bus *child);\n>  struct resource *pci_find_parent_resource(const struct pci_dev *dev,\n> @@ -1292,8 +1299,14 @@ int pci_add_dynid(struct pci_driver *drv,\n>  \t\t  unsigned long driver_data);\n>  const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,\n>  \t\t\t\t\t struct pci_dev *dev);\n> -int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,\n> -\t\t    int pass);\n> +int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, int max,\n> +\t\t\t   unsigned int available_buses, int pass);\n> +\n> +static inline int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev,\n> +\t\t\t\t  int max, int pass)\n> +{\n> +\treturn pci_scan_bridge_extend(bus, dev, max, 0, pass);\n> +}\n>  \n>  void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),\n>  \t\t  void *userdata);\n> -- \n> 2.14.1\n>","headers":{"Return-Path":"<linux-pci-owner@vger.kernel.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming@bilbo.ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=linux-pci-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","mail.kernel.org;\n\tdmarc=none (p=none dis=none) header.from=kernel.org","mail.kernel.org;\n\tspf=none smtp.mailfrom=helgaas@kernel.org"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3yC9JL3SyXz9sPr\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu, 12 Oct 2017 10:32:50 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1750959AbdJKXcs (ORCPT <rfc822;incoming@patchwork.ozlabs.org>);\n\tWed, 11 Oct 2017 19:32:48 -0400","from mail.kernel.org ([198.145.29.99]:41190 \"EHLO mail.kernel.org\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1750718AbdJKXcr (ORCPT <rfc822;linux-pci@vger.kernel.org>);\n\tWed, 11 Oct 2017 19:32:47 -0400","from localhost (unknown [69.71.4.159])\n\t(using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits))\n\t(No client certificate requested)\n\tby mail.kernel.org (Postfix) with ESMTPSA id 9D261218CA;\n\tWed, 11 Oct 2017 23:32:46 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mail.kernel.org 9D261218CA","Date":"Wed, 11 Oct 2017 18:32:43 -0500","From":"Bjorn Helgaas <helgaas@kernel.org>","To":"Mika Westerberg <mika.westerberg@linux.intel.com>","Cc":"Bjorn Helgaas <bhelgaas@google.com>, Ashok Raj <ashok.raj@intel.com>,\n\tKeith Busch <keith.busch@intel.com>,\n\t\"Rafael J . Wysocki\" <rafael.j.wysocki@intel.com>,\n\tLukas Wunner <lukas@wunner.de>, Michael Jamet <michael.jamet@intel.com>,\n\tYehezkel Bernat <yehezkel.bernat@intel.com>,\n\tMario.Limonciello@dell.com, linux-pci@vger.kernel.org,\n\tlinux-kernel@vger.kernel.org","Subject":"Re: [PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","Message-ID":"<20171011233243.GX25517@bhelgaas-glaptop.roam.corp.google.com>","References":"<20170926141720.25067-1-mika.westerberg@linux.intel.com>\n\t<20170926141720.25067-4-mika.westerberg@linux.intel.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170926141720.25067-4-mika.westerberg@linux.intel.com>","User-Agent":"Mutt/1.5.21 (2010-09-15)","Sender":"linux-pci-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<linux-pci.vger.kernel.org>","X-Mailing-List":"linux-pci@vger.kernel.org"}},{"id":1785260,"web_url":"http://patchwork.ozlabs.org/comment/1785260/","msgid":"<063D6719AE5E284EB5DD2968C1650D6DD0092407@AcuExch.aculab.com>","list_archive_url":null,"date":"2017-10-12T09:50:31","subject":"RE: [PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","submitter":{"id":6689,"url":"http://patchwork.ozlabs.org/api/people/6689/","name":"David Laight","email":"David.Laight@ACULAB.COM"},"content":"From: Bjorn Helgaas\n> Sent: 12 October 2017 00:33\n> Why do you limit it to PCIe?  Isn't it conceivable that one could\n> hot-add a conventional PCI card that contained a bridge leading to\n> another hotplug slot?  E.g., a PCI card with PCMCIA slot or something\n> on it?\n\nThere are (maybe were!) cardbus cards that contained PCI bridges to\nconnect to expansion chassis.\nBus enumeration was always a problem unless they were connected\nat boot time.\n\n\tDavid","headers":{"Return-Path":"<linux-pci-owner@vger.kernel.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming@bilbo.ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=linux-pci-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3yCR1T01jMz9t2V\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu, 12 Oct 2017 20:50:53 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752437AbdJLJun convert rfc822-to-8bit (ORCPT\n\t<rfc822;incoming@patchwork.ozlabs.org>);\n\tThu, 12 Oct 2017 05:50:43 -0400","from smtp-out6.electric.net ([192.162.217.192]:51885 \"EHLO\n\tsmtp-out6.electric.net\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1752345AbdJLJuk (ORCPT\n\t<rfc822; linux-pci@vger.kernel.org>); Thu, 12 Oct 2017 05:50:40 -0400","from 1e2a8K-0000Vv-TY by out6a.electric.net with emc1-ok (Exim\n\t4.87) (envelope-from <David.Laight@ACULAB.COM>)\n\tid 1e2a8L-0000mC-Vi; Thu, 12 Oct 2017 02:50:29 -0700","by emcmailer; Thu, 12 Oct 2017 02:50:29 -0700","from [156.67.243.126] (helo=AcuExch.aculab.com)\n\tby out6a.electric.net with esmtps (TLSv1:AES128-SHA:128)\n\t(Exim 4.87) (envelope-from <David.Laight@ACULAB.COM>)\n\tid 1e2a8K-0000Vv-TY; Thu, 12 Oct 2017 02:50:28 -0700","from ACUEXCH.Aculab.com ([::1]) by AcuExch.aculab.com ([::1]) with\n\tmapi id 14.03.0123.003; Thu, 12 Oct 2017 10:50:31 +0100"],"From":"David Laight <David.Laight@ACULAB.COM>","To":"'Bjorn Helgaas' <helgaas@kernel.org>,\n\tMika Westerberg <mika.westerberg@linux.intel.com>","CC":"Bjorn Helgaas <bhelgaas@google.com>, Ashok Raj <ashok.raj@intel.com>,\n\tKeith Busch <keith.busch@intel.com>,\n\t\"Rafael J . Wysocki\" <rafael.j.wysocki@intel.com>,\n\tLukas Wunner <lukas@wunner.de>, Michael Jamet <michael.jamet@intel.com>,\n\tYehezkel Bernat <yehezkel.bernat@intel.com>,\n\t\"Mario.Limonciello@dell.com\" <Mario.Limonciello@dell.com>,\n\t\"linux-pci@vger.kernel.org\" <linux-pci@vger.kernel.org>,\n\t\"linux-kernel@vger.kernel.org\" <linux-kernel@vger.kernel.org>","Subject":"RE: [PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","Thread-Topic":"[PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","Thread-Index":"AQHTQulDOc5IuD1uEEeGLIZqMD9dPqLf+LOw","Date":"Thu, 12 Oct 2017 09:50:31 +0000","Message-ID":"<063D6719AE5E284EB5DD2968C1650D6DD0092407@AcuExch.aculab.com>","References":"<20170926141720.25067-1-mika.westerberg@linux.intel.com>\n\t<20170926141720.25067-4-mika.westerberg@linux.intel.com>\n\t<20171011233243.GX25517@bhelgaas-glaptop.roam.corp.google.com>","In-Reply-To":"<20171011233243.GX25517@bhelgaas-glaptop.roam.corp.google.com>","Accept-Language":"en-GB, en-US","Content-Language":"en-US","X-MS-Has-Attach":"","X-MS-TNEF-Correlator":"","x-originating-ip":"[10.202.99.200]","Content-Type":"text/plain; charset=\"Windows-1252\"","Content-Transfer-Encoding":"8BIT","MIME-Version":"1.0","X-Outbound-IP":"156.67.243.126","X-Env-From":"David.Laight@ACULAB.COM","X-Proto":"esmtps","X-Revdns":"","X-HELO":"AcuExch.aculab.com","X-TLS":"TLSv1:AES128-SHA:128","X-Authenticated_ID":"","X-PolicySMART":"3396946, 3397078","X-Virus-Status":["Scanned by VirusSMART (c)","Scanned by VirusSMART (s)"],"Sender":"linux-pci-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<linux-pci.vger.kernel.org>","X-Mailing-List":"linux-pci@vger.kernel.org"}},{"id":1785429,"web_url":"http://patchwork.ozlabs.org/comment/1785429/","msgid":"<20171012124703.GD2761@lahna.fi.intel.com>","list_archive_url":null,"date":"2017-10-12T12:47:03","subject":"Re: [PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","submitter":{"id":14534,"url":"http://patchwork.ozlabs.org/api/people/14534/","name":"Mika Westerberg","email":"mika.westerberg@linux.intel.com"},"content":"On Wed, Oct 11, 2017 at 06:32:43PM -0500, Bjorn Helgaas wrote:\n> On Tue, Sep 26, 2017 at 05:17:16PM +0300, Mika Westerberg wrote:\n> > System BIOS sometimes allocates extra bus space for hotplug capable PCIe\n> > root/downstream ports. This space is needed if the device plugged to the\n> > port will have more hotplug capable downstream ports. A good example of\n> > this is Thunderbolt. Each Thunderbolt device contains a PCIe switch and\n> > one or more hotplug capable PCIe downstream ports where the daisy chain\n> > can be extended.\n> > \n> > Currently Linux only allocates minimal bus space to make sure all the\n> > enumerated devices barely fit there. The BIOS reserved extra space is\n> > not taken into consideration at all. Because of this we run out of bus\n> > space pretty quickly when more PCIe devices are attached to hotplug\n> > downstream ports in order to extend the chain.\n> > \n> > This modifies PCI core so that we distribute the available BIOS\n> > allocated bus space equally between hotplug capable PCIe downstream\n> > ports to make sure there is enough bus space for extending the\n> > hierarchy later on.\n> \n> I think this makes sense in general.  It's a fairly complicated patch,\n> so my comments here are just a first pass.\n\nThanks for the comments!\n\n> Why do you limit it to PCIe?  Isn't it conceivable that one could\n> hot-add a conventional PCI card that contained a bridge leading to\n> another hotplug slot?  E.g., a PCI card with PCMCIA slot or something\n> on it?\n\nI guess this could be generalized to such configurations but I wanted to\nrestrict this with PCIe for a couple of reasons. Firstly, I'm able to\nactually test this ;-) Second, the rules in PCIe are quite simple,\nwhenever you have hotplug bridge (downstream port with a hotplug\ncapability set) you distribute the available bus space with it. With a\nconventional PCI it is not so clear (at least to me).\n\n> > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>\n> > ---\n> >  drivers/pci/hotplug-pci.c |  13 +++++-\n> >  drivers/pci/probe.c       | 114 ++++++++++++++++++++++++++++++++++++++--------\n> >  include/linux/pci.h       |  19 ++++++--\n> >  3 files changed, 123 insertions(+), 23 deletions(-)\n> > \n> > diff --git a/drivers/pci/hotplug-pci.c b/drivers/pci/hotplug-pci.c\n> > index c68366cee6b7..deb06fe22b26 100644\n> > --- a/drivers/pci/hotplug-pci.c\n> > +++ b/drivers/pci/hotplug-pci.c\n> > @@ -8,6 +8,7 @@ int pci_hp_add_bridge(struct pci_dev *dev)\n> >  {\n> >  \tstruct pci_bus *parent = dev->bus;\n> >  \tint pass, busnr, start = parent->busn_res.start;\n> > +\tunsigned int available_buses = 0;\n> >  \tint end = parent->busn_res.end;\n> >  \n> >  \tfor (busnr = start; busnr <= end; busnr++) {\n> > @@ -19,8 +20,18 @@ int pci_hp_add_bridge(struct pci_dev *dev)\n> >  \t\t\t\tpci_name(dev));\n> >  \t\treturn -1;\n> >  \t}\n> > +\n> > +\t/*\n> > +\t * In case of PCIe the hierarchy can be extended through hotplug\n> > +\t * capable downstream ports. Distribute the available bus\n> > +\t * numbers between them to make extending the chain possible.\n> > +\t */\n> > +\tif (pci_is_pcie(dev))\n> > +\t\tavailable_buses = end - busnr;\n> > +\n> >  \tfor (pass = 0; pass < 2; pass++)\n> > -\t\tbusnr = pci_scan_bridge(parent, dev, busnr, pass);\n> > +\t\tbusnr = pci_scan_bridge_extend(parent, dev, busnr,\n> > +\t\t\t\t\t       available_buses, pass);\n> >  \tif (!dev->subordinate)\n> >  \t\treturn -1;\n> >  \n> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c\n> > index f285cd74088e..ae0bf3c807f5 100644\n> > --- a/drivers/pci/probe.c\n> > +++ b/drivers/pci/probe.c\n> > @@ -960,6 +960,17 @@ static void pci_enable_crs(struct pci_dev *pdev)\n> >  }\n> >  \n> >  /*\n> > + * pci_scan_bridge_extend() - Scan buses behind a bridge\n> > + * @bus: Parent bus the bridge is on\n> > + * @dev: Bridge itself\n> > + * @max: Starting subordinate number of buses behind this bridge\n> > + * @available_buses: Total number of buses available for this bridge and\n> > + *\t\t     the devices below. After the minimal bus space has\n> > + *\t\t     been allocated the remaining buses will be\n> > + *\t\t     distributed equally between hotplug capable bridges.\n> > + * @pass: Either %0 (scan already configured bridges) or %1 (scan bridges\n> > + *        that need to be reconfigured.\n> > + *\n> >   * If it's a bridge, configure it and scan the bus behind it.\n> >   * For CardBus bridges, we don't scan behind as the devices will\n> >   * be handled by the bridge driver itself.\n> > @@ -969,7 +980,8 @@ static void pci_enable_crs(struct pci_dev *pdev)\n> >   * them, we proceed to assigning numbers to the remaining buses in\n> >   * order to avoid overlaps between old and new bus numbers.\n> >   */\n> > -int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)\n> > +int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, int max,\n> > +\t\t\t   unsigned int available_buses, int pass)\n> >  {\n> >  \tstruct pci_bus *child;\n> >  \tint is_cardbus = (dev->hdr_type == PCI_HEADER_TYPE_CARDBUS);\n> > @@ -1080,6 +1092,9 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)\n> >  \t\t\t\t\t\tbus->busn_res.end);\n> >  \t\t}\n> >  \t\tmax++;\n> > +\t\tif (available_buses)\n> > +\t\t\tavailable_buses--;\n> > +\n> >  \t\tbuses = (buses & 0xff000000)\n> >  \t\t      | ((unsigned int)(child->primary)     <<  0)\n> >  \t\t      | ((unsigned int)(child->busn_res.start)   <<  8)\n> > @@ -1101,7 +1116,7 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)\n> >  \n> >  \t\tif (!is_cardbus) {\n> >  \t\t\tchild->bridge_ctl = bctl;\n> > -\t\t\tmax = pci_scan_child_bus(child);\n> > +\t\t\tmax = pci_scan_child_bus_extend(child, available_buses);\n> >  \t\t} else {\n> >  \t\t\t/*\n> >  \t\t\t * For CardBus bridges, we leave 4 bus numbers\n> > @@ -1169,7 +1184,7 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)\n> >  \n> >  \treturn max;\n> >  }\n> > -EXPORT_SYMBOL(pci_scan_bridge);\n> > +EXPORT_SYMBOL(pci_scan_bridge_extend);\n> >  \n> >  /*\n> >   * Read interrupt line and base address registers.\n> > @@ -2397,9 +2412,24 @@ void __weak pcibios_fixup_bus(struct pci_bus *bus)\n> >         /* nothing to do, expected to be removed in the future */\n> >  }\n> >  \n> > -unsigned int pci_scan_child_bus(struct pci_bus *bus)\n> > +/**\n> > + * pci_scan_child_bus_extend() - Scan devices below a bus\n> > + * @bus: Bus to scan for devices\n> > + * @available_buses: Total number of buses available (%0 does not try to\n> > + *\t\t     extend beyond the minimal)\n> \n> What does \"%0\" mean?  Is that kernel-doc for something?  0?\n\n\"%\" means constant in kernel-doc comments. I've been using it with\nnumbers as well as with NULL and so on.\n\n> > + * Scans devices below @bus including subordinate buses. Returns new\n> > + * subordinate number including all the found devices. Passing\n> > + * @available_buses causes the remaining bus space to be distributed\n> > + * equally between hotplug capable bridges to allow future extension of\n> > + * the hierarchy.\n> > + */\n> > +unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,\n> > +\t\t\t\t       unsigned int available_buses)\n> >  {\n> > -\tunsigned int devfn, pass, max = bus->busn_res.start;\n> > +\tunsigned int used_buses, hotplug_bridges = 0;\n> > +\tunsigned int start = bus->busn_res.start;\n> > +\tunsigned int devfn, cmax, max = start;\n> >  \tstruct pci_dev *dev;\n> >  \n> >  \tdev_dbg(&bus->dev, \"scanning bus\\n\");\n> > @@ -2409,7 +2439,8 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus)\n> >  \t\tpci_scan_slot(bus, devfn);\n> >  \n> >  \t/* Reserve buses for SR-IOV capability. */\n> > -\tmax += pci_iov_bus_range(bus);\n> > +\tused_buses = pci_iov_bus_range(bus);\n> > +\tmax += used_buses;\n> >  \n> >  \t/*\n> >  \t * After performing arch-dependent fixup of the bus, look behind\n> > @@ -2421,23 +2452,68 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus)\n> >  \t\tbus->is_added = 1;\n> >  \t}\n> >  \n> > -\tfor (pass = 0; pass < 2; pass++)\n> > -\t\tlist_for_each_entry(dev, &bus->devices, bus_list) {\n> > -\t\t\tif (pci_is_bridge(dev))\n> > -\t\t\t\tmax = pci_scan_bridge(bus, dev, max, pass);\n> \n> Thanks for getting rid of this \"for (pass = 0; ...)\" loop.  Much nicer\n> to just have it spelled out.  It might even be worth pulling this\n> looping change into a separate patch to simplify *this* patch a little\n> bit.\n\nOK, I'll make a separate patch of it.\n\n> I'd be happy if you did the same for the similar loop in\n> pci_hp_add_bridge().\n\nSure, I'll do the same for pci_hp_add_bridge().\n\n> > +\t/*\n> > +\t * First pass. Bridges that are already configured. We don't touch\n> > +\t * these unless they are misconfigured (which we will do in second\n> > +\t * pass).\n> > +\t */\n> > +\tlist_for_each_entry(dev, &bus->devices, bus_list) {\n> > +\t\tif (pci_is_bridge(dev)) {\n> > +\t\t\t/*\n> > +\t\t\t * Calculate how many hotplug bridges there are on\n> > +\t\t\t * this bus. We will distribute the additional\n> > +\t\t\t * available buses between them.\n> > +\t\t\t */\n> > +\t\t\tif (dev->is_hotplug_bridge)\n> > +\t\t\t\thotplug_bridges++;\n> \n> Maybe pull this out into a third list traversal, since it's not\n> related to the \"scan bridge\" functionality?\n\nOK.\n\n> > +\n> > +\t\t\tcmax = max;\n> > +\t\t\tmax = pci_scan_bridge_extend(bus, dev, max, 0, 0);\n> > +\t\t\tused_buses += cmax - max;\n> > +\t\t}\n> > +\t}\n> \n> I'm trying to relate the \"Bridges that are already configured\" comment\n> to this code.  I don't see any test here for whether the bridge is\n> already configured.  Oh, I see -- the last parameter (0) to\n> pci_scan_bridge_extend() means this is the first pass, and it\n> basically just checks for secondary and subordinate being set.\n> I thought \"first pass\" was related to the list_for_each_entry() loop\n> here, but it's actually related to the internals of\n> pci_scan_bridge_extend().  I think maybe rewording the comment can\n> address this.\n\nI'll update the comment.\n\n> > +\t/* Second pass. Bridges that need to be configured. */\n> > +\tlist_for_each_entry(dev, &bus->devices, bus_list) {\n> > +\t\tif (pci_is_bridge(dev)) {\n> > +\t\t\tunsigned int buses = 0;\n> > +\n> > +\t\t\tif (pcie_upstream_port(dev)) {\n> > +\t\t\t\t/* Upstream port gets all available buses */\n> > +\t\t\t\tbuses = available_buses;\n> \n> I guess this relies on the assumption that there can only be one\n> upstream port on a bus?  Is that true?  Typically a switch only has a\n> function 0 upstream port, but something niggles at me like the spec\n> admits the possibility of a switch with multiple functions of upstream\n> ports?  I don't know where that is right now (or if it exists), but\n> I'll try to find it.\n\nMy understanding is that there can be only one upstream port on a bus.\nThat said I looked at the spec again and there is this in chapter 7.3.1\nof PCIe 3.1 spec:\n\n  Switches, and components wishing to incorporate more than eight\n  Functions at their Upstream Port, are permitted to implement one or\n  more “virtual switches” represented by multiple Type 1 (PCI-PCI\n  Bridge) Configuration Space headers as illustrated in Figure 7-2.\n  These virtual switches serve to allow fan-out beyond eight Functions.\n  Since Switch Downstream Ports are permitted to appear on any Device\n  Number, in this case all address information fields (Bus, Device, and\n  Function Numbers) must be completely decoded to access the correct\n  register. Any Configuration Request targeting an unimplemented Bus,\n  Device, or Function must return a Completion with Unsupported Request\n  Completion Status.\n\nNot sure what it actually means, though. A \"virtual switch\" to me says\nit is a switch with one upstream port and multiple downstream ports,\njust like normal switch. Is this what you meant? Do you understand it so\nthat there can be multiple upstream ports connected to a bus?\n\n> > +\t\t\t} else if (dev->is_hotplug_bridge) {\n> > +\t\t\t\t/*\n> > +\t\t\t\t * Distribute the extra buses between\n> > +\t\t\t\t * hotplug bridges if any.\n> > +\t\t\t\t */\n> > +\t\t\t\tbuses = available_buses / hotplug_bridges;\n> > +\t\t\t\tbuses = min(buses, available_buses - used_buses);\n> > +\t\t\t}\n> > +\n> > +\t\t\tcmax = max;\n> > +\t\t\tmax = pci_scan_bridge_extend(bus, dev, cmax, buses, 1);\n> > +\t\t\tused_buses += max - cmax;\n> >  \t\t}\n> > +\t}\n> >  \n> >  \t/*\n> >  \t * Make sure a hotplug bridge has at least the minimum requested\n> > -\t * number of buses.\n> > +\t * number of buses but allow it to grow up to the maximum available\n> > +\t * bus number of there is room.\n> >  \t */\n> > -\tif (bus->self && bus->self->is_hotplug_bridge && pci_hotplug_bus_size) {\n> > -\t\tif (max - bus->busn_res.start < pci_hotplug_bus_size - 1)\n> > -\t\t\tmax = bus->busn_res.start + pci_hotplug_bus_size - 1;\n> > -\n> > -\t\t/* Do not allocate more buses than we have room left */\n> > -\t\tif (max > bus->busn_res.end)\n> > -\t\t\tmax = bus->busn_res.end;\n> > +\tif (bus->self && bus->self->is_hotplug_bridge) {\n> > +\t\tused_buses = max_t(unsigned int, available_buses,\n> > +\t\t\t\t   pci_hotplug_bus_size - 1);\n> > +\t\tif (max - start < used_buses) {\n> > +\t\t\tmax = start + used_buses;\n> > +\n> > +\t\t\t/* Do not allocate more buses than we have room left */\n> > +\t\t\tif (max > bus->busn_res.end)\n> > +\t\t\t\tmax = bus->busn_res.end;\n> > +\n> > +\t\t\tdev_dbg(&bus->dev, \"%pR extended by %#02x\\n\",\n> > +\t\t\t\t&bus->busn_res, max - start);\n> > +\t\t}\n> >  \t}\n> >  \n> >  \t/*\n> > @@ -2450,7 +2526,7 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus)\n> >  \tdev_dbg(&bus->dev, \"bus scan returning with max=%02x\\n\", max);\n> >  \treturn max;\n> >  }\n> > -EXPORT_SYMBOL_GPL(pci_scan_child_bus);\n> > +EXPORT_SYMBOL_GPL(pci_scan_child_bus_extend);\n> \n> Does this need to be exported?  The only callers I see are\n> pci_scan_bridge_extend() (already in the same module) and\n> pci_scan_child_bus() (now an inline in linux/pci.h).\n> \n> I'd rather move pci_scan_child_bus() back to probe.c and make\n> pci_scan_child_bus_extend() static.\n\nOK.\n\n> Same with pci_scan_bridge_extend(), although that looks harder because\n> it's called from hotplug-pci.c.  Really, I'm not sure hotplug-pci.c\n> deserves to exist.  I think the whole file (one function) could be\n> folded into pci/probe.c (as a separate patch all by itself).\n\nNo problem, I'll move it to drivers/pci/probe.c and make\npci_scan_bridge_extend() static instead.","headers":{"Return-Path":"<linux-pci-owner@vger.kernel.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming@bilbo.ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=linux-pci-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3yCVx92ZgGz9t2r\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu, 12 Oct 2017 23:47:25 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752843AbdJLMrM (ORCPT <rfc822;incoming@patchwork.ozlabs.org>);\n\tThu, 12 Oct 2017 08:47:12 -0400","from mga03.intel.com ([134.134.136.65]:39185 \"EHLO mga03.intel.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1752392AbdJLMrK (ORCPT <rfc822;linux-pci@vger.kernel.org>);\n\tThu, 12 Oct 2017 08:47:10 -0400","from fmsmga002.fm.intel.com ([10.253.24.26])\n\tby orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;\n\t12 Oct 2017 05:47:09 -0700","from lahna.fi.intel.com (HELO lahna) ([10.237.72.157])\n\tby fmsmga002.fm.intel.com with SMTP; 12 Oct 2017 05:47:04 -0700","by lahna (sSMTP sendmail emulation);\n\tThu, 12 Oct 2017 15:47:03 +0300"],"X-ExtLoop1":"1","X-IronPort-AV":"E=Sophos;i=\"5.43,366,1503385200\"; d=\"scan'208\";a=\"1230003853\"","Date":"Thu, 12 Oct 2017 15:47:03 +0300","From":"Mika Westerberg <mika.westerberg@linux.intel.com>","To":"Bjorn Helgaas <helgaas@kernel.org>","Cc":"Bjorn Helgaas <bhelgaas@google.com>, Ashok Raj <ashok.raj@intel.com>,\n\tKeith Busch <keith.busch@intel.com>,\n\t\"Rafael J . Wysocki\" <rafael.j.wysocki@intel.com>,\n\tLukas Wunner <lukas@wunner.de>, Michael Jamet <michael.jamet@intel.com>,\n\tYehezkel Bernat <yehezkel.bernat@intel.com>,\n\tMario.Limonciello@dell.com, linux-pci@vger.kernel.org,\n\tlinux-kernel@vger.kernel.org","Subject":"Re: [PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","Message-ID":"<20171012124703.GD2761@lahna.fi.intel.com>","References":"<20170926141720.25067-1-mika.westerberg@linux.intel.com>\n\t<20170926141720.25067-4-mika.westerberg@linux.intel.com>\n\t<20171011233243.GX25517@bhelgaas-glaptop.roam.corp.google.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<20171011233243.GX25517@bhelgaas-glaptop.roam.corp.google.com>","Organization":"Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo","User-Agent":"Mutt/1.9.0 (2017-09-02)","Sender":"linux-pci-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<linux-pci.vger.kernel.org>","X-Mailing-List":"linux-pci@vger.kernel.org"}},{"id":1785733,"web_url":"http://patchwork.ozlabs.org/comment/1785733/","msgid":"<20171012183223.GY25517@bhelgaas-glaptop.roam.corp.google.com>","list_archive_url":null,"date":"2017-10-12T18:32:23","subject":"Re: [PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","submitter":{"id":67298,"url":"http://patchwork.ozlabs.org/api/people/67298/","name":"Bjorn Helgaas","email":"helgaas@kernel.org"},"content":"On Thu, Oct 12, 2017 at 03:47:03PM +0300, Mika Westerberg wrote:\n> On Wed, Oct 11, 2017 at 06:32:43PM -0500, Bjorn Helgaas wrote:\n> > On Tue, Sep 26, 2017 at 05:17:16PM +0300, Mika Westerberg wrote:\n> > > System BIOS sometimes allocates extra bus space for hotplug capable PCIe\n> > > root/downstream ports. This space is needed if the device plugged to the\n> > > port will have more hotplug capable downstream ports. A good example of\n> > > this is Thunderbolt. Each Thunderbolt device contains a PCIe switch and\n> > > one or more hotplug capable PCIe downstream ports where the daisy chain\n> > > can be extended.\n> > > \n> > > Currently Linux only allocates minimal bus space to make sure all the\n> > > enumerated devices barely fit there. The BIOS reserved extra space is\n> > > not taken into consideration at all. Because of this we run out of bus\n> > > space pretty quickly when more PCIe devices are attached to hotplug\n> > > downstream ports in order to extend the chain.\n> > > \n> > > This modifies PCI core so that we distribute the available BIOS\n> > > allocated bus space equally between hotplug capable PCIe downstream\n> > > ports to make sure there is enough bus space for extending the\n> > > hierarchy later on.\n> > \n> > I think this makes sense in general.  It's a fairly complicated patch,\n> > so my comments here are just a first pass.\n> \n> Thanks for the comments!\n> \n> > Why do you limit it to PCIe?  Isn't it conceivable that one could\n> > hot-add a conventional PCI card that contained a bridge leading to\n> > another hotplug slot?  E.g., a PCI card with PCMCIA slot or something\n> > on it?\n> \n> I guess this could be generalized to such configurations but I wanted to\n> restrict this with PCIe for a couple of reasons. Firstly, I'm able to\n> actually test this ;-) Second, the rules in PCIe are quite simple,\n> whenever you have hotplug bridge (downstream port with a hotplug\n> capability set) you distribute the available bus space with it. With a\n> conventional PCI it is not so clear (at least to me).\n\nYou're testing dev->is_hotplug_bridge, which I think is the right\napproach.  It happens that we currently only set that for PCIe bridges\nwith PCI_EXP_SLTCAP_HPC and for some ACPI cases (and a quirk for one\ndevice).  But in principle we could and probably should set it if we\ncan identify a conventional PCI hotplug bridge.  So my suggestion is\nto just remove the explicit PCIe tests.\n\n> > > +\t/* Second pass. Bridges that need to be configured. */\n> > > +\tlist_for_each_entry(dev, &bus->devices, bus_list) {\n> > > +\t\tif (pci_is_bridge(dev)) {\n> > > +\t\t\tunsigned int buses = 0;\n> > > +\n> > > +\t\t\tif (pcie_upstream_port(dev)) {\n> > > +\t\t\t\t/* Upstream port gets all available buses */\n> > > +\t\t\t\tbuses = available_buses;\n> > \n> > I guess this relies on the assumption that there can only be one\n> > upstream port on a bus?  Is that true?  Typically a switch only has a\n> > function 0 upstream port, but something niggles at me like the spec\n> > admits the possibility of a switch with multiple functions of upstream\n> > ports?  I don't know where that is right now (or if it exists), but\n> > I'll try to find it.\n> \n> My understanding is that there can be only one upstream port on a bus.\n> That said I looked at the spec again and there is this in chapter 7.3.1\n> of PCIe 3.1 spec:\n> \n>   Switches, and components wishing to incorporate more than eight\n>   Functions at their Upstream Port, are permitted to implement one or\n>   more “virtual switches” represented by multiple Type 1 (PCI-PCI\n>   Bridge) Configuration Space headers as illustrated in Figure 7-2.\n>   These virtual switches serve to allow fan-out beyond eight Functions.\n>   Since Switch Downstream Ports are permitted to appear on any Device\n>   Number, in this case all address information fields (Bus, Device, and\n>   Function Numbers) must be completely decoded to access the correct\n>   register. Any Configuration Request targeting an unimplemented Bus,\n>   Device, or Function must return a Completion with Unsupported Request\n>   Completion Status.\n> \n> Not sure what it actually means, though. A \"virtual switch\" to me says\n> it is a switch with one upstream port and multiple downstream ports,\n> just like normal switch. Is this what you meant? Do you understand it so\n> that there can be multiple upstream ports connected to a bus?\n\nI agree with you; I think that section is just saying that if a\ncomponent needs more then eight functions, it can incorporate a\nswitch, so it could have one upstream port, one internal logical bus,\nup to 32 * 8 = 256 downstream ports on that logical bus, and 8\nendpoints below each downstream port.  Of course, there wouldn't be\nenough bus number space for all that.  But I don't think this is\ntalking about a multifunction switch upstream port.\n\nAnyway, I think you're right that there can only be one upstream port\non a bus, because an upstream port contains link management stuff, and\nit wouldn't make sense to have two upstream ports trying to manage the\nsame end of a single link.\n\nBut I would really like to remove the PCIe-specific nature of this\ntest somehow so it could work on a conventional PCI topology.\n\nBjorn","headers":{"Return-Path":"<linux-pci-owner@vger.kernel.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming@bilbo.ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=linux-pci-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","mail.kernel.org;\n\tdmarc=none (p=none dis=none) header.from=kernel.org","mail.kernel.org;\n\tspf=none smtp.mailfrom=helgaas@kernel.org"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3yCfbc6SpXz9sNc\n\tfor <incoming@patchwork.ozlabs.org>;\n\tFri, 13 Oct 2017 05:32:44 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1753000AbdJLSc0 (ORCPT <rfc822;incoming@patchwork.ozlabs.org>);\n\tThu, 12 Oct 2017 14:32:26 -0400","from mail.kernel.org ([198.145.29.99]:54990 \"EHLO mail.kernel.org\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1751996AbdJLScY (ORCPT <rfc822;linux-pci@vger.kernel.org>);\n\tThu, 12 Oct 2017 14:32:24 -0400","from localhost (unknown [69.55.156.165])\n\t(using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits))\n\t(No client certificate requested)\n\tby mail.kernel.org (Postfix) with ESMTPSA id 2E34E21903;\n\tThu, 12 Oct 2017 18:32:24 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mail.kernel.org 2E34E21903","Date":"Thu, 12 Oct 2017 13:32:23 -0500","From":"Bjorn Helgaas <helgaas@kernel.org>","To":"Mika Westerberg <mika.westerberg@linux.intel.com>","Cc":"Bjorn Helgaas <bhelgaas@google.com>, Ashok Raj <ashok.raj@intel.com>,\n\tKeith Busch <keith.busch@intel.com>,\n\t\"Rafael J . Wysocki\" <rafael.j.wysocki@intel.com>,\n\tLukas Wunner <lukas@wunner.de>, Michael Jamet <michael.jamet@intel.com>,\n\tYehezkel Bernat <yehezkel.bernat@intel.com>,\n\tMario.Limonciello@dell.com, linux-pci@vger.kernel.org,\n\tlinux-kernel@vger.kernel.org","Subject":"Re: [PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","Message-ID":"<20171012183223.GY25517@bhelgaas-glaptop.roam.corp.google.com>","References":"<20170926141720.25067-1-mika.westerberg@linux.intel.com>\n\t<20170926141720.25067-4-mika.westerberg@linux.intel.com>\n\t<20171011233243.GX25517@bhelgaas-glaptop.roam.corp.google.com>\n\t<20171012124703.GD2761@lahna.fi.intel.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<20171012124703.GD2761@lahna.fi.intel.com>","User-Agent":"Mutt/1.5.21 (2010-09-15)","Sender":"linux-pci-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<linux-pci.vger.kernel.org>","X-Mailing-List":"linux-pci@vger.kernel.org"}},{"id":1786174,"web_url":"http://patchwork.ozlabs.org/comment/1786174/","msgid":"<20171013102618.GA2761@lahna.fi.intel.com>","list_archive_url":null,"date":"2017-10-13T10:26:18","subject":"Re: [PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","submitter":{"id":14534,"url":"http://patchwork.ozlabs.org/api/people/14534/","name":"Mika Westerberg","email":"mika.westerberg@linux.intel.com"},"content":"On Thu, Oct 12, 2017 at 01:32:23PM -0500, Bjorn Helgaas wrote:\n> On Thu, Oct 12, 2017 at 03:47:03PM +0300, Mika Westerberg wrote:\n> > On Wed, Oct 11, 2017 at 06:32:43PM -0500, Bjorn Helgaas wrote:\n> > > On Tue, Sep 26, 2017 at 05:17:16PM +0300, Mika Westerberg wrote:\n> > > > System BIOS sometimes allocates extra bus space for hotplug capable PCIe\n> > > > root/downstream ports. This space is needed if the device plugged to the\n> > > > port will have more hotplug capable downstream ports. A good example of\n> > > > this is Thunderbolt. Each Thunderbolt device contains a PCIe switch and\n> > > > one or more hotplug capable PCIe downstream ports where the daisy chain\n> > > > can be extended.\n> > > > \n> > > > Currently Linux only allocates minimal bus space to make sure all the\n> > > > enumerated devices barely fit there. The BIOS reserved extra space is\n> > > > not taken into consideration at all. Because of this we run out of bus\n> > > > space pretty quickly when more PCIe devices are attached to hotplug\n> > > > downstream ports in order to extend the chain.\n> > > > \n> > > > This modifies PCI core so that we distribute the available BIOS\n> > > > allocated bus space equally between hotplug capable PCIe downstream\n> > > > ports to make sure there is enough bus space for extending the\n> > > > hierarchy later on.\n> > > \n> > > I think this makes sense in general.  It's a fairly complicated patch,\n> > > so my comments here are just a first pass.\n> > \n> > Thanks for the comments!\n> > \n> > > Why do you limit it to PCIe?  Isn't it conceivable that one could\n> > > hot-add a conventional PCI card that contained a bridge leading to\n> > > another hotplug slot?  E.g., a PCI card with PCMCIA slot or something\n> > > on it?\n> > \n> > I guess this could be generalized to such configurations but I wanted to\n> > restrict this with PCIe for a couple of reasons. Firstly, I'm able to\n> > actually test this ;-) Second, the rules in PCIe are quite simple,\n> > whenever you have hotplug bridge (downstream port with a hotplug\n> > capability set) you distribute the available bus space with it. With a\n> > conventional PCI it is not so clear (at least to me).\n> \n> You're testing dev->is_hotplug_bridge, which I think is the right\n> approach.  It happens that we currently only set that for PCIe bridges\n> with PCI_EXP_SLTCAP_HPC and for some ACPI cases (and a quirk for one\n> device).  But in principle we could and probably should set it if we\n> can identify a conventional PCI hotplug bridge.  So my suggestion is\n> to just remove the explicit PCIe tests.\n\nFair enough :)\n\n> > > > +\t/* Second pass. Bridges that need to be configured. */\n> > > > +\tlist_for_each_entry(dev, &bus->devices, bus_list) {\n> > > > +\t\tif (pci_is_bridge(dev)) {\n> > > > +\t\t\tunsigned int buses = 0;\n> > > > +\n> > > > +\t\t\tif (pcie_upstream_port(dev)) {\n> > > > +\t\t\t\t/* Upstream port gets all available buses */\n> > > > +\t\t\t\tbuses = available_buses;\n> > > \n> > > I guess this relies on the assumption that there can only be one\n> > > upstream port on a bus?  Is that true?  Typically a switch only has a\n> > > function 0 upstream port, but something niggles at me like the spec\n> > > admits the possibility of a switch with multiple functions of upstream\n> > > ports?  I don't know where that is right now (or if it exists), but\n> > > I'll try to find it.\n> > \n> > My understanding is that there can be only one upstream port on a bus.\n> > That said I looked at the spec again and there is this in chapter 7.3.1\n> > of PCIe 3.1 spec:\n> > \n> >   Switches, and components wishing to incorporate more than eight\n> >   Functions at their Upstream Port, are permitted to implement one or\n> >   more “virtual switches” represented by multiple Type 1 (PCI-PCI\n> >   Bridge) Configuration Space headers as illustrated in Figure 7-2.\n> >   These virtual switches serve to allow fan-out beyond eight Functions.\n> >   Since Switch Downstream Ports are permitted to appear on any Device\n> >   Number, in this case all address information fields (Bus, Device, and\n> >   Function Numbers) must be completely decoded to access the correct\n> >   register. Any Configuration Request targeting an unimplemented Bus,\n> >   Device, or Function must return a Completion with Unsupported Request\n> >   Completion Status.\n> > \n> > Not sure what it actually means, though. A \"virtual switch\" to me says\n> > it is a switch with one upstream port and multiple downstream ports,\n> > just like normal switch. Is this what you meant? Do you understand it so\n> > that there can be multiple upstream ports connected to a bus?\n> \n> I agree with you; I think that section is just saying that if a\n> component needs more then eight functions, it can incorporate a\n> switch, so it could have one upstream port, one internal logical bus,\n> up to 32 * 8 = 256 downstream ports on that logical bus, and 8\n> endpoints below each downstream port.  Of course, there wouldn't be\n> enough bus number space for all that.  But I don't think this is\n> talking about a multifunction switch upstream port.\n> \n> Anyway, I think you're right that there can only be one upstream port\n> on a bus, because an upstream port contains link management stuff, and\n> it wouldn't make sense to have two upstream ports trying to manage the\n> same end of a single link.\n> \n> But I would really like to remove the PCIe-specific nature of this\n> test somehow so it could work on a conventional PCI topology.\n\nI think we can change the test to check if the bus has only one\nnon-hotplug bridge and assing resources to it then instead of explictly\nchecking for PCIe upstream port.","headers":{"Return-Path":"<linux-pci-owner@vger.kernel.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming@bilbo.ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=linux-pci-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3yD3y60LJWz9sRV\n\tfor <incoming@patchwork.ozlabs.org>;\n\tFri, 13 Oct 2017 21:35:09 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751564AbdJMKfH (ORCPT <rfc822;incoming@patchwork.ozlabs.org>);\n\tFri, 13 Oct 2017 06:35:07 -0400","from mga02.intel.com ([134.134.136.20]:61925 \"EHLO mga02.intel.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1751468AbdJMKfH (ORCPT <rfc822;linux-pci@vger.kernel.org>);\n\tFri, 13 Oct 2017 06:35:07 -0400","from fmsmga003.fm.intel.com ([10.253.24.29])\n\tby orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;\n\t13 Oct 2017 03:35:06 -0700","from lahna.fi.intel.com (HELO lahna) ([10.237.72.157])\n\tby FMSMGA003.fm.intel.com with SMTP; 13 Oct 2017 03:35:02 -0700","by lahna (sSMTP sendmail emulation);\n\tFri, 13 Oct 2017 13:26:18 +0300"],"X-ExtLoop1":"1","X-IronPort-AV":"E=Sophos;i=\"5.43,370,1503385200\"; d=\"scan'208\";a=\"909589736\"","Date":"Fri, 13 Oct 2017 13:26:18 +0300","From":"Mika Westerberg <mika.westerberg@linux.intel.com>","To":"Bjorn Helgaas <helgaas@kernel.org>","Cc":"Bjorn Helgaas <bhelgaas@google.com>, Ashok Raj <ashok.raj@intel.com>,\n\tKeith Busch <keith.busch@intel.com>,\n\t\"Rafael J . Wysocki\" <rafael.j.wysocki@intel.com>,\n\tLukas Wunner <lukas@wunner.de>, Michael Jamet <michael.jamet@intel.com>,\n\tYehezkel Bernat <yehezkel.bernat@intel.com>,\n\tMario.Limonciello@dell.com, linux-pci@vger.kernel.org,\n\tlinux-kernel@vger.kernel.org","Subject":"Re: [PATCH 3/7] PCI: Distribute available buses to hotplug capable\n\tPCIe downstream ports","Message-ID":"<20171013102618.GA2761@lahna.fi.intel.com>","References":"<20170926141720.25067-1-mika.westerberg@linux.intel.com>\n\t<20170926141720.25067-4-mika.westerberg@linux.intel.com>\n\t<20171011233243.GX25517@bhelgaas-glaptop.roam.corp.google.com>\n\t<20171012124703.GD2761@lahna.fi.intel.com>\n\t<20171012183223.GY25517@bhelgaas-glaptop.roam.corp.google.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<20171012183223.GY25517@bhelgaas-glaptop.roam.corp.google.com>","Organization":"Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo","User-Agent":"Mutt/1.9.0 (2017-09-02)","Sender":"linux-pci-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<linux-pci.vger.kernel.org>","X-Mailing-List":"linux-pci@vger.kernel.org"}}]