diff mbox

[v6,1/8] PCI: Recognize Thunderbolt devices

Message ID fee5154ef747e34684f316a9edf1ec7f635fd0f8.1486913733.git.lukas@wunner.de
State Changes Requested
Headers show

Commit Message

Lukas Wunner Feb. 12, 2017, 4:07 p.m. UTC
Detect on device probe whether a PCI device is part of a Thunderbolt
daisy chain (i.e. it is either part of a Thunderbolt controller or part
of the hierarchy below a Thunderbolt controller).

Intel uses a Vendor-Specific Extended Capability (VSEC) with ID 0x1234
on devices belonging to a Thunderbolt controller.  Detect presence of
this VSEC on the device itself or on one of its parents and cache it in
a newly added is_thunderbolt bit in struct pci_dev.

The necessity arises from the following:

* To power off Thunderbolt controllers on Macs when nothing is plugged
  in, we need to allow runtime PM on their PCIe ports in
  pci_bridge_d3_possible().  For this we need a way to recognize them.

* Dual GPU MacBook Pros introduced 2011+ can no longer switch external
  DisplayPort ports between GPUs.  (They're no longer just used for DP
  but have become combined DP/Thunderbolt ports.)  The driver to switch
  the ports, drivers/platform/x86/apple-gmux.c, needs to detect presence
  of a Thunderbolt controller and, if found, keep external ports
  permanently switched to the discrete GPU.

* If an external Thunderbolt GPU is connected to a dual GPU laptop (Mac
  or not), that GPU is currently registered with vga_switcheroo even
  though it can neither drive the laptop's panel nor be powered off by
  the platform.  To vga_switcheroo it will appear as if two discrete
  GPUs are present.  As a result, when the external GPU is runtime
  suspended, vga_switcheroo will cut power to the internal discrete GPU
  which may not be runtime suspended at all at this moment.  The
  solution is to not register external GPUs with vga_switcheroo, which
  necessitates a way to recognize if they're part of a Thunderbolt daisy
  chain.

Cc: Andreas Noever <andreas.noever@gmail.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Amir Levy <amir.jer.levy@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/pci/pci.h   |  2 ++
 drivers/pci/probe.c | 32 ++++++++++++++++++++++++++++++++
 include/linux/pci.h |  1 +
 3 files changed, 35 insertions(+)

Comments

Bjorn Helgaas Feb. 17, 2017, 3:29 p.m. UTC | #1
[+cc other recipients of cover letter]

Hi Lukas,

On Sun, Feb 12, 2017 at 05:07:45PM +0100, Lukas Wunner wrote:
> Detect on device probe whether a PCI device is part of a Thunderbolt
> daisy chain (i.e. it is either part of a Thunderbolt controller or part
> of the hierarchy below a Thunderbolt controller).

My problem with this is that "is_thunderbolt" is not well-defined.
The PCI specs lay out the common vocabulary and framework for this
area.  To keep the code maintainable, we need to connect things back
to the specs somehow.

For example, it might make sense to have a bit that means "this device
can generate PME from D3hot", because PME and D3hot are part of that
common understanding of how PCI works, and we can use that information
to design the software.

An "is_thunderbolt" bit doesn't have any of that context.  It doesn't
say anything about how the device behaves, so I can't evaluate the
code that uses it.

A secondary issue is that I think whatever bit you define here should
only be set for the actual Thunderbolt device itself.  The bit will
presumably tell us something about how that particular device works.
This is an enumeration-time thing and could be done either as in this
path or as a quirk, since it only affects one device.

If runtime code needs to know whether any upstream bridge is
Thunderbolt, it can always search up the hierarchy.  I think that
would improve readability because the software model would map more
closely to the hardware situation.

> Intel uses a Vendor-Specific Extended Capability (VSEC) with ID 0x1234
> on devices belonging to a Thunderbolt controller.  Detect presence of
> this VSEC on the device itself or on one of its parents and cache it in
> a newly added is_thunderbolt bit in struct pci_dev.
> 
> The necessity arises from the following:
> 
> * To power off Thunderbolt controllers on Macs when nothing is plugged
>   in, we need to allow runtime PM on their PCIe ports in
>   pci_bridge_d3_possible().  For this we need a way to recognize them.
> 
> * Dual GPU MacBook Pros introduced 2011+ can no longer switch external
>   DisplayPort ports between GPUs.  (They're no longer just used for DP
>   but have become combined DP/Thunderbolt ports.)  The driver to switch
>   the ports, drivers/platform/x86/apple-gmux.c, needs to detect presence
>   of a Thunderbolt controller and, if found, keep external ports
>   permanently switched to the discrete GPU.
> 
> * If an external Thunderbolt GPU is connected to a dual GPU laptop (Mac
>   or not), that GPU is currently registered with vga_switcheroo even
>   though it can neither drive the laptop's panel nor be powered off by
>   the platform.  To vga_switcheroo it will appear as if two discrete
>   GPUs are present.  As a result, when the external GPU is runtime
>   suspended, vga_switcheroo will cut power to the internal discrete GPU
>   which may not be runtime suspended at all at this moment.  The
>   solution is to not register external GPUs with vga_switcheroo, which
>   necessitates a way to recognize if they're part of a Thunderbolt daisy
>   chain.
> 
> Cc: Andreas Noever <andreas.noever@gmail.com>
> Cc: Tomas Winkler <tomas.winkler@intel.com>
> Cc: Amir Levy <amir.jer.levy@intel.com>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> ---
>  drivers/pci/pci.h   |  2 ++
>  drivers/pci/probe.c | 32 ++++++++++++++++++++++++++++++++
>  include/linux/pci.h |  1 +
>  3 files changed, 35 insertions(+)
> 
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index cb17db242f30..45c2b8144911 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -3,6 +3,8 @@
>  
>  #define PCI_FIND_CAP_TTL	48
>  
> +#define PCI_VSEC_ID_INTEL_TBT	0x1234	/* Thunderbolt */
> +
>  extern const unsigned char pcie_link_speed[];
>  
>  bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 204960e70333..2fcbc535786e 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1208,6 +1208,35 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
>  		pdev->is_hotplug_bridge = 1;
>  }
>  
> +static void set_pcie_thunderbolt(struct pci_dev *dev)
> +{
> +	struct pci_dev *parent = dev;
> +	int vsec = 0;
> +	u32 header;
> +
> +	/* Is the device part of a Thunderbolt controller? */
> +	while ((vsec = pci_find_next_ext_capability(dev, vsec,
> +						    PCI_EXT_CAP_ID_VNDR))) {
> +		pci_read_config_dword(dev, vsec + PCI_VNDR_HEADER, &header);
> +		if (dev->vendor == PCI_VENDOR_ID_INTEL &&
> +		    PCI_VNDR_HEADER_ID(header) == PCI_VSEC_ID_INTEL_TBT) {
> +			dev->is_thunderbolt = 1;
> +			return;
> +		}
> +	}
> +
> +	/*
> +	 * Is the device attached with Thunderbolt?  Walk upwards and check for
> +	 * each encountered bridge if it's part of a Thunderbolt controller.
> +	 * Reaching the host bridge means dev is soldered to the mainboard.
> +	 */
> +	while ((parent = pci_upstream_bridge(parent)))
> +		if (parent->is_thunderbolt) {
> +			dev->is_thunderbolt = 1;
> +			return;
> +		}
> +}
> +
>  /**
>   * pci_ext_cfg_is_aliased - is ext config space just an alias of std config?
>   * @dev: PCI device
> @@ -1360,6 +1389,9 @@ int pci_setup_device(struct pci_dev *dev)
>  	/* need to have dev->class ready */
>  	dev->cfg_size = pci_cfg_space_size(dev);
>  
> +	/* need to have dev->cfg_size ready */
> +	set_pcie_thunderbolt(dev);
> +
>  	/* "Unknown power state" */
>  	dev->current_state = PCI_UNKNOWN;
>  
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index e2d1a124216a..3c775e8498f1 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -358,6 +358,7 @@ struct pci_dev {
>  	unsigned int	is_virtfn:1;
>  	unsigned int	reset_fn:1;
>  	unsigned int    is_hotplug_bridge:1;
> +	unsigned int	is_thunderbolt:1; /* part of Thunderbolt daisy chain */
>  	unsigned int    __aer_firmware_first_valid:1;
>  	unsigned int	__aer_firmware_first:1;
>  	unsigned int	broken_intx_masking:1;
> -- 
> 2.11.0
>
Lukas Wunner Feb. 18, 2017, 9:12 a.m. UTC | #2
On Fri, Feb 17, 2017 at 09:29:03AM -0600, Bjorn Helgaas wrote:
> On Sun, Feb 12, 2017 at 05:07:45PM +0100, Lukas Wunner wrote:
> > Detect on device probe whether a PCI device is part of a Thunderbolt
> > daisy chain (i.e. it is either part of a Thunderbolt controller or part
> > of the hierarchy below a Thunderbolt controller).
> 
> My problem with this is that "is_thunderbolt" is not well-defined.
> The PCI specs lay out the common vocabulary and framework for this
> area.  To keep the code maintainable, we need to connect things back
> to the specs somehow.
> 
> For example, it might make sense to have a bit that means "this device
> can generate PME from D3hot", because PME and D3hot are part of that
> common understanding of how PCI works, and we can use that information
> to design the software.
> 
> An "is_thunderbolt" bit doesn't have any of that context.  It doesn't
> say anything about how the device behaves, so I can't evaluate the
> code that uses it.

No, this *does* connect back to the spec, the Thunderbolt spec to be
specific, except that spec is not available publicly.  (I assume it's
available under NDA.)  FWIW the PCI SIG doesn't make its specs freely
available either, only to members or for $$$.  As Andreas Noever has
pointed out before, there is plenty of precedent for including
(reverse-engineered) code in the kernel for stuff that isn't public.

The rationale for this bit is that devices on a Thunderbolt daisy chain
differ from plain PCIe devices and as such require special treatment.

Thunderbolt encapsulates PCIe and other protocols such as DisplayPort
into Thunderbolt packets and tunnels them over a network of Converged I/O
switches, so what's transmitted on the wire differs significantly from
plain PCIe.  That should already make the necessity for special treatment
obvious.  I've named three use cases for the is_thunderbolt bit in the
commit message of this patch but expect there will be more in the future.
E.g. Apple has a traffic prioritization mechanism on the Converged I/O
layer which is afforded to specific devices on the PCIe layer above,
see US 9,015,384:
http://pimg-fpiw.uspto.gov/fdd/84/153/090/0.pdf


> A secondary issue is that I think whatever bit you define here should
> only be set for the actual Thunderbolt device itself.  The bit will
> presumably tell us something about how that particular device works.

I don't quite follow, the bit *does* tell us something about how the
device works, doesn't it?  That it's situated on a Thunderbolt daisy
chain, I've clearly spelled that out in the commit message as well as
in a code comment of this patch.


> This is an enumeration-time thing and could be done either as in this
> path or as a quirk, since it only affects one device.

You're contradicting yourself by suggesting to use a quirk here:

Back in 2014 an Intel engineer tried to add a PCI quirk with a long
list of Thunderbolt devices and none other than you objected that
you're "really not happy about checking a list of device IDs to identify
Thunderbolt devices.  Surely there's a better way, because a list
like this has to be updated regularly."
https://patchwork.ozlabs.org/patch/409994/

The better way you were asking for back then is to probe for presence
of the Thunderbolt VSEC, which is what this patch does.  Yet you're
still not happy.  I don't get it.

FWIW I've asked an Intel engineer for details on the contents of the
VSEC but he declined to comment:
https://lkml.org/lkml/2017/1/11/153


> If runtime code needs to know whether any upstream bridge is
> Thunderbolt, it can always search up the hierarchy.  I think that
> would improve readability because the software model would map more
> closely to the hardware situation.

That would be more expensive than just checking a bit that is searched
only once and then cached.  We have is_hotplug_bridge for just eight
places where we check presence of hotplug capability, why can't we have
is_thunderbolt?  Beats me.

Thanks,

Lukas

> 
> > Intel uses a Vendor-Specific Extended Capability (VSEC) with ID 0x1234
> > on devices belonging to a Thunderbolt controller.  Detect presence of
> > this VSEC on the device itself or on one of its parents and cache it in
> > a newly added is_thunderbolt bit in struct pci_dev.
> > 
> > The necessity arises from the following:
> > 
> > * To power off Thunderbolt controllers on Macs when nothing is plugged
> >   in, we need to allow runtime PM on their PCIe ports in
> >   pci_bridge_d3_possible().  For this we need a way to recognize them.
> > 
> > * Dual GPU MacBook Pros introduced 2011+ can no longer switch external
> >   DisplayPort ports between GPUs.  (They're no longer just used for DP
> >   but have become combined DP/Thunderbolt ports.)  The driver to switch
> >   the ports, drivers/platform/x86/apple-gmux.c, needs to detect presence
> >   of a Thunderbolt controller and, if found, keep external ports
> >   permanently switched to the discrete GPU.
> > 
> > * If an external Thunderbolt GPU is connected to a dual GPU laptop (Mac
> >   or not), that GPU is currently registered with vga_switcheroo even
> >   though it can neither drive the laptop's panel nor be powered off by
> >   the platform.  To vga_switcheroo it will appear as if two discrete
> >   GPUs are present.  As a result, when the external GPU is runtime
> >   suspended, vga_switcheroo will cut power to the internal discrete GPU
> >   which may not be runtime suspended at all at this moment.  The
> >   solution is to not register external GPUs with vga_switcheroo, which
> >   necessitates a way to recognize if they're part of a Thunderbolt daisy
> >   chain.
> > 
> > Cc: Andreas Noever <andreas.noever@gmail.com>
> > Cc: Tomas Winkler <tomas.winkler@intel.com>
> > Cc: Amir Levy <amir.jer.levy@intel.com>
> > Signed-off-by: Lukas Wunner <lukas@wunner.de>
> > ---
> >  drivers/pci/pci.h   |  2 ++
> >  drivers/pci/probe.c | 32 ++++++++++++++++++++++++++++++++
> >  include/linux/pci.h |  1 +
> >  3 files changed, 35 insertions(+)
> > 
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index cb17db242f30..45c2b8144911 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -3,6 +3,8 @@
> >  
> >  #define PCI_FIND_CAP_TTL	48
> >  
> > +#define PCI_VSEC_ID_INTEL_TBT	0x1234	/* Thunderbolt */
> > +
> >  extern const unsigned char pcie_link_speed[];
> >  
> >  bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 204960e70333..2fcbc535786e 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -1208,6 +1208,35 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
> >  		pdev->is_hotplug_bridge = 1;
> >  }
> >  
> > +static void set_pcie_thunderbolt(struct pci_dev *dev)
> > +{
> > +	struct pci_dev *parent = dev;
> > +	int vsec = 0;
> > +	u32 header;
> > +
> > +	/* Is the device part of a Thunderbolt controller? */
> > +	while ((vsec = pci_find_next_ext_capability(dev, vsec,
> > +						    PCI_EXT_CAP_ID_VNDR))) {
> > +		pci_read_config_dword(dev, vsec + PCI_VNDR_HEADER, &header);
> > +		if (dev->vendor == PCI_VENDOR_ID_INTEL &&
> > +		    PCI_VNDR_HEADER_ID(header) == PCI_VSEC_ID_INTEL_TBT) {
> > +			dev->is_thunderbolt = 1;
> > +			return;
> > +		}
> > +	}
> > +
> > +	/*
> > +	 * Is the device attached with Thunderbolt?  Walk upwards and check for
> > +	 * each encountered bridge if it's part of a Thunderbolt controller.
> > +	 * Reaching the host bridge means dev is soldered to the mainboard.
> > +	 */
> > +	while ((parent = pci_upstream_bridge(parent)))
> > +		if (parent->is_thunderbolt) {
> > +			dev->is_thunderbolt = 1;
> > +			return;
> > +		}
> > +}
> > +
> >  /**
> >   * pci_ext_cfg_is_aliased - is ext config space just an alias of std config?
> >   * @dev: PCI device
> > @@ -1360,6 +1389,9 @@ int pci_setup_device(struct pci_dev *dev)
> >  	/* need to have dev->class ready */
> >  	dev->cfg_size = pci_cfg_space_size(dev);
> >  
> > +	/* need to have dev->cfg_size ready */
> > +	set_pcie_thunderbolt(dev);
> > +
> >  	/* "Unknown power state" */
> >  	dev->current_state = PCI_UNKNOWN;
> >  
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index e2d1a124216a..3c775e8498f1 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -358,6 +358,7 @@ struct pci_dev {
> >  	unsigned int	is_virtfn:1;
> >  	unsigned int	reset_fn:1;
> >  	unsigned int    is_hotplug_bridge:1;
> > +	unsigned int	is_thunderbolt:1; /* part of Thunderbolt daisy chain */
> >  	unsigned int    __aer_firmware_first_valid:1;
> >  	unsigned int	__aer_firmware_first:1;
> >  	unsigned int	broken_intx_masking:1;
> > -- 
> > 2.11.0
> >
Bjorn Helgaas Feb. 19, 2017, 4:27 a.m. UTC | #3
On Sat, Feb 18, 2017 at 10:12:24AM +0100, Lukas Wunner wrote:
> On Fri, Feb 17, 2017 at 09:29:03AM -0600, Bjorn Helgaas wrote:
> > On Sun, Feb 12, 2017 at 05:07:45PM +0100, Lukas Wunner wrote:
> > > Detect on device probe whether a PCI device is part of a Thunderbolt
> > > daisy chain (i.e. it is either part of a Thunderbolt controller or part
> > > of the hierarchy below a Thunderbolt controller).
> > 
> > My problem with this is that "is_thunderbolt" is not well-defined.
> > The PCI specs lay out the common vocabulary and framework for this
> > area.  To keep the code maintainable, we need to connect things back
> > to the specs somehow.
> > 
> > For example, it might make sense to have a bit that means "this device
> > can generate PME from D3hot", because PME and D3hot are part of that
> > common understanding of how PCI works, and we can use that information
> > to design the software.
> > 
> > An "is_thunderbolt" bit doesn't have any of that context.  It doesn't
> > say anything about how the device behaves, so I can't evaluate the
> > code that uses it.
> 
> No, this *does* connect back to the spec, the Thunderbolt spec to be
> specific, except that spec is not available publicly.  (I assume it's
> available under NDA.)  FWIW the PCI SIG doesn't make its specs freely
> available either, only to members or for $$$.  As Andreas Noever has
> pointed out before, there is plenty of precedent for including
> (reverse-engineered) code in the kernel for stuff that isn't public.

I'm not objecting to the fact that the Thunderbolt spec is not public.
What I want to know is specifically how the Thunderbolt bridge behaves
differently than a plain old PCIe bridge.  I don't even want to know
*all* the differences.  You're proposing to make the PCI core work a
slightly differently based on this bit, and in order to maintain that
in the future, we need to know the details of *why* we need to do
things differently.

Maybe D3hot means something different, maybe PME works differently,
maybe hotplug interrupts are signaled differently, I dunno.  If you
want us to treat these devices differently, we have to know *why* so
we can tell whether future changes in other areas also need to handle
them differently.

This might mean several bits instead of one single "is_thunderbolt"
bit, because there may be several differences that affect different
parts of the PCI core.

> The rationale for this bit is that devices on a Thunderbolt daisy chain
> differ from plain PCIe devices and as such require special treatment.
> 
> Thunderbolt encapsulates PCIe and other protocols such as DisplayPort
> into Thunderbolt packets and tunnels them over a network of Converged I/O
> switches, so what's transmitted on the wire differs significantly from
> plain PCIe.  That should already make the necessity for special treatment
> obvious.  I've named three use cases for the is_thunderbolt bit in the
> commit message of this patch but expect there will be more in the future.
> E.g. Apple has a traffic prioritization mechanism on the Converged I/O
> layer which is afforded to specific devices on the PCIe layer above,
> see US 9,015,384:
> http://pimg-fpiw.uspto.gov/fdd/84/153/090/0.pdf
> 
> 
> > A secondary issue is that I think whatever bit you define here should
> > only be set for the actual Thunderbolt device itself.  The bit will
> > presumably tell us something about how that particular device works.
> 
> I don't quite follow, the bit *does* tell us something about how the
> device works, doesn't it?  That it's situated on a Thunderbolt daisy
> chain, I've clearly spelled that out in the commit message as well as
> in a code comment of this patch.

No, it doesn't actually tell us anything about how it works.  The
commit message tells us some *consequences*, but it doesn't say
anything about the PCIe-level features that *lead* to those
consequences.

That means we can't reason about how Thunderbolt devices will work in
different scenarios or future platforms.

> > This is an enumeration-time thing and could be done either as in this
> > path or as a quirk, since it only affects one device.
> 
> You're contradicting yourself by suggesting to use a quirk here:
> 
> Back in 2014 an Intel engineer tried to add a PCI quirk with a long
> list of Thunderbolt devices and none other than you objected that
> you're "really not happy about checking a list of device IDs to identify
> Thunderbolt devices.  Surely there's a better way, because a list
> like this has to be updated regularly."
> https://patchwork.ozlabs.org/patch/409994/
> 
> The better way you were asking for back then is to probe for presence
> of the Thunderbolt VSEC, which is what this patch does.  Yet you're
> still not happy.  I don't get it.

I was wrong to suggest the possibility of a quirk, sorry.  You're
right: looking for the VSEC is a much better route, and that doesn't
fit well in a quirk because the quirk would have to run for all
devices.

> > If runtime code needs to know whether any upstream bridge is
> > Thunderbolt, it can always search up the hierarchy.  I think that
> > would improve readability because the software model would map more
> > closely to the hardware situation.
> 
> That would be more expensive than just checking a bit that is searched
> only once and then cached.  We have is_hotplug_bridge for just eight
> places where we check presence of hotplug capability, why can't we have
> is_thunderbolt?  Beats me.

Searching up the tree is more expensive, but it looks like we only
check it in enumeration-type situations, so I doubt this is a
performance issue.

is_hotplug_bridge is not quite comparable.  For one thing, it can be
set three ways: when PCI_EXP_SLTCAP contains PCI_EXP_SLTCAP_HPC, by a
quirk, and by acpiphp.  It maybe *should* also be set by other hotplug
drivers.  I don't think it would be practical to make an
is_hotplug_bridge() function to look up all those cases.

Bjorn
Lukas Wunner Feb. 20, 2017, 11:49 a.m. UTC | #4
On Sat, Feb 18, 2017 at 10:27:45PM -0600, Bjorn Helgaas wrote:
> On Sat, Feb 18, 2017 at 10:12:24AM +0100, Lukas Wunner wrote:
> > On Fri, Feb 17, 2017 at 09:29:03AM -0600, Bjorn Helgaas wrote:
> > > On Sun, Feb 12, 2017 at 05:07:45PM +0100, Lukas Wunner wrote:
> > > > Detect on device probe whether a PCI device is part of a Thunderbolt
> > > > daisy chain (i.e. it is either part of a Thunderbolt controller or part
> > > > of the hierarchy below a Thunderbolt controller).
> > > 
> > > My problem with this is that "is_thunderbolt" is not well-defined.
> > > The PCI specs lay out the common vocabulary and framework for this
> > > area.  To keep the code maintainable, we need to connect things back
> > > to the specs somehow.
> > > 
> > > For example, it might make sense to have a bit that means "this device
> > > can generate PME from D3hot", because PME and D3hot are part of that
> > > common understanding of how PCI works, and we can use that information
> > > to design the software.
> > > 
> > > An "is_thunderbolt" bit doesn't have any of that context.  It doesn't
> > > say anything about how the device behaves, so I can't evaluate the
> > > code that uses it.
> > 
> > No, this *does* connect back to the spec, the Thunderbolt spec to be
> > specific, except that spec is not available publicly.  (I assume it's
> > available under NDA.)  FWIW the PCI SIG doesn't make its specs freely
> > available either, only to members or for $$$.  As Andreas Noever has
> > pointed out before, there is plenty of precedent for including
> > (reverse-engineered) code in the kernel for stuff that isn't public.
> 
> I'm not objecting to the fact that the Thunderbolt spec is not public.
> What I want to know is specifically how the Thunderbolt bridge behaves
> differently than a plain old PCIe bridge.  I don't even want to know
> *all* the differences.  You're proposing to make the PCI core work a
> slightly differently based on this bit, and in order to maintain that
> in the future, we need to know the details of *why* we need to do
> things differently.

Okay, I think I'm starting to understand what you're driving at.

Thunderbolt tunnels PCIe transparently, so in principle there should be
no difference behaviour-wise.

As far as the PCI core is concerned, I'm only using is_thunderbolt to
whitelist Thunderbolt ports for runtime PM in patch [2/8]. (This just
concerns like a dozen chips whose behaviour is well understood, hence
enabling runtime PM should be safe.)

The other two upcoming use cases merely concern whether a device is on
a Thunderbolt daisy chain (vs. soldered to the mainboard), and to detect
presence of a Thunderbolt controller in the machine (their PCI devices
use PCI_CLASS_SYSTEM_OTHER and PCI_CLASS_BRIDGE_PCI, which is not
sufficiently unique to identify Thunderbolt controllers).

I decided to put the is_thunderbolt flag in struct pci_dev because any
PCI device might end up on a Thunderbolt daisy chain.  The purpose of
the bit is merely to cache that status, it does not signify that the
device suffers from some particular PCI quirk.


> Maybe D3hot means something different, maybe PME works differently,
> maybe hotplug interrupts are signaled differently, I dunno.  If you
> want us to treat these devices differently, we have to know *why* so
> we can tell whether future changes in other areas also need to handle
> them differently.

This all works fine.  Once a Thunderbolt tunnel has been set up, the
hotplug port on the PCIe switch integrated into the Thunderbolt
controller signals "Card present" and "Link up" interrupts.  On surprise
removal of an attached device, the Presence Detect and Link State bits
are cleared and an interrupt is signaled for each of them.

There's a quirk wherein Thunderbolt controllers claim to support Command
Complete interrupts but they never send them, I don't think that's
intentional but rather a hardware bug. The kernel deals gracefully with
that though, no special treatment necessary.


> > > If runtime code needs to know whether any upstream bridge is
> > > Thunderbolt, it can always search up the hierarchy.  I think that
> > > would improve readability because the software model would map more
> > > closely to the hardware situation.
> > 
> > That would be more expensive than just checking a bit that is searched
> > only once and then cached.  We have is_hotplug_bridge for just eight
> > places where we check presence of hotplug capability, why can't we have
> > is_thunderbolt?  Beats me.
> 
> Searching up the tree is more expensive, but it looks like we only
> check it in enumeration-type situations, so I doubt this is a
> performance issue.

In patch [3/8] of v5 of this series, I used the is_thunderbolt bit in
pci_dev_check_d3cold(), which is not only executed during ->probe and
->remove but also whenver the D3cold status of a device is changed via
sysfs.

However I dropped that patch and the remaining use cases are indeed
limited to ->probe paths, so I no longer feel strongly about avoiding
walking up the hierarchy.

The set_pcie_thunderbolt() function in this commit essentially does
two things:  (a) Detect presence of the Thunderbolt VSEC on a device
and (b) walking up the hierarchy to detect whether that VSEC is present
on a parent.

Do you want me to set the is_thunderbolt bit only on devices belonging
to a Thunderbolt controller and use a separate function to walk up the
hierarchy?

Or do you want me to dispose of the is_thunderbolt bit altogether and
use another function to detect presence of the VSEC?  Searching for a
capability can require lots of non-posted transactions, so keeping
is_thunderbolt would seem reasonable to me.

I could probably add these as inlines to include/linux/pci.h, similar
to how acpi_find_root_bridge_handle() in include/linux/pci-acpi.h walks
up the hierarchy.

Thanks,

Lukas
Bjorn Helgaas Feb. 21, 2017, 10:55 p.m. UTC | #5
On Mon, Feb 20, 2017 at 12:49:28PM +0100, Lukas Wunner wrote:
> On Sat, Feb 18, 2017 at 10:27:45PM -0600, Bjorn Helgaas wrote:
> > On Sat, Feb 18, 2017 at 10:12:24AM +0100, Lukas Wunner wrote:
> > > On Fri, Feb 17, 2017 at 09:29:03AM -0600, Bjorn Helgaas wrote:
> > > > On Sun, Feb 12, 2017 at 05:07:45PM +0100, Lukas Wunner wrote:
> > > > > Detect on device probe whether a PCI device is part of a Thunderbolt
> > > > > daisy chain (i.e. it is either part of a Thunderbolt controller or part
> > > > > of the hierarchy below a Thunderbolt controller).
> > > > 
> > > > My problem with this is that "is_thunderbolt" is not well-defined.
> > > > The PCI specs lay out the common vocabulary and framework for this
> > > > area.  To keep the code maintainable, we need to connect things back
> > > > to the specs somehow.
> > > > 
> > > > For example, it might make sense to have a bit that means "this device
> > > > can generate PME from D3hot", because PME and D3hot are part of that
> > > > common understanding of how PCI works, and we can use that information
> > > > to design the software.
> > > > 
> > > > An "is_thunderbolt" bit doesn't have any of that context.  It doesn't
> > > > say anything about how the device behaves, so I can't evaluate the
> > > > code that uses it.
> > > 
> > > No, this *does* connect back to the spec, the Thunderbolt spec to be
> > > specific, except that spec is not available publicly.  (I assume it's
> > > available under NDA.)  FWIW the PCI SIG doesn't make its specs freely
> > > available either, only to members or for $$$.  As Andreas Noever has
> > > pointed out before, there is plenty of precedent for including
> > > (reverse-engineered) code in the kernel for stuff that isn't public.
> > 
> > I'm not objecting to the fact that the Thunderbolt spec is not public.
> > What I want to know is specifically how the Thunderbolt bridge behaves
> > differently than a plain old PCIe bridge.  I don't even want to know
> > *all* the differences.  You're proposing to make the PCI core work a
> > slightly differently based on this bit, and in order to maintain that
> > in the future, we need to know the details of *why* we need to do
> > things differently.
> 
> Okay, I think I'm starting to understand what you're driving at.
> 
> Thunderbolt tunnels PCIe transparently, so in principle there should be
> no difference behaviour-wise.
> 
> As far as the PCI core is concerned, I'm only using is_thunderbolt to
> whitelist Thunderbolt ports for runtime PM in patch [2/8]. (This just
> concerns like a dozen chips whose behaviour is well understood, hence
> enabling runtime PM should be safe.)

If there's no known PM-related behavior difference between Thunderbolt
and other PCIe ports, I'm hesitant to enable runtime PM only for
Thunderbolt, especially since the proposal also enables runtime PM for
non-Thunderbolt devices that happen to be downstream from a
Thunderbolt port.

If we think Thunderbolt ports and non-Thunderbolt ports work the same
with respect to PM, we should operate them the same way to simplify
the code and improve test coverage.

In other words, if there are problems with new functionality, I would
prefer not to just exclude the hardware platforms with the problems,
especially when it's the huge generic class of "all non-Thunderbolt
PCIe ports".  I'd rather figure out the problems so we can enable the
functionality everywhere (possibly with quirks for known defective
platforms).

> The other two upcoming use cases merely concern whether a device is on
> a Thunderbolt daisy chain (vs. soldered to the mainboard), and to detect
> presence of a Thunderbolt controller in the machine (their PCI devices
> use PCI_CLASS_SYSTEM_OTHER and PCI_CLASS_BRIDGE_PCI, which is not
> sufficiently unique to identify Thunderbolt controllers).

If I understand correctly, this is another case where there's no
actual functional difference because of Thunderbolt, but apple-gmux
would use the fact that a GPU is connected via Thunderbolt to infer
something about which GPUs vga_switcheroo can switch between.

Since the PCI core doesn't care at all about this, could apple-gmux
figure this out on its own by looking for the VSEC capability itself?
It seems like doing it closer to where it's used would make things
more understandable.

> I decided to put the is_thunderbolt flag in struct pci_dev because any
> PCI device might end up on a Thunderbolt daisy chain.  The purpose of
> the bit is merely to cache that status, it does not signify that the
> device suffers from some particular PCI quirk.

Assuming we need it, having it in struct pci_dev is fine.  There's no
point in looking up the VSEC capability more than once.

> > Maybe D3hot means something different, maybe PME works differently,
> > maybe hotplug interrupts are signaled differently, I dunno.  If you
> > want us to treat these devices differently, we have to know *why* so
> > we can tell whether future changes in other areas also need to handle
> > them differently.
> 
> This all works fine.  Once a Thunderbolt tunnel has been set up, the
> hotplug port on the PCIe switch integrated into the Thunderbolt
> controller signals "Card present" and "Link up" interrupts.  On surprise
> removal of an attached device, the Presence Detect and Link State bits
> are cleared and an interrupt is signaled for each of them.
> 
> There's a quirk wherein Thunderbolt controllers claim to support Command
> Complete interrupts but they never send them, I don't think that's
> intentional but rather a hardware bug. The kernel deals gracefully with
> that though, no special treatment necessary.

Many Intel ports have a similar issue (erratum CF118, see 3461a068661c
("PCI: pciehp: Wait for hotplug command completion lazily")) where
they generate Command Completion interrupts for some commands but not
all.

I think pciehp should work even without those interrupts, though we
emit timeout warnings (which maybe could be toned down or omitted).

> > > > If runtime code needs to know whether any upstream bridge is
> > > > Thunderbolt, it can always search up the hierarchy.  I think that
> > > > would improve readability because the software model would map more
> > > > closely to the hardware situation.
> > > 
> > > That would be more expensive than just checking a bit that is searched
> > > only once and then cached.  We have is_hotplug_bridge for just eight
> > > places where we check presence of hotplug capability, why can't we have
> > > is_thunderbolt?  Beats me.
> > 
> > Searching up the tree is more expensive, but it looks like we only
> > check it in enumeration-type situations, so I doubt this is a
> > performance issue.
> 
> In patch [3/8] of v5 of this series, I used the is_thunderbolt bit in
> pci_dev_check_d3cold(), which is not only executed during ->probe and
> ->remove but also whenver the D3cold status of a device is changed via
> sysfs.
> 
> However I dropped that patch and the remaining use cases are indeed
> limited to ->probe paths, so I no longer feel strongly about avoiding
> walking up the hierarchy.
> 
> The set_pcie_thunderbolt() function in this commit essentially does
> two things:  (a) Detect presence of the Thunderbolt VSEC on a device
> and (b) walking up the hierarchy to detect whether that VSEC is present
> on a parent.
> 
> Do you want me to set the is_thunderbolt bit only on devices belonging
> to a Thunderbolt controller and use a separate function to walk up the
> hierarchy?

Let's figure out what information we need and then figure out where to
store it.  If Thunderbolt devices have a functional difference we need
to know about, struct pci_dev seems like a good place to store that.

Bjorn
diff mbox

Patch

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index cb17db242f30..45c2b8144911 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -3,6 +3,8 @@ 
 
 #define PCI_FIND_CAP_TTL	48
 
+#define PCI_VSEC_ID_INTEL_TBT	0x1234	/* Thunderbolt */
+
 extern const unsigned char pcie_link_speed[];
 
 bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 204960e70333..2fcbc535786e 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1208,6 +1208,35 @@  void set_pcie_hotplug_bridge(struct pci_dev *pdev)
 		pdev->is_hotplug_bridge = 1;
 }
 
+static void set_pcie_thunderbolt(struct pci_dev *dev)
+{
+	struct pci_dev *parent = dev;
+	int vsec = 0;
+	u32 header;
+
+	/* Is the device part of a Thunderbolt controller? */
+	while ((vsec = pci_find_next_ext_capability(dev, vsec,
+						    PCI_EXT_CAP_ID_VNDR))) {
+		pci_read_config_dword(dev, vsec + PCI_VNDR_HEADER, &header);
+		if (dev->vendor == PCI_VENDOR_ID_INTEL &&
+		    PCI_VNDR_HEADER_ID(header) == PCI_VSEC_ID_INTEL_TBT) {
+			dev->is_thunderbolt = 1;
+			return;
+		}
+	}
+
+	/*
+	 * Is the device attached with Thunderbolt?  Walk upwards and check for
+	 * each encountered bridge if it's part of a Thunderbolt controller.
+	 * Reaching the host bridge means dev is soldered to the mainboard.
+	 */
+	while ((parent = pci_upstream_bridge(parent)))
+		if (parent->is_thunderbolt) {
+			dev->is_thunderbolt = 1;
+			return;
+		}
+}
+
 /**
  * pci_ext_cfg_is_aliased - is ext config space just an alias of std config?
  * @dev: PCI device
@@ -1360,6 +1389,9 @@  int pci_setup_device(struct pci_dev *dev)
 	/* need to have dev->class ready */
 	dev->cfg_size = pci_cfg_space_size(dev);
 
+	/* need to have dev->cfg_size ready */
+	set_pcie_thunderbolt(dev);
+
 	/* "Unknown power state" */
 	dev->current_state = PCI_UNKNOWN;
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e2d1a124216a..3c775e8498f1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -358,6 +358,7 @@  struct pci_dev {
 	unsigned int	is_virtfn:1;
 	unsigned int	reset_fn:1;
 	unsigned int    is_hotplug_bridge:1;
+	unsigned int	is_thunderbolt:1; /* part of Thunderbolt daisy chain */
 	unsigned int    __aer_firmware_first_valid:1;
 	unsigned int	__aer_firmware_first:1;
 	unsigned int	broken_intx_masking:1;