mbox series

[v3,0/2] IB/hfi1: Cleanup PCIe link configuration

Message ID 20180417002825.2737-1-fred@fredlawl.com
Headers show
Series IB/hfi1: Cleanup PCIe link configuration | expand

Message

Frederick Lawler April 17, 2018, 12:28 a.m. UTC
The IB/hfi1 driver uses custom macros to configure the target link speed. Some 
macros were previously replaced, but not fully. This patch series addresses the
configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI,
and then use them in the following IB/hfi1 patch.

V3: 
	* PCI: Add PCI_EXP_LNKCTL2_TLS_* macros
	* Drop patch to use extract_speed() in do_pcie_gen3_transition()
V2:
	* s/LINK/LNK

Frederick Lawler (2):
  PCI: Add PCI_EXP_LNKCTL2_TLS* macros
  IB/hfi1: Replace custom hfi1 macros with PCIe macros

 drivers/infiniband/hw/hfi1/pcie.c | 24 ++++++++----------------
 include/uapi/linux/pci_regs.h     |  5 +++++
 2 files changed, 13 insertions(+), 16 deletions(-)

Comments

Christoph Hellwig April 17, 2018, 6:53 a.m. UTC | #1
On Mon, Apr 16, 2018 at 07:28:23PM -0500, Frederick Lawler wrote:
> The IB/hfi1 driver uses custom macros to configure the target link speed. Some 
> macros were previously replaced, but not fully. This patch series addresses the
> configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI,
> and then use them in the following IB/hfi1 patch.

Btw, why is the driver configuring the PCIe link speed?  Isn't this
something we should be handling in the PCI core?
Bjorn Helgaas April 17, 2018, 2:32 p.m. UTC | #2
[+cc Bartlomiej, who recently changed do_pcie_gen3_transition(), LKML]

On Mon, Apr 16, 2018 at 11:53:43PM -0700, Christoph Hellwig wrote:
> On Mon, Apr 16, 2018 at 07:28:23PM -0500, Frederick Lawler wrote:
> > The IB/hfi1 driver uses custom macros to configure the target link speed. Some 
> > macros were previously replaced, but not fully. This patch series addresses the
> > configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI,
> > and then use them in the following IB/hfi1 patch.
> 
> Btw, why is the driver configuring the PCIe link speed?  Isn't this
> something we should be handling in the PCI core?

Good question.  I think this sort of code definitely should not be in
drivers unless it's to work around some kind of defect in the device.

I think the intent of the spec is that neither the OS nor the driver
has to deal with link training and the link should train to the
highest speed supported by both ends.  For example, PCIe r4.0, sec 1.2
says

  Initialization – During hardware initialization, each PCI Express
  Link is set up following a negotiation of Lane widths and frequency
  of operation by the two agents at each end of the Link. No firmware
  or operating system software is involved.

Maybe the Intel guys can comment on why HFI needs this code.
Presumably they didn't write it just for fun, so I assume there's
some reason.

Bjorn
Marciniszyn, Mike April 17, 2018, 3:56 p.m. UTC | #3
> 
> Btw, why is the driver configuring the PCIe link speed?  Isn't this
> something we should be handling in the PCI core?

The device comes out of reset at the 5GT/s speed. The driver downloads device firmware, programs PCIe registers, and co-ordinates the transition to 8GT/s.

This recipe is device specific and is therefore implemented in the hfi1 driver built on top of PCI core functions and macros.

Mike
Bjorn Helgaas April 17, 2018, 5:19 p.m. UTC | #4
On Tue, Apr 17, 2018 at 03:56:13PM +0000, Marciniszyn, Mike wrote:
> > 
> > Btw, why is the driver configuring the PCIe link speed?  Isn't this
> > something we should be handling in the PCI core?
> 
> The device comes out of reset at the 5GT/s speed. The driver
> downloads device firmware, programs PCIe registers, and co-ordinates
> the transition to 8GT/s.
> 
> This recipe is device specific and is therefore implemented in the
> hfi1 driver built on top of PCI core functions and macros.

Do you think this behavior conforms to the spec, or is this a
workaround for a hardware erratum?

I don't think it's feasible to have every driver deal with this level
of PCIe detail.  Do you have to do something similar in the Windows
driver?

Is there something we can do in the PCI core to do the
reconfiguration?  If you can't go to 8GT/s before doing some
device-specific initialization, could the driver do that setup and
then use a generic PCI core interface to reconfigure the link?

Or maybe if the driver finds the device at 5GT/s, it could download
the firmware and reset the device.  Would the device then negotiate at
8GT/s?

Bjorn
Dennis Dalessandro April 18, 2018, 9:53 p.m. UTC | #5
On 4/17/2018 10:32 AM, Bjorn Helgaas wrote:
> [+cc Bartlomiej, who recently changed do_pcie_gen3_transition(), LKML]
> 
> On Mon, Apr 16, 2018 at 11:53:43PM -0700, Christoph Hellwig wrote:
>> On Mon, Apr 16, 2018 at 07:28:23PM -0500, Frederick Lawler wrote:
>>> The IB/hfi1 driver uses custom macros to configure the target link speed. Some
>>> macros were previously replaced, but not fully. This patch series addresses the
>>> configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI,
>>> and then use them in the following IB/hfi1 patch.
>>
>> Btw, why is the driver configuring the PCIe link speed?  Isn't this
>> something we should be handling in the PCI core?
> 
> Good question.  I think this sort of code definitely should not be in
> drivers unless it's to work around some kind of defect in the device.
> 
> I think the intent of the spec is that neither the OS nor the driver
> has to deal with link training and the link should train to the
> highest speed supported by both ends.  For example, PCIe r4.0, sec 1.2
> says
> 
>    Initialization – During hardware initialization, each PCI Express
>    Link is set up following a negotiation of Lane widths and frequency
>    of operation by the two agents at each end of the Link. No firmware
>    or operating system software is involved.
> 
> Maybe the Intel guys can comment on why HFI needs this code.
> Presumably they didn't write it just for fun, so I assume there's
> some reason.

Yeah there was a reason. It just escapes me what it was right now. Let 
me find out and I'll update.

-Denny
Dennis Dalessandro April 20, 2018, 5:03 p.m. UTC | #6
On 4/17/2018 1:19 PM, Bjorn Helgaas wrote:
> On Tue, Apr 17, 2018 at 03:56:13PM +0000, Marciniszyn, Mike wrote:
>>>
>>> Btw, why is the driver configuring the PCIe link speed?  Isn't this
>>> something we should be handling in the PCI core?
>>
>> The device comes out of reset at the 5GT/s speed. The driver
>> downloads device firmware, programs PCIe registers, and co-ordinates
>> the transition to 8GT/s.
>>
>> This recipe is device specific and is therefore implemented in the
>> hfi1 driver built on top of PCI core functions and macros.
> 
> Do you think this behavior conforms to the spec, or is this a
> workaround for a hardware erratum?

Can't speculate as to why, but this is just the way this hardware works.

> I don't think it's feasible to have every driver deal with this level
> of PCIe detail.  Do you have to do something similar in the Windows
> driver?

No Windows driver.

> Is there something we can do in the PCI core to do the
> reconfiguration?  If you can't go to 8GT/s before doing some
> device-specific initialization, could the driver do that setup and
> then use a generic PCI core interface to reconfigure the link?
> 
> Or maybe if the driver finds the device at 5GT/s, it could download
> the firmware and reset the device.  Would the device then negotiate at
> 8GT/s?

Yes. In fact this is what we should be doing. We use the SBR to trigger 
renegotiation of the link to 8GT/s.

-Denny
Doug Ledford April 27, 2018, 4:14 p.m. UTC | #7
On Mon, 2018-04-16 at 19:28 -0500, Frederick Lawler wrote:
> The IB/hfi1 driver uses custom macros to configure the target link speed. Some 
> macros were previously replaced, but not fully. This patch series addresses the
> configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI,
> and then use them in the following IB/hfi1 patch.
> 
> V3: 
> 	* PCI: Add PCI_EXP_LNKCTL2_TLS_* macros
> 	* Drop patch to use extract_speed() in do_pcie_gen3_transition()
> V2:
> 	* s/LINK/LNK
> 
> Frederick Lawler (2):
>   PCI: Add PCI_EXP_LNKCTL2_TLS* macros
>   IB/hfi1: Replace custom hfi1 macros with PCIe macros
> 
>  drivers/infiniband/hw/hfi1/pcie.c | 24 ++++++++----------------
>  include/uapi/linux/pci_regs.h     |  5 +++++
>  2 files changed, 13 insertions(+), 16 deletions(-)
> 

Where do we stand on this series?  People talked about what the core
might be able to do, or how the driver might be able to do things a
little differently, but I didn't hear a clear conclusion on whether or
not to take this series as is (it seems to be a sensible cleanup and
possibly worthwhile even if we end up changing the hfi1 driver to do a
firmware updated followed by a device reset to get the same results). 
And more importantly as far as I'm concerned, is this sitting on my
plate or on the PCI people's plate?
Bjorn Helgaas April 27, 2018, 5:58 p.m. UTC | #8
On Mon, Apr 16, 2018 at 07:28:23PM -0500, Frederick Lawler wrote:
> The IB/hfi1 driver uses custom macros to configure the target link speed. Some 
> macros were previously replaced, but not fully. This patch series addresses the
> configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI,
> and then use them in the following IB/hfi1 patch.
> 
> V3: 
> 	* PCI: Add PCI_EXP_LNKCTL2_TLS_* macros
> 	* Drop patch to use extract_speed() in do_pcie_gen3_transition()
> V2:
> 	* s/LINK/LNK
> 
> Frederick Lawler (2):
>   PCI: Add PCI_EXP_LNKCTL2_TLS* macros
>   IB/hfi1: Replace custom hfi1 macros with PCIe macros
> 
>  drivers/infiniband/hw/hfi1/pcie.c | 24 ++++++++----------------
>  include/uapi/linux/pci_regs.h     |  5 +++++
>  2 files changed, 13 insertions(+), 16 deletions(-)

I applied these, with Michael's reviewed-by on the second, to
pci/misc for v4.18, thanks!

Ideally we could pull some of this link speed management code into the
PCI core someday, but I think these patches are useful as-is.

Oh, I did s/PCI_EXP_LNKCTL2_TLS_2_5GB/PCI_EXP_LNKCTL2_TLS_2_5GT/
because the link speeds are actually defined in terms of GT/s, not
GB/s.
Doug Ledford April 27, 2018, 6:18 p.m. UTC | #9
On Fri, 2018-04-27 at 12:58 -0500, Bjorn Helgaas wrote:
> On Mon, Apr 16, 2018 at 07:28:23PM -0500, Frederick Lawler wrote:
> > The IB/hfi1 driver uses custom macros to configure the target link speed. Some 
> > macros were previously replaced, but not fully. This patch series addresses the
> > configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI,
> > and then use them in the following IB/hfi1 patch.
> > 
> > V3: 
> > 	* PCI: Add PCI_EXP_LNKCTL2_TLS_* macros
> > 	* Drop patch to use extract_speed() in do_pcie_gen3_transition()
> > V2:
> > 	* s/LINK/LNK
> > 
> > Frederick Lawler (2):
> >   PCI: Add PCI_EXP_LNKCTL2_TLS* macros
> >   IB/hfi1: Replace custom hfi1 macros with PCIe macros
> > 
> >  drivers/infiniband/hw/hfi1/pcie.c | 24 ++++++++----------------
> >  include/uapi/linux/pci_regs.h     |  5 +++++
> >  2 files changed, 13 insertions(+), 16 deletions(-)
> 
> I applied these, with Michael's reviewed-by on the second, to
> pci/misc for v4.18, thanks!
> 
> Ideally we could pull some of this link speed management code into the
> PCI core someday, but I think these patches are useful as-is.
> 
> Oh, I did s/PCI_EXP_LNKCTL2_TLS_2_5GB/PCI_EXP_LNKCTL2_TLS_2_5GT/
> because the link speeds are actually defined in terms of GT/s, not
> GB/s.

Thanks :-)