mbox series

[GIT,PULL] PCI fixes for v5.7

Message ID 20200423173955.GA193359@google.com
State New
Headers show
Series [GIT,PULL] PCI fixes for v5.7 | expand

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git tags/pci-v5.7-fixes-1

Message

Bjorn Helgaas April 23, 2020, 5:39 p.m. UTC
PCI fixes:

  - Workaround Apex TPU class code issue that prevents resource
    assignment (Bjorn Helgaas)

  - Update MAINTAINERS to add Rob Herring for native PCI controller
    drivers (Lorenzo Pieralisi)


The following changes since commit 8f3d9f354286745c751374f5f1fcafee6b3f3136:

  Linux 5.7-rc1 (2020-04-12 12:35:55 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git tags/pci-v5.7-fixes-1

for you to fetch changes up to ef46738cc47adb6f70d548c03bd44508f18e14a5:

  MAINTAINERS: Add Rob Herring and remove Andy Murray as PCI reviewers (2020-04-22 10:53:37 -0500)

----------------------------------------------------------------
pci-v5.7-fixes-1

----------------------------------------------------------------
Bjorn Helgaas (1):
      PCI: Move Apex Edge TPU class quirk to fix BAR assignment

Lorenzo Pieralisi (1):
      MAINTAINERS: Add Rob Herring and remove Andy Murray as PCI reviewers

 MAINTAINERS                          | 2 +-
 drivers/pci/quirks.c                 | 7 +++++++
 drivers/staging/gasket/apex_driver.c | 7 -------
 3 files changed, 8 insertions(+), 8 deletions(-)

Comments

Linus Torvalds April 23, 2020, 6:22 p.m. UTC | #1
On Thu, Apr 23, 2020 at 10:40 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
>   - Workaround Apex TPU class code issue that prevents resource
>     assignment (Bjorn Helgaas)

Hmm.

I have no objections to that patch, but I do wonder if it might not be
better to try to actually assign the resource at enable_resource time?

Put another way: if I read the situation correctly, what happened is
that the hardware is broken and doesn't have the proper class code,
and so the resource is not initially assigned at all. But then the
driver matches on the device ID, and tries to use the device, and then
we get into trouble at pci_enable_resources().

But is there any reason we don't just at least try to do
pci_assign_resource() at that point? Yeah, because we didn't do it at
bus scanning, maybe there's no room for it, but that's what we do for
the PCI ROM resources (which I think we also don't claim by default)
when drivers ask to map them.

The pci/rom.c code does

        /* assign the ROM an address if it doesn't have one */
        if (res->parent == NULL && pci_assign_resource(pdev, PCI_ROM_RESOURCE))
                return NULL;

could we perhaps do the same in enable_resource?

Your patch is obviously much better for an -rc kernel, so this is more
of a longer-term "wouldn't it be less fragile to ..." query.

Alternatively, maybe we should do resource assignment even for
PCI_CLASS_NOT_DEFINED?

                     Linus
pr-tracker-bot@kernel.org April 23, 2020, 8:50 p.m. UTC | #2
The pull request you sent on Thu, 23 Apr 2020 12:39:55 -0500:

> git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git tags/pci-v5.7-fixes-1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/25b1fa8dfb3fe2578c04a077953b13c534f30902

Thank you!
Bjorn Helgaas April 24, 2020, 3:23 a.m. UTC | #3
On Thu, Apr 23, 2020 at 11:22:20AM -0700, Linus Torvalds wrote:
> On Thu, Apr 23, 2020 at 10:40 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> >   - Workaround Apex TPU class code issue that prevents resource
> >     assignment (Bjorn Helgaas)
> 
> Hmm.
> 
> I have no objections to that patch, but I do wonder if it might not be
> better to try to actually assign the resource at enable_resource time?
> 
> Put another way: if I read the situation correctly, what happened is
> that the hardware is broken and doesn't have the proper class code,
> and so the resource is not initially assigned at all. But then the
> driver matches on the device ID, and tries to use the device, and then
> we get into trouble at pci_enable_resources().

Exactly.

> But is there any reason we don't just at least try to do
> pci_assign_resource() at that point? Yeah, because we didn't do it at
> bus scanning, maybe there's no room for it, but that's what we do for
> the PCI ROM resources (which I think we also don't claim by default)
> when drivers ask to map them.

That might make sense, but I think we should be consistent with the
checking __dev_sort_resources() does, e.g., skipping
PCI_CLASS_NOT_DEFINED, or at least understand why it's safe to be
different.

> The pci/rom.c code does
> 
>         /* assign the ROM an address if it doesn't have one */
>         if (res->parent == NULL && pci_assign_resource(pdev, PCI_ROM_RESOURCE))
>                 return NULL;
> 
> could we perhaps do the same in enable_resource?
> 
> Your patch is obviously much better for an -rc kernel, so this is more
> of a longer-term "wouldn't it be less fragile to ..." query.
> 
> Alternatively, maybe we should do resource assignment even for
> PCI_CLASS_NOT_DEFINED?

Yeah.  I don't know the history of why we skip PCI_CLASS_NOT_DEFINED.
I did consider about the fact that we're skipping it, to make it
easier to debug next time.

PCI_CLASS_NOT_DEFINED is supposed to be for devices built before the
Class Code field was defined.  That note is at least as old as PCI 2.2
from 1998, so there shouldn't be *that* many of those devices left.

Bjorn
Bjorn Helgaas April 24, 2020, 3:55 a.m. UTC | #4
On Thu, Apr 23, 2020 at 10:23:05PM -0500, Bjorn Helgaas wrote:
> Yeah.  I don't know the history of why we skip PCI_CLASS_NOT_DEFINED.
> I did consider about the fact that we're skipping it, to make it
> easier to debug next time.

I did consider *warning* about ...
Luís Mendes April 24, 2020, 5:21 p.m. UTC | #5
I think a "warning" would of great value, as it would be easy to
identify the root cause of such issues pretty quickly.

On Fri, Apr 24, 2020 at 4:55 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Apr 23, 2020 at 10:23:05PM -0500, Bjorn Helgaas wrote:
> > Yeah.  I don't know the history of why we skip PCI_CLASS_NOT_DEFINED.
> > I did consider about the fact that we're skipping it, to make it
> > easier to debug next time.
>
> I did consider *warning* about ...