diff mbox series

[1/3] powernv/iov: Ensure the pdn for VFs always contains a valid PE number

Message ID 20190930020848.25767-2-oohall@gmail.com (mailing list archive)
State Superseded
Headers show
Series [1/3] powernv/iov: Ensure the pdn for VFs always contains a valid PE number | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch next (6edfc6487b474fe01857dc3f1a9cd701bb9b21c8)
snowpatch_ozlabs/checkpatch success total: 0 errors, 0 warnings, 0 checks, 68 lines checked

Commit Message

Oliver O'Halloran Sept. 30, 2019, 2:08 a.m. UTC
On PowerNV we use the pcibios_sriov_enable() hook to do two things:

1. Create a pci_dn structure for each of the VFs, and
2. Configure the PHB's internal BARs that map MMIO ranges to PEs
   so that each VF has it's own PE. Note that the PE also determines
   the IOMMU table the HW uses for the device.

Currently we do not set the pe_number field of the pci_dn immediately after
assigning the PE number for the VF that it represents. Instead, we do that
in a fixup (see pnv_pci_dma_dev_setup) which is run inside the
pcibios_add_device() hook which is run prior to adding the device to the
bus.

On PowerNV we add the device to it's IOMMU group using a bus notifier and
in order for this to work the PE number needs to be known when the bus
notifier is run. This works today since the PE number is set in the fixup
which runs before adding the device to the bus. However, if we want to move
the fixup to a later stage this will break.

We can fix this by setting the pdn->pe_number inside of
pcibios_sriov_enable(). There's no good to avoid this since we already have
all the required information at that point, so... do that. Moving this
earlier does cause two problems:

1. We trip the WARN_ON() in the fixup code, and
2. The EEH core will clear pdn->pe_number while recovering VFs.

The only justification for either of these is a comment in eeh_rmv_device()
suggesting that pdn->pe_number *must* be set to IODA_INVALID_PE in order
for the VF to be scanned. However, this comment appears to have no basis in
reality so just delete it.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
Can't get rid of the fixup entirely since we need it to set the
ioda_pe->pdev back-pointer. I'll look at killing that another time.
---
 arch/powerpc/kernel/eeh_driver.c          |  6 ------
 arch/powerpc/platforms/powernv/pci-ioda.c | 19 +++++++++++++++----
 arch/powerpc/platforms/powernv/pci.c      |  4 ----
 3 files changed, 15 insertions(+), 14 deletions(-)

Comments

Bjorn Helgaas Sept. 30, 2019, 5:09 p.m. UTC | #1
On Mon, Sep 30, 2019 at 12:08:46PM +1000, Oliver O'Halloran wrote:

This is all powerpc, so I assume Michael will handle this.  Just
random things I noticed; ignore if they don't make sense:

> On PowerNV we use the pcibios_sriov_enable() hook to do two things:
> 
> 1. Create a pci_dn structure for each of the VFs, and
> 2. Configure the PHB's internal BARs that map MMIO ranges to PEs
>    so that each VF has it's own PE. Note that the PE also determines

s/it's/its/

>    the IOMMU table the HW uses for the device.
> 
> Currently we do not set the pe_number field of the pci_dn immediately after
> assigning the PE number for the VF that it represents. Instead, we do that
> in a fixup (see pnv_pci_dma_dev_setup) which is run inside the
> pcibios_add_device() hook which is run prior to adding the device to the
> bus.
> 
> On PowerNV we add the device to it's IOMMU group using a bus notifier and

s/it's/its/

> in order for this to work the PE number needs to be known when the bus
> notifier is run. This works today since the PE number is set in the fixup
> which runs before adding the device to the bus. However, if we want to move
> the fixup to a later stage this will break.
> 
> We can fix this by setting the pdn->pe_number inside of
> pcibios_sriov_enable(). There's no good to avoid this since we already have

s/no good/no good reason/ ?

Not quite sure what "this" refers to ... "no good reason to avoid
setting pdn->pe_number in pcibios_sriov_enable()"?  The double
negative makes it a little hard to parse.

> all the required information at that point, so... do that. Moving this
> earlier does cause two problems:
> 
> 1. We trip the WARN_ON() in the fixup code, and
> 2. The EEH core will clear pdn->pe_number while recovering VFs.
> 
> The only justification for either of these is a comment in eeh_rmv_device()
> suggesting that pdn->pe_number *must* be set to IODA_INVALID_PE in order
> for the VF to be scanned. However, this comment appears to have no basis in
> reality so just delete it.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> Can't get rid of the fixup entirely since we need it to set the
> ioda_pe->pdev back-pointer. I'll look at killing that another time.
> ---
>  arch/powerpc/kernel/eeh_driver.c          |  6 ------
>  arch/powerpc/platforms/powernv/pci-ioda.c | 19 +++++++++++++++----
>  arch/powerpc/platforms/powernv/pci.c      |  4 ----
>  3 files changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index d9279d0..7955fba 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -541,12 +541,6 @@ static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
>  
>  		pci_iov_remove_virtfn(edev->physfn, pdn->vf_index);
>  		edev->pdev = NULL;
> -
> -		/*
> -		 * We have to set the VF PE number to invalid one, which is
> -		 * required to plug the VF successfully.
> -		 */
> -		pdn->pe_number = IODA_INVALID_PE;
>  #endif
>  		if (rmv_data)
>  			list_add(&edev->rmv_entry, &rmv_data->removed_vf_list);
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 5e3172d..70508b3 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1558,6 +1558,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  
>  	/* Reserve PE for each VF */
>  	for (vf_index = 0; vf_index < num_vfs; vf_index++) {
> +		int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
> +		int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
> +		struct pci_dn *vf_pdn;
> +
>  		if (pdn->m64_single_mode)
>  			pe_num = pdn->pe_num_map[vf_index];
>  		else
> @@ -1570,13 +1574,11 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  		pe->pbus = NULL;
>  		pe->parent_dev = pdev;
>  		pe->mve_number = -1;
> -		pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
> -			   pci_iov_virtfn_devfn(pdev, vf_index);
> +		pe->rid = (vf_bus << 8) | vf_devfn;
>  
>  		pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n",

Not related to *this* patch, but this looks like maybe it's supposed
to match the pci_name(), e.g., "%04x:%02x:%02x.%d" from
pci_setup_device()?  If so, the "%04d:%02d:%02d" here will be
confusing since the decimal & hex won't always match.

>  			hose->global_number, pdev->bus->number,

Consider pci_domain_nr(bus) instead of hose->global_number?  It would
be nice if you had the pci_dev * for each VF so you could just use
pci_name(vf) instead of all this domain/bus/PCI_SLOT/FUNC.

> -			PCI_SLOT(pci_iov_virtfn_devfn(pdev, vf_index)),
> -			PCI_FUNC(pci_iov_virtfn_devfn(pdev, vf_index)), pe_num);
> +			PCI_SLOT(vf_devfn), PCI_FUNC(vf_devfn), pe_num);
>  
>  		if (pnv_ioda_configure_pe(phb, pe)) {
>  			/* XXX What do we do here ? */
> @@ -1590,6 +1592,15 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  		list_add_tail(&pe->list, &phb->ioda.pe_list);
>  		mutex_unlock(&phb->ioda.pe_list_mutex);
>  
> +		/* associate this pe to it's pdn */
> +		list_for_each_entry(vf_pdn, &pdn->parent->child_list, list) {
> +			if (vf_pdn->busno == vf_bus &&
> +			    vf_pdn->devfn == vf_devfn) {
> +				vf_pdn->pe_number = pe_num;
> +				break;
> +			}
> +		}
> +
>  		pnv_pci_ioda2_setup_dma_pe(phb, pe);
>  #ifdef CONFIG_IOMMU_API
>  		iommu_register_group(&pe->table_group,
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 2825d00..b7761e2 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -816,16 +816,12 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>  	struct pnv_phb *phb = hose->private_data;
>  #ifdef CONFIG_PCI_IOV
>  	struct pnv_ioda_pe *pe;
> -	struct pci_dn *pdn;
>  
>  	/* Fix the VF pdn PE number */
>  	if (pdev->is_virtfn) {
> -		pdn = pci_get_pdn(pdev);
> -		WARN_ON(pdn->pe_number != IODA_INVALID_PE);
>  		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
>  			if (pe->rid == ((pdev->bus->number << 8) |
>  			    (pdev->devfn & 0xff))) {
> -				pdn->pe_number = pe->pe_number;
>  				pe->pdev = pdev;
>  				break;
>  			}
> -- 
> 2.9.5
>
Oliver O'Halloran Oct. 1, 2019, 1:33 a.m. UTC | #2
On Tue, Oct 1, 2019 at 3:09 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Mon, Sep 30, 2019 at 12:08:46PM +1000, Oliver O'Halloran wrote:
>
> This is all powerpc, so I assume Michael will handle this.  Just
> random things I noticed; ignore if they don't make sense:
>
> > On PowerNV we use the pcibios_sriov_enable() hook to do two things:
> >
> > 1. Create a pci_dn structure for each of the VFs, and
> > 2. Configure the PHB's internal BARs that map MMIO ranges to PEs
> >    so that each VF has it's own PE. Note that the PE also determines
>
> s/it's/its/
>
> >    the IOMMU table the HW uses for the device.
> >
> > Currently we do not set the pe_number field of the pci_dn immediately after
> > assigning the PE number for the VF that it represents. Instead, we do that
> > in a fixup (see pnv_pci_dma_dev_setup) which is run inside the
> > pcibios_add_device() hook which is run prior to adding the device to the
> > bus.
> >
> > On PowerNV we add the device to it's IOMMU group using a bus notifier and
>
> s/it's/its/
>
> > in order for this to work the PE number needs to be known when the bus
> > notifier is run. This works today since the PE number is set in the fixup
> > which runs before adding the device to the bus. However, if we want to move
> > the fixup to a later stage this will break.
> >
> > We can fix this by setting the pdn->pe_number inside of
> > pcibios_sriov_enable(). There's no good to avoid this since we already have
>
> s/no good/no good reason/ ?
>
> Not quite sure what "this" refers to ... "no good reason to avoid
> setting pdn->pe_number in pcibios_sriov_enable()"?  The double
> negative makes it a little hard to parse.

I agree it's a bit vague, I'll re-word it.

> > all the required information at that point, so... do that. Moving this
> > earlier does cause two problems:
> >
> > 1. We trip the WARN_ON() in the fixup code, and
> > 2. The EEH core will clear pdn->pe_number while recovering VFs.
> >
> > The only justification for either of these is a comment in eeh_rmv_device()
> > suggesting that pdn->pe_number *must* be set to IODA_INVALID_PE in order
> > for the VF to be scanned. However, this comment appears to have no basis in
> > reality so just delete it.
> >
> > Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> > ---
> > Can't get rid of the fixup entirely since we need it to set the
> > ioda_pe->pdev back-pointer. I'll look at killing that another time.
> > ---
> >  arch/powerpc/kernel/eeh_driver.c          |  6 ------
> >  arch/powerpc/platforms/powernv/pci-ioda.c | 19 +++++++++++++++----
> >  arch/powerpc/platforms/powernv/pci.c      |  4 ----
> >  3 files changed, 15 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> > index d9279d0..7955fba 100644
> > --- a/arch/powerpc/kernel/eeh_driver.c
> > +++ b/arch/powerpc/kernel/eeh_driver.c
> > @@ -541,12 +541,6 @@ static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
> >
> >               pci_iov_remove_virtfn(edev->physfn, pdn->vf_index);
> >               edev->pdev = NULL;
> > -
> > -             /*
> > -              * We have to set the VF PE number to invalid one, which is
> > -              * required to plug the VF successfully.
> > -              */
> > -             pdn->pe_number = IODA_INVALID_PE;
> >  #endif
> >               if (rmv_data)
> >                       list_add(&edev->rmv_entry, &rmv_data->removed_vf_list);
> > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> > index 5e3172d..70508b3 100644
> > --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> > @@ -1558,6 +1558,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
> >
> >       /* Reserve PE for each VF */
> >       for (vf_index = 0; vf_index < num_vfs; vf_index++) {
> > +             int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
> > +             int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
> > +             struct pci_dn *vf_pdn;
> > +
> >               if (pdn->m64_single_mode)
> >                       pe_num = pdn->pe_num_map[vf_index];
> >               else
> > @@ -1570,13 +1574,11 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
> >               pe->pbus = NULL;
> >               pe->parent_dev = pdev;
> >               pe->mve_number = -1;
> > -             pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
> > -                        pci_iov_virtfn_devfn(pdev, vf_index);
> > +             pe->rid = (vf_bus << 8) | vf_devfn;
> >
> >               pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n",
>
> Not related to *this* patch, but this looks like maybe it's supposed
> to match the pci_name(), e.g., "%04x:%02x:%02x.%d" from
> pci_setup_device()?  If so, the "%04d:%02d:%02d" here will be
> confusing since the decimal & hex won't always match.

That looks plain wrong. I'll send a separate patch to fix it.

> >                       hose->global_number, pdev->bus->number,
>
> Consider pci_domain_nr(bus) instead of hose->global_number?  It would
> be nice if you had the pci_dev * for each VF so you could just use
> pci_name(vf) instead of all this domain/bus/PCI_SLOT/FUNC.

Unfortunately, we don't have pci_devs for the VFs when
pcibios_sriov_enable() is called. On powernv (and pseries) we only
permit config accesses to a BDF when a pci_dn exists for that BDF
because the platform code assumes that one will exist. As a result we
can't scan the VFs until after pcibios_sriov_enable() is called since
that's where pci_dn's are created for the VFs. I'm working on removing
the use of pci_dn from powernv entirely though. Once that's done we
should revisit whether any of this infrastructure is necessary...

Oliver
Alexey Kardashevskiy Oct. 1, 2019, 5:39 a.m. UTC | #3
On 30/09/2019 12:08, Oliver O'Halloran wrote:
> On PowerNV we use the pcibios_sriov_enable() hook to do two things:
> 
> 1. Create a pci_dn structure for each of the VFs, and
> 2. Configure the PHB's internal BARs that map MMIO ranges to PEs
>    so that each VF has it's own PE. Note that the PE also determines
>    the IOMMU table the HW uses for the device.
> 
> Currently we do not set the pe_number field of the pci_dn immediately after
> assigning the PE number for the VF that it represents. Instead, we do that
> in a fixup (see pnv_pci_dma_dev_setup) which is run inside the
> pcibios_add_device() hook which is run prior to adding the device to the
> bus.
> 
> On PowerNV we add the device to it's IOMMU group using a bus notifier and
> in order for this to work the PE number needs to be known when the bus
> notifier is run. This works today since the PE number is set in the fixup
> which runs before adding the device to the bus. However, if we want to move
> the fixup to a later stage this will break.
> 
> We can fix this by setting the pdn->pe_number inside of
> pcibios_sriov_enable(). There's no good to avoid this since we already have
> all the required information at that point, so... do that. Moving this
> earlier does cause two problems:
> 
> 1. We trip the WARN_ON() in the fixup code, and
> 2. The EEH core will clear pdn->pe_number while recovering VFs.
> 
> The only justification for either of these is a comment in eeh_rmv_device()
> suggesting that pdn->pe_number *must* be set to IODA_INVALID_PE in order
> for the VF to be scanned. However, this comment appears to have no basis in
> reality so just delete it.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>


Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>



> ---
> Can't get rid of the fixup entirely since we need it to set the
> ioda_pe->pdev back-pointer. I'll look at killing that another time.
> ---
>  arch/powerpc/kernel/eeh_driver.c          |  6 ------
>  arch/powerpc/platforms/powernv/pci-ioda.c | 19 +++++++++++++++----
>  arch/powerpc/platforms/powernv/pci.c      |  4 ----
>  3 files changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index d9279d0..7955fba 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -541,12 +541,6 @@ static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
>  
>  		pci_iov_remove_virtfn(edev->physfn, pdn->vf_index);
>  		edev->pdev = NULL;
> -
> -		/*
> -		 * We have to set the VF PE number to invalid one, which is
> -		 * required to plug the VF successfully.
> -		 */
> -		pdn->pe_number = IODA_INVALID_PE;
>  #endif
>  		if (rmv_data)
>  			list_add(&edev->rmv_entry, &rmv_data->removed_vf_list);
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 5e3172d..70508b3 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1558,6 +1558,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  
>  	/* Reserve PE for each VF */
>  	for (vf_index = 0; vf_index < num_vfs; vf_index++) {
> +		int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
> +		int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
> +		struct pci_dn *vf_pdn;
> +
>  		if (pdn->m64_single_mode)
>  			pe_num = pdn->pe_num_map[vf_index];
>  		else
> @@ -1570,13 +1574,11 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  		pe->pbus = NULL;
>  		pe->parent_dev = pdev;
>  		pe->mve_number = -1;
> -		pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
> -			   pci_iov_virtfn_devfn(pdev, vf_index);
> +		pe->rid = (vf_bus << 8) | vf_devfn;
>  
>  		pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n",
>  			hose->global_number, pdev->bus->number,
> -			PCI_SLOT(pci_iov_virtfn_devfn(pdev, vf_index)),
> -			PCI_FUNC(pci_iov_virtfn_devfn(pdev, vf_index)), pe_num);
> +			PCI_SLOT(vf_devfn), PCI_FUNC(vf_devfn), pe_num);
>  
>  		if (pnv_ioda_configure_pe(phb, pe)) {
>  			/* XXX What do we do here ? */
> @@ -1590,6 +1592,15 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  		list_add_tail(&pe->list, &phb->ioda.pe_list);
>  		mutex_unlock(&phb->ioda.pe_list_mutex);
>  
> +		/* associate this pe to it's pdn */
> +		list_for_each_entry(vf_pdn, &pdn->parent->child_list, list) {
> +			if (vf_pdn->busno == vf_bus &&
> +			    vf_pdn->devfn == vf_devfn) {
> +				vf_pdn->pe_number = pe_num;
> +				break;
> +			}
> +		}
> +
>  		pnv_pci_ioda2_setup_dma_pe(phb, pe);
>  #ifdef CONFIG_IOMMU_API
>  		iommu_register_group(&pe->table_group,
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 2825d00..b7761e2 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -816,16 +816,12 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>  	struct pnv_phb *phb = hose->private_data;
>  #ifdef CONFIG_PCI_IOV
>  	struct pnv_ioda_pe *pe;
> -	struct pci_dn *pdn;
>  
>  	/* Fix the VF pdn PE number */
>  	if (pdev->is_virtfn) {
> -		pdn = pci_get_pdn(pdev);
> -		WARN_ON(pdn->pe_number != IODA_INVALID_PE);
>  		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
>  			if (pe->rid == ((pdev->bus->number << 8) |
>  			    (pdev->devfn & 0xff))) {
> -				pdn->pe_number = pe->pe_number;
>  				pe->pdev = pdev;
>  				break;
>  			}
>
Michael Ellerman Oct. 11, 2019, 8:27 a.m. UTC | #4
"Oliver O'Halloran" <oohall@gmail.com> writes:
> On Tue, Oct 1, 2019 at 3:09 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>> On Mon, Sep 30, 2019 at 12:08:46PM +1000, Oliver O'Halloran wrote:
>> This is all powerpc, so I assume Michael will handle this.  Just
>> random things I noticed; ignore if they don't make sense:
>>
>> > On PowerNV we use the pcibios_sriov_enable() hook to do two things:
>> >
>> > 1. Create a pci_dn structure for each of the VFs, and
>> > 2. Configure the PHB's internal BARs that map MMIO ranges to PEs
>> >    so that each VF has it's own PE. Note that the PE also determines
>>
>> s/it's/its/
>>
>> >    the IOMMU table the HW uses for the device.
>> >
>> > Currently we do not set the pe_number field of the pci_dn immediately after
>> > assigning the PE number for the VF that it represents. Instead, we do that
>> > in a fixup (see pnv_pci_dma_dev_setup) which is run inside the
>> > pcibios_add_device() hook which is run prior to adding the device to the
>> > bus.
>> >
>> > On PowerNV we add the device to it's IOMMU group using a bus notifier and
>>
>> s/it's/its/
>>
>> > in order for this to work the PE number needs to be known when the bus
>> > notifier is run. This works today since the PE number is set in the fixup
>> > which runs before adding the device to the bus. However, if we want to move
>> > the fixup to a later stage this will break.
>> >
>> > We can fix this by setting the pdn->pe_number inside of
>> > pcibios_sriov_enable(). There's no good to avoid this since we already have
>>
>> s/no good/no good reason/ ?
>>
>> Not quite sure what "this" refers to ... "no good reason to avoid
>> setting pdn->pe_number in pcibios_sriov_enable()"?  The double
>> negative makes it a little hard to parse.
>
> I agree it's a bit vague, I'll re-word it.

So I'm expecting a v2?

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index d9279d0..7955fba 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -541,12 +541,6 @@  static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
 
 		pci_iov_remove_virtfn(edev->physfn, pdn->vf_index);
 		edev->pdev = NULL;
-
-		/*
-		 * We have to set the VF PE number to invalid one, which is
-		 * required to plug the VF successfully.
-		 */
-		pdn->pe_number = IODA_INVALID_PE;
 #endif
 		if (rmv_data)
 			list_add(&edev->rmv_entry, &rmv_data->removed_vf_list);
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 5e3172d..70508b3 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1558,6 +1558,10 @@  static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 
 	/* Reserve PE for each VF */
 	for (vf_index = 0; vf_index < num_vfs; vf_index++) {
+		int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
+		int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
+		struct pci_dn *vf_pdn;
+
 		if (pdn->m64_single_mode)
 			pe_num = pdn->pe_num_map[vf_index];
 		else
@@ -1570,13 +1574,11 @@  static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 		pe->pbus = NULL;
 		pe->parent_dev = pdev;
 		pe->mve_number = -1;
-		pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
-			   pci_iov_virtfn_devfn(pdev, vf_index);
+		pe->rid = (vf_bus << 8) | vf_devfn;
 
 		pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n",
 			hose->global_number, pdev->bus->number,
-			PCI_SLOT(pci_iov_virtfn_devfn(pdev, vf_index)),
-			PCI_FUNC(pci_iov_virtfn_devfn(pdev, vf_index)), pe_num);
+			PCI_SLOT(vf_devfn), PCI_FUNC(vf_devfn), pe_num);
 
 		if (pnv_ioda_configure_pe(phb, pe)) {
 			/* XXX What do we do here ? */
@@ -1590,6 +1592,15 @@  static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 		list_add_tail(&pe->list, &phb->ioda.pe_list);
 		mutex_unlock(&phb->ioda.pe_list_mutex);
 
+		/* associate this pe to it's pdn */
+		list_for_each_entry(vf_pdn, &pdn->parent->child_list, list) {
+			if (vf_pdn->busno == vf_bus &&
+			    vf_pdn->devfn == vf_devfn) {
+				vf_pdn->pe_number = pe_num;
+				break;
+			}
+		}
+
 		pnv_pci_ioda2_setup_dma_pe(phb, pe);
 #ifdef CONFIG_IOMMU_API
 		iommu_register_group(&pe->table_group,
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 2825d00..b7761e2 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -816,16 +816,12 @@  void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
 	struct pnv_phb *phb = hose->private_data;
 #ifdef CONFIG_PCI_IOV
 	struct pnv_ioda_pe *pe;
-	struct pci_dn *pdn;
 
 	/* Fix the VF pdn PE number */
 	if (pdev->is_virtfn) {
-		pdn = pci_get_pdn(pdev);
-		WARN_ON(pdn->pe_number != IODA_INVALID_PE);
 		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
 			if (pe->rid == ((pdev->bus->number << 8) |
 			    (pdev->devfn & 0xff))) {
-				pdn->pe_number = pe->pe_number;
 				pe->pdev = pdev;
 				break;
 			}