diff mbox

PCI: iproc: fix resource allocation for BCMA PCIe

Message ID 1484265521-13497-1-git-send-email-aospan@netup.ru
State Not Applicable
Headers show

Commit Message

Abylay Ospan Jan. 12, 2017, 11:58 p.m. UTC
Resource allocated on stack was saved by 'devm_request_resource' to
global 'iomem_resource' but become invalid after 'iproc_pcie_bcma_probe' exit.
So the global 'iomem_resource' was poisoned. This may cause kernel crash
or second PCIe bridge registration failure.

Tested on Broadcom NorthStar machine ('Edgecore ECW7220-L') with two PCIe wifi
adapters (b43 BCM4331 and ath10k QCA988X).

Signed-off-by: Abylay Ospan <aospan@netup.ru>
---
 drivers/pci/host/pcie-iproc-bcma.c | 18 ++++++++----------
 drivers/pci/host/pcie-iproc.h      |  2 ++
 2 files changed, 10 insertions(+), 10 deletions(-)

Comments

Ray Jui Jan. 13, 2017, 12:40 a.m. UTC | #1
Hi Abylay,

On 1/12/2017 3:58 PM, Abylay Ospan wrote:
> Resource allocated on stack was saved by 'devm_request_resource' to
> global 'iomem_resource' but become invalid after 'iproc_pcie_bcma_probe' exit.
> So the global 'iomem_resource' was poisoned. This may cause kernel crash
> or second PCIe bridge registration failure.
> 
> Tested on Broadcom NorthStar machine ('Edgecore ECW7220-L') with two PCIe wifi
> adapters (b43 BCM4331 and ath10k QCA988X).
> 
> Signed-off-by: Abylay Ospan <aospan@netup.ru>

I have not yet looked into this in great details. But if what you
claimed is true, do we have the same problem with multiple PCIe host
drivers that all have their resource allocated on the stack and have
'devm_request_resource' called to save it?

Thanks,

Ray

> ---
>  drivers/pci/host/pcie-iproc-bcma.c | 18 ++++++++----------
>  drivers/pci/host/pcie-iproc.h      |  2 ++
>  2 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pci/host/pcie-iproc-bcma.c b/drivers/pci/host/pcie-iproc-bcma.c
> index bd4c9ec..28f9b89 100644
> --- a/drivers/pci/host/pcie-iproc-bcma.c
> +++ b/drivers/pci/host/pcie-iproc-bcma.c
> @@ -44,8 +44,6 @@ static int iproc_pcie_bcma_probe(struct bcma_device *bdev)
>  {
>  	struct device *dev = &bdev->dev;
>  	struct iproc_pcie *pcie;
> -	LIST_HEAD(res);
> -	struct resource res_mem;
>  	int ret;
>  
>  	pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL);
> @@ -62,21 +60,21 @@ static int iproc_pcie_bcma_probe(struct bcma_device *bdev)
>  	}
>  
>  	pcie->base_addr = bdev->addr;
> +	INIT_LIST_HEAD(&pcie->resources);
>  
> -	res_mem.start = bdev->addr_s[0];
> -	res_mem.end = bdev->addr_s[0] + SZ_128M - 1;
> -	res_mem.name = "PCIe MEM space";
> -	res_mem.flags = IORESOURCE_MEM;
> -	pci_add_resource(&res, &res_mem);
> +	pcie->res_mem.start = bdev->addr_s[0];
> +	pcie->res_mem.end = bdev->addr_s[0] + SZ_128M - 1;
> +	pcie->res_mem.name = "PCIe MEM space";
> +	pcie->res_mem.flags = IORESOURCE_MEM;
> +	pcie->res_mem.child = NULL;
> +	pci_add_resource(&pcie->resources, &pcie->res_mem);
>  
>  	pcie->map_irq = iproc_pcie_bcma_map_irq;
>  
> -	ret = iproc_pcie_setup(pcie, &res);
> +	ret = iproc_pcie_setup(pcie, &pcie->resources);
>  	if (ret)
>  		dev_err(dev, "PCIe controller setup failed\n");
>  
> -	pci_free_resource_list(&res);
> -
>  	bcma_set_drvdata(bdev, pcie);
>  	return ret;
>  }
> diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h
> index 04fed8e..866d649 100644
> --- a/drivers/pci/host/pcie-iproc.h
> +++ b/drivers/pci/host/pcie-iproc.h
> @@ -105,6 +105,8 @@ struct iproc_pcie {
>  
>  	bool need_msi_steer;
>  	struct iproc_msi *msi;
> +	struct resource res_mem;
> +	struct list_head resources;
>  };
>  
>  int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Abylay Ospan Jan. 13, 2017, 5:32 a.m. UTC | #2
Hi Ray,

i'm not sure but looks so. Following drivers is doing same allocation on stack:

drivers/pci/host/pcie-designware.c
drivers/pci/host/pcie-rockchip.c
drivers/pci/host/pcie-xilinx.c
drivers/pci/host/pcie-xilinx-nwl.c
drivers/pci/host/pci-host-common.c
 -> drivers/pci/host/pci-thunder-ecam.c
 -> drivers/pci/host/pci-thunder-pem.c
 -> drivers/pci/host/pci-host-generic.c
drivers/pci/host/pci-versatile.c
drivers/pci/host/pci-xgene.c

other drivers is ok. But need to double-check.

Below is more detailed description of the problem.

Problem is visible when second PCIe bridge registering (but can cause
kernel crash with only one PCIe bridge because broken pointer
introduced to 'iomem_resource' anyway). Here is my global
'iomem_resource' list dump:

after first PCIe bridge registered:
[    3.578650] iomem_resource child=cb039d40 name=PCIe MEM space start=08000000
[    3.585811] iomem_resource child=cb41da80 name=serial start=18000300
[    3.592267] iomem_resource child=cb15ce80 name=serial start=18000400
[    3.598719] iomem_resource child=cb57df00 name=nand start=18028000
[    3.605001] iomem_resource child=cb57dc80 name=iproc-ext start=18028f00
[    3.611723] iomem_resource child=cb57da00 name=iproc-idm start=1811a408
[    3.618433] iomem_resource child=cbfffe80 name=System RAM start=80000000

this dump looks good but before registering second PCIe same dump looks broken:
[    3.669225] iomem_resource child=cb039d40 name=PCIe MEM space start=40000000
[    3.676395] iomem_resource child=cb5e3410 name=bcma0:8 start=cb0f6e10
[    3.682948] iomem_resource child=cb061080 name= start=c1604b04

and second PCIe registration fail with:
[    3.694207] pcie_iproc_bcma bcma0:8: resource collision: [mem
0x40000000-0x47ffffff] conflicts with PCIe MEM space [mem
0x40000000-0x47ffffff]
[    3.707024] pcie_iproc_bcma bcma0:8: PCIe controller setup failed

address 0xcb039d40 from this dumps is allocated on stack and is not
valid after 'iproc_pcie_bcma_probe' exit.

Proposed patch is fixing this issue.

2017-01-12 19:40 GMT-05:00 Ray Jui <ray.jui@broadcom.com>:
> Hi Abylay,
>
> On 1/12/2017 3:58 PM, Abylay Ospan wrote:
>> Resource allocated on stack was saved by 'devm_request_resource' to
>> global 'iomem_resource' but become invalid after 'iproc_pcie_bcma_probe' exit.
>> So the global 'iomem_resource' was poisoned. This may cause kernel crash
>> or second PCIe bridge registration failure.
>>
>> Tested on Broadcom NorthStar machine ('Edgecore ECW7220-L') with two PCIe wifi
>> adapters (b43 BCM4331 and ath10k QCA988X).
>>
>> Signed-off-by: Abylay Ospan <aospan@netup.ru>
>
> I have not yet looked into this in great details. But if what you
> claimed is true, do we have the same problem with multiple PCIe host
> drivers that all have their resource allocated on the stack and have
> 'devm_request_resource' called to save it?
>
> Thanks,
>
> Ray
>
>> ---
>>  drivers/pci/host/pcie-iproc-bcma.c | 18 ++++++++----------
>>  drivers/pci/host/pcie-iproc.h      |  2 ++
>>  2 files changed, 10 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/pci/host/pcie-iproc-bcma.c b/drivers/pci/host/pcie-iproc-bcma.c
>> index bd4c9ec..28f9b89 100644
>> --- a/drivers/pci/host/pcie-iproc-bcma.c
>> +++ b/drivers/pci/host/pcie-iproc-bcma.c
>> @@ -44,8 +44,6 @@ static int iproc_pcie_bcma_probe(struct bcma_device *bdev)
>>  {
>>       struct device *dev = &bdev->dev;
>>       struct iproc_pcie *pcie;
>> -     LIST_HEAD(res);
>> -     struct resource res_mem;
>>       int ret;
>>
>>       pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL);
>> @@ -62,21 +60,21 @@ static int iproc_pcie_bcma_probe(struct bcma_device *bdev)
>>       }
>>
>>       pcie->base_addr = bdev->addr;
>> +     INIT_LIST_HEAD(&pcie->resources);
>>
>> -     res_mem.start = bdev->addr_s[0];
>> -     res_mem.end = bdev->addr_s[0] + SZ_128M - 1;
>> -     res_mem.name = "PCIe MEM space";
>> -     res_mem.flags = IORESOURCE_MEM;
>> -     pci_add_resource(&res, &res_mem);
>> +     pcie->res_mem.start = bdev->addr_s[0];
>> +     pcie->res_mem.end = bdev->addr_s[0] + SZ_128M - 1;
>> +     pcie->res_mem.name = "PCIe MEM space";
>> +     pcie->res_mem.flags = IORESOURCE_MEM;
>> +     pcie->res_mem.child = NULL;
>> +     pci_add_resource(&pcie->resources, &pcie->res_mem);
>>
>>       pcie->map_irq = iproc_pcie_bcma_map_irq;
>>
>> -     ret = iproc_pcie_setup(pcie, &res);
>> +     ret = iproc_pcie_setup(pcie, &pcie->resources);
>>       if (ret)
>>               dev_err(dev, "PCIe controller setup failed\n");
>>
>> -     pci_free_resource_list(&res);
>> -
>>       bcma_set_drvdata(bdev, pcie);
>>       return ret;
>>  }
>> diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h
>> index 04fed8e..866d649 100644
>> --- a/drivers/pci/host/pcie-iproc.h
>> +++ b/drivers/pci/host/pcie-iproc.h
>> @@ -105,6 +105,8 @@ struct iproc_pcie {
>>
>>       bool need_msi_steer;
>>       struct iproc_msi *msi;
>> +     struct resource res_mem;
>> +     struct list_head resources;
>>  };
>>
>>  int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res);
>>
Bjorn Helgaas Jan. 28, 2017, 8:44 p.m. UTC | #3
On Fri, Jan 13, 2017 at 02:58:41AM +0300, Abylay Ospan wrote:
> Resource allocated on stack was saved by 'devm_request_resource' to
> global 'iomem_resource' but become invalid after 'iproc_pcie_bcma_probe' exit.
> So the global 'iomem_resource' was poisoned. This may cause kernel crash
> or second PCIe bridge registration failure.
> 
> Tested on Broadcom NorthStar machine ('Edgecore ECW7220-L') with two PCIe wifi
> adapters (b43 BCM4331 and ath10k QCA988X).
> 
> Signed-off-by: Abylay Ospan <aospan@netup.ru>

Hi Abylay, there was some follow-up discussion and I couldn't tell
what the ultimate resolution was.  Please repost this with any acks
you have if this is still necessary.

> ---
>  drivers/pci/host/pcie-iproc-bcma.c | 18 ++++++++----------
>  drivers/pci/host/pcie-iproc.h      |  2 ++
>  2 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pci/host/pcie-iproc-bcma.c b/drivers/pci/host/pcie-iproc-bcma.c
> index bd4c9ec..28f9b89 100644
> --- a/drivers/pci/host/pcie-iproc-bcma.c
> +++ b/drivers/pci/host/pcie-iproc-bcma.c
> @@ -44,8 +44,6 @@ static int iproc_pcie_bcma_probe(struct bcma_device *bdev)
>  {
>  	struct device *dev = &bdev->dev;
>  	struct iproc_pcie *pcie;
> -	LIST_HEAD(res);
> -	struct resource res_mem;
>  	int ret;
>  
>  	pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL);
> @@ -62,21 +60,21 @@ static int iproc_pcie_bcma_probe(struct bcma_device *bdev)
>  	}
>  
>  	pcie->base_addr = bdev->addr;
> +	INIT_LIST_HEAD(&pcie->resources);
>  
> -	res_mem.start = bdev->addr_s[0];
> -	res_mem.end = bdev->addr_s[0] + SZ_128M - 1;
> -	res_mem.name = "PCIe MEM space";
> -	res_mem.flags = IORESOURCE_MEM;
> -	pci_add_resource(&res, &res_mem);
> +	pcie->res_mem.start = bdev->addr_s[0];
> +	pcie->res_mem.end = bdev->addr_s[0] + SZ_128M - 1;
> +	pcie->res_mem.name = "PCIe MEM space";
> +	pcie->res_mem.flags = IORESOURCE_MEM;
> +	pcie->res_mem.child = NULL;
> +	pci_add_resource(&pcie->resources, &pcie->res_mem);
>  
>  	pcie->map_irq = iproc_pcie_bcma_map_irq;
>  
> -	ret = iproc_pcie_setup(pcie, &res);
> +	ret = iproc_pcie_setup(pcie, &pcie->resources);
>  	if (ret)
>  		dev_err(dev, "PCIe controller setup failed\n");
>  
> -	pci_free_resource_list(&res);
> -
>  	bcma_set_drvdata(bdev, pcie);
>  	return ret;
>  }
> diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h
> index 04fed8e..866d649 100644
> --- a/drivers/pci/host/pcie-iproc.h
> +++ b/drivers/pci/host/pcie-iproc.h
> @@ -105,6 +105,8 @@ struct iproc_pcie {
>  
>  	bool need_msi_steer;
>  	struct iproc_msi *msi;
> +	struct resource res_mem;
> +	struct list_head resources;
>  };
>  
>  int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res);
> -- 
> 2.7.4
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 8, 2017, 8:48 p.m. UTC | #4
[+cc Shawn, Wenrui for rockchip error path issue]

Hi Abylay,

On Sun, Jan 29, 2017 at 09:40:38PM -0500, Abylay Ospan wrote:
> Hi Bjorn,
> 
> yes, this is still actual. I didn't seen any other patches/followups about
> this problem (am i missed some ?).
> Seems like this issue affects following drivers too:
> 
> drivers/pci/host/pcie-designware.c
> drivers/pci/host/pcie-rockchip.c
> drivers/pci/host/pcie-xilinx.c
> drivers/pci/host/pcie-xilinx-nwl.c
> drivers/pci/host/pci-host-common.c
>  -> drivers/pci/host/pci-thunder-ecam.c
>  -> drivers/pci/host/pci-thunder-pem.c
>  -> drivers/pci/host/pci-host-generic.c
> drivers/pci/host/pci-versatile.c
> drivers/pci/host/pci-xgene.c
> 
> so, I think we need more general fix for this issue. May be inside
> 'iproc_pcie_setup' (or deeper in 'devm_request_pci_bus_resources') or
> fix all drivers listed above. what is better ?
> 
> If better to fix all this drivers then my original patch is actual and
> need to be approved. If it's better to make more general fix then
> please let me know - I will prepare one.

This looks like an object lifetime issue.  The struct resource for
each host bridge window must live as long as the host bridge itself,
i.e., until we free the window list in pci_release_host_bridge_dev().

This is a problem in iproc_pcie_bcma_probe(), because the "res_mem"
window is on the stack and is deallocated when iproc_pcie_bcma_probe()
returns.

I didn't look at all the drivers you listed above, but I think the
first three are OK in this respect:

  dw_pcie_host_init
    LIST_HEAD(res)                            # on stack
    of_pci_get_host_bridge_resources(np, 0, 0xff, &res, &pp->io_base)
      res = kzalloc()                         # different "res" from above!
      pci_add_resource_offset(resources, res, ...)
    devm_request_pci_bus_resources(&pdev->dev, &res)
    pci_scan_root_bus(pp->dev, pp->root_bus_nr, &dw_pcie_ops, pp, &res)
    error:
      pci_free_resource_list(&res)
    
  rockchip_pcie_probe
    LIST_HEAD(res)                            # on stack
    of_pci_get_host_bridge_resources(dev->of_node, 0, 0xff, &res, &io_base)
    devm_request_pci_bus_resources(dev, &res)
    pci_scan_root_bus(&pdev->dev, 0, &rockchip_pcie_ops, rockchip, &res)
    
  xilinx_pcie_probe
    LIST_HEAD(res)                            # on stack
    of_pci_get_host_bridge_resources(dev->of_node, 0, 0xff, &res, &iobase)
    devm_request_pci_bus_resources(dev, &res)
    pci_create_root_bus(dev, 0, &xilinx_pcie_ops, port, &res)
    error:
      pci_free_resource_list(&res)

The *LIST_HEAD* is on the stack, but the individual resources are
allocated by kzalloc inside of_pci_get_host_bridge_resources(), and
pci_create_root_bus_msi() links the "res" list into its own
bridge->windows list.

In rockchip_pcie_probe(), we are missing the pci_free_resource_list()
that we should have in the error path, but that's a slightly different
problem.

If there are other drivers with this problem, I think each driver
should be fixed, similar to this patch for iproc.

> 2017-01-28 15:44 GMT-05:00 Bjorn Helgaas <helgaas@kernel.org>:
> 
> > On Fri, Jan 13, 2017 at 02:58:41AM +0300, Abylay Ospan wrote:
> > > Resource allocated on stack was saved by 'devm_request_resource' to
> > > global 'iomem_resource' but become invalid after 'iproc_pcie_bcma_probe'
> > exit.
> > > So the global 'iomem_resource' was poisoned. This may cause kernel crash
> > > or second PCIe bridge registration failure.
> > >
> > > Tested on Broadcom NorthStar machine ('Edgecore ECW7220-L') with two
> > PCIe wifi
> > > adapters (b43 BCM4331 and ath10k QCA988X).
> > >
> > > Signed-off-by: Abylay Ospan <aospan@netup.ru>
> > ...

> > > ---
> > >  drivers/pci/host/pcie-iproc-bcma.c | 18 ++++++++----------
> > >  drivers/pci/host/pcie-iproc.h      |  2 ++
> > >  2 files changed, 10 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/pci/host/pcie-iproc-bcma.c
> > b/drivers/pci/host/pcie-iproc-bcma.c
> > > index bd4c9ec..28f9b89 100644
> > > --- a/drivers/pci/host/pcie-iproc-bcma.c
> > > +++ b/drivers/pci/host/pcie-iproc-bcma.c
> > > @@ -44,8 +44,6 @@ static int iproc_pcie_bcma_probe(struct bcma_device
> > *bdev)
> > >  {
> > >       struct device *dev = &bdev->dev;
> > >       struct iproc_pcie *pcie;
> > > -     LIST_HEAD(res);
> > > -     struct resource res_mem;
> > >       int ret;
> > >
> > >       pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL);
> > > @@ -62,21 +60,21 @@ static int iproc_pcie_bcma_probe(struct bcma_device
> > *bdev)
> > >       }
> > >
> > >       pcie->base_addr = bdev->addr;
> > > +     INIT_LIST_HEAD(&pcie->resources);
> > >
> > > -     res_mem.start = bdev->addr_s[0];
> > > -     res_mem.end = bdev->addr_s[0] + SZ_128M - 1;
> > > -     res_mem.name = "PCIe MEM space";
> > > -     res_mem.flags = IORESOURCE_MEM;
> > > -     pci_add_resource(&res, &res_mem);
> > > +     pcie->res_mem.start = bdev->addr_s[0];
> > > +     pcie->res_mem.end = bdev->addr_s[0] + SZ_128M - 1;
> > > +     pcie->res_mem.name = "PCIe MEM space";
> > > +     pcie->res_mem.flags = IORESOURCE_MEM;
> > > +     pcie->res_mem.child = NULL;
> > > +     pci_add_resource(&pcie->resources, &pcie->res_mem);
> > >
> > >       pcie->map_irq = iproc_pcie_bcma_map_irq;
> > >
> > > -     ret = iproc_pcie_setup(pcie, &res);
> > > +     ret = iproc_pcie_setup(pcie, &pcie->resources);
> > >       if (ret)
> > >               dev_err(dev, "PCIe controller setup failed\n");
> > >
> > > -     pci_free_resource_list(&res);
> > > -
> > >       bcma_set_drvdata(bdev, pcie);
> > >       return ret;
> > >  }
> > > diff --git a/drivers/pci/host/pcie-iproc.h
> > b/drivers/pci/host/pcie-iproc.h
> > > index 04fed8e..866d649 100644
> > > --- a/drivers/pci/host/pcie-iproc.h
> > > +++ b/drivers/pci/host/pcie-iproc.h
> > > @@ -105,6 +105,8 @@ struct iproc_pcie {
> > >
> > >       bool need_msi_steer;
> > >       struct iproc_msi *msi;
> > > +     struct resource res_mem;
> > > +     struct list_head resources;

This looks good to me, but I don't think it's necessary to keep the
list_head in the struct iproc_pcie.  It should be safe to use
"LIST_HEAD(res)" on the stack like the other drivers do.  Can you
verify that and get an ack from Ray, Scott, or Jon?

> > >  };
> > >
> > >  int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res);
> > > --
> > > 2.7.4
> > >
> > >
> > > _______________________________________________
> > > linux-arm-kernel mailing list
> > > linux-arm-kernel@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >
> 
> 
> 
> -- 
> Abylay Ospan,
> NetUP Inc.
> http://www.netup.tv
Abylay Ospan Feb. 8, 2017, 10:07 p.m. UTC | #5
Hi Bjorn,

I have checked first listed driver
(drivers/pci/host/pcie-designware.c). Seems like into
'devm_request_pci_bus_resources' we supply same stack allocated 'res'
(actual insert of this pointer to 'iomem_resource' was done inside
'__request_resource'). This 'res' is not changed inside
'of_pci_get_host_bridge_resources'.
I don't have this platforms on hand and cannot test it on real
hadrware (to 100% verify). But investigating this code I see that the
problem exist.

Here is a summary of flow for 'res' to show the problem:

pcie-designware.c:
  LIST_HEAD(res);
  ret = of_pci_get_host_bridge_resources(np, 0, 0xff, &res,
&pp->io_base); <--- 'res' not changing here
  ret = devm_request_pci_bus_resources(&pdev->dev, &res);

  drivers/pci/bus.c:
    err = devm_request_resource(dev, parent, res);

    kernel/resource.c:
      conflict = request_resource_conflict(root, new);
        conflict = __request_resource(root, new);
          *p = new;  <--- here we introduce stack allocated res into
global 'iomem_resource'


Please check and correct me if i'm wrong ?

>
>   dw_pcie_host_init
>     LIST_HEAD(res)                            # on stack
>     of_pci_get_host_bridge_resources(np, 0, 0xff, &res, &pp->io_base)
>       res = kzalloc()                         # different "res" from above!
>       pci_add_resource_offset(resources, res, ...)
>     devm_request_pci_bus_resources(&pdev->dev, &res)
>     pci_scan_root_bus(pp->dev, pp->root_bus_nr, &dw_pcie_ops, pp, &res)
>     error:
>       pci_free_resource_list(&res)

> This looks good to me, but I don't think it's necessary to keep the
> list_head in the struct iproc_pcie.  It should be safe to use
> "LIST_HEAD(res)" on the stack like the other drivers do.  Can you
> verify that and get an ack from Ray, Scott, or Jon?

if my investigation above is true then we need to keep 'res' all the
time we working with the driver (or find another way to fix this
issue).
Bjorn Helgaas Feb. 8, 2017, 10:27 p.m. UTC | #6
On Wed, Feb 08, 2017 at 05:07:27PM -0500, Abylay Ospan wrote:
> Hi Bjorn,
> 
> I have checked first listed driver
> (drivers/pci/host/pcie-designware.c). Seems like into
> 'devm_request_pci_bus_resources' we supply same stack allocated 'res'
> (actual insert of this pointer to 'iomem_resource' was done inside
> '__request_resource'). This 'res' is not changed inside
> 'of_pci_get_host_bridge_resources'.
> I don't have this platforms on hand and cannot test it on real
> hadrware (to 100% verify). But investigating this code I see that the
> problem exist.
> 
> Here is a summary of flow for 'res' to show the problem:
> 
> pcie-designware.c:
>   LIST_HEAD(res);
>   ret = of_pci_get_host_bridge_resources(np, 0, 0xff, &res,
> &pp->io_base); <--- 'res' not changing here
>   ret = devm_request_pci_bus_resources(&pdev->dev, &res);
> 
>   drivers/pci/bus.c:
>     err = devm_request_resource(dev, parent, res);
> 
>     kernel/resource.c:
>       conflict = request_resource_conflict(root, new);
>         conflict = __request_resource(root, new);
>           *p = new;  <--- here we introduce stack allocated res into
> global 'iomem_resource'
> 
> 
> Please check and correct me if i'm wrong ?

The "res" in dw_pcie_host_init() is a list_head (not a struct
resource) and is on the stack.

When we call of_pci_get_host_bridge_resources(), we pass a pointer
("&res") to the empty list.  It kzallocs a struct resource for the bus
range and more for any bridge windows, and adds them to the list.

When we call devm_request_pci_bus_resources(), we pass a pointer
("&res") to the list, which is no longer empty.  It iterates through
the list and calls devm_request_resource() for each resource.  Inside
devm_request_pci_bus_resources(), "res" is the pointer to the resource
(not the list_head), and this resource is the one we kzalloc'd above.

When devm_request_resource() calls request_resource_conflict(), it
passes that pointer to the kzalloc'd resource (the pointer is called
"new" in devm_request_resource()).)

So when __request_resource() assigns "*p = new", it is copying a
pointer to a kzalloc'd struct resource.

This is certainly a twisty maze of similar names for different things,
but I think it is OK if the list_head is on the stack as long as the
struct resources are kzalloc'd.

> >   dw_pcie_host_init
> >     LIST_HEAD(res)                            # on stack
> >     of_pci_get_host_bridge_resources(np, 0, 0xff, &res, &pp->io_base)
> >       res = kzalloc()                         # different "res" from above!
> >       pci_add_resource_offset(resources, res, ...)
> >     devm_request_pci_bus_resources(&pdev->dev, &res)
> >     pci_scan_root_bus(pp->dev, pp->root_bus_nr, &dw_pcie_ops, pp, &res)
> >     error:
> >       pci_free_resource_list(&res)
> 
> > This looks good to me, but I don't think it's necessary to keep the
> > list_head in the struct iproc_pcie.  It should be safe to use
> > "LIST_HEAD(res)" on the stack like the other drivers do.  Can you
> > verify that and get an ack from Ray, Scott, or Jon?
> 
> if my investigation above is true then we need to keep 'res' all the
> time we working with the driver (or find another way to fix this
> issue).
> 
> -- 
> Abylay Ospan,
> NetUP Inc.
> http://www.netup.tv
Abylay Ospan Feb. 8, 2017, 10:39 p.m. UTC | #7
Yes, you right. Now it's clear.

Thanks !

2017-02-08 17:27 GMT-05:00 Bjorn Helgaas <helgaas@kernel.org>:
> On Wed, Feb 08, 2017 at 05:07:27PM -0500, Abylay Ospan wrote:
>> Hi Bjorn,
>>
>> I have checked first listed driver
>> (drivers/pci/host/pcie-designware.c). Seems like into
>> 'devm_request_pci_bus_resources' we supply same stack allocated 'res'
>> (actual insert of this pointer to 'iomem_resource' was done inside
>> '__request_resource'). This 'res' is not changed inside
>> 'of_pci_get_host_bridge_resources'.
>> I don't have this platforms on hand and cannot test it on real
>> hadrware (to 100% verify). But investigating this code I see that the
>> problem exist.
>>
>> Here is a summary of flow for 'res' to show the problem:
>>
>> pcie-designware.c:
>>   LIST_HEAD(res);
>>   ret = of_pci_get_host_bridge_resources(np, 0, 0xff, &res,
>> &pp->io_base); <--- 'res' not changing here
>>   ret = devm_request_pci_bus_resources(&pdev->dev, &res);
>>
>>   drivers/pci/bus.c:
>>     err = devm_request_resource(dev, parent, res);
>>
>>     kernel/resource.c:
>>       conflict = request_resource_conflict(root, new);
>>         conflict = __request_resource(root, new);
>>           *p = new;  <--- here we introduce stack allocated res into
>> global 'iomem_resource'
>>
>>
>> Please check and correct me if i'm wrong ?
>
> The "res" in dw_pcie_host_init() is a list_head (not a struct
> resource) and is on the stack.
>
> When we call of_pci_get_host_bridge_resources(), we pass a pointer
> ("&res") to the empty list.  It kzallocs a struct resource for the bus
> range and more for any bridge windows, and adds them to the list.
>
> When we call devm_request_pci_bus_resources(), we pass a pointer
> ("&res") to the list, which is no longer empty.  It iterates through
> the list and calls devm_request_resource() for each resource.  Inside
> devm_request_pci_bus_resources(), "res" is the pointer to the resource
> (not the list_head), and this resource is the one we kzalloc'd above.
>
> When devm_request_resource() calls request_resource_conflict(), it
> passes that pointer to the kzalloc'd resource (the pointer is called
> "new" in devm_request_resource()).)
>
> So when __request_resource() assigns "*p = new", it is copying a
> pointer to a kzalloc'd struct resource.
>
> This is certainly a twisty maze of similar names for different things,
> but I think it is OK if the list_head is on the stack as long as the
> struct resources are kzalloc'd.
>
>> >   dw_pcie_host_init
>> >     LIST_HEAD(res)                            # on stack
>> >     of_pci_get_host_bridge_resources(np, 0, 0xff, &res, &pp->io_base)
>> >       res = kzalloc()                         # different "res" from above!
>> >       pci_add_resource_offset(resources, res, ...)
>> >     devm_request_pci_bus_resources(&pdev->dev, &res)
>> >     pci_scan_root_bus(pp->dev, pp->root_bus_nr, &dw_pcie_ops, pp, &res)
>> >     error:
>> >       pci_free_resource_list(&res)
>>
>> > This looks good to me, but I don't think it's necessary to keep the
>> > list_head in the struct iproc_pcie.  It should be safe to use
>> > "LIST_HEAD(res)" on the stack like the other drivers do.  Can you
>> > verify that and get an ack from Ray, Scott, or Jon?
>>
>> if my investigation above is true then we need to keep 'res' all the
>> time we working with the driver (or find another way to fix this
>> issue).
>>
>> --
>> Abylay Ospan,
>> NetUP Inc.
>> http://www.netup.tv
diff mbox

Patch

diff --git a/drivers/pci/host/pcie-iproc-bcma.c b/drivers/pci/host/pcie-iproc-bcma.c
index bd4c9ec..28f9b89 100644
--- a/drivers/pci/host/pcie-iproc-bcma.c
+++ b/drivers/pci/host/pcie-iproc-bcma.c
@@ -44,8 +44,6 @@  static int iproc_pcie_bcma_probe(struct bcma_device *bdev)
 {
 	struct device *dev = &bdev->dev;
 	struct iproc_pcie *pcie;
-	LIST_HEAD(res);
-	struct resource res_mem;
 	int ret;
 
 	pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL);
@@ -62,21 +60,21 @@  static int iproc_pcie_bcma_probe(struct bcma_device *bdev)
 	}
 
 	pcie->base_addr = bdev->addr;
+	INIT_LIST_HEAD(&pcie->resources);
 
-	res_mem.start = bdev->addr_s[0];
-	res_mem.end = bdev->addr_s[0] + SZ_128M - 1;
-	res_mem.name = "PCIe MEM space";
-	res_mem.flags = IORESOURCE_MEM;
-	pci_add_resource(&res, &res_mem);
+	pcie->res_mem.start = bdev->addr_s[0];
+	pcie->res_mem.end = bdev->addr_s[0] + SZ_128M - 1;
+	pcie->res_mem.name = "PCIe MEM space";
+	pcie->res_mem.flags = IORESOURCE_MEM;
+	pcie->res_mem.child = NULL;
+	pci_add_resource(&pcie->resources, &pcie->res_mem);
 
 	pcie->map_irq = iproc_pcie_bcma_map_irq;
 
-	ret = iproc_pcie_setup(pcie, &res);
+	ret = iproc_pcie_setup(pcie, &pcie->resources);
 	if (ret)
 		dev_err(dev, "PCIe controller setup failed\n");
 
-	pci_free_resource_list(&res);
-
 	bcma_set_drvdata(bdev, pcie);
 	return ret;
 }
diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h
index 04fed8e..866d649 100644
--- a/drivers/pci/host/pcie-iproc.h
+++ b/drivers/pci/host/pcie-iproc.h
@@ -105,6 +105,8 @@  struct iproc_pcie {
 
 	bool need_msi_steer;
 	struct iproc_msi *msi;
+	struct resource res_mem;
+	struct list_head resources;
 };
 
 int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res);