Patchwork [02/11] PCI: Try to allocate mem64 above 4G at first

login
register
mail settings
Submitter Bjorn Helgaas
Date May 25, 2012, 4:36 a.m.
Message ID <20120525043651.GA1391@google.com>
Download mbox | patch
Permalink /patch/161243/
State Not Applicable
Headers show

Comments

Bjorn Helgaas - May 25, 2012, 4:36 a.m.
On Wed, May 23, 2012 at 11:40:46AM -0700, Yinghai Lu wrote:
> On Wed, May 23, 2012 at 10:30 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> > On Wed, May 23, 2012 at 8:57 AM, Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> >> On Tue, May 22, 2012 at 11:34 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> >>> and will fall back to below 4g if it can not find any above 4g.
> >>
> >> Has this been tested on 32-bit machines without PAE? There might be
> >> things that just happen to work because their allocations were always
> >> done bottom-up.
> >
> > Good point. that problem should be addressed at first before this patch.
> 
> Just checked code for 32bit machines without PAE.
> 
> when X86_PAE is not set, phys_addr_t aka resource_size_t will be 32bit.
> so in drivers/pci/bus.c::pci_bus_alloc_resource_fit()
> will have bottom to 0.
>     resource_size_t bottom = PCIBIOS_MAX_MEM_32 + 1ULL;
> also in arch/x86/kernel/setup.c::setup_arch()
>    iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
> will have iomem_resource.end to 0xffffffff
> 
> when X86_PAE is set, but CPU does not support PAE.
> phys_addr_t aka resource_size_t will be 32bit.

I think you meant phys_addr_t and resource_size_t will be *64* bit
when X86_PAE is set.  Obvious to you, but quite confusing to non-x86
experts like me :)

> so in drivers/pci/bus.c::pci_bus_alloc_resource_fit()
> will have bottom to 4g.
>     resource_size_t bottom = PCIBIOS_MAX_MEM_32 + 1ULL;
> but
> in arch/x86/kernel/setup.c::setup_arch()
>    iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
> will have iomem_resource.end to 0xffffffff, because x86_phys_bits is 32 when PAE
> is not detected in arch/x86/kernel/cpu/common.c::get_cpu_cap.
> that mean first try will fail, so it will go to second try with bottom to 0.
> 
> so both case are safe with this patch.

I don't really like the dependency on PCIBIOS_MAX_MEM_32 + 1ULL
overflowing to zero -- that means the reader has to know what the
value of PCIBIOS_MAX_MEM_32 is, and things would break in non-obvious
ways if we changed it.

What do you think of a patch like the following?  It makes it
explicit that we can only allocate space the CPU can address.

commit feded2ae21d6160292726ccd5128080d42395be4
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu May 24 22:15:26 2012 -0600

    PCI: try to allocate 64-bit resources above 4GB
    
    If we have a 64-bit resource, try to allocate it above 4GB first.  If that
    fails, either because there's no space or the CPU can't address space above
    4GB (iomem_resource.end is the highest address the CPU supports), we'll
    fall back to allocating space below 4GB.

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu - May 25, 2012, 5:53 p.m.
On Thu, May 24, 2012 at 9:36 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Wed, May 23, 2012 at 11:40:46AM -0700, Yinghai Lu wrote:
>> On Wed, May 23, 2012 at 10:30 AM, Yinghai Lu <yinghai@kernel.org> wrote:
>> > On Wed, May 23, 2012 at 8:57 AM, Linus Torvalds
>> > <torvalds@linux-foundation.org> wrote:
>> >> On Tue, May 22, 2012 at 11:34 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> >>> and will fall back to below 4g if it can not find any above 4g.
>> >>
>> >> Has this been tested on 32-bit machines without PAE? There might be
>> >> things that just happen to work because their allocations were always
>> >> done bottom-up.
>> >
>> > Good point. that problem should be addressed at first before this patch.
>>
>> Just checked code for 32bit machines without PAE.
>>
>> when X86_PAE is not set, phys_addr_t aka resource_size_t will be 32bit.
>> so in drivers/pci/bus.c::pci_bus_alloc_resource_fit()
>> will have bottom to 0.
>>     resource_size_t bottom = PCIBIOS_MAX_MEM_32 + 1ULL;
>> also in arch/x86/kernel/setup.c::setup_arch()
>>    iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
>> will have iomem_resource.end to 0xffffffff
>>
>> when X86_PAE is set, but CPU does not support PAE.
>> phys_addr_t aka resource_size_t will be 32bit.
>
> I think you meant phys_addr_t and resource_size_t will be *64* bit
> when X86_PAE is set.  Obvious to you, but quite confusing to non-x86
> experts like me :)
>
>> so in drivers/pci/bus.c::pci_bus_alloc_resource_fit()
>> will have bottom to 4g.
>>     resource_size_t bottom = PCIBIOS_MAX_MEM_32 + 1ULL;
>> but
>> in arch/x86/kernel/setup.c::setup_arch()
>>    iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
>> will have iomem_resource.end to 0xffffffff, because x86_phys_bits is 32 when PAE
>> is not detected in arch/x86/kernel/cpu/common.c::get_cpu_cap.
>> that mean first try will fail, so it will go to second try with bottom to 0.
>>
>> so both case are safe with this patch.
>
> I don't really like the dependency on PCIBIOS_MAX_MEM_32 + 1ULL
> overflowing to zero -- that means the reader has to know what the
> value of PCIBIOS_MAX_MEM_32 is, and things would break in non-obvious
> ways if we changed it.
>
> What do you think of a patch like the following?  It makes it
> explicit that we can only allocate space the CPU can address.
>
> commit feded2ae21d6160292726ccd5128080d42395be4
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Thu May 24 22:15:26 2012 -0600
>
>    PCI: try to allocate 64-bit resources above 4GB
>
>    If we have a 64-bit resource, try to allocate it above 4GB first.  If that
>    fails, either because there's no space or the CPU can't address space above
>    4GB (iomem_resource.end is the highest address the CPU supports), we'll
>    fall back to allocating space below 4GB.
>
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 4ce5ef2..2c56693 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -121,14 +121,18 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
>  {
>        int i, ret = -ENOMEM;
>        struct resource *r;
> -       resource_size_t max = -1;
> +       resource_size_t start = 0;
> +       resource_size_t end = PCIBIOS_MAX_MEM_32;
>
>        type_mask |= IORESOURCE_IO | IORESOURCE_MEM;
>
> -       /* don't allocate too high if the pref mem doesn't support 64bit*/
> -       if (!(res->flags & IORESOURCE_MEM_64))
> -               max = PCIBIOS_MAX_MEM_32;
> +       /* If this is a 64-bit resource, prefer space above 4GB */
> +       if (res->flags & IORESOURCE_MEM_64) {
> +               start = PCIBIOS_MAX_MEM_32 + 1ULL;
> +               end = iomem_resource.end;

but here we still have PCIBIOS_MAX_MEM_32 + 1ULL ...will still have
overflow to 0..

also because all mmio will in iomem_resource, so we don't need to
specify it, and still keep using -1 as max.
aka avoid referring global iomem_resource here.

So this version is the same as old version, and just reverse checking
           res->flags & IORESOURCE_MEM_64

> +       }
>
> +again:
>        pci_bus_for_each_resource(bus, r, i) {
>                if (!r)
>                        continue;
> @@ -145,12 +149,18 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
>
>                /* Ok, try it out.. */
>                ret = allocate_resource(r, res, size,
> -                                       r->start ? : min,
> -                                       max, align,
> +                                       max(start, r->start ? : min),
> +                                       end, align,
>                                        alignf, alignf_data);
>                if (ret == 0)
> -                       break;
> +                       return 0;
>        }
> +
> +       if (start != 0) {
> +               start = 0;
> +               goto again;
> +       }
> +
>        return ret;
>  }
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu - May 25, 2012, 6:39 p.m.
On Fri, May 25, 2012 at 10:53 AM, Yinghai Lu <yinghai@kernel.org> wrote:
>> I don't really like the dependency on PCIBIOS_MAX_MEM_32 + 1ULL
>> overflowing to zero -- that means the reader has to know what the
>> value of PCIBIOS_MAX_MEM_32 is, and things would break in non-obvious
>> ways if we changed it.
>>

please check if attached one is more clear.

make max and bottom is only related to _MEM and not default one.

-       if (!(res->flags & IORESOURCE_MEM_64))
-               max = PCIBIOS_MAX_MEM_32;
+       if (res->flags & IORESOURCE_MEM) {
+               if (!(res->flags & IORESOURCE_MEM_64))
+                       max = PCIBIOS_MAX_MEM_32;
+               else if (PCIBIOS_MAX_MEM_32 != -1)
+                       bottom = (resource_size_t)(1ULL<<32);
+       }

will still not affect to other arches.


Thanks

Yinghai

Patch

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 4ce5ef2..2c56693 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -121,14 +121,18 @@  pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
 {
 	int i, ret = -ENOMEM;
 	struct resource *r;
-	resource_size_t max = -1;
+	resource_size_t start = 0;
+	resource_size_t end = PCIBIOS_MAX_MEM_32;
 
 	type_mask |= IORESOURCE_IO | IORESOURCE_MEM;
 
-	/* don't allocate too high if the pref mem doesn't support 64bit*/
-	if (!(res->flags & IORESOURCE_MEM_64))
-		max = PCIBIOS_MAX_MEM_32;
+	/* If this is a 64-bit resource, prefer space above 4GB */
+	if (res->flags & IORESOURCE_MEM_64) {
+		start = PCIBIOS_MAX_MEM_32 + 1ULL;
+		end = iomem_resource.end;
+	}
 
+again:
 	pci_bus_for_each_resource(bus, r, i) {
 		if (!r)
 			continue;
@@ -145,12 +149,18 @@  pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
 
 		/* Ok, try it out.. */
 		ret = allocate_resource(r, res, size,
-					r->start ? : min,
-					max, align,
+					max(start, r->start ? : min),
+					end, align,
 					alignf, alignf_data);
 		if (ret == 0)
-			break;
+			return 0;
 	}
+
+	if (start != 0) {
+		start = 0;
+		goto again;
+	}
+
 	return ret;
 }