Patchwork [02/11] PCI: Try to allocate mem64 above 4G at first

login
register
mail settings
Submitter Yinghai Lu
Date May 23, 2012, 6:34 a.m.
Message ID <1337754877-19759-3-git-send-email-yinghai@kernel.org>
Download mbox | patch
Permalink /patch/160870/
State Rejected
Headers show

Comments

Yinghai Lu - May 23, 2012, 6:34 a.m.
and will fall back to below 4g if it can not find any above 4g.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 drivers/pci/bus.c |   16 +++++++++++++---
 1 files changed, 13 insertions(+), 3 deletions(-)
Linus Torvalds - May 23, 2012, 3:57 p.m.
On Tue, May 22, 2012 at 11:34 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> and will fall back to below 4g if it can not find any above 4g.

Has this been tested on 32-bit machines without PAE? There might be
things that just happen to work because their allocations were always
done bottom-up.

Or do we have something else that protects us from the "oops, we can't
actually *map* those pages"?

                       Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu - May 23, 2012, 5:30 p.m.
On Wed, May 23, 2012 at 8:57 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, May 22, 2012 at 11:34 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> and will fall back to below 4g if it can not find any above 4g.
>
> Has this been tested on 32-bit machines without PAE? There might be
> things that just happen to work because their allocations were always
> done bottom-up.

Good point. that problem should be addressed at first before this patch.

>
> Or do we have something else that protects us from the "oops, we can't
> actually *map* those pages"?

Steven tested on his setup.

I tested some Infiniband cards.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu - May 23, 2012, 6:40 p.m.
On Wed, May 23, 2012 at 10:30 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Wed, May 23, 2012 at 8:57 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Tue, May 22, 2012 at 11:34 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>> and will fall back to below 4g if it can not find any above 4g.
>>
>> Has this been tested on 32-bit machines without PAE? There might be
>> things that just happen to work because their allocations were always
>> done bottom-up.
>
> Good point. that problem should be addressed at first before this patch.

Just checked code for 32bit machines without PAE.

when X86_PAE is not set, phys_addr_t aka resource_size_t will be 32bit.
so in drivers/pci/bus.c::pci_bus_alloc_resource_fit()
will have bottom to 0.
    resource_size_t bottom = PCIBIOS_MAX_MEM_32 + 1ULL;
also in arch/x86/kernel/setup.c::setup_arch()
   iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
will have iomem_resource.end to 0xffffffff

when X86_PAE is set, but CPU does not support PAE.
phys_addr_t aka resource_size_t will be 32bit.
so in drivers/pci/bus.c::pci_bus_alloc_resource_fit()
will have bottom to 4g.
    resource_size_t bottom = PCIBIOS_MAX_MEM_32 + 1ULL;
but
in arch/x86/kernel/setup.c::setup_arch()
   iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
will have iomem_resource.end to 0xffffffff, because x86_phys_bits is 32 when PAE
is not detected in arch/x86/kernel/cpu/common.c::get_cpu_cap.
that mean first try will fail, so it will go to second try with bottom to 0.

so both case are safe with this patch.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 4ce5ef2..2429f1f 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -122,13 +122,17 @@  pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
 	int i, ret = -ENOMEM;
 	struct resource *r;
 	resource_size_t max = -1;
+	resource_size_t bottom = PCIBIOS_MAX_MEM_32 + 1ULL;
 
 	type_mask |= IORESOURCE_IO | IORESOURCE_MEM;
 
 	/* don't allocate too high if the pref mem doesn't support 64bit*/
-	if (!(res->flags & IORESOURCE_MEM_64))
+	if (!(res->flags & IORESOURCE_MEM_64)) {
 		max = PCIBIOS_MAX_MEM_32;
+		bottom = 0;
+	}
 
+again:
 	pci_bus_for_each_resource(bus, r, i) {
 		if (!r)
 			continue;
@@ -145,12 +149,18 @@  pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
 
 		/* Ok, try it out.. */
 		ret = allocate_resource(r, res, size,
-					r->start ? : min,
+					max(bottom, r->start ? : min),
 					max, align,
 					alignf, alignf_data);
 		if (ret == 0)
-			break;
+			return 0;
 	}
+
+	if (bottom != 0) {
+		bottom = 0;
+		goto again;
+	}
+
 	return ret;
 }