Patchwork [2/2] exec: make address spaces 64-bit wide

login
register
mail settings
Submitter Paolo Bonzini
Date Nov. 7, 2013, 4:14 p.m.
Message ID <1383840877-2861-3-git-send-email-pbonzini@redhat.com>
Download mbox | patch
Permalink /patch/289403/
State New
Headers show

Comments

Paolo Bonzini - Nov. 7, 2013, 4:14 p.m.
As an alternative to commit 818f86b (exec: limit system memory
size, 2013-11-04) let's just make all address spaces 64-bit wide.
This eliminates problems with phys_page_find ignoring bits above
TARGET_PHYS_ADDR_SPACE_BITS and address_space_translate_internal
consequently messing up the computations.

In Luiz's reported crash, at startup gdb attempts to read from address
0xffffffffffffffe6 to 0xffffffffffffffff inclusive.  The region it gets
is the newly introduced master abort region, which is as big as the PCI
address space (see pci_bus_init).  Due to a typo that's only 2^63-1,
not 2^64.  But we get it anyway because phys_page_find ignores the upper
bits of the physical address.  In address_space_translate_internal then

    diff = int128_sub(section->mr->size, int128_make64(addr));
    *plen = int128_get64(int128_min(diff, int128_make64(*plen)));
Michael S. Tsirkin - Nov. 10, 2013, 10:31 a.m.
On Thu, Nov 07, 2013 at 05:14:37PM +0100, Paolo Bonzini wrote:
> As an alternative to commit 818f86b (exec: limit system memory
> size, 2013-11-04) let's just make all address spaces 64-bit wide.
> This eliminates problems with phys_page_find ignoring bits above
> TARGET_PHYS_ADDR_SPACE_BITS and address_space_translate_internal
> consequently messing up the computations.
> 
> In Luiz's reported crash, at startup gdb attempts to read from address
> 0xffffffffffffffe6 to 0xffffffffffffffff inclusive.  The region it gets
> is the newly introduced master abort region, which is as big as the PCI
> address space (see pci_bus_init).  Due to a typo that's only 2^63-1,
> not 2^64.  But we get it anyway because phys_page_find ignores the upper
> bits of the physical address.  In address_space_translate_internal then
> 
>     diff = int128_sub(section->mr->size, int128_make64(addr));
>     *plen = int128_get64(int128_min(diff, int128_make64(*plen)));
> 
> diff becomes negative, and int128_get64 booms.
> 
> The size of the PCI address space region should be fixed anyway.
> 
> Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

So this causes a 12% performance regression on some TCG
tests, I think we should look into a smarter
datastructure to solve the issues.

> ---
>  exec.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 9e2fc4b..d5ce3da 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -89,7 +89,7 @@ struct PhysPageEntry {
>  };
>  
>  /* Size of the L2 (and L3, etc) page tables.  */
> -#define ADDR_SPACE_BITS TARGET_PHYS_ADDR_SPACE_BITS
> +#define ADDR_SPACE_BITS 64
>  
>  #define P_L2_BITS 10
>  #define P_L2_SIZE (1 << P_L2_BITS)
> @@ -1750,11 +1750,7 @@ static void memory_map_init(void)
>  {
>      system_memory = g_malloc(sizeof(*system_memory));
>  
> -    assert(ADDR_SPACE_BITS <= 64);
> -
> -    memory_region_init(system_memory, NULL, "system",
> -                       ADDR_SPACE_BITS == 64 ?
> -                       UINT64_MAX : (0x1ULL << ADDR_SPACE_BITS));
> +    memory_region_init(system_memory, NULL, "system", UINT64_MAX);
>      address_space_init(&address_space_memory, system_memory, "memory");
>  
>      system_io = g_malloc(sizeof(*system_io));
> -- 
> 1.8.4.2
>
Paolo Bonzini - Nov. 11, 2013, 10:15 a.m.
Il 10/11/2013 11:31, Michael S. Tsirkin ha scritto:
> > Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> So this causes a 12% performance regression on some TCG
> tests, I think we should look into a smarter
> datastructure to solve the issues.

It causes a 12% performance regression in a single testcase where KVM
has a 150x performance regression.  This says a lot about the relevance
of the testcase.

In any case, I have patches to avoid the regression.  For 1.7 we can
just revert the patches, for 1.8 we can apply this patch together with
the optimizations that avoid introducing a regression.

Paolo

Patch

diff becomes negative, and int128_get64 booms.

The size of the PCI address space region should be fixed anyway.

Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index 9e2fc4b..d5ce3da 100644
--- a/exec.c
+++ b/exec.c
@@ -89,7 +89,7 @@  struct PhysPageEntry {
 };
 
 /* Size of the L2 (and L3, etc) page tables.  */
-#define ADDR_SPACE_BITS TARGET_PHYS_ADDR_SPACE_BITS
+#define ADDR_SPACE_BITS 64
 
 #define P_L2_BITS 10
 #define P_L2_SIZE (1 << P_L2_BITS)
@@ -1750,11 +1750,7 @@  static void memory_map_init(void)
 {
     system_memory = g_malloc(sizeof(*system_memory));
 
-    assert(ADDR_SPACE_BITS <= 64);
-
-    memory_region_init(system_memory, NULL, "system",
-                       ADDR_SPACE_BITS == 64 ?
-                       UINT64_MAX : (0x1ULL << ADDR_SPACE_BITS));
+    memory_region_init(system_memory, NULL, "system", UINT64_MAX);
     address_space_init(&address_space_memory, system_memory, "memory");
 
     system_io = g_malloc(sizeof(*system_io));