Patchwork [v2] linux-user: Fix 32-on-64 mmap for x86_64

login
register
mail settings
Submitter Alexander Graf
Date Nov. 24, 2011, 11:43 p.m.
Message ID <1322178221-8481-1-git-send-email-agraf@suse.de>
Download mbox | patch
Permalink /patch/127628/
State New
Headers show

Comments

Alexander Graf - Nov. 24, 2011, 11:43 p.m.
When running a 32 bit guest on a 64 bit host, we can run into trouble while
calling the host's mmap() because it could potentially give us a 64 bit
return value which the guest can't interpret.

There are 2 ways of dealing with this:

  1) Only do MAP_FIXED mmap calls and implement our own vm management in QEMU
  2) Tell the kernel that we only want mappings in the lower 32 bits

Way 1 is very involved and hard to do. It's been advocated forever now but
nobody sat down to actually implement it.

Way 2 is easy. It's what this patch does. However, it only works on x86_64
because that's the only platform implementing the MAP_32BIT flag. Since most
people are on x86_64 though, I think it's a good enough compromise for now
though

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v1 -> v2:

  - make prettier by just wrapping mmap in linux-user/mmap.c
---
 linux-user/mmap.c |   46 +++++++++++++++++++++++++++++++++-------------
 1 files changed, 33 insertions(+), 13 deletions(-)
Peter Maydell - Nov. 25, 2011, 1:06 p.m.
On 24 November 2011 23:43, Alexander Graf <agraf@suse.de> wrote:
> ---
>
> v1 -> v2:
>
>  - make prettier by just wrapping mmap in linux-user/mmap.c

Hmm. I prefer the non-wrapped version :-)
In particular, qemu_mmap() implies that (like other qemu_foo
wrappers) this is a portability wrapper that should be used for
all mmap calls. But actually we only want to apply MAP_32BIT
for those mmap()s which are mmapping guest memory requests.
So just having an extra flag in the cases where we need the
flag seems more straightforward.

I'd prefer a

#if defined(MAP_32BIT) && defined(__x86_64__) && (TARGET_LONG_BITS == 32)
#define QEMU_MAP_32BIT MAP_32BIT
#else
#define QEMU_MAP_32BIT 0
#endif

and then use QEMU_MAP_32BIT. (That way it's obvious when you're
looking at the mmap() calls that they're not using the host's
MAP_32BIT but something that might be different.)

-- PMM
Alexander Graf - Dec. 19, 2011, 1:09 p.m.
On 25.11.2011, at 14:06, Peter Maydell wrote:

> On 24 November 2011 23:43, Alexander Graf <agraf@suse.de> wrote:
>> ---
>> 
>> v1 -> v2:
>> 
>>  - make prettier by just wrapping mmap in linux-user/mmap.c
> 
> Hmm. I prefer the non-wrapped version :-)
> In particular, qemu_mmap() implies that (like other qemu_foo
> wrappers) this is a portability wrapper that should be used for
> all mmap calls. But actually we only want to apply MAP_32BIT
> for those mmap()s which are mmapping guest memory requests.
> So just having an extra flag in the cases where we need the
> flag seems more straightforward.
> 
> I'd prefer a
> 
> #if defined(MAP_32BIT) && defined(__x86_64__) && (TARGET_LONG_BITS == 32)
> #define QEMU_MAP_32BIT MAP_32BIT
> #else
> #define QEMU_MAP_32BIT 0
> #endif
> 
> and then use QEMU_MAP_32BIT. (That way it's obvious when you're
> looking at the mmap() calls that they're not using the host's
> MAP_32BIT but something that might be different.)

This patch is actually wrong and we should rather make proper -R values the default instead of relying on MAP_32BIT.


Alex
陳韋任 - Dec. 20, 2011, 7:46 a.m.
> This patch is actually wrong and we should rather make proper -R values the default instead of relying on MAP_32BIT.
                                                         ^^^^^^^^^^^^^^^^
  I don't understand what "make proper -R values" means. Where/how we can apply
"-R"?

Regards,
chenwj
Alexander Graf - Dec. 20, 2011, 4:12 p.m.
On 20.12.2011, at 08:46, 陳韋任 wrote:

>> This patch is actually wrong and we should rather make proper -R values the default instead of relying on MAP_32BIT.
>                                                         ^^^^^^^^^^^^^^^^
>  I don't understand what "make proper -R values" means. Where/how we can apply
> "-R"?

Please see my other mail about this:

  http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg01697.html

Alex
陳韋任 - Dec. 21, 2011, 2:43 a.m.
> >> This patch is actually wrong and we should rather make proper -R values the default instead of relying on MAP_32BIT.
> >                                                         ^^^^^^^^^^^^^^^^
> >  I don't understand what "make proper -R values" means. Where/how we can apply
> > "-R"?
> 
> Please see my other mail about this:
> 
>   http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg01697.html

  Ah, "-R" should mean

    -R size           reserve size bytes for guest virtual address space

  right?

Regards,
chenwj
Alexander Graf - Dec. 21, 2011, 6:19 a.m.
On 21.12.2011, at 03:43, 陳韋任 <chenwj@iis.sinica.edu.tw> wrote:

>>>> This patch is actually wrong and we should rather make proper -R values the default instead of relying on MAP_32BIT.
>>>                                                        ^^^^^^^^^^^^^^^^
>>> I don't understand what "make proper -R values" means. Where/how we can apply
>>> "-R"?
>> 
>> Please see my other mail about this:
>> 
>>  http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg01697.html
> 
>  Ah, "-R" should mean
> 
>    -R size           reserve size bytes for guest virtual address space

Yes, that's what I was referring to :)

Alex

Patch

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 994c02b..b6f0fbf 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -33,6 +33,26 @@ 
 
 //#define DEBUG_MMAP
 
+/*
+ * On x86_64 we can tell mmap that we only want to map within the first 32
+ * bits to not get pointers that potentially exceed the return size. Without
+ * this flag set mmap will eventually break for users when running 32-on-64.
+ *
+ * However, Linux doesn't implement this for non-x86_64 systems. So we have
+ * to safeguard the flag and just ignore it on other architectures. At least
+ * we fixed the "common case" this way :).
+ *
+ * - agraf
+ */
+static void *qemu_mmap(void *addr, size_t length, int prot, int flags, int fd,
+                       off_t offset)
+{
+#if defined(MAP_32BIT) && defined(__x86_64__) && (TARGET_LONG_BITS == 32)
+    flags |= MAP_32BIT;
+#endif
+    return mmap(addr, length, prot, flags, fd, offset);
+}
+
 #if defined(CONFIG_USE_NPTL)
 static pthread_mutex_t mmap_mutex = PTHREAD_MUTEX_INITIALIZER;
 static __thread int mmap_lock_count;
@@ -168,8 +188,8 @@  static int mmap_frag(abi_ulong real_start,
 
     if (prot1 == 0) {
         /* no page was there, so we allocate one */
-        void *p = mmap(host_start, qemu_host_page_size, prot,
-                       flags | MAP_ANONYMOUS, -1, 0);
+        void *p = qemu_mmap(host_start, qemu_host_page_size, prot,
+                            flags | MAP_ANONYMOUS, -1, 0);
         if (p == MAP_FAILED)
             return -1;
         prot1 = prot;
@@ -291,8 +311,8 @@  abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size)
          *  - mremap() with MREMAP_FIXED flag
          *  - shmat() with SHM_REMAP flag
          */
-        ptr = mmap(g2h(addr), size, PROT_NONE,
-                   MAP_ANONYMOUS|MAP_PRIVATE|MAP_NORESERVE, -1, 0);
+        ptr = qemu_mmap(g2h(addr), size, PROT_NONE,
+                        MAP_ANONYMOUS|MAP_PRIVATE|MAP_NORESERVE, -1, 0);
 
         /* ENOMEM, if host address space has no memory */
         if (ptr == MAP_FAILED) {
@@ -453,15 +473,15 @@  abi_long target_mmap(abi_ulong start, abi_ulong len, int prot,
         /* Note: we prefer to control the mapping address. It is
            especially important if qemu_host_page_size >
            qemu_real_host_page_size */
-        p = mmap(g2h(mmap_start),
-                 host_len, prot, flags | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
+        p = qemu_mmap(g2h(mmap_start),
+                      host_len, prot, flags | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
         if (p == MAP_FAILED)
             goto fail;
         /* update start so that it points to the file position at 'offset' */
         host_start = (unsigned long)p;
         if (!(flags & MAP_ANONYMOUS)) {
-            p = mmap(g2h(mmap_start), len, prot, 
-                     flags | MAP_FIXED, fd, host_offset);
+            p = qemu_mmap(g2h(mmap_start), len, prot,
+                          flags | MAP_FIXED, fd, host_offset);
             host_start += offset - host_offset;
         }
         start = h2g(host_start);
@@ -546,8 +566,8 @@  abi_long target_mmap(abi_ulong start, abi_ulong len, int prot,
                 offset1 = 0;
             else
                 offset1 = offset + real_start - start;
-            p = mmap(g2h(real_start), real_end - real_start,
-                     prot, flags, fd, offset1);
+            p = qemu_mmap(g2h(real_start), real_end - real_start,
+                          prot, flags, fd, offset1);
             if (p == MAP_FAILED)
                 goto fail;
         }
@@ -602,9 +622,9 @@  static void mmap_reserve(abi_ulong start, abi_ulong size)
             real_end -= qemu_host_page_size;
     }
     if (real_start != real_end) {
-        mmap(g2h(real_start), real_end - real_start, PROT_NONE,
-                 MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE | MAP_NORESERVE,
-                 -1, 0);
+        qemu_mmap(g2h(real_start), real_end - real_start, PROT_NONE,
+                  MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE | MAP_NORESERVE,
+                  -1, 0);
     }
 }