Patchwork linux-user mmap bug

login
register
mail settings
Submitter Edgar Iglesias
Date May 21, 2010, 1:28 p.m.
Message ID <20100521132817.GA8021@edde.se.axis.com>
Download mbox | patch
Permalink /patch/53155/
State New
Headers show

Comments

Edgar Iglesias - May 21, 2010, 1:28 p.m.
Hi

I ran into an mmap problem linux-user emulating CRIS (32bit) on x86_64 hosts.
Guest asks for a non fixed mmap, QEMU tries the mmap but the kernel returns a
high 64bit address. QEMU notices that it wont fit in the guests 32bit ptr size
and retries with a low address but doesn't set the MAP_FIXED flag.

Was something like the following patch the intended behaviour or did I
missunderstand something? (it fixes my problem at least...)

Cheers

commit 96fd8e3fdedb697ba249f32245751a28979c3fab
Author: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Date:   Fri May 21 15:22:11 2010 +0200

    linux-user: Set MAP_FIXED for mmap address fixups.
    
    Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Richard Henderson - May 21, 2010, 4:38 p.m.
On 05/21/2010 06:28 AM, Edgar E. Iglesias wrote:
>          ptr = mmap(g2h(addr), size, PROT_NONE,
> -                   MAP_ANONYMOUS|MAP_PRIVATE|MAP_NORESERVE, -1, 0);
> +                   /* When the kernel returns addresses that the guest
> +                      cannot use we might need to fallback to fixed
> +                      allocations.  */
> +                   (addr ? MAP_FIXED : 0)
> +                   | MAP_ANONYMOUS|MAP_PRIVATE|MAP_NORESERVE, -1, 0);

NACK.  We are in fact probing for a free address in this loop,
so you don't know that the address being tested is in fact free.

I have a patch series that attempts to clean this up, but it 
isn't quite optimal.  I'll post it for reference, however.


r~
Richard Henderson - May 21, 2010, 5:39 p.m.
On 05/21/2010 09:38 AM, Richard Henderson wrote:
> I have a patch series that attempts to clean this up, but it 
> isn't quite optimal.  I'll post it for reference, however.

Bah, the patch sequence no longer applies since Paul removed PAGE_RESERVED.


r~
Edgar Iglesias - May 24, 2010, 2:57 p.m.
On Fri, May 21, 2010 at 10:39:20AM -0700, Richard Henderson wrote:
> On 05/21/2010 09:38 AM, Richard Henderson wrote:
> > I have a patch series that attempts to clean this up, but it 
> > isn't quite optimal.  I'll post it for reference, however.
> 
> Bah, the patch sequence no longer applies since Paul removed PAGE_RESERVED.

I took a look at the code again and I dont really understand how the
particular case when we get a high address from the kernel while
mmap_min_addr is busy case is supposed to work :/
In fact, for CRIS it never works on my host.

I get a high address that doesnt fit, then the code tries to go back
to mmap_min_addr but if that was busy my kernel just returns the
same addr we got the first try and mmap_find_vma gives -1 due to
repeat.

I changed it locally to keep scanning after a wrap until we succeed to
allocate a chunk or rewrap (SLOW) but at least I can run dynamically
linked CRIS programs again.

Cheers
Richard Henderson - May 24, 2010, 3:45 p.m.
On 05/24/2010 07:57 AM, Edgar E. Iglesias wrote:
> I took a look at the code again and I dont really understand how the
> particular case when we get a high address from the kernel while
> mmap_min_addr is busy case is supposed to work :/
> In fact, for CRIS it never works on my host.

Indeed, there are many cases for which it doesn't work for the Alpha
target either.

> I changed it locally to keep scanning after a wrap until we succeed to
> allocate a chunk or rewrap (SLOW) but at least I can run dynamically
> linked CRIS programs again.

Yep.  My hack had been similar, except that I used the PageDesc tree
to help speed things up.  But PageDesc is hardly an ideal data structure
in which to search, since it quickly devolves into a linear search of
the address space.

Probably the easiest real fix is to re-read /proc/self/maps each time
the mmap_next_start guess fails and the kernel's returned address is
out of range.

Another is using the MMAP_32BIT flag on x86-64 host whenever a 31-bit
address is appropriate for the guest.  E.g. mips32, where architecturally
the high half of the address space is reserved for kernel mode.

See 
  http://www.mail-archive.com/qemu-devel@nongnu.org/msg28924.html
for more ideas on the subject.



r~
Edgar Iglesias - May 25, 2010, 9:19 a.m.
On Mon, May 24, 2010 at 08:45:31AM -0700, Richard Henderson wrote:
> On 05/24/2010 07:57 AM, Edgar E. Iglesias wrote:
> > I took a look at the code again and I dont really understand how the
> > particular case when we get a high address from the kernel while
> > mmap_min_addr is busy case is supposed to work :/
> > In fact, for CRIS it never works on my host.
> 
> Indeed, there are many cases for which it doesn't work for the Alpha
> target either.

Ye, what puzzled me was that if I am not completely senile, CRIS apps
used to emulate on my x86_64 host not so long ago :)


> > I changed it locally to keep scanning after a wrap until we succeed to
> > allocate a chunk or rewrap (SLOW) but at least I can run dynamically
> > linked CRIS programs again.
> 
> Yep.  My hack had been similar, except that I used the PageDesc tree
> to help speed things up.  But PageDesc is hardly an ideal data structure
> in which to search, since it quickly devolves into a linear search of
> the address space.
> 
> Probably the easiest real fix is to re-read /proc/self/maps each time
> the mmap_next_start guess fails and the kernel's returned address is
> out of range.
> 
> Another is using the MMAP_32BIT flag on x86-64 host whenever a 31-bit
> address is appropriate for the guest.  E.g. mips32, where architecturally
> the high half of the address space is reserved for kernel mode.


MAP_32BIT sounds good as long as guest_base is not used. When used I
guess we'd need to fallback to something else anyway..

Maybe these issues are something too look more at during the bug day? :)

In the meantime, I've patched the cris git to use the MAP_32BIT and
to fallback to a super ugly and slow linear scan..

Thanks again for the help,
Cheers


> See 
>   http://www.mail-archive.com/qemu-devel@nongnu.org/msg28924.html
> for more ideas on the subject.
> 
> 
> 
> r~

Patch

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 6a1d933..5308fe1 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -304,7 +304,11 @@  abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size)
          *  - shmat() with SHM_REMAP flag
          */
         ptr = mmap(g2h(addr), size, PROT_NONE,
-                   MAP_ANONYMOUS|MAP_PRIVATE|MAP_NORESERVE, -1, 0);
+                   /* When the kernel returns addresses that the guest
+                      cannot use we might need to fallback to fixed
+                      allocations.  */
+                   (addr ? MAP_FIXED : 0)
+                   | MAP_ANONYMOUS|MAP_PRIVATE|MAP_NORESERVE, -1, 0);
 
         /* ENOMEM, if host address space has no memory */
         if (ptr == MAP_FAILED) {