diff mbox

qemu git head 20100323 on FreeBSD - qemu-devel port update for testing

Message ID 20100330191629.GA95521@triton8.kn-bremen.de
State New
Headers show

Commit Message

Juergen Lock March 30, 2010, 7:16 p.m. UTC
On Tue, Mar 30, 2010 at 09:04:28PM +0300, Blue Swirl wrote:
> On 3/25/10, Juergen Lock <nox@jelal.kn-bremen.de> wrote:
> > Hi!
> >
> >   Now that qemu git head works again (thanx Aurelien! :) I've finished
> >  the FreeBSD qemu-devel port update patch/shar that made me uncover
> >  the bug:
> >         http://people.freebsd.org/~nox/qemu/qemu-devel-20100323.patch
> >  resp.
> >         http://people.freebsd.org/~nox/qemu/qemu-devel-20100323.shar
> >
> >   This also adds a few misc fixes (that I'll submit on the qemu list
> >  seperately), I have...
> >
> >  . Fixed the FreeBSD executable path detection to work without /proc
> >   mounted (it usually isn't on FreeBSD), so you now no longer have to
> >   pass the path to the pc-bios dir with -L if you run qemu out of the
> >   build dir when another version is installed, like,
> >         work/qemu-snapshot-20100323_20/i386-softmmu/qemu ...
> >
> >   (files/patch-vl.c in the shar/patch)
> >
> >  . Fixed some more bsd-user bugs so all of i386-bsd-user, x86_64-bsd-user,
> >   and sparc64-bsd-user now run for me again on FreeBSD stable/8 amd64.
> >   (I didn't test sparc-bsd-user as I only tried -bsd freebsd and FreeBSD
> >   doesn't run on 32bit sparc.) - Yes bsd-user still needs more work but
> >   at least simple exectuables run.
> >
> >   (files/patch-bsd-user-mmap.c, files/patch-exec.c)
> >
> >  . Fixed the bsd-user host page protection code for FreeBSD hosts
> >   (using kinfo_getvmmap(3) on FeeBSD >= 7.x and /compat/linux/proc
> >   on older FreeBSD.)
> >
> >   (files/patch-bsd-user-linproc)
> >
> >  . Fixed some compilation warnings and a missing #include.
> >
> >   (files/patch-qemu-char.c, files/patch-qemu-timer.c)
> >
> 
> Thanks, applied all except exec.c one.

Oh, is there something wrong with it?  You mean this one, right?

Subject: [PATCH] Avoid page_set_flags() assert in qemu-user host page
 protection code
Message-ID: <20100325211421.GA52572@triton8.kn-bremen.de>
[...]

 I first tried to replace the endaddr in the !h2g_valid(endaddr) case with
	((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
if TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS (which comes from the condition
of the assert in page_set_flags() that was triggered on the ~0ul value),
but that caused the qemu process to grow into swap and made the box
usuable when that code was reached and I had to kill qemu.  (The box has
8 GB RAM.)  And so I thought just leaving that page range unprotected
if only the start address is valid was the lesser evil...

 ..and thanx for committing the other ones, :)
	Juergen

Comments

Richard Henderson March 30, 2010, 7:42 p.m. UTC | #1
On 03/30/2010 12:16 PM, Juergen Lock wrote:
>  I first tried to replace the endaddr in the !h2g_valid(endaddr) case with
> 	((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
> if TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS (which comes from the condition
> of the assert in page_set_flags() that was triggered on the ~0ul value),
> but that caused the qemu process to grow into swap and made the box
> usuable when that code was reached and I had to kill qemu.  (The box has
> 8 GB RAM.)  And so I thought just leaving that page range unprotected
> if only the start address is valid was the lesser evil...

What's are the real arguments to the page_set_flags that causes things
to go into swap?  I can't imagine the range really being so large that
it causes massive allocation within that function...


r~
Blue Swirl March 30, 2010, 7:54 p.m. UTC | #2
On 3/30/10, Juergen Lock <nox@jelal.kn-bremen.de> wrote:
> On Tue, Mar 30, 2010 at 09:04:28PM +0300, Blue Swirl wrote:
>  > On 3/25/10, Juergen Lock <nox@jelal.kn-bremen.de> wrote:
>  > > Hi!
>  > >
>  > >   Now that qemu git head works again (thanx Aurelien! :) I've finished
>  > >  the FreeBSD qemu-devel port update patch/shar that made me uncover
>  > >  the bug:
>  > >         http://people.freebsd.org/~nox/qemu/qemu-devel-20100323.patch
>  > >  resp.
>  > >         http://people.freebsd.org/~nox/qemu/qemu-devel-20100323.shar
>  > >
>  > >   This also adds a few misc fixes (that I'll submit on the qemu list
>  > >  seperately), I have...
>  > >
>  > >  . Fixed the FreeBSD executable path detection to work without /proc
>  > >   mounted (it usually isn't on FreeBSD), so you now no longer have to
>  > >   pass the path to the pc-bios dir with -L if you run qemu out of the
>  > >   build dir when another version is installed, like,
>  > >         work/qemu-snapshot-20100323_20/i386-softmmu/qemu ...
>  > >
>  > >   (files/patch-vl.c in the shar/patch)
>  > >
>  > >  . Fixed some more bsd-user bugs so all of i386-bsd-user, x86_64-bsd-user,
>  > >   and sparc64-bsd-user now run for me again on FreeBSD stable/8 amd64.
>  > >   (I didn't test sparc-bsd-user as I only tried -bsd freebsd and FreeBSD
>  > >   doesn't run on 32bit sparc.) - Yes bsd-user still needs more work but
>  > >   at least simple exectuables run.
>  > >
>  > >   (files/patch-bsd-user-mmap.c, files/patch-exec.c)
>  > >
>  > >  . Fixed the bsd-user host page protection code for FreeBSD hosts
>  > >   (using kinfo_getvmmap(3) on FeeBSD >= 7.x and /compat/linux/proc
>  > >   on older FreeBSD.)
>  > >
>  > >   (files/patch-bsd-user-linproc)
>  > >
>  > >  . Fixed some compilation warnings and a missing #include.
>  > >
>  > >   (files/patch-qemu-char.c, files/patch-qemu-timer.c)
>  > >
>  >
>  > Thanks, applied all except exec.c one.
>
>
> Oh, is there something wrong with it?  You mean this one, right?
>
>  Subject: [PATCH] Avoid page_set_flags() assert in qemu-user host page
>   protection code
>  Message-ID: <20100325211421.GA52572@triton8.kn-bremen.de>
>  [...]
>
>  --- a/exec.c
>  +++ b/exec.c
>  @@ -293,10 +293,13 @@ static void page_init(void)
>
>                      if (h2g_valid(endaddr)) {
>                          endaddr = h2g(endaddr);
>  +                        page_set_flags(startaddr, endaddr, PAGE_RESERVED);
>                      } else {
>  +#if TARGET_ABI_BITS <= L1_MAP_ADDR_SPACE_BITS
>                          endaddr = ~0ul;
>  +                        page_set_flags(startaddr, endaddr, PAGE_RESERVED);
>  +#endif
>                      }
>  -                    page_set_flags(startaddr, endaddr, PAGE_RESERVED);
>                  }
>              } while (!feof(f));
>
>   I first tried to replace the endaddr in the !h2g_valid(endaddr) case with
>         ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
>  if TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS (which comes from the condition
>  of the assert in page_set_flags() that was triggered on the ~0ul value),
>  but that caused the qemu process to grow into swap and made the box
>  usuable when that code was reached and I had to kill qemu.  (The box has
>  8 GB RAM.)  And so I thought just leaving that page range unprotected
>  if only the start address is valid was the lesser evil...

I was thinking something like (abi_ulong)-1 but maybe that isn't any
more correct.
Juergen Lock March 30, 2010, 8:09 p.m. UTC | #3
In article <4BB2540B.90704@twiddle.net> you write:
>On 03/30/2010 12:16 PM, Juergen Lock wrote:
>>  I first tried to replace the endaddr in the !h2g_valid(endaddr) case with
>> 	((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
>> if TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS (which comes from the condition
>> of the assert in page_set_flags() that was triggered on the ~0ul value),
>> but that caused the qemu process to grow into swap and made the box
>> usuable when that code was reached and I had to kill qemu.  (The box has
>> 8 GB RAM.)  And so I thought just leaving that page range unprotected
>> if only the start address is valid was the lesser evil...
>
>What's are the real arguments to the page_set_flags that causes things
>to go into swap?  I can't imagine the range really being so large that
>it causes massive allocation within that function...

Oh sorry if that was not clear, things go into swap if I _replace_ the
endaddr ~0ul (which caused the assert) with the max value the assert
still tolerates i.e.
	((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
which in this case seems to be 0x7fffffffffff:

#3  0x0000000060012731 in page_set_flags (start=140737488224256, 
    end=18446744073709551615, flags=32)
    at /usr/ports/emulators/qemu-devel-20100323a/work/qemu-snapshot-20100323_20/exec.c:2426
2426	    assert(end < ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS));
(gdb) i li 2426
Line 2426 of "/usr/ports/emulators/qemu-devel-20100323a/work/qemu-snapshot-20100323_20/exec.c" starts at address 0x60012662 <page_set_flags+34>
   and ends at 0x60012675 <page_set_flags+53>.
(gdb) disassemble 0x60012662 0x60012675
Dump of assembler code from 0x60012662 to 0x60012675:
0x0000000060012662 <page_set_flags+34>:	mov    $0x7fffffffffff,%rax
						^^^^^^^^^^^^^^
0x000000006001266c <page_set_flags+44>:	cmp    %rax,%rsi
0x000000006001266f <page_set_flags+47>:	ja     0x60012718 <page_set_flags+216>
End of assembler dump.
(gdb) q

 Cheers,
	Juergen
Juergen Lock March 30, 2010, 8:17 p.m. UTC | #4
On Tue, Mar 30, 2010 at 10:54:03PM +0300, Blue Swirl wrote:
> On 3/30/10, Juergen Lock <nox@jelal.kn-bremen.de> wrote:
> > On Tue, Mar 30, 2010 at 09:04:28PM +0300, Blue Swirl wrote:
> >  > On 3/25/10, Juergen Lock <nox@jelal.kn-bremen.de> wrote:
> >  > > Hi!
> >  > >
> >  > >   Now that qemu git head works again (thanx Aurelien! :) I've finished
> >  > >  the FreeBSD qemu-devel port update patch/shar that made me uncover
> >  > >  the bug:
> >  > >         http://people.freebsd.org/~nox/qemu/qemu-devel-20100323.patch
> >  > >  resp.
> >  > >         http://people.freebsd.org/~nox/qemu/qemu-devel-20100323.shar
> >  > >
> >  > >   This also adds a few misc fixes (that I'll submit on the qemu list
> >  > >  seperately), I have...
> >  > >
> >  > >  . Fixed the FreeBSD executable path detection to work without /proc
> >  > >   mounted (it usually isn't on FreeBSD), so you now no longer have to
> >  > >   pass the path to the pc-bios dir with -L if you run qemu out of the
> >  > >   build dir when another version is installed, like,
> >  > >         work/qemu-snapshot-20100323_20/i386-softmmu/qemu ...
> >  > >
> >  > >   (files/patch-vl.c in the shar/patch)
> >  > >
> >  > >  . Fixed some more bsd-user bugs so all of i386-bsd-user, x86_64-bsd-user,
> >  > >   and sparc64-bsd-user now run for me again on FreeBSD stable/8 amd64.
> >  > >   (I didn't test sparc-bsd-user as I only tried -bsd freebsd and FreeBSD
> >  > >   doesn't run on 32bit sparc.) - Yes bsd-user still needs more work but
> >  > >   at least simple exectuables run.
> >  > >
> >  > >   (files/patch-bsd-user-mmap.c, files/patch-exec.c)
> >  > >
> >  > >  . Fixed the bsd-user host page protection code for FreeBSD hosts
> >  > >   (using kinfo_getvmmap(3) on FeeBSD >= 7.x and /compat/linux/proc
> >  > >   on older FreeBSD.)
> >  > >
> >  > >   (files/patch-bsd-user-linproc)
> >  > >
> >  > >  . Fixed some compilation warnings and a missing #include.
> >  > >
> >  > >   (files/patch-qemu-char.c, files/patch-qemu-timer.c)
> >  > >
> >  >
> >  > Thanks, applied all except exec.c one.
> >
> >
> > Oh, is there something wrong with it?  You mean this one, right?
> >
> >  Subject: [PATCH] Avoid page_set_flags() assert in qemu-user host page
> >   protection code
> >  Message-ID: <20100325211421.GA52572@triton8.kn-bremen.de>
> >  [...]
> >
> >  --- a/exec.c
> >  +++ b/exec.c
> >  @@ -293,10 +293,13 @@ static void page_init(void)
> >
> >                      if (h2g_valid(endaddr)) {
> >                          endaddr = h2g(endaddr);
> >  +                        page_set_flags(startaddr, endaddr, PAGE_RESERVED);
> >                      } else {
> >  +#if TARGET_ABI_BITS <= L1_MAP_ADDR_SPACE_BITS
> >                          endaddr = ~0ul;
> >  +                        page_set_flags(startaddr, endaddr, PAGE_RESERVED);
> >  +#endif
> >                      }
> >  -                    page_set_flags(startaddr, endaddr, PAGE_RESERVED);
> >                  }
> >              } while (!feof(f));
> >
> >   I first tried to replace the endaddr in the !h2g_valid(endaddr) case with
> >         ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
> >  if TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS (which comes from the condition
> >  of the assert in page_set_flags() that was triggered on the ~0ul value),
> >  but that caused the qemu process to grow into swap and made the box
> >  usuable when that code was reached and I had to kill qemu.  (The box has
> >  8 GB RAM.)  And so I thought just leaving that page range unprotected
> >  if only the start address is valid was the lesser evil...
> 
> I was thinking something like (abi_ulong)-1 but maybe that isn't any
> more correct.

Oh this is happening with x86_64-bsd-user on the same arch so I'd say
(abi_ulong)-1 would be the same as ~0ul (and still cause the assert.)

 Cheers,
	Juergen
Richard Henderson March 30, 2010, 8:33 p.m. UTC | #5
On 03/30/2010 01:09 PM, Juergen Lock wrote:
> Oh sorry if that was not clear, things go into swap if I _replace_ the
> endaddr ~0ul (which caused the assert) with the max value the assert
> still tolerates i.e.
> 	((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
> which in this case seems to be 0x7fffffffffff:

Yes, I got that.  And I see from ...

> #3  0x0000000060012731 in page_set_flags (start=140737488224256, 
>     end=18446744073709551615, flags=32)

... here that the range we're reserving is

  0x7ffffffe0000 ... 0x7fffffffffff

which is a mere 128k range.  Which ought to allocate no more than
a single leaf page table (and thus N-1 pages for the N-level table).

Which doesn't answer the question of why you'd wind up running out
of memory.


r~
Juergen Lock March 30, 2010, 8:42 p.m. UTC | #6
On Tue, Mar 30, 2010 at 10:09:47PM +0200, Juergen Lock wrote:
> In article <4BB2540B.90704@twiddle.net> you write:
> >On 03/30/2010 12:16 PM, Juergen Lock wrote:
> >>  I first tried to replace the endaddr in the !h2g_valid(endaddr) case with
> >> 	((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
> >> if TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS (which comes from the condition
> >> of the assert in page_set_flags() that was triggered on the ~0ul value),
> >> but that caused the qemu process to grow into swap and made the box
> >> usuable when that code was reached and I had to kill qemu.  (The box has
> >> 8 GB RAM.)  And so I thought just leaving that page range unprotected
> >> if only the start address is valid was the lesser evil...
> >
> >What's are the real arguments to the page_set_flags that causes things
> >to go into swap?  I can't imagine the range really being so large that
> >it causes massive allocation within that function...
> 
> Oh sorry if that was not clear, things go into swap if I _replace_ the
> endaddr ~0ul (which caused the assert) with the max value the assert
> still tolerates i.e.
> 	((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
> which in this case seems to be 0x7fffffffffff:
> 
> #3  0x0000000060012731 in page_set_flags (start=140737488224256, 
>     end=18446744073709551615, flags=32)
>     at /usr/ports/emulators/qemu-devel-20100323a/work/qemu-snapshot-20100323_20/exec.c:2426
> 2426	    assert(end < ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS));
> (gdb) i li 2426
> Line 2426 of "/usr/ports/emulators/qemu-devel-20100323a/work/qemu-snapshot-20100323_20/exec.c" starts at address 0x60012662 <page_set_flags+34>
>    and ends at 0x60012675 <page_set_flags+53>.
> (gdb) disassemble 0x60012662 0x60012675
> Dump of assembler code from 0x60012662 to 0x60012675:
> 0x0000000060012662 <page_set_flags+34>:	mov    $0x7fffffffffff,%rax
> 						^^^^^^^^^^^^^^
> 0x000000006001266c <page_set_flags+44>:	cmp    %rax,%rsi
> 0x000000006001266f <page_set_flags+47>:	ja     0x60012718 <page_set_flags+216>
> End of assembler dump.
> (gdb) q

Ok sorry about the confusion, this is a different problem, I just looked
at the value of start, it seems to be:

(gdb) p start
$2 = 0x7ffffffe0000

 So I'd say the real problem is page_set_flags() has a bug that makes
it allocate too much if the range is the last allowed page...

 Cheers,
	Juergen
Juergen Lock March 30, 2010, 8:45 p.m. UTC | #7
On Tue, Mar 30, 2010 at 01:33:15PM -0700, Richard Henderson wrote:
> On 03/30/2010 01:09 PM, Juergen Lock wrote:
> > Oh sorry if that was not clear, things go into swap if I _replace_ the
> > endaddr ~0ul (which caused the assert) with the max value the assert
> > still tolerates i.e.
> > 	((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
> > which in this case seems to be 0x7fffffffffff:
> 
> Yes, I got that.  And I see from ...
> 
> > #3  0x0000000060012731 in page_set_flags (start=140737488224256, 
> >     end=18446744073709551615, flags=32)
> 
> ... here that the range we're reserving is
> 
>   0x7ffffffe0000 ... 0x7fffffffffff
> 
> which is a mere 128k range.  Which ought to allocate no more than
> a single leaf page table (and thus N-1 pages for the N-level table).
> 
> Which doesn't answer the question of why you'd wind up running out
> of memory.

Ah yeah our mails crossed each other :)  Yeah I don't know...

 Cheers,
	Juergen
Richard Henderson March 30, 2010, 8:56 p.m. UTC | #8
On 03/30/2010 01:42 PM, Juergen Lock wrote:
>  So I'd say the real problem is page_set_flags() has a bug that makes
> it allocate too much if the range is the last allowed page...

It doesn't, as far as I can see.  I added this range by hand to page_init
and the effect was exactly as I supposed on a linux host -- 2 pages
allocated to handle the 3-level page table.  No out of memory.


r~
Juergen Lock March 30, 2010, 9:57 p.m. UTC | #9
On Tue, Mar 30, 2010 at 10:45:48PM +0200, Juergen Lock wrote:
> On Tue, Mar 30, 2010 at 01:33:15PM -0700, Richard Henderson wrote:
> > On 03/30/2010 01:09 PM, Juergen Lock wrote:
> > > Oh sorry if that was not clear, things go into swap if I _replace_ the
> > > endaddr ~0ul (which caused the assert) with the max value the assert
> > > still tolerates i.e.
> > > 	((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
> > > which in this case seems to be 0x7fffffffffff:
> > 
> > Yes, I got that.  And I see from ...
> > 
> > > #3  0x0000000060012731 in page_set_flags (start=140737488224256, 
> > >     end=18446744073709551615, flags=32)
> > 
> > ... here that the range we're reserving is
> > 
> >   0x7ffffffe0000 ... 0x7fffffffffff
> > 
> > which is a mere 128k range.  Which ought to allocate no more than
> > a single leaf page table (and thus N-1 pages for the N-level table).
> > 
> > Which doesn't answer the question of why you'd wind up running out
> > of memory.
> 
> Ah yeah our mails crossed each other :)  Yeah I don't know...

Ok found the problem:  Reads from /compat/linux/proc/self/maps are
broken i.e. return partial lines if (at least I think that's the reason)
the mappings change between read() calls.  (It got a line like this
after the first line that originally caused the assert:
	fe0000-800000000000 rwxp 00020000 00:00 0
)

 Well good that we now use kinfo_getvmmap(), I hope noone needs
bsd-user on FreeBSD 6.x...

 Oh well...
	Juergen
Paul Brook April 1, 2010, 11:59 a.m. UTC | #10
> Oh this is happening with x86_64-bsd-user on the same arch so I'd say
> (abi_ulong)-1 would be the same as ~0ul (and still cause the assert.)

No. These two are different when sizeof(abi_ulong) < sizeof(long).

Paul
Juergen Lock April 1, 2010, 5:06 p.m. UTC | #11
On Thu, Apr 01, 2010 at 11:59:11AM +0000, Paul Brook wrote:
> > Oh this is happening with x86_64-bsd-user on the same arch so I'd say
> > (abi_ulong)-1 would be the same as ~0ul (and still cause the assert.)
> 
> No. These two are different when sizeof(abi_ulong) < sizeof(long).

Yeah sorry I meant in this case that caused the assert. (x86_64-bsd-user
on same-arch host.)

 Cheers,
	Juergen
diff mbox

Patch

--- a/exec.c
+++ b/exec.c
@@ -293,10 +293,13 @@  static void page_init(void)
 
                     if (h2g_valid(endaddr)) {
                         endaddr = h2g(endaddr);
+                        page_set_flags(startaddr, endaddr, PAGE_RESERVED);
                     } else {
+#if TARGET_ABI_BITS <= L1_MAP_ADDR_SPACE_BITS
                         endaddr = ~0ul;
+                        page_set_flags(startaddr, endaddr, PAGE_RESERVED);
+#endif
                     }
-                    page_set_flags(startaddr, endaddr, PAGE_RESERVED);
                 }
             } while (!feof(f));