diff mbox series

oslib-posix: Use MAP_STACK in qemu_alloc_stack() on OpenBSD

Message ID 20181007153731.GA32288@humpty.home.comstyle.com
State New
Headers show
Series oslib-posix: Use MAP_STACK in qemu_alloc_stack() on OpenBSD | expand

Commit Message

Brad Smith Oct. 7, 2018, 3:37 p.m. UTC
Use MAP_STACK in qemu_alloc_stack() on OpenBSD.

Added to -current and will be in our soon to be 6.4 release.

MAP_STACK      Indicate that the mapping is used as a stack.  This
               flag must be used in combination with MAP_ANON and
               MAP_PRIVATE.

Implement MAP_STACK option for mmap().  Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.


Signed-off-by: Brad Smith <brad@comstyle.com>

Comments

Kamil Rytarowski Oct. 9, 2018, 1:52 p.m. UTC | #1
On 07.10.2018 17:37, Brad Smith wrote:
> Use MAP_STACK in qemu_alloc_stack() on OpenBSD.
> 
> Added to -current and will be in our soon to be 6.4 release.
> 
> MAP_STACK      Indicate that the mapping is used as a stack.  This
>                flag must be used in combination with MAP_ANON and
>                MAP_PRIVATE.
> 
> Implement MAP_STACK option for mmap().  Synchronous faults (pagefault and
> syscall) confirm the stack register points at MAP_STACK memory, otherwise
> SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
> to create a MAP_STACK sub-region which satisfies alignment requirements.
> Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
> contents of the region -- there is no mprotect() equivalent operation, so
> there is no MAP_STACK-adding gadget.
> 
> 
> Signed-off-by: Brad Smith <brad@comstyle.com>
> 
> 
> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> index fbd0dc8c57..51e9a012c2 100644
> --- a/util/oslib-posix.c
> +++ b/util/oslib-posix.c
> @@ -611,7 +611,11 @@ void *qemu_alloc_stack(size_t *sz)
>      *sz += pagesz;
>  
>      ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
> -               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> +               MAP_PRIVATE | MAP_ANONYMOUS
> +#ifdef MAP_STACK
> +               | MAP_STACK
> +#endif
> +               , -1, 0);
>      if (ptr == MAP_FAILED) {
>          perror("failed to allocate memory for stack");
>          abort();
> 

Can we handle it differently, storing MAP_* flags in a variable:

int flags = MAP_PRIVATE | MAP_ANONYMOUS;
#ifdef MAP_STACK
flags |= MAP_STACK;
#endif

ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE, flags, -1, 0);

This way it will look nicer as we won't ifdef the middle of a function call.
Peter Maydell Oct. 9, 2018, 2:12 p.m. UTC | #2
On 9 October 2018 at 14:52, Kamil Rytarowski <n54@gmx.com> wrote:
> On 07.10.2018 17:37, Brad Smith wrote:
>> Use MAP_STACK in qemu_alloc_stack() on OpenBSD.
>>
>> Added to -current and will be in our soon to be 6.4 release.
>>
>> MAP_STACK      Indicate that the mapping is used as a stack.  This
>>                flag must be used in combination with MAP_ANON and
>>                MAP_PRIVATE.
>>
>> Implement MAP_STACK option for mmap().  Synchronous faults (pagefault and
>> syscall) confirm the stack register points at MAP_STACK memory, otherwise
>> SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
>> to create a MAP_STACK sub-region which satisfies alignment requirements.
>> Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
>> contents of the region -- there is no mprotect() equivalent operation, so
>> there is no MAP_STACK-adding gadget.
>>
>>
>> Signed-off-by: Brad Smith <brad@comstyle.com>
>>
>>
>> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
>> index fbd0dc8c57..51e9a012c2 100644
>> --- a/util/oslib-posix.c
>> +++ b/util/oslib-posix.c
>> @@ -611,7 +611,11 @@ void *qemu_alloc_stack(size_t *sz)
>>      *sz += pagesz;
>>
>>      ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
>> -               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>> +               MAP_PRIVATE | MAP_ANONYMOUS
>> +#ifdef MAP_STACK
>> +               | MAP_STACK
>> +#endif
>> +               , -1, 0);
>>      if (ptr == MAP_FAILED) {
>>          perror("failed to allocate memory for stack");
>>          abort();
>>
>
> Can we handle it differently, storing MAP_* flags in a variable:
>
> int flags = MAP_PRIVATE | MAP_ANONYMOUS;
> #ifdef MAP_STACK
> flags |= MAP_STACK;
> #endif
>
> ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE, flags, -1, 0);
>
> This way it will look nicer as we won't ifdef the middle of a function call.

The other nice way to handle that is to have osdep.h do
#ifndef MAP_STACK
#define MAP_STACK 0
#endif
and then you can just unconditionally use MAP_STACK in your
expression for the mmap flags.


I note that Linux also defines a MAP_STACK, about which the
manpage says:
       MAP_STACK (since Linux 2.6.27)
              Allocate the mapping at an address suitable for a
process or thread  stack.
              This  flag  is currently a no-op, but is used in the
glibc threading imple‐
              mentation so that if some architectures require special
treatment for stack
              allocations, support can later be transparently
implemented for glibc.

So this patch would be opting Linux QEMU builds into whatever that
potential future behaviour change is. That sounds I guess like
it's more likely to be the right thing than the wrong thing,
but it would be useful if some Linux expert could confirm...

thanks
-- PMM
Brad Smith Oct. 9, 2018, 2:19 p.m. UTC | #3
On Tue, Oct 09, 2018 at 03:52:30PM +0200, Kamil Rytarowski wrote:
> On 07.10.2018 17:37, Brad Smith wrote:
> > Use MAP_STACK in qemu_alloc_stack() on OpenBSD.
> > 
> > Added to -current and will be in our soon to be 6.4 release.
> > 
> > MAP_STACK      Indicate that the mapping is used as a stack.  This
> >                flag must be used in combination with MAP_ANON and
> >                MAP_PRIVATE.
> > 
> > Implement MAP_STACK option for mmap().  Synchronous faults (pagefault and
> > syscall) confirm the stack register points at MAP_STACK memory, otherwise
> > SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
> > to create a MAP_STACK sub-region which satisfies alignment requirements.
> > Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
> > contents of the region -- there is no mprotect() equivalent operation, so
> > there is no MAP_STACK-adding gadget.
> > 
> > 
> > Signed-off-by: Brad Smith <brad@comstyle.com>
> > 
> > 
> > diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> > index fbd0dc8c57..51e9a012c2 100644
> > --- a/util/oslib-posix.c
> > +++ b/util/oslib-posix.c
> > @@ -611,7 +611,11 @@ void *qemu_alloc_stack(size_t *sz)
> >      *sz += pagesz;
> >  
> >      ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
> > -               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> > +               MAP_PRIVATE | MAP_ANONYMOUS
> > +#ifdef MAP_STACK
> > +               | MAP_STACK
> > +#endif
> > +               , -1, 0);
> >      if (ptr == MAP_FAILED) {
> >          perror("failed to allocate memory for stack");
> >          abort();
> > 
> 
> Can we handle it differently, storing MAP_* flags in a variable:
> 
> int flags = MAP_PRIVATE | MAP_ANONYMOUS;
> #ifdef MAP_STACK
> flags |= MAP_STACK;
> #endif
> 
> ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE, flags, -1, 0);
> 
> This way it will look nicer as we won't ifdef the middle of a function call.

How about the following?


diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index fbd0dc8c57..f8ee349c9c 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -596,6 +596,7 @@ pid_t qemu_fork(Error **errp)
 void *qemu_alloc_stack(size_t *sz)
 {
     void *ptr, *guardpage;
+    int flags;
 #ifdef CONFIG_DEBUG_STACK_USAGE
     void *ptr2;
 #endif
@@ -610,8 +611,12 @@ void *qemu_alloc_stack(size_t *sz)
     /* allocate one extra page for the guard page */
     *sz += pagesz;
 
-    ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
-               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+    flags = MAP_PRIVATE | MAP_ANONYMOUS;
+#ifdef MAP_STACK
+    flags |= MAP_STACK;
+#endif
+
+    ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE, flags, -1, 0);
     if (ptr == MAP_FAILED) {
         perror("failed to allocate memory for stack");
         abort();
Kamil Rytarowski Oct. 9, 2018, 3:04 p.m. UTC | #4
On 09.10.2018 16:12, Peter Maydell wrote:
> On 9 October 2018 at 14:52, Kamil Rytarowski <n54@gmx.com> wrote:
>> On 07.10.2018 17:37, Brad Smith wrote:
>>> Use MAP_STACK in qemu_alloc_stack() on OpenBSD.
>>>
>>> Added to -current and will be in our soon to be 6.4 release.
>>>
>>> MAP_STACK      Indicate that the mapping is used as a stack.  This
>>>                flag must be used in combination with MAP_ANON and
>>>                MAP_PRIVATE.
>>>
>>> Implement MAP_STACK option for mmap().  Synchronous faults (pagefault and
>>> syscall) confirm the stack register points at MAP_STACK memory, otherwise
>>> SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
>>> to create a MAP_STACK sub-region which satisfies alignment requirements.
>>> Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
>>> contents of the region -- there is no mprotect() equivalent operation, so
>>> there is no MAP_STACK-adding gadget.
>>>
>>>
>>> Signed-off-by: Brad Smith <brad@comstyle.com>
>>>
>>>
>>> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
>>> index fbd0dc8c57..51e9a012c2 100644
>>> --- a/util/oslib-posix.c
>>> +++ b/util/oslib-posix.c
>>> @@ -611,7 +611,11 @@ void *qemu_alloc_stack(size_t *sz)
>>>      *sz += pagesz;
>>>
>>>      ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
>>> -               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>>> +               MAP_PRIVATE | MAP_ANONYMOUS
>>> +#ifdef MAP_STACK
>>> +               | MAP_STACK
>>> +#endif
>>> +               , -1, 0);
>>>      if (ptr == MAP_FAILED) {
>>>          perror("failed to allocate memory for stack");
>>>          abort();
>>>
>>
>> Can we handle it differently, storing MAP_* flags in a variable:
>>
>> int flags = MAP_PRIVATE | MAP_ANONYMOUS;
>> #ifdef MAP_STACK
>> flags |= MAP_STACK;
>> #endif
>>
>> ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE, flags, -1, 0);
>>
>> This way it will look nicer as we won't ifdef the middle of a function call.
> 
> The other nice way to handle that is to have osdep.h do
> #ifndef MAP_STACK
> #define MAP_STACK 0
> #endif
> and then you can just unconditionally use MAP_STACK in your
> expression for the mmap flags.
> 

I assume that this is a cleaner solution.

> 
> I note that Linux also defines a MAP_STACK, about which the
> manpage says:
>        MAP_STACK (since Linux 2.6.27)
>               Allocate the mapping at an address suitable for a
> process or thread  stack.
>               This  flag  is currently a no-op, but is used in the
> glibc threading imple‐
>               mentation so that if some architectures require special
> treatment for stack
>               allocations, support can later be transparently
> implemented for glibc.
> 
> So this patch would be opting Linux QEMU builds into whatever that
> potential future behaviour change is. That sounds I guess like
> it's more likely to be the right thing than the wrong thing,
> but it would be useful if some Linux expert could confirm...
> 
> thanks
> -- PMM
> 

There is a similar description on NetBSD:

MAP_STACK

Allocate a memory segment that can be used
either for a process or thread stack.  This
currently has no effect, but its use is
reserved for architectures that might require
special treatment of that address space.
Unimplemented.

Apparently OpenBSD started to overload it for some hardening.
Brad Smith Oct. 10, 2018, 11:55 p.m. UTC | #5
On 10/9/2018 11:04 AM, Kamil Rytarowski wrote:

> On 09.10.2018 16:12, Peter Maydell wrote:
>> On 9 October 2018 at 14:52, Kamil Rytarowski <n54@gmx.com> wrote:
>>> On 07.10.2018 17:37, Brad Smith wrote:
>>>> Use MAP_STACK in qemu_alloc_stack() on OpenBSD.
>>>>
>>>> Added to -current and will be in our soon to be 6.4 release.
>>>>
>>>> MAP_STACK      Indicate that the mapping is used as a stack.  This
>>>>                 flag must be used in combination with MAP_ANON and
>>>>                 MAP_PRIVATE.
>>>>
>>>> Implement MAP_STACK option for mmap().  Synchronous faults (pagefault and
>>>> syscall) confirm the stack register points at MAP_STACK memory, otherwise
>>>> SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
>>>> to create a MAP_STACK sub-region which satisfies alignment requirements.
>>>> Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
>>>> contents of the region -- there is no mprotect() equivalent operation, so
>>>> there is no MAP_STACK-adding gadget.
>>>>
>>>>
>>>> Signed-off-by: Brad Smith <brad@comstyle.com>
>>>>
>>>>
>>>> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
>>>> index fbd0dc8c57..51e9a012c2 100644
>>>> --- a/util/oslib-posix.c
>>>> +++ b/util/oslib-posix.c
>>>> @@ -611,7 +611,11 @@ void *qemu_alloc_stack(size_t *sz)
>>>>       *sz += pagesz;
>>>>
>>>>       ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
>>>> -               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>>>> +               MAP_PRIVATE | MAP_ANONYMOUS
>>>> +#ifdef MAP_STACK
>>>> +               | MAP_STACK
>>>> +#endif
>>>> +               , -1, 0);
>>>>       if (ptr == MAP_FAILED) {
>>>>           perror("failed to allocate memory for stack");
>>>>           abort();
>>>>
>>> Can we handle it differently, storing MAP_* flags in a variable:
>>>
>>> int flags = MAP_PRIVATE | MAP_ANONYMOUS;
>>> #ifdef MAP_STACK
>>> flags |= MAP_STACK;
>>> #endif
>>>
>>> ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE, flags, -1, 0);
>>>
>>> This way it will look nicer as we won't ifdef the middle of a function call.
>> The other nice way to handle that is to have osdep.h do
>> #ifndef MAP_STACK
>> #define MAP_STACK 0
>> #endif
>> and then you can just unconditionally use MAP_STACK in your
>> expression for the mmap flags.
>>
> I assume that this is a cleaner solution.
>
>> I note that Linux also defines a MAP_STACK, about which the
>> manpage says:
>>         MAP_STACK (since Linux 2.6.27)
>>                Allocate the mapping at an address suitable for a
>> process or thread  stack.
>>                This  flag  is currently a no-op, but is used in the
>> glibc threading imple‐
>>                mentation so that if some architectures require special
>> treatment for stack
>>                allocations, support can later be transparently
>> implemented for glibc.
>>
>> So this patch would be opting Linux QEMU builds into whatever that
>> potential future behaviour change is. That sounds I guess like
>> it's more likely to be the right thing than the wrong thing,
>> but it would be useful if some Linux expert could confirm...
>>
>> thanks
>> -- PMM
>>
> There is a similar description on NetBSD:
>
> MAP_STACK
>
> Allocate a memory segment that can be used
> either for a process or thread stack.  This
> currently has no effect, but its use is
> reserved for architectures that might require
> special treatment of that address space.
> Unimplemented.
>
> Apparently OpenBSD started to overload it for some hardening.

And from FreeBSD...

      MAP_STACK		MAP_STACK implies MAP_ANON, and	/offset/  of 0.  The/fd/
			argument must be -1 and	/prot/  must include at least
			PROT_READ and PROT_WRITE.

			This option creates a memory region that grows to at
			most/len/  bytes in size,	starting from the stack	top
			and growing down.  The stack top is the	starting
			address	returned by the	call, plus/len/  bytes.  The
			bottom of the stack at maximum growth is the starting
			address	returned by the	call.

			Stacks created with MAP_STACK automatically grow.
			Guards prevent inadvertent use of the regions into
			which those stacks can grow without requiring mapping
			the whole stack	in advance.
Peter Maydell Oct. 11, 2018, 9:36 a.m. UTC | #6
On 11 October 2018 at 00:55, Brad Smith <brad@comstyle.com> wrote:
> And from FreeBSD...
>
>      MAP_STACK MAP_STACK implies MAP_ANON, and offset of 0.  The fd
> argument must be -1 and prot must include at least
> PROT_READ and PROT_WRITE.
>
> This option creates a memory region that grows to at
> most len bytes in size, starting from the stack top
> and growing down.  The stack top is the starting
> address returned by the call, plus len bytes.  The
> bottom of the stack at maximum growth is the starting
> address returned by the call.
>
> Stacks created with MAP_STACK automatically grow.
> Guards prevent inadvertent use of the regions into
> which those stacks can grow without requiring mapping
> the whole stack in advance.

Hmm. That "automatically growing" part sounds like
behaviour we definitely do not want for our use case.
So we're going to need to make this OS-specific :-(

thanks
-- PMM
Kamil Rytarowski Oct. 11, 2018, 9:41 a.m. UTC | #7
On 11.10.2018 11:36, Peter Maydell wrote:
> On 11 October 2018 at 00:55, Brad Smith <brad@comstyle.com> wrote:
>> And from FreeBSD...
>>
>>      MAP_STACK MAP_STACK implies MAP_ANON, and offset of 0.  The fd
>> argument must be -1 and prot must include at least
>> PROT_READ and PROT_WRITE.
>>
>> This option creates a memory region that grows to at
>> most len bytes in size, starting from the stack top
>> and growing down.  The stack top is the starting
>> address returned by the call, plus len bytes.  The
>> bottom of the stack at maximum growth is the starting
>> address returned by the call.
>>
>> Stacks created with MAP_STACK automatically grow.
>> Guards prevent inadvertent use of the regions into
>> which those stacks can grow without requiring mapping
>> the whole stack in advance.
> 
> Hmm. That "automatically growing" part sounds like
> behaviour we definitely do not want for our use case.
> So we're going to need to make this OS-specific :-(
> 

I propose to restrict MAP_STACK it to OpenBSD (with a comment in the
code). Once it will be needed by someone else will be able to enable it
for other OSes.

> thanks
> -- PMM
>
Brad Smith Oct. 11, 2018, 2:25 p.m. UTC | #8
On 10/11/2018 5:41 AM, Kamil Rytarowski wrote:

> On 11.10.2018 11:36, Peter Maydell wrote:
>> On 11 October 2018 at 00:55, Brad Smith <brad@comstyle.com> wrote:
>>> And from FreeBSD...
>>>
>>>       MAP_STACK MAP_STACK implies MAP_ANON, and offset of 0.  The fd
>>> argument must be -1 and prot must include at least
>>> PROT_READ and PROT_WRITE.
>>>
>>> This option creates a memory region that grows to at
>>> most len bytes in size, starting from the stack top
>>> and growing down.  The stack top is the starting
>>> address returned by the call, plus len bytes.  The
>>> bottom of the stack at maximum growth is the starting
>>> address returned by the call.
>>>
>>> Stacks created with MAP_STACK automatically grow.
>>> Guards prevent inadvertent use of the regions into
>>> which those stacks can grow without requiring mapping
>>> the whole stack in advance.
>> Hmm. That "automatically growing" part sounds like
>> behaviour we definitely do not want for our use case.
>> So we're going to need to make this OS-specific :-(
>>
> I propose to restrict MAP_STACK it to OpenBSD (with a comment in the
> code). Once it will be needed by someone else will be able to enable it
> for other OSes.

I was going to propose doing something like that but you had replied 
before I did.
What sort of comment did you have in mind?
Kamil Rytarowski Oct. 11, 2018, 7:31 p.m. UTC | #9
On 11.10.2018 16:25, Brad Smith wrote:
> On 10/11/2018 5:41 AM, Kamil Rytarowski wrote:
> 
>> On 11.10.2018 11:36, Peter Maydell wrote:
>>> On 11 October 2018 at 00:55, Brad Smith <brad@comstyle.com> wrote:
>>>> And from FreeBSD...
>>>>
>>>>       MAP_STACK MAP_STACK implies MAP_ANON, and offset of 0.  The fd
>>>> argument must be -1 and prot must include at least
>>>> PROT_READ and PROT_WRITE.
>>>>
>>>> This option creates a memory region that grows to at
>>>> most len bytes in size, starting from the stack top
>>>> and growing down.  The stack top is the starting
>>>> address returned by the call, plus len bytes.  The
>>>> bottom of the stack at maximum growth is the starting
>>>> address returned by the call.
>>>>
>>>> Stacks created with MAP_STACK automatically grow.
>>>> Guards prevent inadvertent use of the regions into
>>>> which those stacks can grow without requiring mapping
>>>> the whole stack in advance.
>>> Hmm. That "automatically growing" part sounds like
>>> behaviour we definitely do not want for our use case.
>>> So we're going to need to make this OS-specific :-(
>>>
>> I propose to restrict MAP_STACK it to OpenBSD (with a comment in the
>> code). Once it will be needed by someone else will be able to enable it
>> for other OSes.
> 
> I was going to propose doing something like that but you had replied
> before I did.
> What sort of comment did you have in mind?
> 

Why do we want it only on OpenBSD and its either unneeded or meaning
something else on other OSes.

#ifdef __OpenBSD__
flags |= MAP_STACK;
#endif
Brad Smith Oct. 11, 2018, 9:20 p.m. UTC | #10
On Thu, Oct 11, 2018 at 09:31:23PM +0200, Kamil Rytarowski wrote:
> On 11.10.2018 16:25, Brad Smith wrote:
> > On 10/11/2018 5:41 AM, Kamil Rytarowski wrote:
> > 
> >> On 11.10.2018 11:36, Peter Maydell wrote:
> >>> On 11 October 2018 at 00:55, Brad Smith <brad@comstyle.com> wrote:
> >>>> And from FreeBSD...
> >>>>
> >>>> ?????????? MAP_STACK MAP_STACK implies MAP_ANON, and offset of 0.?? The fd
> >>>> argument must be -1 and prot must include at least
> >>>> PROT_READ and PROT_WRITE.
> >>>>
> >>>> This option creates a memory region that grows to at
> >>>> most len bytes in size, starting from the stack top
> >>>> and growing down.?? The stack top is the starting
> >>>> address returned by the call, plus len bytes.?? The
> >>>> bottom of the stack at maximum growth is the starting
> >>>> address returned by the call.
> >>>>
> >>>> Stacks created with MAP_STACK automatically grow.
> >>>> Guards prevent inadvertent use of the regions into
> >>>> which those stacks can grow without requiring mapping
> >>>> the whole stack in advance.
> >>> Hmm. That "automatically growing" part sounds like
> >>> behaviour we definitely do not want for our use case.
> >>> So we're going to need to make this OS-specific :-(
> >>>
> >> I propose to restrict MAP_STACK it to OpenBSD (with a comment in the
> >> code). Once it will be needed by someone else will be able to enable it
> >> for other OSes.
> > 
> > I was going to propose doing something like that but you had replied
> > before I did.
> > What sort of comment did you have in mind?
> > 
> 
> Why do we want it only on OpenBSD and its either unneeded or meaning
> something else on other OSes.
> 
> #ifdef __OpenBSD__
> flags |= MAP_STACK;
> #endif

How about the following?


diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index fbd0dc8c57..7814e61114 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -596,6 +596,7 @@ pid_t qemu_fork(Error **errp)
 void *qemu_alloc_stack(size_t *sz)
 {
     void *ptr, *guardpage;
+    int flags;
 #ifdef CONFIG_DEBUG_STACK_USAGE
     void *ptr2;
 #endif
@@ -610,8 +611,15 @@ void *qemu_alloc_stack(size_t *sz)
     /* allocate one extra page for the guard page */
     *sz += pagesz;
 
-    ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
-               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+    flags = MAP_PRIVATE | MAP_ANONYMOUS;
+#if defined(MAP_STACK) && defined(__OpenBSD__)
+    /* Only enable MAP_STACK on OpenBSD. Other OS's such
+       as Linux/FreeBSD/NetBSD have a flag with the same
+       name but have differing functionality. */
+    flags |= MAP_STACK;
+#endif
+
+    ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE, flags, -1, 0);
     if (ptr == MAP_FAILED) {
         perror("failed to allocate memory for stack");
         abort();
Kamil Rytarowski Oct. 13, 2018, 6:23 p.m. UTC | #11
On 11.10.2018 23:20, Brad Smith wrote:
> On Thu, Oct 11, 2018 at 09:31:23PM +0200, Kamil Rytarowski wrote:
>> On 11.10.2018 16:25, Brad Smith wrote:
>>> On 10/11/2018 5:41 AM, Kamil Rytarowski wrote:
>>>
>>>> On 11.10.2018 11:36, Peter Maydell wrote:
>>>>> On 11 October 2018 at 00:55, Brad Smith <brad@comstyle.com> wrote:
>>>>>> And from FreeBSD...
>>>>>>
>>>>>> ?????????? MAP_STACK MAP_STACK implies MAP_ANON, and offset of 0.?? The fd
>>>>>> argument must be -1 and prot must include at least
>>>>>> PROT_READ and PROT_WRITE.
>>>>>>
>>>>>> This option creates a memory region that grows to at
>>>>>> most len bytes in size, starting from the stack top
>>>>>> and growing down.?? The stack top is the starting
>>>>>> address returned by the call, plus len bytes.?? The
>>>>>> bottom of the stack at maximum growth is the starting
>>>>>> address returned by the call.
>>>>>>
>>>>>> Stacks created with MAP_STACK automatically grow.
>>>>>> Guards prevent inadvertent use of the regions into
>>>>>> which those stacks can grow without requiring mapping
>>>>>> the whole stack in advance.
>>>>> Hmm. That "automatically growing" part sounds like
>>>>> behaviour we definitely do not want for our use case.
>>>>> So we're going to need to make this OS-specific :-(
>>>>>
>>>> I propose to restrict MAP_STACK it to OpenBSD (with a comment in the
>>>> code). Once it will be needed by someone else will be able to enable it
>>>> for other OSes.
>>>
>>> I was going to propose doing something like that but you had replied
>>> before I did.
>>> What sort of comment did you have in mind?
>>>
>>
>> Why do we want it only on OpenBSD and its either unneeded or meaning
>> something else on other OSes.
>>
>> #ifdef __OpenBSD__
>> flags |= MAP_STACK;
>> #endif
> 
> How about the following?
> 

It looks good to me.

Reviewed-by: Kamil Rytarowski <n54@gmx.com>

> 
> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> index fbd0dc8c57..7814e61114 100644
> --- a/util/oslib-posix.c
> +++ b/util/oslib-posix.c
> @@ -596,6 +596,7 @@ pid_t qemu_fork(Error **errp)
>  void *qemu_alloc_stack(size_t *sz)
>  {
>      void *ptr, *guardpage;
> +    int flags;
>  #ifdef CONFIG_DEBUG_STACK_USAGE
>      void *ptr2;
>  #endif
> @@ -610,8 +611,15 @@ void *qemu_alloc_stack(size_t *sz)
>      /* allocate one extra page for the guard page */
>      *sz += pagesz;
>  
> -    ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
> -               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> +    flags = MAP_PRIVATE | MAP_ANONYMOUS;
> +#if defined(MAP_STACK) && defined(__OpenBSD__)
> +    /* Only enable MAP_STACK on OpenBSD. Other OS's such
> +       as Linux/FreeBSD/NetBSD have a flag with the same
> +       name but have differing functionality. */
> +    flags |= MAP_STACK;
> +#endif
> +
> +    ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE, flags, -1, 0);
>      if (ptr == MAP_FAILED) {
>          perror("failed to allocate memory for stack");
>          abort();
>
diff mbox series

Patch

diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index fbd0dc8c57..51e9a012c2 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -611,7 +611,11 @@  void *qemu_alloc_stack(size_t *sz)
     *sz += pagesz;
 
     ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
-               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+               MAP_PRIVATE | MAP_ANONYMOUS
+#ifdef MAP_STACK
+               | MAP_STACK
+#endif
+               , -1, 0);
     if (ptr == MAP_FAILED) {
         perror("failed to allocate memory for stack");
         abort();