diff mbox series

[v5] Add a test case for mmap MAP_GROWSDOWN flag

Message ID 20200911035533.30538-1-liwang@redhat.com
State Superseded
Headers show
Series [v5] Add a test case for mmap MAP_GROWSDOWN flag | expand

Commit Message

Li Wang Sept. 11, 2020, 3:55 a.m. UTC
From: pravin <pravinraghul@zilogic.com>

We assign the memory region allocated using MAP_GROWSDOWN to a thread,
as a stack, to test the effect of MAP_GROWSDOWN. This is because the
kernel only grows the memory region when the stack pointer, is within
guard page, when the guard page is touched.

  1. Map an anyonymous memory region of size X, and unmap it.
  2. Split the unmapped memory region into two.
  3. The lower memory region is left unmapped.
  4. The higher memory region is mapped for use as stack, using MAP_FIXED | MAP_GROWSDOWN.
  5. The higher memory region is provided as stack to a thread, where
     a recursive function is invoked.
  6. The stack grows beyond the allocated region, into the lower memory area.
  7. If this results in the memory region being extended, into the
     unmapped region, the test is considered to have passed.

Also, to verify that(Test2) the stack grows to within a page of the high
end of the next lower map‐ping will result in a SIGSEGV signal.

Resolves #300
Signed-off-by: Pravin Raghul S. <pravinraghul@zilogic.com>
Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Signed-off-by: Li Wang <liwang@redhat.com>
Cc: Cyril Hrubis <chrubis@suse.cz>
---
 runtest/syscalls                          |   1 +
 testcases/kernel/syscalls/mmap/.gitignore |   1 +
 testcases/kernel/syscalls/mmap/mmap18.c   | 177 ++++++++++++++++++++++
 3 files changed, 179 insertions(+)
 create mode 100644 testcases/kernel/syscalls/mmap/mmap18.c

Comments

Petr Vorel Sept. 11, 2020, 7:05 a.m. UTC | #1
Hi Li,

> From: pravin <pravinraghul@zilogic.com>

> We assign the memory region allocated using MAP_GROWSDOWN to a thread,
> as a stack, to test the effect of MAP_GROWSDOWN. This is because the
> kernel only grows the memory region when the stack pointer, is within
> guard page, when the guard page is touched.

>   1. Map an anyonymous memory region of size X, and unmap it.
>   2. Split the unmapped memory region into two.
>   3. The lower memory region is left unmapped.
>   4. The higher memory region is mapped for use as stack, using MAP_FIXED | MAP_GROWSDOWN.
>   5. The higher memory region is provided as stack to a thread, where
>      a recursive function is invoked.
>   6. The stack grows beyond the allocated region, into the lower memory area.
>   7. If this results in the memory region being extended, into the
>      unmapped region, the test is considered to have passed.

> Also, to verify that(Test2) the stack grows to within a page of the high
> end of the next lower map‐ping will result in a SIGSEGV signal.

> Resolves #300
> Signed-off-by: Pravin Raghul S. <pravinraghul@zilogic.com>
> Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com>
> Signed-off-by: Li Wang <liwang@redhat.com>
> Cc: Cyril Hrubis <chrubis@suse.cz>

Reviewed-by: Petr Vorel <pvorel@suse.cz>
LGTM.

Just please fix using spaces instead of tabs in check_stackgrow_up() and
run_test() (I suppose your modifications to pravis's code):

mmap18.c:55: WARNING: please, no spaces at the start of a line
mmap18.c:55: WARNING: suspect code indent for conditional statements (7, 15)
mmap18.c:56: ERROR: code indent should use tabs where possible
mmap18.c:56: WARNING: please, no spaces at the start of a line
mmap18.c:57: ERROR: code indent should use tabs where possible
mmap18.c:57: WARNING: please, no spaces at the start of a line
mmap18.c:58: WARNING: please, no spaces at the start of a line
mmap18.c:60: WARNING: please, no spaces at the start of a line
mmap18.c:167: ERROR: code indent should use tabs where possible
mmap18.c:167: WARNING: please, no spaces at the start of a line
mmap18.c:169: ERROR: code indent should use tabs where possible
mmap18.c:169: WARNING: please, no spaces at the start of a line
mmap18.c:170: ERROR: code indent should use tabs where possible
mmap18.c:170: WARNING: please, no spaces at the start of a line


Kind regards,
Petr
Li Wang Sept. 11, 2020, 7:42 a.m. UTC | #2
On Fri, Sep 11, 2020 at 3:05 PM Petr Vorel <pvorel@suse.cz> wrote:

> Hi Li,
>
> > From: pravin <pravinraghul@zilogic.com>
>
> > We assign the memory region allocated using MAP_GROWSDOWN to a thread,
> > as a stack, to test the effect of MAP_GROWSDOWN. This is because the
> > kernel only grows the memory region when the stack pointer, is within
> > guard page, when the guard page is touched.
>
> >   1. Map an anyonymous memory region of size X, and unmap it.
> >   2. Split the unmapped memory region into two.
> >   3. The lower memory region is left unmapped.
> >   4. The higher memory region is mapped for use as stack, using
> MAP_FIXED | MAP_GROWSDOWN.
> >   5. The higher memory region is provided as stack to a thread, where
> >      a recursive function is invoked.
> >   6. The stack grows beyond the allocated region, into the lower memory
> area.
> >   7. If this results in the memory region being extended, into the
> >      unmapped region, the test is considered to have passed.
>
> > Also, to verify that(Test2) the stack grows to within a page of the high
> > end of the next lower map‐ping will result in a SIGSEGV signal.
>
> > Resolves #300
> > Signed-off-by: Pravin Raghul S. <pravinraghul@zilogic.com>
> > Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com>
> > Signed-off-by: Li Wang <liwang@redhat.com>
> > Cc: Cyril Hrubis <chrubis@suse.cz>
>
> Reviewed-by: Petr Vorel <pvorel@suse.cz>
> LGTM.
>
> Just please fix using spaces instead of tabs in check_stackgrow_up() and
> run_test() (I suppose your modifications to pravis's code):
>

Sorry for the chaotic indent, I guess it was caused by my editor.

Anyway, thanks for the review, will fix it after getting Cyril's reviewing.
Cyril Hrubis Sept. 11, 2020, 1:08 p.m. UTC | #3
Hi!
> We assign the memory region allocated using MAP_GROWSDOWN to a thread,
> as a stack, to test the effect of MAP_GROWSDOWN. This is because the
> kernel only grows the memory region when the stack pointer, is within
> guard page, when the guard page is touched.
> 
>   1. Map an anyonymous memory region of size X, and unmap it.
>   2. Split the unmapped memory region into two.
>   3. The lower memory region is left unmapped.
>   4. The higher memory region is mapped for use as stack, using MAP_FIXED | MAP_GROWSDOWN.
>   5. The higher memory region is provided as stack to a thread, where
>      a recursive function is invoked.
>   6. The stack grows beyond the allocated region, into the lower memory area.
>   7. If this results in the memory region being extended, into the
>      unmapped region, the test is considered to have passed.
> 
> Also, to verify that(Test2) the stack grows to within a page of the high
> end of the next lower map???ping will result in a SIGSEGV signal.
> 
> Resolves #300
> Signed-off-by: Pravin Raghul S. <pravinraghul@zilogic.com>
> Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com>
> Signed-off-by: Li Wang <liwang@redhat.com>
> Cc: Cyril Hrubis <chrubis@suse.cz>
> ---
>  runtest/syscalls                          |   1 +
>  testcases/kernel/syscalls/mmap/.gitignore |   1 +
>  testcases/kernel/syscalls/mmap/mmap18.c   | 177 ++++++++++++++++++++++
>  3 files changed, 179 insertions(+)
>  create mode 100644 testcases/kernel/syscalls/mmap/mmap18.c
> 
> diff --git a/runtest/syscalls b/runtest/syscalls
> index dc0ca5626..ed86bb593 100644
> --- a/runtest/syscalls
> +++ b/runtest/syscalls
> @@ -747,6 +747,7 @@ mmap14 mmap14
>  mmap15 mmap15
>  mmap16 mmap16
>  mmap17 mmap17
> +mmap18 mmap18
>  
>  modify_ldt01 modify_ldt01
>  modify_ldt02 modify_ldt02
> diff --git a/testcases/kernel/syscalls/mmap/.gitignore b/testcases/kernel/syscalls/mmap/.gitignore
> index c5c083d4b..4fd90ab5f 100644
> --- a/testcases/kernel/syscalls/mmap/.gitignore
> +++ b/testcases/kernel/syscalls/mmap/.gitignore
> @@ -16,3 +16,4 @@
>  /mmap15
>  /mmap16
>  /mmap17
> +/mmap18
> diff --git a/testcases/kernel/syscalls/mmap/mmap18.c b/testcases/kernel/syscalls/mmap/mmap18.c
> new file mode 100644
> index 000000000..b5008497d
> --- /dev/null
> +++ b/testcases/kernel/syscalls/mmap/mmap18.c
> @@ -0,0 +1,177 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Copyright (c) Zilogic Systems Pvt. Ltd., 2020
> + * Email: code@zilogic.com
> + */
> +
> +/*
> + * Test mmap() MAP_GROWSDOWN flag
> + *
> + * # Test1:
> + *   We assign the memory region allocated using MAP_GROWSDOWN to a thread,
> + *   as a stack, to test the effect of MAP_GROWSDOWN. This is because the
> + *   kernel only grows the memory region when the stack pointer, is within
> + *   guard page, when the guard page is touched.
> + *
> + *   1. Map an anyonymous memory region of size X, and unmap it.
> + *   2. Split the unmapped memory region into two.
> + *   3. The lower memory region is left unmapped.
> + *   4. The higher memory region is mapped for use as stack, using
> + *      MAP_FIXED | MAP_GROWSDOWN.
> + *   5. The higher memory region is provided as stack to a thread, where
> + *      a recursive function is invoked.
> + *   6. The stack grows beyond the allocated region, into the lower memory area.
> + *   7. If this results in the memory region being extended, into the
> + *      unmapped region, the test is considered to have passed.
> + *
> + * # Test2:
> + *   Steps mostly like Test1, but mmaping a page in the space the stack is
> + *   supposed to grow into. To verify that the stack grows to within a page
> + *   of the high end of the next lower map???ping, at which point touching
> + *   the "guard" page will result in a SIGSEGV signal.
> + */
> +
> +#include <unistd.h>
> +#include <pthread.h>
> +#include <sys/mman.h>
> +#include <sys/wait.h>
> +#include <sys/types.h>
> +#include <stdlib.h>
> +#include <stdbool.h>
> +
> +#include "tst_test.h"
> +#include "tst_safe_pthread.h"
> +
> +#define UNITS(x) ((x) * PTHREAD_STACK_MIN)
> +
> +static void *stack;
> +static long stack_size = UNITS(8);
> +
> +static bool __attribute__((noinline)) check_stackgrow_up(void)
> +{
> +	char local_var;
> +	static char *addr;
> +
> +       if (!addr) {
> +               addr = &local_var;
> +               return check_stackgrow_up();
> +       }
> +
> +       return (addr < &local_var);
> +}
> +
> +static void setup(void)
> +{
> +	if (check_stackgrow_up())
> +		tst_brk(TCONF, "Test can't be performed with stack grows up architecture");
> +}
> +
> +static void allocate_stack(size_t size)
> +{
> +	void *start;
> +
> +	/*
> +	 * Since the newer kernel does not allow a MAP_GROWSDOWN mapping to grow
> +	 * closer than 'stack_guard_gap' pages away from a preceding mapping.
> +	 * The guard then ensures that the next-highest mapped page remains more
> +	 * than 'stack_guard_gap' below the lowest stack address, and if not then
> +	 * it will trigger a segfault. So, here adding 256 pages memory spacing
> +	 * for stack growing safely.
> +	 *
> +	 * Btw, kernel default 'stack_guard_gap' size is '256 * getpagesize()'.
> +	 */
> +	long total_size = 256 * getpagesize() + size * 2;
> +
> +	start = SAFE_MMAP(NULL, total_size, PROT_READ | PROT_WRITE,
> +			MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> +	SAFE_MUNMAP(start, total_size);
> +
> +	/* start                             stack
> +	 * +-----------+---------------------+----------------------+
> +	 * | 256 pages | unmapped (size)     | mapped (size)        |
> +	 * +-----------+---------------------+----------------------+
> +	 *
> +	 */
> +	stack = SAFE_MMAP((start + total_size - size), size, PROT_READ | PROT_WRITE,
> +			  MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN,
> +			  -1, 0);

Well it's not wrong per se but as it is we do not use the pre-allocated
part of the stack at all, we directly jump for the guard page as we use
the stack pointer as a base for the pthread stack. The actual pointer
that points to the start of the region is stack - stack_size.

There is no point in adding size * 2 here. We can as well reserve 256 *
page_size + size. Then map() a single page at the end, which would be at
start + total_size - page_size and finally return start + total_size
from this function and pass that to pthread_attr_setstack().

That way it would look like:

| 256 pages | unmapped | 1 mapped page |

            | - - -  stack_size  - - - |


> +	tst_res(TINFO, "start = %p, stack = %p", start, stack);
> +}
> +
> +static __attribute__((noinline)) void *check_depth_recursive(void *limit)
> +{
> +	if ((off_t) &limit < (off_t) limit) {
> +		tst_res(TINFO, "&limit = %p, limit = %p", &limit, limit);
> +		return NULL;
> +	}
> +
> +	return check_depth_recursive(limit);
> +}
> +
> +static void grow_stack(void *stack, size_t size, void *limit)
> +{
> +	pthread_t test_thread;
> +	pthread_attr_t attr;
> +	int ret;
> +
> +	ret = pthread_attr_init(&attr);
> +	if (ret)
> +		tst_brk(TBROK, "pthread_attr_init failed during setup");
> +
> +	ret = pthread_attr_setstack(&attr, stack, size);
> +	if (ret)
> +		tst_brk(TBROK, "pthread_attr_setstack failed during setup");
> +
> +	SAFE_PTHREAD_CREATE(&test_thread, &attr, check_depth_recursive, limit);
> +	SAFE_PTHREAD_JOIN(test_thread, NULL);
> +
> +	if (stack)
> +		SAFE_MUNMAP(stack, stack_size);

I'ts a bit unexpected to unmap the stack here. I guess that unamping it
in the run_test() after the grow_stack() call would be a bit cleaner but
we would have to move the exit(0) there as well.

> +	exit(0);
> +}
> +
> +static void run_test(void)
> +{
> +	pid_t child_pid;
> +	int wstatus;
> +
> +	/* Test 1 */
> +	child_pid = SAFE_FORK();
> +	if (!child_pid) {
> +		allocate_stack(stack_size);
> +		grow_stack(stack, stack_size, stack - stack_size + UNITS(1));


> +	}
> +
> +	SAFE_WAIT(&wstatus);
> +	if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) == 0)
> +		tst_res(TPASS, "Stack grows in unmapped region");
> +	else
> +		tst_res(TFAIL, "Child: %s", tst_strstatus(wstatus));
> +
> +	/* Test 2 */
> +	child_pid = SAFE_FORK();
> +	if (!child_pid) {
> +		tst_no_corefile(0);
                  ^
		 This should go to the test setup.

> +		allocate_stack(stack_size);
> +
> +		SAFE_MMAP(stack - stack_size, UNITS(1), PROT_READ | PROT_WRITE,
> +			  MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> +
> +		/* This will cause to segment fault (SIGSEGV) */
> +		grow_stack(stack, stack_size, stack - stack_size + UNITS(1));
> +	}
> +
> +	SAFE_WAIT(&wstatus);
> +        if (WIFSIGNALED(wstatus) && WTERMSIG(wstatus) == SIGSEGV)
> +		tst_res(TPASS, "Child ended by %s as expected", tst_strsig(SIGSEGV));
> +        else
> +                tst_res(TFAIL, "Child: %s", tst_strstatus(wstatus));
> +}
> +
> +static struct tst_test test = {
> +	.setup = setup,
> +	.test_all = run_test,
> +	.forks_child = 1,
> +};
> -- 
> 2.21.1
>
Li Wang Sept. 11, 2020, 2:41 p.m. UTC | #4
On Fri, Sep 11, 2020 at 9:08 PM Cyril Hrubis <chrubis@suse.cz> wrote:

> ...
> > +static void allocate_stack(size_t size)
> > +{
> > +     void *start;
> > +
> > +     /*
> > +      * Since the newer kernel does not allow a MAP_GROWSDOWN mapping
> to grow
> > +      * closer than 'stack_guard_gap' pages away from a preceding
> mapping.
> > +      * The guard then ensures that the next-highest mapped page
> remains more
> > +      * than 'stack_guard_gap' below the lowest stack address, and if
> not then
> > +      * it will trigger a segfault. So, here adding 256 pages memory
> spacing
> > +      * for stack growing safely.
> > +      *
> > +      * Btw, kernel default 'stack_guard_gap' size is '256 *
> getpagesize()'.
> > +      */
> > +     long total_size = 256 * getpagesize() + size * 2;
> > +
> > +     start = SAFE_MMAP(NULL, total_size, PROT_READ | PROT_WRITE,
> > +                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> > +     SAFE_MUNMAP(start, total_size);
> > +
> > +     /* start                             stack
> > +      * +-----------+---------------------+----------------------+
> > +      * | 256 pages | unmapped (size)     | mapped (size)        |
> > +      * +-----------+---------------------+----------------------+
> > +      *
> > +      */
> > +     stack = SAFE_MMAP((start + total_size - size), size, PROT_READ |
> PROT_WRITE,
> > +                       MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS |
> MAP_GROWSDOWN,
> > +                       -1, 0);
>
> Well it's not wrong per se but as it is we do not use the pre-allocated
> part of the stack at all, we directly jump for the guard page as we use
>

Really? But I think the pthread_attr_setstack(&attr, stack, stack_size) will
take use of the whole stack memory in function recursive performing.
How can we say NOT use the pre-allocated stack? I fell a bit confused
about your words here.

> the stack pointer as a base for the pthread stack. The actual pointer
> that points to the start of the region is stack - stack_size.
>

If that's true, it might belong to the internal operation of stack grows,
should we take care of it in the userspace case?

> There is no point in adding size * 2 here. We can as well reserve 256 *
> page_size + size. Then map() a single page at the end, which would be at
> start + total_size - page_size and finally return start + total_size
> from this function and pass that to pthread_attr_setstack().
>

I guess that will be work, but sounds a bit stingy. Since the modern system
does not short of such memory for testing:). And if we decide to go with
this,
the code design comments above should be all rewrite.


>
> That way it would look like:
>
> | 256 pages | unmapped | 1 mapped page |
>
>             | - - -  stack_size  - - - |
>
>
> ...
> > +     SAFE_PTHREAD_CREATE(&test_thread, &attr, check_depth_recursive,
> limit);
> > +     SAFE_PTHREAD_JOIN(test_thread, NULL);
> > +
> > +     if (stack)
> > +             SAFE_MUNMAP(stack, stack_size);
>
> I'ts a bit unexpected to unmap the stack here. I guess that unamping it
> in the run_test() after the grow_stack() call would be a bit cleaner but
> we would have to move the exit(0) there as well.
>

I agree with this.


> > +     /* Test 2 */
> > +     child_pid = SAFE_FORK();
> > +     if (!child_pid) {
> > +             tst_no_corefile(0);
>                   ^
>                  This should go to the test setup.
>

Only the child_2 will get SIGSEGV, why should we move it to affect the
whole test?
Cyril Hrubis Sept. 11, 2020, 2:57 p.m. UTC | #5
Hi!
> > Well it's not wrong per se but as it is we do not use the pre-allocated
> > part of the stack at all, we directly jump for the guard page as we use
> >
> 
> Really? But I think the pthread_attr_setstack(&attr, stack, stack_size) will
> take use of the whole stack memory in function recursive performing.
> How can we say NOT use the pre-allocated stack? I fell a bit confused
> about your words here.

I've been confused as well I looked at pthread_attr_setstack() function
manual and it's expecting to get the lowest pointer of the stack. So I
suppose that the stack really started at the stack + stack_size address.
But still the code wasn't exactly right, because the lowest address of
the stack in the previous code was stack - stack_size, which would be
start of the unmapped region and the size of the stack would be 2 *
stack_size, as we expected the mapping to grow.

> > There is no point in adding size * 2 here. We can as well reserve 256 *
> > page_size + size. Then map() a single page at the end, which would be at
> > start + total_size - page_size and finally return start + total_size
> > from this function and pass that to pthread_attr_setstack().
> >
> 
> I guess that will be work, but sounds a bit stingy. Since the modern system
> does not short of such memory for testing:). And if we decide to go with
> this, the code design comments above should be all rewrite.

I do find this layout to be a bit more straighforward.

> >
> > That way it would look like:
> >
> > | 256 pages | unmapped | 1 mapped page |
> >
> >             | - - -  stack_size  - - - |
> >
> >
> > > +     /* Test 2 */
> > > +     child_pid = SAFE_FORK();
> > > +     if (!child_pid) {
> > > +             tst_no_corefile(0);
> >                   ^
> >                  This should go to the test setup.
> >
> 
> Only the child_2 will get SIGSEGV, why should we move it to affect the
> whole test?

It's not like we do expect any part of the test to produce core-dump so
there is no point in disabling them on each iteration only for the
child. But I guess that it's fine either way.
Li Wang Sept. 14, 2020, 3:06 a.m. UTC | #6
On Fri, Sep 11, 2020 at 10:57 PM Cyril Hrubis <chrubis@suse.cz> wrote:

> Hi!
> > > Well it's not wrong per se but as it is we do not use the pre-allocated
> > > part of the stack at all, we directly jump for the guard page as we use
> > >
> >
> > Really? But I think the pthread_attr_setstack(&attr, stack, stack_size)
> will
> > take use of the whole stack memory in function recursive performing.
> > How can we say NOT use the pre-allocated stack? I fell a bit confused
> > about your words here.
>
> I've been confused as well I looked at pthread_attr_setstack() function
> manual and it's expecting to get the lowest pointer of the stack. So I
> suppose that the stack really started at the stack + stack_size address.
>

I guess you probably misread the "lowest pointer of the stack", it is not
mean
the bottom of the stack, it is actually the lowest addressable byte of a
buffer of
stacksize. From what I understand, we should NOT pass the 'stack +
stack_size'
address to pthread_attr_setstack().



> But still the code wasn't exactly right, because the lowest address of
> the stack in the previous code was stack - stack_size, which would be
> start of the unmapped region and the size of the stack would be 2 *
> stack_size, as we expected the mapping to grow.
>

No, I don't think so, the lowest address of the stack in the previous code
is:
    stack = start + total_size - size;
and we pass this stack pointer to ptrehad_attr_setstack() is correct here,
indeed the stack really starts at stack + stack_size, that's internal steps.

If we go with 'stack + stack_size' as you suggested, that will easily get
segmental fault. So I stand by myself understanding unless someone can
give enough explanation/demo :).

PTHREAD_ATTR_SETSTACK(3) manual says:
  "stackaddr should point to the lowest addressable byte of a buffer of
stacksize bytes that was allocated by the caller.  The pages of the
allocated buffer should be both readable and writable."
Li Wang Sept. 14, 2020, 3:40 a.m. UTC | #7
On Mon, Sep 14, 2020 at 11:06 AM Li Wang <liwang@redhat.com> wrote:

>
> ...
>
>
>> But still the code wasn't exactly right, because the lowest address of
>> the stack in the previous code was stack - stack_size, which would be
>> start of the unmapped region and the size of the stack would be 2 *
>> stack_size, as we expected the mapping to grow.
>>
>
> No, I don't think so, the lowest address of the stack in the previous code
> is:
>     stack = start + total_size - size;
> and we pass this stack pointer to ptrehad_attr_setstack() is correct here,
> indeed the stack really starts at stack + stack_size, that's internal
> steps.
>

Here print debug info to verify that stack grows down from 'stack + size'
address:

mmap18.c:99: INFO: start = 0x7f4831293000, stack - size = 0x7f4831393000,
stack = 0x7f48313b3000, stack + size = 0x7f48313d3000
mmap18.c:110: INFO: &limit = 0x7f48313d1ee8, limit = 0x7f4831397000
mmap18.c:110: INFO: &limit = 0x7f48313d1ec8, limit = 0x7f4831397000
mmap18.c:110: INFO: &limit = 0x7f48313d1ea8, limit = 0x7f4831397000
mmap18.c:110: INFO: &limit = 0x7f48313d1e88, limit = 0x7f4831397000
mmap18.c:110: INFO: &limit = 0x7f48313d1e68, limit = 0x7f4831397000
mmap18.c:110: INFO: &limit = 0x7f48313d1e48, limit = 0x7f4831397000
mmap18.c:110: INFO: &limit = 0x7f48313d1e28, limit = 0x7f4831397000
mmap18.c:110: INFO: &limit = 0x7f48313d1e08, limit = 0x7f4831397000
mmap18.c:110: INFO: &limit = 0x7f48313d1de8, limit = 0x7f4831397000
...
Cyril Hrubis Sept. 15, 2020, 1:40 p.m. UTC | #8
Hi!
> > But still the code wasn't exactly right, because the lowest address of
> > the stack in the previous code was stack - stack_size, which would be
> > start of the unmapped region and the size of the stack would be 2 *
> > stack_size, as we expected the mapping to grow.
> >
> 
> No, I don't think so, the lowest address of the stack in the previous code
> is:
>     stack = start + total_size - size;
> and we pass this stack pointer to ptrehad_attr_setstack() is correct here,
> indeed the stack really starts at stack + stack_size, that's internal steps.
> 
> If we go with 'stack + stack_size' as you suggested, that will easily get
> segmental fault. So I stand by myself understanding unless someone can
> give enough explanation/demo :).

No I mean to pass stack - stack_size as a stack lowest addres since that
is the real end of the stack, i.e. the thread touches the memory close
to that address and technically we overflow the stack when we pass stack
as an address to pthread_attr_setstack().

Or we can change the memory layout so that only first page of the stack
is mapped as I proposed which is much less confusing that this.

> PTHREAD_ATTR_SETSTACK(3) manual says:
>   "stackaddr should point to the lowest addressable byte of a buffer of
> stacksize bytes that was allocated by the caller.  The pages of the
> allocated buffer should be both readable and writable."

I would say that what we actually do here is in the gray zone, nobody
really expected that growable stack would be used with pthreads and
hence the manual page does not take growable mappings into an account.

It works because libc does really use the stack size only to calculate
start of the stack, i.e. highest address of the stack, but still I do
find the code a bit confusing, since the real test starts after we
overflow the stack we passed to the pthread_attr_setstack().

I.e. regardless if we pass addr=stack size=stack_size or
addr=stack-stack_size and size=2*stack_size the glibc will end up with
the same address which is stack + stack_size. But the second one is a
bit more clear in where the real end of the stack is supposed to be.
Cyril Hrubis Sept. 18, 2020, 11:43 a.m. UTC | #9
Hi!
Li are you working on this, or should I try to finish the test?
Li Wang Sept. 18, 2020, 3 p.m. UTC | #10
Hi Cyril,

On Fri, Sep 18, 2020 at 7:42 PM Cyril Hrubis <chrubis@suse.cz> wrote:

> Hi!
> Li are you working on this, or should I try to finish the test?
>

Sorry for the late reply.

No, I'm working on another urgent issue so far. Feel free work out the
patch V6,
I think you can finish it better than me.
diff mbox series

Patch

diff --git a/runtest/syscalls b/runtest/syscalls
index dc0ca5626..ed86bb593 100644
--- a/runtest/syscalls
+++ b/runtest/syscalls
@@ -747,6 +747,7 @@  mmap14 mmap14
 mmap15 mmap15
 mmap16 mmap16
 mmap17 mmap17
+mmap18 mmap18
 
 modify_ldt01 modify_ldt01
 modify_ldt02 modify_ldt02
diff --git a/testcases/kernel/syscalls/mmap/.gitignore b/testcases/kernel/syscalls/mmap/.gitignore
index c5c083d4b..4fd90ab5f 100644
--- a/testcases/kernel/syscalls/mmap/.gitignore
+++ b/testcases/kernel/syscalls/mmap/.gitignore
@@ -16,3 +16,4 @@ 
 /mmap15
 /mmap16
 /mmap17
+/mmap18
diff --git a/testcases/kernel/syscalls/mmap/mmap18.c b/testcases/kernel/syscalls/mmap/mmap18.c
new file mode 100644
index 000000000..b5008497d
--- /dev/null
+++ b/testcases/kernel/syscalls/mmap/mmap18.c
@@ -0,0 +1,177 @@ 
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (c) Zilogic Systems Pvt. Ltd., 2020
+ * Email: code@zilogic.com
+ */
+
+/*
+ * Test mmap() MAP_GROWSDOWN flag
+ *
+ * # Test1:
+ *   We assign the memory region allocated using MAP_GROWSDOWN to a thread,
+ *   as a stack, to test the effect of MAP_GROWSDOWN. This is because the
+ *   kernel only grows the memory region when the stack pointer, is within
+ *   guard page, when the guard page is touched.
+ *
+ *   1. Map an anyonymous memory region of size X, and unmap it.
+ *   2. Split the unmapped memory region into two.
+ *   3. The lower memory region is left unmapped.
+ *   4. The higher memory region is mapped for use as stack, using
+ *      MAP_FIXED | MAP_GROWSDOWN.
+ *   5. The higher memory region is provided as stack to a thread, where
+ *      a recursive function is invoked.
+ *   6. The stack grows beyond the allocated region, into the lower memory area.
+ *   7. If this results in the memory region being extended, into the
+ *      unmapped region, the test is considered to have passed.
+ *
+ * # Test2:
+ *   Steps mostly like Test1, but mmaping a page in the space the stack is
+ *   supposed to grow into. To verify that the stack grows to within a page
+ *   of the high end of the next lower map‐ping, at which point touching
+ *   the "guard" page will result in a SIGSEGV signal.
+ */
+
+#include <unistd.h>
+#include <pthread.h>
+#include <sys/mman.h>
+#include <sys/wait.h>
+#include <sys/types.h>
+#include <stdlib.h>
+#include <stdbool.h>
+
+#include "tst_test.h"
+#include "tst_safe_pthread.h"
+
+#define UNITS(x) ((x) * PTHREAD_STACK_MIN)
+
+static void *stack;
+static long stack_size = UNITS(8);
+
+static bool __attribute__((noinline)) check_stackgrow_up(void)
+{
+	char local_var;
+	static char *addr;
+
+       if (!addr) {
+               addr = &local_var;
+               return check_stackgrow_up();
+       }
+
+       return (addr < &local_var);
+}
+
+static void setup(void)
+{
+	if (check_stackgrow_up())
+		tst_brk(TCONF, "Test can't be performed with stack grows up architecture");
+}
+
+static void allocate_stack(size_t size)
+{
+	void *start;
+
+	/*
+	 * Since the newer kernel does not allow a MAP_GROWSDOWN mapping to grow
+	 * closer than 'stack_guard_gap' pages away from a preceding mapping.
+	 * The guard then ensures that the next-highest mapped page remains more
+	 * than 'stack_guard_gap' below the lowest stack address, and if not then
+	 * it will trigger a segfault. So, here adding 256 pages memory spacing
+	 * for stack growing safely.
+	 *
+	 * Btw, kernel default 'stack_guard_gap' size is '256 * getpagesize()'.
+	 */
+	long total_size = 256 * getpagesize() + size * 2;
+
+	start = SAFE_MMAP(NULL, total_size, PROT_READ | PROT_WRITE,
+			MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	SAFE_MUNMAP(start, total_size);
+
+	/* start                             stack
+	 * +-----------+---------------------+----------------------+
+	 * | 256 pages | unmapped (size)     | mapped (size)        |
+	 * +-----------+---------------------+----------------------+
+	 *
+	 */
+	stack = SAFE_MMAP((start + total_size - size), size, PROT_READ | PROT_WRITE,
+			  MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN,
+			  -1, 0);
+
+	tst_res(TINFO, "start = %p, stack = %p", start, stack);
+}
+
+static __attribute__((noinline)) void *check_depth_recursive(void *limit)
+{
+	if ((off_t) &limit < (off_t) limit) {
+		tst_res(TINFO, "&limit = %p, limit = %p", &limit, limit);
+		return NULL;
+	}
+
+	return check_depth_recursive(limit);
+}
+
+static void grow_stack(void *stack, size_t size, void *limit)
+{
+	pthread_t test_thread;
+	pthread_attr_t attr;
+	int ret;
+
+	ret = pthread_attr_init(&attr);
+	if (ret)
+		tst_brk(TBROK, "pthread_attr_init failed during setup");
+
+	ret = pthread_attr_setstack(&attr, stack, size);
+	if (ret)
+		tst_brk(TBROK, "pthread_attr_setstack failed during setup");
+
+	SAFE_PTHREAD_CREATE(&test_thread, &attr, check_depth_recursive, limit);
+	SAFE_PTHREAD_JOIN(test_thread, NULL);
+
+	if (stack)
+		SAFE_MUNMAP(stack, stack_size);
+
+	exit(0);
+}
+
+static void run_test(void)
+{
+	pid_t child_pid;
+	int wstatus;
+
+	/* Test 1 */
+	child_pid = SAFE_FORK();
+	if (!child_pid) {
+		allocate_stack(stack_size);
+		grow_stack(stack, stack_size, stack - stack_size + UNITS(1));
+	}
+
+	SAFE_WAIT(&wstatus);
+	if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) == 0)
+		tst_res(TPASS, "Stack grows in unmapped region");
+	else
+		tst_res(TFAIL, "Child: %s", tst_strstatus(wstatus));
+
+	/* Test 2 */
+	child_pid = SAFE_FORK();
+	if (!child_pid) {
+		tst_no_corefile(0);
+		allocate_stack(stack_size);
+
+		SAFE_MMAP(stack - stack_size, UNITS(1), PROT_READ | PROT_WRITE,
+			  MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+
+		/* This will cause to segment fault (SIGSEGV) */
+		grow_stack(stack, stack_size, stack - stack_size + UNITS(1));
+	}
+
+	SAFE_WAIT(&wstatus);
+        if (WIFSIGNALED(wstatus) && WTERMSIG(wstatus) == SIGSEGV)
+		tst_res(TPASS, "Child ended by %s as expected", tst_strsig(SIGSEGV));
+        else
+                tst_res(TFAIL, "Child: %s", tst_strstatus(wstatus));
+}
+
+static struct tst_test test = {
+	.setup = setup,
+	.test_all = run_test,
+	.forks_child = 1,
+};