diff mbox series

linux-user: Add emulation for MADV_WIPEONFORK and MADV_KEEPONFORK in madvise()

Message ID Y5bRnRaiSOUKRjdW@p100
State New
Headers show
Series linux-user: Add emulation for MADV_WIPEONFORK and MADV_KEEPONFORK in madvise() | expand

Commit Message

Helge Deller Dec. 12, 2022, 7 a.m. UTC
Both parameters have a different value on the parisc platform, so first
translate the target value into a host value for usage in the native
madvise() syscall.

Those parameters are often used by security sensitive applications (e.g.
tor browser, boringssl, ...) which expect the call to return a proper
return code on failure, so return -EINVAL if qemu fails to forward the
syscall to the host OS.

Tested with testcase of tor browser when running hppa-linux guest on
x86-64 host.

Signed-off-by: Helge Deller <deller@gmx.de>

Comments

Ilya Leoshkevich Dec. 12, 2022, 9:16 p.m. UTC | #1
On Mon, Dec 12, 2022 at 08:00:45AM +0100, Helge Deller wrote:
> Both parameters have a different value on the parisc platform, so first
> translate the target value into a host value for usage in the native
> madvise() syscall.
> 
> Those parameters are often used by security sensitive applications (e.g.
> tor browser, boringssl, ...) which expect the call to return a proper
> return code on failure, so return -EINVAL if qemu fails to forward the
> syscall to the host OS.
> 
> Tested with testcase of tor browser when running hppa-linux guest on
> x86-64 host.
> 
> Signed-off-by: Helge Deller <deller@gmx.de>
> 
> diff --git a/linux-user/mmap.c b/linux-user/mmap.c
> index 10f5079331..c75342108c 100644
> --- a/linux-user/mmap.c
> +++ b/linux-user/mmap.c
> @@ -901,11 +901,25 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice)
>          return -TARGET_EINVAL;
>      }
> 
> +    /* Translate for some architectures which have different MADV_xxx values */
> +    switch (advice) {
> +    case TARGET_MADV_DONTNEED:      /* alpha */
> +        advice = MADV_DONTNEED;
> +        break;
> +    case TARGET_MADV_WIPEONFORK:    /* parisc */
> +        advice = MADV_WIPEONFORK;
> +        break;
> +    case TARGET_MADV_KEEPONFORK:    /* parisc */
> +        advice = MADV_KEEPONFORK;
> +        break;
> +    /* we do not care about the other MADV_xxx values yet */
> +    }
> +
>      /*
>       * A straight passthrough may not be safe because qemu sometimes turns
>       * private file-backed mappings into anonymous mappings.
>       *
> -     * This is a hint, so ignoring and returning success is ok.
> +     * For MADV_DONTNEED, which is a hint, ignoring and returning success is ok.

Actually, MADV_DONTNEED is one of the few values, which is not always a
hint - it can be used to e.g. zero out pages.

As the next paragraph states, strictly speaking, MADV_DONTNEED is
currently broken, because it can indeed be ignored without indication
in some cases, but it's still arguably better than not honoring it at
all.

>       *
>       * This breaks MADV_DONTNEED, completely implementing which is quite
>       * complicated. However, there is one low-hanging fruit: mappings that are
> @@ -913,11 +927,17 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice)
>       * passthrough is safe, so do it.
>       */
>      mmap_lock();
> -    if (advice == TARGET_MADV_DONTNEED &&
> -        can_passthrough_madv_dontneed(start, end)) {
> -        ret = get_errno(madvise(g2h_untagged(start), len, MADV_DONTNEED));
> -        if (ret == 0) {
> -            page_reset_target_data(start, start + len);
> +    switch (advice) {
> +    case MADV_WIPEONFORK:
> +    case MADV_KEEPONFORK:
> +        ret = -EINVAL;
> +        /* fall through */
> +    case MADV_DONTNEED:
> +        if (can_passthrough_madv_dontneed(start, end)) {
> +            ret = get_errno(madvise(g2h_untagged(start), len, advice));
> +            if ((advice == MADV_DONTNEED) && (ret == 0)) {
> +                page_reset_target_data(start, start + len);
> +            }
>          }
>      }
>      mmap_unlock();
> 

Nit: maybe rename can_passthrough_madv_dontneed() to can_passthrough(),
since now it's used not only for MADV_DONTNEED?



With the MADV_DONTNEED comment change:

Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Helge Deller Dec. 12, 2022, 9:49 p.m. UTC | #2
On 12/12/22 22:16, Ilya Leoshkevich wrote:
> On Mon, Dec 12, 2022 at 08:00:45AM +0100, Helge Deller wrote:
>> Both parameters have a different value on the parisc platform, so first
>> translate the target value into a host value for usage in the native
>> madvise() syscall.
>>
>> Those parameters are often used by security sensitive applications (e.g.
>> tor browser, boringssl, ...) which expect the call to return a proper
>> return code on failure, so return -EINVAL if qemu fails to forward the
>> syscall to the host OS.
>>
>> Tested with testcase of tor browser when running hppa-linux guest on
>> x86-64 host.
>>
>> Signed-off-by: Helge Deller <deller@gmx.de>
>>
>> diff --git a/linux-user/mmap.c b/linux-user/mmap.c
>> index 10f5079331..c75342108c 100644
>> --- a/linux-user/mmap.c
>> +++ b/linux-user/mmap.c
>> @@ -901,11 +901,25 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice)
>>           return -TARGET_EINVAL;
>>       }
>>
>> +    /* Translate for some architectures which have different MADV_xxx values */
>> +    switch (advice) {
>> +    case TARGET_MADV_DONTNEED:      /* alpha */
>> +        advice = MADV_DONTNEED;
>> +        break;
>> +    case TARGET_MADV_WIPEONFORK:    /* parisc */
>> +        advice = MADV_WIPEONFORK;
>> +        break;
>> +    case TARGET_MADV_KEEPONFORK:    /* parisc */
>> +        advice = MADV_KEEPONFORK;
>> +        break;
>> +    /* we do not care about the other MADV_xxx values yet */
>> +    }
>> +
>>       /*
>>        * A straight passthrough may not be safe because qemu sometimes turns
>>        * private file-backed mappings into anonymous mappings.
>>        *
>> -     * This is a hint, so ignoring and returning success is ok.
>> +     * For MADV_DONTNEED, which is a hint, ignoring and returning success is ok.
>
> Actually, MADV_DONTNEED is one of the few values, which is not always a
> hint - it can be used to e.g. zero out pages.

Right, it _should_ zero out pages and return 0, or otherwise return failure.
I think the problem is that some userspace apps will then sadly break if we
change the current behaviour....

Anyway, in this patch I didn't wanted to touch MAD_DONTNEED.

> As the next paragraph states, strictly speaking, MADV_DONTNEED is
> currently broken, because it can indeed be ignored without indication
> in some cases, but it's still arguably better than not honoring it at
> all.

Yep.

>>        *
>>        * This breaks MADV_DONTNEED, completely implementing which is quite
>>        * complicated. However, there is one low-hanging fruit: mappings that are
>> @@ -913,11 +927,17 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice)
>>        * passthrough is safe, so do it.
>>        */
>>       mmap_lock();
>> -    if (advice == TARGET_MADV_DONTNEED &&
>> -        can_passthrough_madv_dontneed(start, end)) {
>> -        ret = get_errno(madvise(g2h_untagged(start), len, MADV_DONTNEED));
>> -        if (ret == 0) {
>> -            page_reset_target_data(start, start + len);
>> +    switch (advice) {
>> +    case MADV_WIPEONFORK:
>> +    case MADV_KEEPONFORK:
>> +        ret = -EINVAL;
>> +        /* fall through */
>> +    case MADV_DONTNEED:
>> +        if (can_passthrough_madv_dontneed(start, end)) {
>> +            ret = get_errno(madvise(g2h_untagged(start), len, advice));
>> +            if ((advice == MADV_DONTNEED) && (ret == 0)) {
>> +                page_reset_target_data(start, start + len);
>> +            }
>>           }
>>       }
>>       mmap_unlock();
>>
>
> Nit: maybe rename can_passthrough_madv_dontneed() to can_passthrough(),
> since now it's used not only for MADV_DONTNEED?

Maybe can_passthrough_madvise() is better?

> With the MADV_DONTNEED comment change:

Just for me to understand correctly:
You propose that I shouldn't touch that comment in my followup-patch, right?
That's ok.

> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>

Thanks!
Helge
Ilya Leoshkevich Dec. 12, 2022, 10:12 p.m. UTC | #3
On Mon, Dec 12, 2022 at 10:49:24PM +0100, Helge Deller wrote:
> On 12/12/22 22:16, Ilya Leoshkevich wrote:
> > On Mon, Dec 12, 2022 at 08:00:45AM +0100, Helge Deller wrote:
> > > Both parameters have a different value on the parisc platform, so first
> > > translate the target value into a host value for usage in the native
> > > madvise() syscall.
> > > 
> > > Those parameters are often used by security sensitive applications (e.g.
> > > tor browser, boringssl, ...) which expect the call to return a proper
> > > return code on failure, so return -EINVAL if qemu fails to forward the
> > > syscall to the host OS.
> > > 
> > > Tested with testcase of tor browser when running hppa-linux guest on
> > > x86-64 host.
> > > 
> > > Signed-off-by: Helge Deller <deller@gmx.de>
> > > 
> > > diff --git a/linux-user/mmap.c b/linux-user/mmap.c
> > > index 10f5079331..c75342108c 100644
> > > --- a/linux-user/mmap.c
> > > +++ b/linux-user/mmap.c
> > > @@ -901,11 +901,25 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice)
> > >           return -TARGET_EINVAL;
> > >       }
> > > 
> > > +    /* Translate for some architectures which have different MADV_xxx values */
> > > +    switch (advice) {
> > > +    case TARGET_MADV_DONTNEED:      /* alpha */
> > > +        advice = MADV_DONTNEED;
> > > +        break;
> > > +    case TARGET_MADV_WIPEONFORK:    /* parisc */
> > > +        advice = MADV_WIPEONFORK;
> > > +        break;
> > > +    case TARGET_MADV_KEEPONFORK:    /* parisc */
> > > +        advice = MADV_KEEPONFORK;
> > > +        break;
> > > +    /* we do not care about the other MADV_xxx values yet */
> > > +    }
> > > +
> > >       /*
> > >        * A straight passthrough may not be safe because qemu sometimes turns
> > >        * private file-backed mappings into anonymous mappings.
> > >        *
> > > -     * This is a hint, so ignoring and returning success is ok.
> > > +     * For MADV_DONTNEED, which is a hint, ignoring and returning success is ok.
> > 
> > Actually, MADV_DONTNEED is one of the few values, which is not always a
> > hint - it can be used to e.g. zero out pages.
> 
> Right, it _should_ zero out pages and return 0, or otherwise return failure.
> I think the problem is that some userspace apps will then sadly break if we
> change the current behaviour....
> 
> Anyway, in this patch I didn't wanted to touch MAD_DONTNEED.
> 
> > As the next paragraph states, strictly speaking, MADV_DONTNEED is
> > currently broken, because it can indeed be ignored without indication
> > in some cases, but it's still arguably better than not honoring it at
> > all.
> 
> Yep.
> 
> > >        *
> > >        * This breaks MADV_DONTNEED, completely implementing which is quite
> > >        * complicated. However, there is one low-hanging fruit: mappings that are
> > > @@ -913,11 +927,17 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice)
> > >        * passthrough is safe, so do it.
> > >        */
> > >       mmap_lock();
> > > -    if (advice == TARGET_MADV_DONTNEED &&
> > > -        can_passthrough_madv_dontneed(start, end)) {
> > > -        ret = get_errno(madvise(g2h_untagged(start), len, MADV_DONTNEED));
> > > -        if (ret == 0) {
> > > -            page_reset_target_data(start, start + len);
> > > +    switch (advice) {
> > > +    case MADV_WIPEONFORK:
> > > +    case MADV_KEEPONFORK:
> > > +        ret = -EINVAL;
> > > +        /* fall through */
> > > +    case MADV_DONTNEED:
> > > +        if (can_passthrough_madv_dontneed(start, end)) {
> > > +            ret = get_errno(madvise(g2h_untagged(start), len, advice));
> > > +            if ((advice == MADV_DONTNEED) && (ret == 0)) {
> > > +                page_reset_target_data(start, start + len);
> > > +            }
> > >           }
> > >       }
> > >       mmap_unlock();
> > > 
> > 
> > Nit: maybe rename can_passthrough_madv_dontneed() to can_passthrough(),
> > since now it's used not only for MADV_DONTNEED?
> 
> Maybe can_passthrough_madvise() is better?

Sounds good to me as well. The idea with PAGE_PASSTHROUGH was that we
should be able to passthrough anything; but with this patch we still
use it only for madvise(), and indicating it in the name makes sense.

> > > With the MADV_DONTNEED comment change:
> 
> Just for me to understand correctly:
> You propose that I shouldn't touch that comment in my followup-patch, right?
> That's ok.

Either that, or maybe make it more precise - strictly speaking, it's
not correct at the moment anyway. Maybe something like this?

    Most advice values are hints, so ignoring and returning success is
    ok.

    However, this would break MADV_DONTNEED, MADV_WIPEONFORK and
    MADV_KEEPONFORK ...

> > Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
> 
> Thanks!
> Helge
diff mbox series

Patch

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 10f5079331..c75342108c 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -901,11 +901,25 @@  abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice)
         return -TARGET_EINVAL;
     }

+    /* Translate for some architectures which have different MADV_xxx values */
+    switch (advice) {
+    case TARGET_MADV_DONTNEED:      /* alpha */
+        advice = MADV_DONTNEED;
+        break;
+    case TARGET_MADV_WIPEONFORK:    /* parisc */
+        advice = MADV_WIPEONFORK;
+        break;
+    case TARGET_MADV_KEEPONFORK:    /* parisc */
+        advice = MADV_KEEPONFORK;
+        break;
+    /* we do not care about the other MADV_xxx values yet */
+    }
+
     /*
      * A straight passthrough may not be safe because qemu sometimes turns
      * private file-backed mappings into anonymous mappings.
      *
-     * This is a hint, so ignoring and returning success is ok.
+     * For MADV_DONTNEED, which is a hint, ignoring and returning success is ok.
      *
      * This breaks MADV_DONTNEED, completely implementing which is quite
      * complicated. However, there is one low-hanging fruit: mappings that are
@@ -913,11 +927,17 @@  abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice)
      * passthrough is safe, so do it.
      */
     mmap_lock();
-    if (advice == TARGET_MADV_DONTNEED &&
-        can_passthrough_madv_dontneed(start, end)) {
-        ret = get_errno(madvise(g2h_untagged(start), len, MADV_DONTNEED));
-        if (ret == 0) {
-            page_reset_target_data(start, start + len);
+    switch (advice) {
+    case MADV_WIPEONFORK:
+    case MADV_KEEPONFORK:
+        ret = -EINVAL;
+        /* fall through */
+    case MADV_DONTNEED:
+        if (can_passthrough_madv_dontneed(start, end)) {
+            ret = get_errno(madvise(g2h_untagged(start), len, advice));
+            if ((advice == MADV_DONTNEED) && (ret == 0)) {
+                page_reset_target_data(start, start + len);
+            }
         }
     }
     mmap_unlock();