mbox series

[0/5] Guarded Userspace Access Prevention on Radix

Message ID 20181026063513.30806-1-ruscur@russell.cc (mailing list archive)
Headers show
Series Guarded Userspace Access Prevention on Radix | expand

Message

Russell Currey Oct. 26, 2018, 6:35 a.m. UTC
Guarded Userspace Access Prevention is a security mechanism that prevents
the kernel from being able to read and write userspace addresses outside of
the allowed paths, most commonly copy_{to/from}_user().

At present, the only CPU that supports this is POWER9, and only while using
the Radix MMU.  Privileged reads and writes cannot access user data when
key 0 of the AMR is set.  This is described in the "Radix Tree Translation
Storage Protection" section of the POWER ISA as of version 3.0.

GUAP code sets key 0 of the AMR (thus disabling accesses of user data)
early during boot, and only ever "unlocks" access prior to certain
operations, like copy_{to/from}_user(), futex ops, etc.  Setting this does
not prevent unprivileged access, so userspace can operate fine while access
is locked.

There is a performance impact, although I don't consider it heavy.  Running
a worst-case benchmark of a 1GB copy 1 byte at a time (and thus constant
read(1) write(1) syscalls), I found enabling GUAP to be 3.5% slower than
when disabled.  In most cases, the difference is negligible.  The main
performance impact is the mtspr instruction, which is quite slow.

There are a few caveats with this series that could be improved upon in
future.  Right now there is no saving and restoring of the AMR value -
there is no userspace exploitation of the AMR on Radix in POWER9, but if
this were to change in future, saving and restoring the value would be
necessary.

No attempt to optimise cases of repeated calls - for example, if some
code was repeatedly calling copy_to_user() for small sizes very frequently,
it would be slower than the equivalent of wrapping that code in an unlock
and lock and only having to modify the AMR once.

There are some interesting cases that I've attempted to handle, such as if
the AMR is unlocked (i.e. because a copy_{to_from}_user is in progress)...

    - and an exception is taken, the kernel would then be running with the
    AMR unlocked and freely able to access userspace again.  I am working
    around this by storing a flag in the PACA to indicate if the AMR is
    unlocked (to save a costly SPR read), and if so, locking the AMR in
    the exception entry path and unlocking it on the way out.

    - and gets context switched out, goes into a path that locks the AMR,
    then context switches back, access will be disabled and will fault.
    As a result, I context switch the AMR between tasks as if it was used
    by userspace like hash (which already implements this).

Another consideration is use of the isync instruction.  Without an isync
following the mtspr instruction, there is no guarantee that the change
takes effect.  The issue is that isync is very slow, and so I tried to
avoid them wherever necessary.  In this series, the only place an isync
gets used is after *unlocking* the AMR, because if an access takes place
and access is still prevented, the kernel will fault.

On the flipside, a slight delay in unlocking caused by skipping an isync
potentially allows a small window of vulnerability.  It is my opinion
that this window is practically impossible to exploit, but if someone
thinks otherwise, please do share.

This series is my first attempt at POWER assembly so all feedback is very
welcome.

The official theme song of this series can be found here:
    https://www.youtube.com/watch?v=QjTrnKAcYjE

Russell Currey (5):
  powerpc/64s: Guarded Userspace Access Prevention
  powerpc/futex: GUAP support for futex ops
  powerpc/lib: checksum GUAP support
  powerpc/64s: Disable GUAP with nosmap option
  powerpc/64s: Document that PPC supports nosmap

 .../admin-guide/kernel-parameters.txt         |  2 +-
 arch/powerpc/include/asm/exception-64e.h      |  3 +
 arch/powerpc/include/asm/exception-64s.h      | 19 ++++++-
 arch/powerpc/include/asm/futex.h              |  6 ++
 arch/powerpc/include/asm/mmu.h                |  7 +++
 arch/powerpc/include/asm/paca.h               |  3 +
 arch/powerpc/include/asm/reg.h                |  1 +
 arch/powerpc/include/asm/uaccess.h            | 57 ++++++++++++++++---
 arch/powerpc/kernel/asm-offsets.c             |  1 +
 arch/powerpc/kernel/dt_cpu_ftrs.c             |  4 ++
 arch/powerpc/kernel/entry_64.S                | 17 +++++-
 arch/powerpc/lib/checksum_wrappers.c          |  6 +-
 arch/powerpc/mm/fault.c                       |  9 +++
 arch/powerpc/mm/init_64.c                     | 15 +++++
 arch/powerpc/mm/pgtable-radix.c               |  2 +
 arch/powerpc/mm/pkeys.c                       |  7 ++-
 arch/powerpc/platforms/Kconfig.cputype        | 15 +++++
 17 files changed, 158 insertions(+), 16 deletions(-)

Comments

Christophe Leroy Oct. 26, 2018, 4:29 p.m. UTC | #1
Russell Currey <ruscur@russell.cc> a écrit :

> Guarded Userspace Access Prevention is a security mechanism that prevents
> the kernel from being able to read and write userspace addresses outside of
> the allowed paths, most commonly copy_{to/from}_user().
>
> At present, the only CPU that supports this is POWER9, and only while using
> the Radix MMU.  Privileged reads and writes cannot access user data when
> key 0 of the AMR is set.  This is described in the "Radix Tree Translation
> Storage Protection" section of the POWER ISA as of version 3.0.

It is not right that only power9 can support that.

The 8xx has mmu access protection registers which can serve the same  
purpose. Today on the 8xx kernel space is group 0 and user space is  
group 1. Group 0 is set to "page defined access permission" in MD_AP  
and MI_AP registers, and group 1 is set to "all accesses done with  
supervisor rights". By setting group 1 to "user and supervisor  
interpretation swapped" we can forbid kernel access to user space  
while still allowing user access to it. Then by simply changing group  
1 mode at dedicated places we can lock/unlock kernel access to user  
space.

Could you implement something as generic as possible having that in  
mind for a future patch ?

Christophe

>
> GUAP code sets key 0 of the AMR (thus disabling accesses of user data)
> early during boot, and only ever "unlocks" access prior to certain
> operations, like copy_{to/from}_user(), futex ops, etc.  Setting this does
> not prevent unprivileged access, so userspace can operate fine while access
> is locked.
>
> There is a performance impact, although I don't consider it heavy.  Running
> a worst-case benchmark of a 1GB copy 1 byte at a time (and thus constant
> read(1) write(1) syscalls), I found enabling GUAP to be 3.5% slower than
> when disabled.  In most cases, the difference is negligible.  The main
> performance impact is the mtspr instruction, which is quite slow.
>
> There are a few caveats with this series that could be improved upon in
> future.  Right now there is no saving and restoring of the AMR value -
> there is no userspace exploitation of the AMR on Radix in POWER9, but if
> this were to change in future, saving and restoring the value would be
> necessary.
>
> No attempt to optimise cases of repeated calls - for example, if some
> code was repeatedly calling copy_to_user() for small sizes very frequently,
> it would be slower than the equivalent of wrapping that code in an unlock
> and lock and only having to modify the AMR once.
>
> There are some interesting cases that I've attempted to handle, such as if
> the AMR is unlocked (i.e. because a copy_{to_from}_user is in progress)...
>
>     - and an exception is taken, the kernel would then be running with the
>     AMR unlocked and freely able to access userspace again.  I am working
>     around this by storing a flag in the PACA to indicate if the AMR is
>     unlocked (to save a costly SPR read), and if so, locking the AMR in
>     the exception entry path and unlocking it on the way out.
>
>     - and gets context switched out, goes into a path that locks the AMR,
>     then context switches back, access will be disabled and will fault.
>     As a result, I context switch the AMR between tasks as if it was used
>     by userspace like hash (which already implements this).
>
> Another consideration is use of the isync instruction.  Without an isync
> following the mtspr instruction, there is no guarantee that the change
> takes effect.  The issue is that isync is very slow, and so I tried to
> avoid them wherever necessary.  In this series, the only place an isync
> gets used is after *unlocking* the AMR, because if an access takes place
> and access is still prevented, the kernel will fault.
>
> On the flipside, a slight delay in unlocking caused by skipping an isync
> potentially allows a small window of vulnerability.  It is my opinion
> that this window is practically impossible to exploit, but if someone
> thinks otherwise, please do share.
>
> This series is my first attempt at POWER assembly so all feedback is very
> welcome.
>
> The official theme song of this series can be found here:
>     https://www.youtube.com/watch?v=QjTrnKAcYjE
>
> Russell Currey (5):
>   powerpc/64s: Guarded Userspace Access Prevention
>   powerpc/futex: GUAP support for futex ops
>   powerpc/lib: checksum GUAP support
>   powerpc/64s: Disable GUAP with nosmap option
>   powerpc/64s: Document that PPC supports nosmap
>
>  .../admin-guide/kernel-parameters.txt         |  2 +-
>  arch/powerpc/include/asm/exception-64e.h      |  3 +
>  arch/powerpc/include/asm/exception-64s.h      | 19 ++++++-
>  arch/powerpc/include/asm/futex.h              |  6 ++
>  arch/powerpc/include/asm/mmu.h                |  7 +++
>  arch/powerpc/include/asm/paca.h               |  3 +
>  arch/powerpc/include/asm/reg.h                |  1 +
>  arch/powerpc/include/asm/uaccess.h            | 57 ++++++++++++++++---
>  arch/powerpc/kernel/asm-offsets.c             |  1 +
>  arch/powerpc/kernel/dt_cpu_ftrs.c             |  4 ++
>  arch/powerpc/kernel/entry_64.S                | 17 +++++-
>  arch/powerpc/lib/checksum_wrappers.c          |  6 +-
>  arch/powerpc/mm/fault.c                       |  9 +++
>  arch/powerpc/mm/init_64.c                     | 15 +++++
>  arch/powerpc/mm/pgtable-radix.c               |  2 +
>  arch/powerpc/mm/pkeys.c                       |  7 ++-
>  arch/powerpc/platforms/Kconfig.cputype        | 15 +++++
>  17 files changed, 158 insertions(+), 16 deletions(-)
>
> --
> 2.19.1
Russell Currey Oct. 31, 2018, 3:53 a.m. UTC | #2
On Fri, 2018-10-26 at 18:29 +0200, LEROY Christophe wrote:
> Russell Currey <ruscur@russell.cc> a écrit :
> 
> > Guarded Userspace Access Prevention is a security mechanism that
> > prevents
> > the kernel from being able to read and write userspace addresses
> > outside of
> > the allowed paths, most commonly copy_{to/from}_user().
> > 
> > At present, the only CPU that supports this is POWER9, and only
> > while using
> > the Radix MMU.  Privileged reads and writes cannot access user data
> > when
> > key 0 of the AMR is set.  This is described in the "Radix Tree
> > Translation
> > Storage Protection" section of the POWER ISA as of version 3.0.
> 
> It is not right that only power9 can support that.

It's true that not only P9 can support it, but there are more
considerations under hash than radix, implementing this for radix is a
first step.

> 
> The 8xx has mmu access protection registers which can serve the
> same  
> purpose. Today on the 8xx kernel space is group 0 and user space is  
> group 1. Group 0 is set to "page defined access permission" in
> MD_AP  
> and MI_AP registers, and group 1 is set to "all accesses done with  
> supervisor rights". By setting group 1 to "user and supervisor  
> interpretation swapped" we can forbid kernel access to user space  
> while still allowing user access to it. Then by simply changing
> group  
> 1 mode at dedicated places we can lock/unlock kernel access to user  
> space.
> 
> Could you implement something as generic as possible having that in  
> mind for a future patch ?

I don't think anything in this series is particularly problematic in
relation to future work for hash.  I am interested in doing a hash
implementation in future too.

- Russell

> 
> Christophe
> 
> > GUAP code sets key 0 of the AMR (thus disabling accesses of user
> > data)
> > early during boot, and only ever "unlocks" access prior to certain
> > operations, like copy_{to/from}_user(), futex ops, etc.  Setting
> > this does
> > not prevent unprivileged access, so userspace can operate fine
> > while access
> > is locked.
> > 
> > There is a performance impact, although I don't consider it
> > heavy.  Running
> > a worst-case benchmark of a 1GB copy 1 byte at a time (and thus
> > constant
> > read(1) write(1) syscalls), I found enabling GUAP to be 3.5% slower
> > than
> > when disabled.  In most cases, the difference is negligible.  The
> > main
> > performance impact is the mtspr instruction, which is quite slow.
> > 
> > There are a few caveats with this series that could be improved
> > upon in
> > future.  Right now there is no saving and restoring of the AMR
> > value -
> > there is no userspace exploitation of the AMR on Radix in POWER9,
> > but if
> > this were to change in future, saving and restoring the value would
> > be
> > necessary.
> > 
> > No attempt to optimise cases of repeated calls - for example, if
> > some
> > code was repeatedly calling copy_to_user() for small sizes very
> > frequently,
> > it would be slower than the equivalent of wrapping that code in an
> > unlock
> > and lock and only having to modify the AMR once.
> > 
> > There are some interesting cases that I've attempted to handle,
> > such as if
> > the AMR is unlocked (i.e. because a copy_{to_from}_user is in
> > progress)...
> > 
> >     - and an exception is taken, the kernel would then be running
> > with the
> >     AMR unlocked and freely able to access userspace again.  I am
> > working
> >     around this by storing a flag in the PACA to indicate if the
> > AMR is
> >     unlocked (to save a costly SPR read), and if so, locking the
> > AMR in
> >     the exception entry path and unlocking it on the way out.
> > 
> >     - and gets context switched out, goes into a path that locks
> > the AMR,
> >     then context switches back, access will be disabled and will
> > fault.
> >     As a result, I context switch the AMR between tasks as if it
> > was used
> >     by userspace like hash (which already implements this).
> > 
> > Another consideration is use of the isync instruction.  Without an
> > isync
> > following the mtspr instruction, there is no guarantee that the
> > change
> > takes effect.  The issue is that isync is very slow, and so I tried
> > to
> > avoid them wherever necessary.  In this series, the only place an
> > isync
> > gets used is after *unlocking* the AMR, because if an access takes
> > place
> > and access is still prevented, the kernel will fault.
> > 
> > On the flipside, a slight delay in unlocking caused by skipping an
> > isync
> > potentially allows a small window of vulnerability.  It is my
> > opinion
> > that this window is practically impossible to exploit, but if
> > someone
> > thinks otherwise, please do share.
> > 
> > This series is my first attempt at POWER assembly so all feedback
> > is very
> > welcome.
> > 
> > The official theme song of this series can be found here:
> >     https://www.youtube.com/watch?v=QjTrnKAcYjE
> > 
> > Russell Currey (5):
> >   powerpc/64s: Guarded Userspace Access Prevention
> >   powerpc/futex: GUAP support for futex ops
> >   powerpc/lib: checksum GUAP support
> >   powerpc/64s: Disable GUAP with nosmap option
> >   powerpc/64s: Document that PPC supports nosmap
> > 
> >  .../admin-guide/kernel-parameters.txt         |  2 +-
> >  arch/powerpc/include/asm/exception-64e.h      |  3 +
> >  arch/powerpc/include/asm/exception-64s.h      | 19 ++++++-
> >  arch/powerpc/include/asm/futex.h              |  6 ++
> >  arch/powerpc/include/asm/mmu.h                |  7 +++
> >  arch/powerpc/include/asm/paca.h               |  3 +
> >  arch/powerpc/include/asm/reg.h                |  1 +
> >  arch/powerpc/include/asm/uaccess.h            | 57
> > ++++++++++++++++---
> >  arch/powerpc/kernel/asm-offsets.c             |  1 +
> >  arch/powerpc/kernel/dt_cpu_ftrs.c             |  4 ++
> >  arch/powerpc/kernel/entry_64.S                | 17 +++++-
> >  arch/powerpc/lib/checksum_wrappers.c          |  6 +-
> >  arch/powerpc/mm/fault.c                       |  9 +++
> >  arch/powerpc/mm/init_64.c                     | 15 +++++
> >  arch/powerpc/mm/pgtable-radix.c               |  2 +
> >  arch/powerpc/mm/pkeys.c                       |  7 ++-
> >  arch/powerpc/platforms/Kconfig.cputype        | 15 +++++
> >  17 files changed, 158 insertions(+), 16 deletions(-)
> > 
> > --
> > 2.19.1
> 
>
Christophe Leroy Oct. 31, 2018, 4:58 p.m. UTC | #3
Russell Currey <ruscur@russell.cc> a écrit :

> On Fri, 2018-10-26 at 18:29 +0200, LEROY Christophe wrote:
>> Russell Currey <ruscur@russell.cc> a écrit :
>>
>> > Guarded Userspace Access Prevention is a security mechanism that
>> > prevents
>> > the kernel from being able to read and write userspace addresses
>> > outside of
>> > the allowed paths, most commonly copy_{to/from}_user().
>> >
>> > At present, the only CPU that supports this is POWER9, and only
>> > while using
>> > the Radix MMU.  Privileged reads and writes cannot access user data
>> > when
>> > key 0 of the AMR is set.  This is described in the "Radix Tree
>> > Translation
>> > Storage Protection" section of the POWER ISA as of version 3.0.
>>
>> It is not right that only power9 can support that.
>
> It's true that not only P9 can support it, but there are more
> considerations under hash than radix, implementing this for radix is a
> first step.

I don't know much about hash, but I was talking about the 8xx which is  
a nohash ppc32. I'll see next week if I can do something with it on  
top of your serie.

>
>>
>> The 8xx has mmu access protection registers which can serve the
>> same
>> purpose. Today on the 8xx kernel space is group 0 and user space is
>> group 1. Group 0 is set to "page defined access permission" in
>> MD_AP
>> and MI_AP registers, and group 1 is set to "all accesses done with
>> supervisor rights". By setting group 1 to "user and supervisor
>> interpretation swapped" we can forbid kernel access to user space
>> while still allowing user access to it. Then by simply changing
>> group
>> 1 mode at dedicated places we can lock/unlock kernel access to user
>> space.
>>
>> Could you implement something as generic as possible having that in
>> mind for a future patch ?
>
> I don't think anything in this series is particularly problematic in
> relation to future work for hash.  I am interested in doing a hash
> implementation in future too.

I think we have to look at that carrefuly to avoid uggly ifdefs

Christophe

>
> - Russell
>
>>
>> Christophe
>>
>> > GUAP code sets key 0 of the AMR (thus disabling accesses of user
>> > data)
>> > early during boot, and only ever "unlocks" access prior to certain
>> > operations, like copy_{to/from}_user(), futex ops, etc.  Setting
>> > this does
>> > not prevent unprivileged access, so userspace can operate fine
>> > while access
>> > is locked.
>> >
>> > There is a performance impact, although I don't consider it
>> > heavy.  Running
>> > a worst-case benchmark of a 1GB copy 1 byte at a time (and thus
>> > constant
>> > read(1) write(1) syscalls), I found enabling GUAP to be 3.5% slower
>> > than
>> > when disabled.  In most cases, the difference is negligible.  The
>> > main
>> > performance impact is the mtspr instruction, which is quite slow.
>> >
>> > There are a few caveats with this series that could be improved
>> > upon in
>> > future.  Right now there is no saving and restoring of the AMR
>> > value -
>> > there is no userspace exploitation of the AMR on Radix in POWER9,
>> > but if
>> > this were to change in future, saving and restoring the value would
>> > be
>> > necessary.
>> >
>> > No attempt to optimise cases of repeated calls - for example, if
>> > some
>> > code was repeatedly calling copy_to_user() for small sizes very
>> > frequently,
>> > it would be slower than the equivalent of wrapping that code in an
>> > unlock
>> > and lock and only having to modify the AMR once.
>> >
>> > There are some interesting cases that I've attempted to handle,
>> > such as if
>> > the AMR is unlocked (i.e. because a copy_{to_from}_user is in
>> > progress)...
>> >
>> >     - and an exception is taken, the kernel would then be running
>> > with the
>> >     AMR unlocked and freely able to access userspace again.  I am
>> > working
>> >     around this by storing a flag in the PACA to indicate if the
>> > AMR is
>> >     unlocked (to save a costly SPR read), and if so, locking the
>> > AMR in
>> >     the exception entry path and unlocking it on the way out.
>> >
>> >     - and gets context switched out, goes into a path that locks
>> > the AMR,
>> >     then context switches back, access will be disabled and will
>> > fault.
>> >     As a result, I context switch the AMR between tasks as if it
>> > was used
>> >     by userspace like hash (which already implements this).
>> >
>> > Another consideration is use of the isync instruction.  Without an
>> > isync
>> > following the mtspr instruction, there is no guarantee that the
>> > change
>> > takes effect.  The issue is that isync is very slow, and so I tried
>> > to
>> > avoid them wherever necessary.  In this series, the only place an
>> > isync
>> > gets used is after *unlocking* the AMR, because if an access takes
>> > place
>> > and access is still prevented, the kernel will fault.
>> >
>> > On the flipside, a slight delay in unlocking caused by skipping an
>> > isync
>> > potentially allows a small window of vulnerability.  It is my
>> > opinion
>> > that this window is practically impossible to exploit, but if
>> > someone
>> > thinks otherwise, please do share.
>> >
>> > This series is my first attempt at POWER assembly so all feedback
>> > is very
>> > welcome.
>> >
>> > The official theme song of this series can be found here:
>> >     https://www.youtube.com/watch?v=QjTrnKAcYjE
>> >
>> > Russell Currey (5):
>> >   powerpc/64s: Guarded Userspace Access Prevention
>> >   powerpc/futex: GUAP support for futex ops
>> >   powerpc/lib: checksum GUAP support
>> >   powerpc/64s: Disable GUAP with nosmap option
>> >   powerpc/64s: Document that PPC supports nosmap
>> >
>> >  .../admin-guide/kernel-parameters.txt         |  2 +-
>> >  arch/powerpc/include/asm/exception-64e.h      |  3 +
>> >  arch/powerpc/include/asm/exception-64s.h      | 19 ++++++-
>> >  arch/powerpc/include/asm/futex.h              |  6 ++
>> >  arch/powerpc/include/asm/mmu.h                |  7 +++
>> >  arch/powerpc/include/asm/paca.h               |  3 +
>> >  arch/powerpc/include/asm/reg.h                |  1 +
>> >  arch/powerpc/include/asm/uaccess.h            | 57
>> > ++++++++++++++++---
>> >  arch/powerpc/kernel/asm-offsets.c             |  1 +
>> >  arch/powerpc/kernel/dt_cpu_ftrs.c             |  4 ++
>> >  arch/powerpc/kernel/entry_64.S                | 17 +++++-
>> >  arch/powerpc/lib/checksum_wrappers.c          |  6 +-
>> >  arch/powerpc/mm/fault.c                       |  9 +++
>> >  arch/powerpc/mm/init_64.c                     | 15 +++++
>> >  arch/powerpc/mm/pgtable-radix.c               |  2 +
>> >  arch/powerpc/mm/pkeys.c                       |  7 ++-
>> >  arch/powerpc/platforms/Kconfig.cputype        | 15 +++++
>> >  17 files changed, 158 insertions(+), 16 deletions(-)
>> >
>> > --
>> > 2.19.1
>>
>>
Russell Currey Nov. 1, 2018, 3:54 a.m. UTC | #4
On Wed, 2018-10-31 at 17:58 +0100, LEROY Christophe wrote:
> Russell Currey <ruscur@russell.cc> a écrit :
> 
> > On Fri, 2018-10-26 at 18:29 +0200, LEROY Christophe wrote:
> > > Russell Currey <ruscur@russell.cc> a écrit :
> > > 
> > > > Guarded Userspace Access Prevention is a security mechanism
> > > > that
> > > > prevents
> > > > the kernel from being able to read and write userspace
> > > > addresses
> > > > outside of
> > > > the allowed paths, most commonly copy_{to/from}_user().
> > > > 
> > > > At present, the only CPU that supports this is POWER9, and only
> > > > while using
> > > > the Radix MMU.  Privileged reads and writes cannot access user
> > > > data
> > > > when
> > > > key 0 of the AMR is set.  This is described in the "Radix Tree
> > > > Translation
> > > > Storage Protection" section of the POWER ISA as of version 3.0.
> > > 
> > > It is not right that only power9 can support that.
> > 
> > It's true that not only P9 can support it, but there are more
> > considerations under hash than radix, implementing this for radix
> > is a
> > first step.
> 
> I don't know much about hash, but I was talking about the 8xx which
> is  
> a nohash ppc32. I'll see next week if I can do something with it on  
> top of your serie.

My small brain saw the number 8 and assumed you were talking about
POWER8, I didn't know what 8xx was until now.

Working on a refactor to make things a bit more generic, and removing
the radix name and dependency from the config option.

> 
> > > The 8xx has mmu access protection registers which can serve the
> > > same
> > > purpose. Today on the 8xx kernel space is group 0 and user space
> > > is
> > > group 1. Group 0 is set to "page defined access permission" in
> > > MD_AP
> > > and MI_AP registers, and group 1 is set to "all accesses done
> > > with
> > > supervisor rights". By setting group 1 to "user and supervisor
> > > interpretation swapped" we can forbid kernel access to user space
> > > while still allowing user access to it. Then by simply changing
> > > group
> > > 1 mode at dedicated places we can lock/unlock kernel access to
> > > user
> > > space.
> > > 
> > > Could you implement something as generic as possible having that
> > > in
> > > mind for a future patch ?
> > 
> > I don't think anything in this series is particularly problematic
> > in
> > relation to future work for hash.  I am interested in doing a hash
> > implementation in future too.
> 
> I think we have to look at that carrefuly to avoid uggly ifdefs
> 
> Christophe
> 
> > - Russell
> > 
> > > Christophe
> > > 
> > > > GUAP code sets key 0 of the AMR (thus disabling accesses of
> > > > user
> > > > data)
> > > > early during boot, and only ever "unlocks" access prior to
> > > > certain
> > > > operations, like copy_{to/from}_user(), futex ops,
> > > > etc.  Setting
> > > > this does
> > > > not prevent unprivileged access, so userspace can operate fine
> > > > while access
> > > > is locked.
> > > > 
> > > > There is a performance impact, although I don't consider it
> > > > heavy.  Running
> > > > a worst-case benchmark of a 1GB copy 1 byte at a time (and thus
> > > > constant
> > > > read(1) write(1) syscalls), I found enabling GUAP to be 3.5%
> > > > slower
> > > > than
> > > > when disabled.  In most cases, the difference is
> > > > negligible.  The
> > > > main
> > > > performance impact is the mtspr instruction, which is quite
> > > > slow.
> > > > 
> > > > There are a few caveats with this series that could be improved
> > > > upon in
> > > > future.  Right now there is no saving and restoring of the AMR
> > > > value -
> > > > there is no userspace exploitation of the AMR on Radix in
> > > > POWER9,
> > > > but if
> > > > this were to change in future, saving and restoring the value
> > > > would
> > > > be
> > > > necessary.
> > > > 
> > > > No attempt to optimise cases of repeated calls - for example,
> > > > if
> > > > some
> > > > code was repeatedly calling copy_to_user() for small sizes very
> > > > frequently,
> > > > it would be slower than the equivalent of wrapping that code in
> > > > an
> > > > unlock
> > > > and lock and only having to modify the AMR once.
> > > > 
> > > > There are some interesting cases that I've attempted to handle,
> > > > such as if
> > > > the AMR is unlocked (i.e. because a copy_{to_from}_user is in
> > > > progress)...
> > > > 
> > > >     - and an exception is taken, the kernel would then be
> > > > running
> > > > with the
> > > >     AMR unlocked and freely able to access userspace again.  I
> > > > am
> > > > working
> > > >     around this by storing a flag in the PACA to indicate if
> > > > the
> > > > AMR is
> > > >     unlocked (to save a costly SPR read), and if so, locking
> > > > the
> > > > AMR in
> > > >     the exception entry path and unlocking it on the way out.
> > > > 
> > > >     - and gets context switched out, goes into a path that
> > > > locks
> > > > the AMR,
> > > >     then context switches back, access will be disabled and
> > > > will
> > > > fault.
> > > >     As a result, I context switch the AMR between tasks as if
> > > > it
> > > > was used
> > > >     by userspace like hash (which already implements this).
> > > > 
> > > > Another consideration is use of the isync instruction.  Without
> > > > an
> > > > isync
> > > > following the mtspr instruction, there is no guarantee that the
> > > > change
> > > > takes effect.  The issue is that isync is very slow, and so I
> > > > tried
> > > > to
> > > > avoid them wherever necessary.  In this series, the only place
> > > > an
> > > > isync
> > > > gets used is after *unlocking* the AMR, because if an access
> > > > takes
> > > > place
> > > > and access is still prevented, the kernel will fault.
> > > > 
> > > > On the flipside, a slight delay in unlocking caused by skipping
> > > > an
> > > > isync
> > > > potentially allows a small window of vulnerability.  It is my
> > > > opinion
> > > > that this window is practically impossible to exploit, but if
> > > > someone
> > > > thinks otherwise, please do share.
> > > > 
> > > > This series is my first attempt at POWER assembly so all
> > > > feedback
> > > > is very
> > > > welcome.
> > > > 
> > > > The official theme song of this series can be found here:
> > > >     https://www.youtube.com/watch?v=QjTrnKAcYjE
> > > > 
> > > > Russell Currey (5):
> > > >   powerpc/64s: Guarded Userspace Access Prevention
> > > >   powerpc/futex: GUAP support for futex ops
> > > >   powerpc/lib: checksum GUAP support
> > > >   powerpc/64s: Disable GUAP with nosmap option
> > > >   powerpc/64s: Document that PPC supports nosmap
> > > > 
> > > >  .../admin-guide/kernel-parameters.txt         |  2 +-
> > > >  arch/powerpc/include/asm/exception-64e.h      |  3 +
> > > >  arch/powerpc/include/asm/exception-64s.h      | 19 ++++++-
> > > >  arch/powerpc/include/asm/futex.h              |  6 ++
> > > >  arch/powerpc/include/asm/mmu.h                |  7 +++
> > > >  arch/powerpc/include/asm/paca.h               |  3 +
> > > >  arch/powerpc/include/asm/reg.h                |  1 +
> > > >  arch/powerpc/include/asm/uaccess.h            | 57
> > > > ++++++++++++++++---
> > > >  arch/powerpc/kernel/asm-offsets.c             |  1 +
> > > >  arch/powerpc/kernel/dt_cpu_ftrs.c             |  4 ++
> > > >  arch/powerpc/kernel/entry_64.S                | 17 +++++-
> > > >  arch/powerpc/lib/checksum_wrappers.c          |  6 +-
> > > >  arch/powerpc/mm/fault.c                       |  9 +++
> > > >  arch/powerpc/mm/init_64.c                     | 15 +++++
> > > >  arch/powerpc/mm/pgtable-radix.c               |  2 +
> > > >  arch/powerpc/mm/pkeys.c                       |  7 ++-
> > > >  arch/powerpc/platforms/Kconfig.cputype        | 15 +++++
> > > >  17 files changed, 158 insertions(+), 16 deletions(-)
> > > > 
> > > > --
> > > > 2.19.1
> 
>
Christophe Leroy Nov. 8, 2018, 5:52 p.m. UTC | #5
Le 01/11/2018 à 04:54, Russell Currey a écrit :
> On Wed, 2018-10-31 at 17:58 +0100, LEROY Christophe wrote:
>> Russell Currey <ruscur@russell.cc> a écrit :
>>
>>> On Fri, 2018-10-26 at 18:29 +0200, LEROY Christophe wrote:
>>>> Russell Currey <ruscur@russell.cc> a écrit :
>>>>
>>>>> Guarded Userspace Access Prevention is a security mechanism
>>>>> that
>>>>> prevents
>>>>> the kernel from being able to read and write userspace
>>>>> addresses
>>>>> outside of
>>>>> the allowed paths, most commonly copy_{to/from}_user().
>>>>>
>>>>> At present, the only CPU that supports this is POWER9, and only
>>>>> while using
>>>>> the Radix MMU.  Privileged reads and writes cannot access user
>>>>> data
>>>>> when
>>>>> key 0 of the AMR is set.  This is described in the "Radix Tree
>>>>> Translation
>>>>> Storage Protection" section of the POWER ISA as of version 3.0.
>>>>
>>>> It is not right that only power9 can support that.
>>>
>>> It's true that not only P9 can support it, but there are more
>>> considerations under hash than radix, implementing this for radix
>>> is a
>>> first step.
>>
>> I don't know much about hash, but I was talking about the 8xx which
>> is
>> a nohash ppc32. I'll see next week if I can do something with it on
>> top of your serie.
> 
> My small brain saw the number 8 and assumed you were talking about
> POWER8, I didn't know what 8xx was until now.
> 
> Working on a refactor to make things a bit more generic, and removing
> the radix name and dependency from the config option.

In signal_32.c and signal_64.c, save_user_regs() calls __put_user() to 
modify code, then calls flush_icache_range() on user addresses.

Shouldn't flush_icache_range() be performed with userspace access 
protection unlocked ?

Christophe
Benjamin Herrenschmidt Nov. 8, 2018, 8:09 p.m. UTC | #6
On Thu, 2018-11-08 at 18:52 +0100, Christophe LEROY wrote:
> 
> In signal_32.c and signal_64.c, save_user_regs() calls __put_user() to 
> modify code, then calls flush_icache_range() on user addresses.
> 
> Shouldn't flush_icache_range() be performed with userspace access 
> protection unlocked ?

Thankfully this code is pretty much never used these days...

Russell: To trigger that, you need to disable the VDSO.

This brings back the idea however of having a way to "bulk" open the
gate during the whole signal sequence...

Cheers,
Ben.