mbox series

[v2,0/3] arm64: Enable BTI for the executable as well as the interpreter

Message ID 20210604112450.13344-1-broonie@kernel.org
Headers show
Series arm64: Enable BTI for the executable as well as the interpreter | expand

Message

Mark Brown June 4, 2021, 11:24 a.m. UTC
Deployments of BTI on arm64 have run into issues interacting with
systemd's MemoryDenyWriteExecute feature.  Currently for dynamically
linked executables the kernel will only handle architecture specific
properties like BTI for the interpreter, the expectation is that the
interpreter will then handle any properties on the main executable.
For BTI this means remapping the executable segments PROT_EXEC |
PROT_BTI.

This interacts poorly with MemoryDenyWriteExecute since that is
implemented using a seccomp filter which prevents setting PROT_EXEC on
already mapped memory and lacks the context to be able to detect that
memory is already mapped with PROT_EXEC.  This series resolves this by
handling the BTI property for both the interpreter and the main
executable.

This does mean that we may get more code with BTI enabled if running on
a system without BTI support in the dynamic linker, this is expected to
be a safe configuration and testing seems to confirm that. It also
reduces the flexibility userspace has to disable BTI but it is expected
that for cases where there are problems which require BTI to be disabled
it is more likely that it will need to be disabled on a system level.

v2:
 - Add a patch dropping has_interp from arch_adjust_elf_prot()
 - Fix bisection issue with static executables on arm64 in the first
   patch.

Mark Brown (3):
  elf: Allow architectures to parse properties on the main executable
  arm64: Enable BTI for main executable as well as the interpreter
  elf: Remove has_interp property from arch_adjust_elf_prot()

 arch/arm64/include/asm/elf.h | 13 ++++++++++---
 arch/arm64/kernel/process.c  | 20 +++++++-------------
 fs/binfmt_elf.c              | 29 ++++++++++++++++++++---------
 include/linux/elf.h          |  8 +++++---
 4 files changed, 42 insertions(+), 28 deletions(-)


base-commit: c4681547bcce777daf576925a966ffa824edd09d

Comments

Jeremy Linton June 10, 2021, 4:28 p.m. UTC | #1
Hi,

On 6/4/21 6:24 AM, Mark Brown wrote:
> Deployments of BTI on arm64 have run into issues interacting with
> systemd's MemoryDenyWriteExecute feature.  Currently for dynamically
> linked executables the kernel will only handle architecture specific
> properties like BTI for the interpreter, the expectation is that the
> interpreter will then handle any properties on the main executable.
> For BTI this means remapping the executable segments PROT_EXEC |
> PROT_BTI.
> 
> This interacts poorly with MemoryDenyWriteExecute since that is
> implemented using a seccomp filter which prevents setting PROT_EXEC on
> already mapped memory and lacks the context to be able to detect that
> memory is already mapped with PROT_EXEC.  This series resolves this by
> handling the BTI property for both the interpreter and the main
> executable.

I've got a Fedora34 system booting in qemu or a model with BTI enabled. 
On that system I took the systemd-resolved executable, which is one of 
the services with MDWE enabled, and replaced a number of the bti's with 
nops. I expect the service to continue to work with the fedora or 
mainline 5.13 kernel and it does. If instead I boot with MDWE=no for the 
service, it should fail to start given either of those kernels, and it does.

Thus, I expect that with his patch applied to 5.13 the service will fail 
to start regardless of the state of MDWE, but it seems to continue 
starting when I set MDWE=yes. Same behavior with v1 FWTW.

Of course, there is a good chance I've messed something up or i'm 
missing something. I should really validate the /lib/ld-linux behavior 
itself too. I guess this could just as well be a glibc issue (f34 has 
glibc 2.33-5 which appears to have the re-mmap on failure patch). Either 
way, systemd-resolved is a LSB PIE, with /lib/ld-linux as its 
interpreter. I've not dug too deep into debugging this, cause I've got a 
couple other things I need to deal with in the next couple days, and I 
strongly dislike booting a full debug+system on the model. chuckle, sorry...


Thanks,


> 
> This does mean that we may get more code with BTI enabled if running on
> a system without BTI support in the dynamic linker, this is expected to
> be a safe configuration and testing seems to confirm that. It also
> reduces the flexibility userspace has to disable BTI but it is expected
> that for cases where there are problems which require BTI to be disabled
> it is more likely that it will need to be disabled on a system level.
> 
> v2:
>   - Add a patch dropping has_interp from arch_adjust_elf_prot()
>   - Fix bisection issue with static executables on arm64 in the first
>     patch.
> 
> Mark Brown (3):
>    elf: Allow architectures to parse properties on the main executable
>    arm64: Enable BTI for main executable as well as the interpreter
>    elf: Remove has_interp property from arch_adjust_elf_prot()
> 
>   arch/arm64/include/asm/elf.h | 13 ++++++++++---
>   arch/arm64/kernel/process.c  | 20 +++++++-------------
>   fs/binfmt_elf.c              | 29 ++++++++++++++++++++---------
>   include/linux/elf.h          |  8 +++++---
>   4 files changed, 42 insertions(+), 28 deletions(-)
> 
> 
> base-commit: c4681547bcce777daf576925a966ffa824edd09d
>
Mark Brown June 14, 2021, 4 p.m. UTC | #2
On Thu, Jun 10, 2021 at 11:28:12AM -0500, Jeremy Linton wrote:

> Of course, there is a good chance I've messed something up or i'm missing
> something. I should really validate the /lib/ld-linux behavior itself too. I
> guess this could just as well be a glibc issue (f34 has glibc 2.33-5 which

If it were a glibc issue that'd mean that glibc would have to somehow
manage to disable PROT_BTI after the kernel set it.  I think I've found
the issue, will send a new version out shortly - we just weren't
actually parsing the properties on the main executable properly.  A new
version should appear shortly.
Dave Martin June 15, 2021, 3:22 p.m. UTC | #3
On Thu, Jun 10, 2021 at 11:28:12AM -0500, Jeremy Linton via Libc-alpha wrote:
> Hi,
> 
> On 6/4/21 6:24 AM, Mark Brown wrote:
> >Deployments of BTI on arm64 have run into issues interacting with
> >systemd's MemoryDenyWriteExecute feature.  Currently for dynamically
> >linked executables the kernel will only handle architecture specific
> >properties like BTI for the interpreter, the expectation is that the
> >interpreter will then handle any properties on the main executable.
> >For BTI this means remapping the executable segments PROT_EXEC |
> >PROT_BTI.
> >
> >This interacts poorly with MemoryDenyWriteExecute since that is
> >implemented using a seccomp filter which prevents setting PROT_EXEC on
> >already mapped memory and lacks the context to be able to detect that
> >memory is already mapped with PROT_EXEC.  This series resolves this by
> >handling the BTI property for both the interpreter and the main
> >executable.
> 
> I've got a Fedora34 system booting in qemu or a model with BTI enabled. On
> that system I took the systemd-resolved executable, which is one of the
> services with MDWE enabled, and replaced a number of the bti's with nops. I
> expect the service to continue to work with the fedora or mainline 5.13
> kernel and it does. If instead I boot with MDWE=no for the service, it
> should fail to start given either of those kernels, and it does.
> 
> Thus, I expect that with his patch applied to 5.13 the service will fail to
> start regardless of the state of MDWE, but it seems to continue starting
> when I set MDWE=yes. Same behavior with v1 FWTW.
> 
> Of course, there is a good chance I've messed something up or i'm missing
> something. I should really validate the /lib/ld-linux behavior itself too. I
> guess this could just as well be a glibc issue (f34 has glibc 2.33-5 which
> appears to have the re-mmap on failure patch). Either way, systemd-resolved
> is a LSB PIE, with /lib/ld-linux as its interpreter. I've not dug too deep
> into debugging this, cause I've got a couple other things I need to deal
> with in the next couple days, and I strongly dislike booting a full
> debug+system on the model. chuckle, sorry...

[...]

If the failure we're trying to detect is that BTI is undesirably left
off for the main executable, surely replacing BTIs with NOPs will make
no differenece?  The behaviour with PROT_BTI clear is strictly more
permissive than with PROT_BTI set, so I'm not sure we can test the
behaviour this way.

Maybe I'm missing sometihng / confused myself somewhere.

Looking at /proc/<pid>/maps after the process starts up may be a more
reliable approach, so see what the actual prot value is on the main
executable's text pages.

Cheers
---Dave
Mark Brown June 15, 2021, 3:33 p.m. UTC | #4
On Tue, Jun 15, 2021 at 04:22:06PM +0100, Dave Martin wrote:
> On Thu, Jun 10, 2021 at 11:28:12AM -0500, Jeremy Linton via Libc-alpha wrote:

> > Thus, I expect that with his patch applied to 5.13 the service will fail to
> > start regardless of the state of MDWE, but it seems to continue starting
> > when I set MDWE=yes. Same behavior with v1 FWTW.

> If the failure we're trying to detect is that BTI is undesirably left
> off for the main executable, surely replacing BTIs with NOPs will make
> no differenece?  The behaviour with PROT_BTI clear is strictly more
> permissive than with PROT_BTI set, so I'm not sure we can test the
> behaviour this way.

> Maybe I'm missing sometihng / confused myself somewhere.

The issue this patch series is intended to address is that BTI gets
left off since the dynamic linker is unable to enable PROT_BTI on the
main executable.  We're looking to see that we end up with the stricter
permissions checking of BTI, with the issue present landing pads
replaced by NOPs will not fault but once the issue is addressed they
should start faulting.

> Looking at /proc/<pid>/maps after the process starts up may be a more
> reliable approach, so see what the actual prot value is on the main
> executable's text pages.

smaps rather than maps but yes, executable pages show up as "ex" and BTI
adds a "bt" tag in VmFlags.
Dave Martin June 15, 2021, 3:41 p.m. UTC | #5
On Tue, Jun 15, 2021 at 04:33:41PM +0100, Mark Brown via Libc-alpha wrote:
> On Tue, Jun 15, 2021 at 04:22:06PM +0100, Dave Martin wrote:
> > On Thu, Jun 10, 2021 at 11:28:12AM -0500, Jeremy Linton via Libc-alpha wrote:
> 
> > > Thus, I expect that with his patch applied to 5.13 the service will fail to
> > > start regardless of the state of MDWE, but it seems to continue starting
> > > when I set MDWE=yes. Same behavior with v1 FWTW.
> 
> > If the failure we're trying to detect is that BTI is undesirably left
> > off for the main executable, surely replacing BTIs with NOPs will make
> > no differenece?  The behaviour with PROT_BTI clear is strictly more
> > permissive than with PROT_BTI set, so I'm not sure we can test the
> > behaviour this way.
> 
> > Maybe I'm missing sometihng / confused myself somewhere.
> 
> The issue this patch series is intended to address is that BTI gets
> left off since the dynamic linker is unable to enable PROT_BTI on the
> main executable.  We're looking to see that we end up with the stricter
> permissions checking of BTI, with the issue present landing pads
> replaced by NOPs will not fault but once the issue is addressed they
> should start faulting.

Ah, right -- I got the test backwards in my head.  Yes, that sounds
reasonable.

> > Looking at /proc/<pid>/maps after the process starts up may be a more
> > reliable approach, so see what the actual prot value is on the main
> > executable's text pages.
> 
> smaps rather than maps but yes, executable pages show up as "ex" and BTI
> adds a "bt" tag in VmFlags.

Fumbled that -- yes, I meant smaps!

Ignore me...

Cheers
---Dave
Jeremy Linton June 16, 2021, 5:12 a.m. UTC | #6
Hi,

On 6/15/21 10:41 AM, Dave Martin wrote:
> On Tue, Jun 15, 2021 at 04:33:41PM +0100, Mark Brown via Libc-alpha wrote:
>> On Tue, Jun 15, 2021 at 04:22:06PM +0100, Dave Martin wrote:
>>> On Thu, Jun 10, 2021 at 11:28:12AM -0500, Jeremy Linton via Libc-alpha wrote:
>>
>>>> Thus, I expect that with his patch applied to 5.13 the service will fail to
>>>> start regardless of the state of MDWE, but it seems to continue starting
>>>> when I set MDWE=yes. Same behavior with v1 FWTW.
>>
>>> If the failure we're trying to detect is that BTI is undesirably left
>>> off for the main executable, surely replacing BTIs with NOPs will make
>>> no differenece?  The behaviour with PROT_BTI clear is strictly more
>>> permissive than with PROT_BTI set, so I'm not sure we can test the
>>> behaviour this way.
>>
>>> Maybe I'm missing sometihng / confused myself somewhere.
>>
>> The issue this patch series is intended to address is that BTI gets
>> left off since the dynamic linker is unable to enable PROT_BTI on the
>> main executable.  We're looking to see that we end up with the stricter
>> permissions checking of BTI, with the issue present landing pads
>> replaced by NOPs will not fault but once the issue is addressed they
>> should start faulting.
> 
> Ah, right -- I got the test backwards in my head.  Yes, that sounds
> reasonable.

Yes, the good thing about doing both the success and failure cases 
rather than just checking smaps is that one can be assured the emulation 
env and all the pieces are working correctly, not just the mappings,


Anyway, it looks like v3 is behaving as expected, I'm going to let it 
run a few more tests and presumably post a tested-by on the set tomorrow.


Thanks,

> 
>>> Looking at /proc/<pid>/maps after the process starts up may be a more
>>> reliable approach, so see what the actual prot value is on the main
>>> executable's text pages.
>>
>> smaps rather than maps but yes, executable pages show up as "ex" and BTI
>> adds a "bt" tag in VmFlags.
> 
> Fumbled that -- yes, I meant smaps!
> 
> Ignore me...
> 
> Cheers
> ---Dave
>