diff mbox series

[AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

Message ID 00194eb0-f727-4d59-9bcd-60b05248d031@arm.com
State New
Headers show
Series [AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI | expand

Commit Message

Szabolcs Nagy Jan. 16, 2020, 11:15 a.m. UTC
this affects the linux kernel and technically a wrong code bug
so this fix tries to be backportable (fixing all issues with
-fpatchable-function-entry=N,M will likely require new option).

gcc/ChangeLog:

2020-01-16  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	PR target/92424
	* config/aarch64/aarch64.c (aarch64_declare_function_name): Emit BTI c
	if the function label is followed by a patch area.

Comments

Richard Sandiford Jan. 16, 2020, 3:12 p.m. UTC | #1
Szabolcs Nagy <Szabolcs.Nagy@arm.com> writes:
> this affects the linux kernel and technically a wrong code bug
> so this fix tries to be backportable (fixing all issues with
> -fpatchable-function-entry=N,M will likely require new option).

Even for the backportable version, I think it would be better
not to duplicate so much varasm stuff.

Perhaps instead we could:

(a) set a cfun->machine flag in aarch64_declare_function_name
    to say that we've assembled the label

(b) override print_patchable_function_entry so that it prints a BTI if
    this flag is set and the usual other conditions are true

As discussed off-list, it'd be good to avoid a second BTI after the
nops if possible.  print_patchable_function_entry should be able
to find the first instruction using get_insns and next_real_insn,
then remove it if it turns out to be a BTI.

Thanks,
Richard

>
> gcc/ChangeLog:
>
> 2020-01-16  Szabolcs Nagy  <szabolcs.nagy@arm.com>
>
> 	PR target/92424
> 	* config/aarch64/aarch64.c (aarch64_declare_function_name): Emit BTI c
> 	if the function label is followed by a patch area.
>
> From ac2a46bab60ecc80a453328b4749a951908c02c5 Mon Sep 17 00:00:00 2001
> From: Szabolcs Nagy <szabolcs.nagy@arm.com>
> Date: Wed, 15 Jan 2020 12:23:40 +0000
> Subject: [PATCH] [AArch64] PR92424: Fix -fpatchable-function-entry=N,M with
>  BTI
>
> This is a minimal workaround that emits another BTI after the function
> label if that is followed by a patch area.
>
> So before this commit -fpatchable-function-entry=3,1 with bti generates
>
>     .section __patchable_function_entries
>     .8byte .LPFE
>     .text
>   .LPFE:
>     nop
>   foo:
>     nop
>     nop
>     bti c // or paciasp
>     ...
>
> and after this commit
>
>     .section __patchable_function_entries
>     .8byte .LPFE
>     .text
>   .LPFE:
>     nop
>   foo:
>     bti c
>     nop
>     nop
>     bti c // or paciasp
>     ...
>
> There is a new bti insn in the middle of the patchable area unless M=0
> (patch area is after the new bti) or M=N (patch area is before the
> label, no new bti).
>
> Note that .cfi_startproc and the asynchronous unwind table entry label
> comes after the patch area, but whatever effect that has on the newly
> inserted bti c, it already affected the insns in the patch area.
>
> Tested on aarch64-none-linux-gnu.
>
> gcc/ChangeLog:
>
> 	PR target/92424
> 	* config/aarch64/aarch64.c (aarch64_declare_function_name): Emit BTI c
> 	if the function label is followed by a patch area.
> ---
>  gcc/config/aarch64/aarch64.c | 33 +++++++++++++++++++++++++++++++++
>  1 file changed, 33 insertions(+)
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 66e20becaf2..0394c274330 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -18079,6 +18079,39 @@ aarch64_declare_function_name (FILE *stream, const char* name,
>    /* Don't forget the type directive for ELF.  */
>    ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
>    ASM_OUTPUT_LABEL (stream, name);
> +
> +  if (!aarch64_bti_enabled ()
> +      || cgraph_node::get (fndecl)->only_called_directly_p ())
> +    return;
> +
> +  /* Copy logic from varasm.c assemble_start_function
> +     to handle -fpatchable-function-entry=N,M with BTI.  */
> +  unsigned HOST_WIDE_INT patch_area_size = function_entry_patch_area_size;
> +  unsigned HOST_WIDE_INT patch_area_entry = function_entry_patch_area_start;
> +
> +  tree patchable_function_entry_attr
> +    = lookup_attribute ("patchable_function_entry", DECL_ATTRIBUTES (fndecl));
> +  if (patchable_function_entry_attr)
> +    {
> +      tree pp_val = TREE_VALUE (patchable_function_entry_attr);
> +      tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
> +
> +      patch_area_size = tree_to_uhwi (patchable_function_entry_value1);
> +      patch_area_entry = 0;
> +      if (TREE_CHAIN (pp_val) != NULL_TREE)
> +	{
> +	  tree patchable_function_entry_value2
> +	    = TREE_VALUE (TREE_CHAIN (pp_val));
> +	  patch_area_entry = tree_to_uhwi (patchable_function_entry_value2);
> +	}
> +    }
> +
> +  if (patch_area_entry > patch_area_size)
> +    patch_area_entry = 0;
> +
> +  /* Emit a BTI c if a patch area comes after the function label.  */
> +  if (patch_area_size > patch_area_entry)
> +    asm_fprintf (stream, "\thint\t34 // bti c\n");
>  }
>  
>  /* Implement ASM_OUTPUT_DEF_FROM_DECLS.  Output .variant_pcs for aliases.  */
Li, Pan2 via Gcc-patches Jan. 19, 2020, 8:53 a.m. UTC | #2
On 2020-01-16, Richard Sandiford wrote:
>Szabolcs Nagy <Szabolcs.Nagy@arm.com> writes:
>> this affects the linux kernel and technically a wrong code bug
>> so this fix tries to be backportable (fixing all issues with
>> -fpatchable-function-entry=N,M will likely require new option).
>
>Even for the backportable version, I think it would be better
>not to duplicate so much varasm stuff.
>
>Perhaps instead we could:
>
>(a) set a cfun->machine flag in aarch64_declare_function_name
>    to say that we've assembled the label
>
>(b) override print_patchable_function_entry so that it prints a BTI if
>    this flag is set and the usual other conditions are true
>
>As discussed off-list, it'd be good to avoid a second BTI after the
>nops if possible.  print_patchable_function_entry should be able
>to find the first instruction using get_insns and next_real_insn,
>then remove it if it turns out to be a BTI.
>
>Thanks,
>Richard

It'd be great to have some tests, e.g.

1. -fpatchable-function-entry=0 -mbranch-protection=bti
2. -fpatchable-function-entry=2 -mbranch-protection=bti

I have updated clang to emit   `.Lfunc_begin0: bti c; nop; nop` for case 2.
The __patchable_function_entries entry points to .Lfunc_begin0 (bti c).

(The change is not included in the llvm 10.0 branch.)

I think we can address all except the SHF_WRITE issue in a backward
compatible manner. For some items listed in
https://gcc.gnu.org/ml/gcc/2020-01/msg00067.html , the fixes require GNU as
(https://sourceware.org/bugzilla/show_bug.cgi?id=25380 https://sourceware.org/bugzilla/show_bug.cgi?id=25381)
and ld features (https://sourceware.org/ml/binutils/2019-11/msg00266.html).

>>
>> gcc/ChangeLog:
>>
>> 2020-01-16  Szabolcs Nagy  <szabolcs.nagy@arm.com>
>>
>> 	PR target/92424
>> 	* config/aarch64/aarch64.c (aarch64_declare_function_name): Emit BTI c
>> 	if the function label is followed by a patch area.
>>
>> From ac2a46bab60ecc80a453328b4749a951908c02c5 Mon Sep 17 00:00:00 2001
>> From: Szabolcs Nagy <szabolcs.nagy@arm.com>
>> Date: Wed, 15 Jan 2020 12:23:40 +0000
>> Subject: [PATCH] [AArch64] PR92424: Fix -fpatchable-function-entry=N,M with
>>  BTI
>>
>> This is a minimal workaround that emits another BTI after the function
>> label if that is followed by a patch area.
>>
>> So before this commit -fpatchable-function-entry=3,1 with bti generates
>>
>>     .section __patchable_function_entries
>>     .8byte .LPFE
>>     .text
>>   .LPFE:
>>     nop
>>   foo:
>>     nop
>>     nop
>>     bti c // or paciasp
>>     ...
>>
>> and after this commit
>>
>>     .section __patchable_function_entries
>>     .8byte .LPFE
>>     .text
>>   .LPFE:
>>     nop
>>   foo:
>>     bti c
>>     nop
>>     nop
>>     bti c // or paciasp
>>     ...
>>
>> There is a new bti insn in the middle of the patchable area unless M=0
>> (patch area is after the new bti) or M=N (patch area is before the
>> label, no new bti).
>>
>> Note that .cfi_startproc and the asynchronous unwind table entry label
>> comes after the patch area, but whatever effect that has on the newly
>> inserted bti c, it already affected the insns in the patch area.
>>
>> Tested on aarch64-none-linux-gnu.
>>
>> gcc/ChangeLog:
>>
>> 	PR target/92424
>> 	* config/aarch64/aarch64.c (aarch64_declare_function_name): Emit BTI c
>> 	if the function label is followed by a patch area.
>> ---
>>  gcc/config/aarch64/aarch64.c | 33 +++++++++++++++++++++++++++++++++
>>  1 file changed, 33 insertions(+)
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 66e20becaf2..0394c274330 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -18079,6 +18079,39 @@ aarch64_declare_function_name (FILE *stream, const char* name,
>>    /* Don't forget the type directive for ELF.  */
>>    ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
>>    ASM_OUTPUT_LABEL (stream, name);
>> +
>> +  if (!aarch64_bti_enabled ()
>> +      || cgraph_node::get (fndecl)->only_called_directly_p ())
>> +    return;
>> +
>> +  /* Copy logic from varasm.c assemble_start_function
>> +     to handle -fpatchable-function-entry=N,M with BTI.  */
>> +  unsigned HOST_WIDE_INT patch_area_size = function_entry_patch_area_size;
>> +  unsigned HOST_WIDE_INT patch_area_entry = function_entry_patch_area_start;
>> +
>> +  tree patchable_function_entry_attr
>> +    = lookup_attribute ("patchable_function_entry", DECL_ATTRIBUTES (fndecl));
>> +  if (patchable_function_entry_attr)
>> +    {
>> +      tree pp_val = TREE_VALUE (patchable_function_entry_attr);
>> +      tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
>> +
>> +      patch_area_size = tree_to_uhwi (patchable_function_entry_value1);
>> +      patch_area_entry = 0;
>> +      if (TREE_CHAIN (pp_val) != NULL_TREE)
>> +	{
>> +	  tree patchable_function_entry_value2
>> +	    = TREE_VALUE (TREE_CHAIN (pp_val));
>> +	  patch_area_entry = tree_to_uhwi (patchable_function_entry_value2);
>> +	}
>> +    }
>> +
>> +  if (patch_area_entry > patch_area_size)
>> +    patch_area_entry = 0;
>> +
>> +  /* Emit a BTI c if a patch area comes after the function label.  */
>> +  if (patch_area_size > patch_area_entry)
>> +    asm_fprintf (stream, "\thint\t34 // bti c\n");
>>  }
>>
>>  /* Implement ASM_OUTPUT_DEF_FROM_DECLS.  Output .variant_pcs for aliases.  */
Li, Pan2 via Gcc-patches Jan. 19, 2020, 6:26 p.m. UTC | #3
On 2020-01-19, Fāng-ruì Sòng wrote:
>On 2020-01-16, Richard Sandiford wrote:
>>Szabolcs Nagy <Szabolcs.Nagy@arm.com> writes:
>>>this affects the linux kernel and technically a wrong code bug
>>>so this fix tries to be backportable (fixing all issues with
>>>-fpatchable-function-entry=N,M will likely require new option).
>>
>>Even for the backportable version, I think it would be better
>>not to duplicate so much varasm stuff.
>>
>>Perhaps instead we could:
>>
>>(a) set a cfun->machine flag in aarch64_declare_function_name
>>   to say that we've assembled the label
>>
>>(b) override print_patchable_function_entry so that it prints a BTI if
>>   this flag is set and the usual other conditions are true
>>
>>As discussed off-list, it'd be good to avoid a second BTI after the
>>nops if possible.  print_patchable_function_entry should be able
>>to find the first instruction using get_insns and next_real_insn,
>>then remove it if it turns out to be a BTI.
>>
>>Thanks,
>>Richard
>
>It'd be great to have some tests, e.g.
>
>1. -fpatchable-function-entry=0 -mbranch-protection=bti
>2. -fpatchable-function-entry=2 -mbranch-protection=bti
>
>I have updated clang to emit   `.Lfunc_begin0: bti c; nop; nop` for case 2.
>The __patchable_function_entries entry points to .Lfunc_begin0 (bti c).
>
>(The change is not included in the llvm 10.0 branch.)
>
>I think we can address all except the SHF_WRITE issue in a backward
>compatible manner. For some items listed in
>https://gcc.gnu.org/ml/gcc/2020-01/msg00067.html , the fixes require GNU as
>(https://sourceware.org/bugzilla/show_bug.cgi?id=25380 https://sourceware.org/bugzilla/show_bug.cgi?id=25381)
>and ld features (https://sourceware.org/ml/binutils/2019-11/msg00266.html).

Hope someone can take a look at
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00271.html
Szabolcs Nagy Jan. 20, 2020, 10:53 a.m. UTC | #4
On 19/01/2020 08:53, Fāng-ruì Sòng via gcc-patches wrote:
> It'd be great to have some tests, e.g.
> 
> 1. -fpatchable-function-entry=0 -mbranch-protection=bti
> 2. -fpatchable-function-entry=2 -mbranch-protection=bti
> 
> I have updated clang to emit   `.Lfunc_begin0: bti c; nop; nop` for case 2.
> The __patchable_function_entries entry points to .Lfunc_begin0 (bti c).
> 
> (The change is not included in the llvm 10.0 branch.)

i have to ask some linux developers which way they prefer:

e.g. -fpatchable-function-entry=3,1 is

 .section __patchable_function_entries
 .8byte .Lpatch
 .text
.Lpatch:
  nop
func:
  nop
  nop
  ...

with bti the code will be emitted as:

.Lpatch:
  nop
func:
  bti c
  nop
  nop
  ...

but e.g. -fpatchable-function-entry=2,0 has two reasonable
approaches with bti:

(a)

func:
.Lpatch:
  bti c
  nop
  nop
  ...

(b)

func:
  bti c
.Lpatch:
  nop
  nop
  ...

i think (a) is more consistent across fancy N,M settings
(bti is always included into the patch area, user needs
to know to skip it), but (b) is more compatible with
existing usage (M=0 is i believe the common setting and
with that or with M=N the patching code does not need to
know about bti, existing patching code works unmodified).

current llvm fix does (a), proposed gcc fix does (b),
i guess we have to pick one.

(solution (a) is a bit messier in gcc, because currently
there is no target hook between the emission of .Lpatch
and the nops, i avoided refactoring that code to get a
backend only fix that is easy to backport, but (a) is
possible to do with a bit more changes.)
Mark Rutland Jan. 21, 2020, 11:34 a.m. UTC | #5
Hi Szabolcs,

Answers from a linux dev perspective below.

On Mon, Jan 20, 2020 at 10:53:33AM +0000, Szabolcs Nagy wrote:
> On 19/01/2020 08:53, Fāng-ruì Sòng via gcc-patches wrote:
> > It'd be great to have some tests, e.g.
> > 
> > 1. -fpatchable-function-entry=0 -mbranch-protection=bti
> > 2. -fpatchable-function-entry=2 -mbranch-protection=bti
> > 
> > I have updated clang to emit   `.Lfunc_begin0: bti c; nop; nop` for case 2.
> > The __patchable_function_entries entry points to .Lfunc_begin0 (bti c).
> > 
> > (The change is not included in the llvm 10.0 branch.)
> 
> i have to ask some linux developers which way they prefer:
> 
> e.g. -fpatchable-function-entry=3,1 is
> 
>  .section __patchable_function_entries
>  .8byte .Lpatch
>  .text
> .Lpatch:
>   nop
> func:
>   nop
>   nop
>   ...
> 
> with bti the code will be emitted as:
> 
> .Lpatch:
>   nop
> func:
>   bti c
>   nop
>   nop
>   ...

That looks good to me.

> but e.g. -fpatchable-function-entry=2,0 has two reasonable
> approaches with bti:
> 
> (a)
> 
> func:
> .Lpatch:
>   bti c
>   nop
>   nop
>   ...
> 
> (b)
> 
> func:
>   bti c
> .Lpatch:
>   nop
>   nop
>   ...

I had assumed (b); that means that .Lpatch consistently points to the
first NOP. To my mental model, that seems more consistent than (a).

However, I can work with either so long as it's consistent.

> i think (a) is more consistent across fancy N,M settings
> (bti is always included into the patch area, user needs
> to know to skip it), but (b) is more compatible with
> existing usage (M=0 is i believe the common setting and
> with that or with M=N the patching code does not need to
> know about bti, existing patching code works unmodified).
> 
> current llvm fix does (a), proposed gcc fix does (b),
> i guess we have to pick one.
>
> (solution (a) is a bit messier in gcc, because currently
> there is no target hook between the emission of .Lpatch
> and the nops, i avoided refactoring that code to get a
> backend only fix that is easy to backport, but (a) is
> possible to do with a bit more changes.)

As above, my weak preference is (b), but I can work with either. I just
need the behaviour to be consistent.

Was there a rationale for LLVM choosing (a) rather than (b), e.g. was
that also ease of implementation? If there isn't a rationale otherwise,
and if LLVM could also do (b), that would be nice from my PoV.

How big is "a bit more changes" for GCC?

Thanks,
Mark.
Szabolcs Nagy Jan. 21, 2020, 12:50 p.m. UTC | #6
On 21/01/2020 11:34, Mark Rutland wrote:
> Hi Szabolcs,
> 
> Answers from a linux dev perspective below.

thanks.

> On Mon, Jan 20, 2020 at 10:53:33AM +0000, Szabolcs Nagy wrote:
>> i have to ask some linux developers which way they prefer:
>>
>> e.g. -fpatchable-function-entry=3,1 is
>>
>>  .section __patchable_function_entries
>>  .8byte .Lpatch
>>  .text
>> .Lpatch:
>>   nop
>> func:
>>   nop
>>   nop
>>   ...
>>
>> with bti the code will be emitted as:
>>
>> .Lpatch:
>>   nop
>> func:
>>   bti c
>>   nop
>>   nop
>>   ...
> 
> That looks good to me.
> 
>> but e.g. -fpatchable-function-entry=2,0 has two reasonable
>> approaches with bti:
>>
>> (a)
>>
>> func:
>> .Lpatch:
>>   bti c
>>   nop
>>   nop
>>   ...
>>
>> (b)
>>
>> func:
>>   bti c
>> .Lpatch:
>>   nop
>>   nop
>>   ...
> 
> I had assumed (b); that means that .Lpatch consistently points to the
> first NOP. To my mental model, that seems more consistent than (a).
> 
> However, I can work with either so long as it's consistent.
...
> As above, my weak preference is (b), but I can work with either. I just
> need the behaviour to be consistent.
> 
> Was there a rationale for LLVM choosing (a) rather than (b), e.g. was
> that also ease of implementation? If there isn't a rationale otherwise,
> and if LLVM could also do (b), that would be nice from my PoV.
> 
> How big is "a bit more changes" for GCC?

".Lpatch: nop; nop" is generated by generic code now which
a backend can override, but then i have to copy that logic
(~30 lines) to add something after the .Lpatch, or i have
to change generic code to have a new hook after the .Lpatch:.

to provide more argument for (b) and expand on consistency:

right now bti is not emitted if a function is not called
indirectly (compiler knows this for some local functions)
or bti can be turned off by a target attribute per function.

so a mix of bti and non-bti functions are possible, so
the user patching code has to deal with this (e.g. read
the instructions before patching etc)

in the M>0, M!=N case bti is emitted in the middle of a
patch area (e.g. 3,1 case above), but some functions will
not have bti. i can make it so that if patchable function
entry is requested then bti is always added, irrespective
of any other option or attribute, but that seems to be a
big hammer, without such big hammer solution users will
have a difficult time to deal with the additional bti.

if the patch area is entirely before the function label,
then no changes are needed (M=N), if patch area is after
the function label (M=0) then using fix (b) would allow
simple user code. in other cases user code will be
complicated no matter what. so if linux uses M=0 then i
think (b) is preferable.

hope this helps.
Li, Pan2 via Gcc-patches Jan. 21, 2020, 5:51 p.m. UTC | #7
On 2020-01-21, Szabolcs Nagy wrote:
>On 21/01/2020 11:34, Mark Rutland wrote:
>> Hi Szabolcs,
>>
>> Answers from a linux dev perspective below.
>
>thanks.
>
>> On Mon, Jan 20, 2020 at 10:53:33AM +0000, Szabolcs Nagy wrote:
>>> i have to ask some linux developers which way they prefer:
>>>
>>> e.g. -fpatchable-function-entry=3,1 is
>>>
>>>  .section __patchable_function_entries
>>>  .8byte .Lpatch
>>>  .text
>>> .Lpatch:
>>>   nop
>>> func:
>>>   nop
>>>   nop
>>>   ...
>>>
>>> with bti the code will be emitted as:
>>>
>>> .Lpatch:
>>>   nop
>>> func:
>>>   bti c
>>>   nop
>>>   nop
>>>   ...
>>
>> That looks good to me.
>>
>>> but e.g. -fpatchable-function-entry=2,0 has two reasonable
>>> approaches with bti:
>>>
>>> (a)
>>>
>>> func:
>>> .Lpatch:
>>>   bti c
>>>   nop
>>>   nop
>>>   ...
>>>
>>> (b)
>>>
>>> func:
>>>   bti c
>>> .Lpatch:
>>>   nop
>>>   nop
>>>   ...
>>
>> I had assumed (b); that means that .Lpatch consistently points to the
>> first NOP. To my mental model, that seems more consistent than (a).
>>
>> However, I can work with either so long as it's consistent.
>...
>> As above, my weak preference is (b), but I can work with either. I just
>> need the behaviour to be consistent.
>>
>> Was there a rationale for LLVM choosing (a) rather than (b), e.g. was
>> that also ease of implementation? If there isn't a rationale otherwise,
>> and if LLVM could also do (b), that would be nice from my PoV.
>>
>> How big is "a bit more changes" for GCC?
>
>".Lpatch: nop; nop" is generated by generic code now which
>a backend can override, but then i have to copy that logic
>(~30 lines) to add something after the .Lpatch, or i have
>to change generic code to have a new hook after the .Lpatch:.
>
>to provide more argument for (b) and expand on consistency:
>
>right now bti is not emitted if a function is not called
>indirectly (compiler knows this for some local functions)
>or bti can be turned off by a target attribute per function.
>
>so a mix of bti and non-bti functions are possible, so
>the user patching code has to deal with this (e.g. read
>the instructions before patching etc)
>
>in the M>0, M!=N case bti is emitted in the middle of a
>patch area (e.g. 3,1 case above), but some functions will
>not have bti. i can make it so that if patchable function
>entry is requested then bti is always added, irrespective
>of any other option or attribute, but that seems to be a
>big hammer, without such big hammer solution users will
>have a difficult time to deal with the additional bti.
>
>if the patch area is entirely before the function label,
>then no changes are needed (M=N), if patch area is after
>the function label (M=0) then using fix (b) would allow
>simple user code. in other cases user code will be
>complicated no matter what. so if linux uses M=0 then i
>think (b) is preferable.
>
>hope this helps.

The Clang inconvenience is in the other way...

(My previous Clang patch series do not implement M>0.
  https://reviews.llvm.org/D73070 will add support for M>0.)

AsmPrinter (assembly printer/object file emitter) does the following:

1. Emit data before the function entry
2. Emit the function entry label and the label for __patchable_function_entries
3. Emit the function body

The function body has already been constructed before AsmPrinter. Among
late-stage machine code passes, -fpatchable-function-entry=,
-mbranch-protection= (AArch64 BTI) and -fcf-protection= (Intel Indirect
Branch Tracking) are implemented as passes which can prepend
instructions to the function body.

To do (b), step 2 needs to be split. The code generator should somehow
know the label for __patchable_function_entries is after bti
c/endbr32/endbr64.


Before forming a decision, I think we should also consider whether M>0
will be possible in the future. Taking M>0 into consideration, is (a) or
(b) more consistent?

Linux/arch/arm64/Kconfig currently uses -fpatchable-function-entry=2,0
but M>0 could be useful for other architectures (arch/parisc already uses 1,1).
Szabolcs Nagy Jan. 21, 2020, 7:05 p.m. UTC | #8
On 21/01/2020 17:51, Fāng-ruì Sòng via gcc-patches wrote:
> The Clang inconvenience is in the other way...
> 
> (My previous Clang patch series do not implement M>0.
>  https://reviews.llvm.org/D73070 will add support for M>0.)
> 
> AsmPrinter (assembly printer/object file emitter) does the following:
> 
> 1. Emit data before the function entry
> 2. Emit the function entry label and the label for __patchable_function_entries
> 3. Emit the function body
> 
> The function body has already been constructed before AsmPrinter. Among
> late-stage machine code passes, -fpatchable-function-entry=,
> -mbranch-protection= (AArch64 BTI) and -fcf-protection= (Intel Indirect
> Branch Tracking) are implemented as passes which can prepend
> instructions to the function body.
> 
> To do (b), step 2 needs to be split. The code generator should somehow
> know the label for __patchable_function_entries is after bti
> c/endbr32/endbr64.

or you can do a hack in 'emit function entry label'
(assuming there is a hook for that) so instead of
printing 'foo:' you print 'foo: bti c'

> Before forming a decision, I think we should also consider whether M>0
> will be possible in the future. Taking M>0 into consideration, is (a) or
> (b) more consistent?
> 
> Linux/arch/arm64/Kconfig currently uses -fpatchable-function-entry=2,0
> but M>0 could be useful for other architectures (arch/parisc already uses 1,1).

i think then checking for the bti is unavoidable:
some functions may miss the bti c, so simple logic
to jump over the bti at fixed offset is not enough.

but this complication is only needed if the function
label is in the middle of the patch area.
diff mbox series

Patch

From ac2a46bab60ecc80a453328b4749a951908c02c5 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date: Wed, 15 Jan 2020 12:23:40 +0000
Subject: [PATCH] [AArch64] PR92424: Fix -fpatchable-function-entry=N,M with
 BTI

This is a minimal workaround that emits another BTI after the function
label if that is followed by a patch area.

So before this commit -fpatchable-function-entry=3,1 with bti generates

    .section __patchable_function_entries
    .8byte .LPFE
    .text
  .LPFE:
    nop
  foo:
    nop
    nop
    bti c // or paciasp
    ...

and after this commit

    .section __patchable_function_entries
    .8byte .LPFE
    .text
  .LPFE:
    nop
  foo:
    bti c
    nop
    nop
    bti c // or paciasp
    ...

There is a new bti insn in the middle of the patchable area unless M=0
(patch area is after the new bti) or M=N (patch area is before the
label, no new bti).

Note that .cfi_startproc and the asynchronous unwind table entry label
comes after the patch area, but whatever effect that has on the newly
inserted bti c, it already affected the insns in the patch area.

Tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

	PR target/92424
	* config/aarch64/aarch64.c (aarch64_declare_function_name): Emit BTI c
	if the function label is followed by a patch area.
---
 gcc/config/aarch64/aarch64.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 66e20becaf2..0394c274330 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -18079,6 +18079,39 @@  aarch64_declare_function_name (FILE *stream, const char* name,
   /* Don't forget the type directive for ELF.  */
   ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
   ASM_OUTPUT_LABEL (stream, name);
+
+  if (!aarch64_bti_enabled ()
+      || cgraph_node::get (fndecl)->only_called_directly_p ())
+    return;
+
+  /* Copy logic from varasm.c assemble_start_function
+     to handle -fpatchable-function-entry=N,M with BTI.  */
+  unsigned HOST_WIDE_INT patch_area_size = function_entry_patch_area_size;
+  unsigned HOST_WIDE_INT patch_area_entry = function_entry_patch_area_start;
+
+  tree patchable_function_entry_attr
+    = lookup_attribute ("patchable_function_entry", DECL_ATTRIBUTES (fndecl));
+  if (patchable_function_entry_attr)
+    {
+      tree pp_val = TREE_VALUE (patchable_function_entry_attr);
+      tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
+
+      patch_area_size = tree_to_uhwi (patchable_function_entry_value1);
+      patch_area_entry = 0;
+      if (TREE_CHAIN (pp_val) != NULL_TREE)
+	{
+	  tree patchable_function_entry_value2
+	    = TREE_VALUE (TREE_CHAIN (pp_val));
+	  patch_area_entry = tree_to_uhwi (patchable_function_entry_value2);
+	}
+    }
+
+  if (patch_area_entry > patch_area_size)
+    patch_area_entry = 0;
+
+  /* Emit a BTI c if a patch area comes after the function label.  */
+  if (patch_area_size > patch_area_entry)
+    asm_fprintf (stream, "\thint\t34 // bti c\n");
 }
 
 /* Implement ASM_OUTPUT_DEF_FROM_DECLS.  Output .variant_pcs for aliases.  */
-- 
2.17.1