diff mbox series

elf: Fix rtld-audit trampoline for aarch64

Message ID 20221117084729.2794073-1-och95@yandex.ru
State New
Headers show
Series elf: Fix rtld-audit trampoline for aarch64 | expand

Commit Message

Vladislav Khmelevsky Nov. 17, 2022, 8:47 a.m. UTC
This patch fixes two problems with audit:
1. The DL_OFFSET_RV_VPCS offset was mixed up with DL_OFFSET_RG_VPCS,
resulting in x2 register value nulling in RG structure.
2. We need to preserve the x8 register before function call, but don't have
to save it's new value and restore it before return. Anyway the final
restore was using OFFSET_RV instead of OFFSET_RG value which is wrong (althoug doesn't affect anything).
---
 sysdeps/aarch64/dl-trampoline.S | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Adhemerval Zanella Netto Nov. 17, 2022, 6:15 p.m. UTC | #1
On 17/11/22 05:47, Vladislav Khmelevsky wrote:
> This patch fixes two problems with audit:
> 1. The DL_OFFSET_RV_VPCS offset was mixed up with DL_OFFSET_RG_VPCS,
> resulting in x2 register value nulling in RG structure.
> 2. We need to preserve the x8 register before function call, but don't have
> to save it's new value and restore it before return. Anyway the final
> restore was using OFFSET_RV instead of OFFSET_RG value which is wrong (althoug doesn't affect anything).

Patch looks ok, although I think currently the ABI only uses x0 and/or x1
to return value (for __int128_t for instance). So I think it should not
be a user-visible issue (at least tst-audit26 does check that lr_vpcs
and lrv_vpcs are zeroed). Are you seeing any issue with current code?
If so could you open a bug please?

> ---
>  sysdeps/aarch64/dl-trampoline.S | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/sysdeps/aarch64/dl-trampoline.S b/sysdeps/aarch64/dl-trampoline.S
> index 909b208578..d66f0b9c45 100644
> --- a/sysdeps/aarch64/dl-trampoline.S
> +++ b/sysdeps/aarch64/dl-trampoline.S
> @@ -298,12 +298,11 @@ _dl_runtime_profile:
>  	stp	x2, x3, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*1]
>  	stp	x4, x5, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*2]
>  	stp	x6, x7, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*3]
> -	str	x8,     [x29, #OFFSET_RG + DL_OFFSET_RG_X0 + 16*4]
>  	stp	q0, q1, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*0]
>  	stp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
>  	stp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]
>  	stp	q6, q7, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*3]
> -	str	xzr,    [X29, #OFFSET_RV + DL_OFFSET_RG_VPCS]
> +	str	xzr,    [X29, #OFFSET_RV + DL_OFFSET_RV_VPCS]
>  
>  	/* Setup call to pltexit  */
>  	ldp	x0, x1, [x29, #OFFSET_SAVED_CALL_X0]
> @@ -315,7 +314,6 @@ _dl_runtime_profile:
>  	ldp	x2, x3, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*1]
>  	ldp	x4, x5, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*2]
>  	ldp	x6, x7, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*3]
> -	ldr	x8,     [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*4]
>  	ldp	q0, q1, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*0]
>  	ldp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
>  	ldp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]
Vladislav Khmelevsky Nov. 17, 2022, 6:23 p.m. UTC | #2
Thank you for reviewing!
Yes, it is true tat inly x0/x1 are used as return register. But I have a specific audit library code that was storing some metadata in a free registers during plt entrer and reading it during plt exit :) As for a normal use cases both problems doesn't really affect anything.

> 17 нояб. 2022 г., в 22:15, Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> написал(а):
> 
> 
> 
> On 17/11/22 05:47, Vladislav Khmelevsky wrote:
>> This patch fixes two problems with audit:
>> 1. The DL_OFFSET_RV_VPCS offset was mixed up with DL_OFFSET_RG_VPCS,
>> resulting in x2 register value nulling in RG structure.
>> 2. We need to preserve the x8 register before function call, but don't have
>> to save it's new value and restore it before return. Anyway the final
>> restore was using OFFSET_RV instead of OFFSET_RG value which is wrong (althoug doesn't affect anything).
> 
> Patch looks ok, although I think currently the ABI only uses x0 and/or x1
> to return value (for __int128_t for instance). So I think it should not
> be a user-visible issue (at least tst-audit26 does check that lr_vpcs
> and lrv_vpcs are zeroed). Are you seeing any issue with current code?
> If so could you open a bug please?
> 
>> ---
>> sysdeps/aarch64/dl-trampoline.S | 4 +---
>> 1 file changed, 1 insertion(+), 3 deletions(-)
>> 
>> diff --git a/sysdeps/aarch64/dl-trampoline.S b/sysdeps/aarch64/dl-trampoline.S
>> index 909b208578..d66f0b9c45 100644
>> --- a/sysdeps/aarch64/dl-trampoline.S
>> +++ b/sysdeps/aarch64/dl-trampoline.S
>> @@ -298,12 +298,11 @@ _dl_runtime_profile:
>> 	stp	x2, x3, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*1]
>> 	stp	x4, x5, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*2]
>> 	stp	x6, x7, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*3]
>> -	str	x8,     [x29, #OFFSET_RG + DL_OFFSET_RG_X0 + 16*4]
>> 	stp	q0, q1, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*0]
>> 	stp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
>> 	stp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]
>> 	stp	q6, q7, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*3]
>> -	str	xzr,    [X29, #OFFSET_RV + DL_OFFSET_RG_VPCS]
>> +	str	xzr,    [X29, #OFFSET_RV + DL_OFFSET_RV_VPCS]
>> 
>> 	/* Setup call to pltexit  */
>> 	ldp	x0, x1, [x29, #OFFSET_SAVED_CALL_X0]
>> @@ -315,7 +314,6 @@ _dl_runtime_profile:
>> 	ldp	x2, x3, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*1]
>> 	ldp	x4, x5, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*2]
>> 	ldp	x6, x7, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*3]
>> -	ldr	x8,     [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*4]
>> 	ldp	q0, q1, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*0]
>> 	ldp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
>> 	ldp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]
Adhemerval Zanella Netto Nov. 17, 2022, 6:36 p.m. UTC | #3
On 17/11/22 15:23, Vladislav Khmelevsky wrote:
> Thank you for reviewing!
> Yes, it is true tat inly x0/x1 are used as return register. But I have a specific audit library code that was storing some metadata in a free registers during plt entrer and reading it during plt exit :) As for a normal use cases both problems doesn't really affect anything.
> 

Fair enough, although we might want to backport this.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

>> 17 нояб. 2022 г., в 22:15, Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> написал(а):
>>
>>
>>
>> On 17/11/22 05:47, Vladislav Khmelevsky wrote:
>>> This patch fixes two problems with audit:
>>> 1. The DL_OFFSET_RV_VPCS offset was mixed up with DL_OFFSET_RG_VPCS,
>>> resulting in x2 register value nulling in RG structure.
>>> 2. We need to preserve the x8 register before function call, but don't have
>>> to save it's new value and restore it before return. Anyway the final
>>> restore was using OFFSET_RV instead of OFFSET_RG value which is wrong (althoug doesn't affect anything).
>>
>> Patch looks ok, although I think currently the ABI only uses x0 and/or x1
>> to return value (for __int128_t for instance). So I think it should not
>> be a user-visible issue (at least tst-audit26 does check that lr_vpcs
>> and lrv_vpcs are zeroed). Are you seeing any issue with current code?
>> If so could you open a bug please?
>>
>>> ---
>>> sysdeps/aarch64/dl-trampoline.S | 4 +---
>>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>>
>>> diff --git a/sysdeps/aarch64/dl-trampoline.S b/sysdeps/aarch64/dl-trampoline.S
>>> index 909b208578..d66f0b9c45 100644
>>> --- a/sysdeps/aarch64/dl-trampoline.S
>>> +++ b/sysdeps/aarch64/dl-trampoline.S
>>> @@ -298,12 +298,11 @@ _dl_runtime_profile:
>>> 	stp	x2, x3, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*1]
>>> 	stp	x4, x5, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*2]
>>> 	stp	x6, x7, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*3]
>>> -	str	x8,     [x29, #OFFSET_RG + DL_OFFSET_RG_X0 + 16*4]
>>> 	stp	q0, q1, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*0]
>>> 	stp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
>>> 	stp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]
>>> 	stp	q6, q7, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*3]
>>> -	str	xzr,    [X29, #OFFSET_RV + DL_OFFSET_RG_VPCS]
>>> +	str	xzr,    [X29, #OFFSET_RV + DL_OFFSET_RV_VPCS]
>>>
>>> 	/* Setup call to pltexit  */
>>> 	ldp	x0, x1, [x29, #OFFSET_SAVED_CALL_X0]
>>> @@ -315,7 +314,6 @@ _dl_runtime_profile:
>>> 	ldp	x2, x3, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*1]
>>> 	ldp	x4, x5, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*2]
>>> 	ldp	x6, x7, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*3]
>>> -	ldr	x8,     [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*4]
>>> 	ldp	q0, q1, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*0]
>>> 	ldp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
>>> 	ldp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]
>
Vladislav Khmelevsky Nov. 17, 2022, 6:51 p.m. UTC | #4
Thanks! But I would need some guidance from you if you don't mind. First of all I don't have write access to the repo, I would very appreciate if you or somebody would help me to merge this :)

> 17 нояб. 2022 г., в 22:36, Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> написал(а):
> 
> 
> 
> On 17/11/22 15:23, Vladislav Khmelevsky wrote:
>> Thank you for reviewing!
>> Yes, it is true tat inly x0/x1 are used as return register. But I have a specific audit library code that was storing some metadata in a free registers during plt entrer and reading it during plt exit :) As for a normal use cases both problems doesn't really affect anything.
>> 
> 
> Fair enough, although we might want to backport this.
> 
> Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
> 
>>> 17 нояб. 2022 г., в 22:15, Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> написал(а):
>>> 
>>> 
>>> 
>>> On 17/11/22 05:47, Vladislav Khmelevsky wrote:
>>>> This patch fixes two problems with audit:
>>>> 1. The DL_OFFSET_RV_VPCS offset was mixed up with DL_OFFSET_RG_VPCS,
>>>> resulting in x2 register value nulling in RG structure.
>>>> 2. We need to preserve the x8 register before function call, but don't have
>>>> to save it's new value and restore it before return. Anyway the final
>>>> restore was using OFFSET_RV instead of OFFSET_RG value which is wrong (althoug doesn't affect anything).
>>> 
>>> Patch looks ok, although I think currently the ABI only uses x0 and/or x1
>>> to return value (for __int128_t for instance). So I think it should not
>>> be a user-visible issue (at least tst-audit26 does check that lr_vpcs
>>> and lrv_vpcs are zeroed). Are you seeing any issue with current code?
>>> If so could you open a bug please?
>>> 
>>>> ---
>>>> sysdeps/aarch64/dl-trampoline.S | 4 +---
>>>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>>> 
>>>> diff --git a/sysdeps/aarch64/dl-trampoline.S b/sysdeps/aarch64/dl-trampoline.S
>>>> index 909b208578..d66f0b9c45 100644
>>>> --- a/sysdeps/aarch64/dl-trampoline.S
>>>> +++ b/sysdeps/aarch64/dl-trampoline.S
>>>> @@ -298,12 +298,11 @@ _dl_runtime_profile:
>>>> 	stp	x2, x3, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*1]
>>>> 	stp	x4, x5, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*2]
>>>> 	stp	x6, x7, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*3]
>>>> -	str	x8,     [x29, #OFFSET_RG + DL_OFFSET_RG_X0 + 16*4]
>>>> 	stp	q0, q1, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*0]
>>>> 	stp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
>>>> 	stp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]
>>>> 	stp	q6, q7, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*3]
>>>> -	str	xzr,    [X29, #OFFSET_RV + DL_OFFSET_RG_VPCS]
>>>> +	str	xzr,    [X29, #OFFSET_RV + DL_OFFSET_RV_VPCS]
>>>> 
>>>> 	/* Setup call to pltexit  */
>>>> 	ldp	x0, x1, [x29, #OFFSET_SAVED_CALL_X0]
>>>> @@ -315,7 +314,6 @@ _dl_runtime_profile:
>>>> 	ldp	x2, x3, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*1]
>>>> 	ldp	x4, x5, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*2]
>>>> 	ldp	x6, x7, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*3]
>>>> -	ldr	x8,     [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*4]
>>>> 	ldp	q0, q1, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*0]
>>>> 	ldp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
>>>> 	ldp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]
>>
Carlos O'Donell Dec. 2, 2022, 6:20 p.m. UTC | #5
On 11/17/22 13:51, Vladislav Khmelevsky via Libc-alpha wrote:
> Thanks! But I would need some guidance from you if you don't mind.
> First of all I don't have write access to the repo, I would very
> appreciate if you or somebody would help me to merge this :)

I was doing some review of rtld-audit changes we might want to backport, and
I just wanted to close the loop here that this is committed to the glibc
development branch as of November 21st and will be in glibc 2.37, and has
already been backported all the way to glibc 2.34 (which introduced the key
fixes for aarch64 in rtld-audit).

We could have written a test case for this, but it's an odd testcase
because it's about ensuring we touch minimal register state as we return
through the pltexit case.

Sending this email mostly for my own records so I remember what we did.
diff mbox series

Patch

diff --git a/sysdeps/aarch64/dl-trampoline.S b/sysdeps/aarch64/dl-trampoline.S
index 909b208578..d66f0b9c45 100644
--- a/sysdeps/aarch64/dl-trampoline.S
+++ b/sysdeps/aarch64/dl-trampoline.S
@@ -298,12 +298,11 @@  _dl_runtime_profile:
 	stp	x2, x3, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*1]
 	stp	x4, x5, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*2]
 	stp	x6, x7, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*3]
-	str	x8,     [x29, #OFFSET_RG + DL_OFFSET_RG_X0 + 16*4]
 	stp	q0, q1, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*0]
 	stp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
 	stp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]
 	stp	q6, q7, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*3]
-	str	xzr,    [X29, #OFFSET_RV + DL_OFFSET_RG_VPCS]
+	str	xzr,    [X29, #OFFSET_RV + DL_OFFSET_RV_VPCS]
 
 	/* Setup call to pltexit  */
 	ldp	x0, x1, [x29, #OFFSET_SAVED_CALL_X0]
@@ -315,7 +314,6 @@  _dl_runtime_profile:
 	ldp	x2, x3, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*1]
 	ldp	x4, x5, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*2]
 	ldp	x6, x7, [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*3]
-	ldr	x8,     [x29, #OFFSET_RV + DL_OFFSET_RV_X0 + 16*4]
 	ldp	q0, q1, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*0]
 	ldp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
 	ldp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]