diff mbox series

powerpc/ftrace: fix syscall tracing on PPC64_ELF_ABI_V1

Message ID 20221201161442.2127231-1-mjeanson@efficios.com (mailing list archive)
State Accepted
Commit ad050d2390fccb22aa3e6f65e11757ce7a5a7ca5
Headers show
Series powerpc/ftrace: fix syscall tracing on PPC64_ELF_ABI_V1 | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 6 jobs.

Commit Message

Michael Jeanson Dec. 1, 2022, 4:14 p.m. UTC
In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
changing from their dot prefixed variant to the non-prefixed ones.

Since ftrace prefixes a dot to the syscall names when matching them to
build its syscall event list, this resulted in no syscall events being
available.

Remove the PPC64_ELF_ABI_V1 specific version of
arch_syscall_match_sym_name to have the same behavior across all powerpc
variants.

Fixes: 68b34588e202 ("powerpc/64/sycall: Implement syscall entry/exit logic in C")
Cc: stable@vger.kernel.org # v5.7+
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 arch/powerpc/include/asm/ftrace.h | 12 ------------
 1 file changed, 12 deletions(-)

Comments

Michael Ellerman Dec. 5, 2022, 5:34 a.m. UTC | #1
Michael Jeanson <mjeanson@efficios.com> writes:
> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
> changing from their dot prefixed variant to the non-prefixed ones.
>
> Since ftrace prefixes a dot to the syscall names when matching them to
> build its syscall event list, this resulted in no syscall events being
> available.
>
> Remove the PPC64_ELF_ABI_V1 specific version of
> arch_syscall_match_sym_name to have the same behavior across all powerpc
> variants.

This doesn't seem to work for me.

Event with it applied I still don't see anything in /sys/kernel/debug/tracing/events/syscalls

Did we break it in some other way recently?

cheers


> Fixes: 68b34588e202 ("powerpc/64/sycall: Implement syscall entry/exit logic in C")
> Cc: stable@vger.kernel.org # v5.7+
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
> Cc: Michal Suchanek <msuchanek@suse.de>
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> ---
>  arch/powerpc/include/asm/ftrace.h | 12 ------------
>  1 file changed, 12 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
> index 3cee7115441b..e3d1f377bc5b 100644
> --- a/arch/powerpc/include/asm/ftrace.h
> +++ b/arch/powerpc/include/asm/ftrace.h
> @@ -64,17 +64,6 @@ void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
>   * those.
>   */
>  #define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
> -#ifdef CONFIG_PPC64_ELF_ABI_V1
> -static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
> -{
> -	/* We need to skip past the initial dot, and the __se_sys alias */
> -	return !strcmp(sym + 1, name) ||
> -		(!strncmp(sym, ".__se_sys", 9) && !strcmp(sym + 6, name)) ||
> -		(!strncmp(sym, ".ppc_", 5) && !strcmp(sym + 5, name + 4)) ||
> -		(!strncmp(sym, ".ppc32_", 7) && !strcmp(sym + 7, name + 4)) ||
> -		(!strncmp(sym, ".ppc64_", 7) && !strcmp(sym + 7, name + 4));
> -}
> -#else
>  static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
>  {
>  	return !strcmp(sym, name) ||
> @@ -83,7 +72,6 @@ static inline bool arch_syscall_match_sym_name(const char *sym, const char *name
>  		(!strncmp(sym, "ppc32_", 6) && !strcmp(sym + 6, name + 4)) ||
>  		(!strncmp(sym, "ppc64_", 6) && !strcmp(sym + 6, name + 4));
>  }
> -#endif /* CONFIG_PPC64_ELF_ABI_V1 */
>  #endif /* CONFIG_FTRACE_SYSCALLS */
>  
>  #if defined(CONFIG_PPC64) && defined(CONFIG_FUNCTION_TRACER)
> -- 
> 2.34.1
Michael Jeanson Dec. 5, 2022, 6:19 p.m. UTC | #2
On 2022-12-05 00:34, Michael Ellerman wrote:
> Michael Jeanson <mjeanson@efficios.com> writes:
>> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
>> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
>> changing from their dot prefixed variant to the non-prefixed ones.
>>
>> Since ftrace prefixes a dot to the syscall names when matching them to
>> build its syscall event list, this resulted in no syscall events being
>> available.
>>
>> Remove the PPC64_ELF_ABI_V1 specific version of
>> arch_syscall_match_sym_name to have the same behavior across all powerpc
>> variants.
> 
> This doesn't seem to work for me.
> 
> Event with it applied I still don't see anything in /sys/kernel/debug/tracing/events/syscalls
> 
> Did we break it in some other way recently?
> 
> cheers

I've just tried this change on top of v6.1-rc8 in qemu with a base config of 
'corenet32_smp_defconfig' and these options on top:

CONFIG_FTRACE=y
CONFIG_FTRACE_SYSCALLS=y

And I can trace syscalls with ftrace.

What kernel tree and config are you using?

Thanks for looking into this.

> 
> 
>> Fixes: 68b34588e202 ("powerpc/64/sycall: Implement syscall entry/exit logic in C")
>> Cc: stable@vger.kernel.org # v5.7+
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Cc: Masami Hiramatsu <mhiramat@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Nicholas Piggin <npiggin@gmail.com>
>> Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
>> Cc: Michal Suchanek <msuchanek@suse.de>
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
>> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> ---
>>   arch/powerpc/include/asm/ftrace.h | 12 ------------
>>   1 file changed, 12 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
>> index 3cee7115441b..e3d1f377bc5b 100644
>> --- a/arch/powerpc/include/asm/ftrace.h
>> +++ b/arch/powerpc/include/asm/ftrace.h
>> @@ -64,17 +64,6 @@ void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
>>    * those.
>>    */
>>   #define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
>> -#ifdef CONFIG_PPC64_ELF_ABI_V1
>> -static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
>> -{
>> -	/* We need to skip past the initial dot, and the __se_sys alias */
>> -	return !strcmp(sym + 1, name) ||
>> -		(!strncmp(sym, ".__se_sys", 9) && !strcmp(sym + 6, name)) ||
>> -		(!strncmp(sym, ".ppc_", 5) && !strcmp(sym + 5, name + 4)) ||
>> -		(!strncmp(sym, ".ppc32_", 7) && !strcmp(sym + 7, name + 4)) ||
>> -		(!strncmp(sym, ".ppc64_", 7) && !strcmp(sym + 7, name + 4));
>> -}
>> -#else
>>   static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
>>   {
>>   	return !strcmp(sym, name) ||
>> @@ -83,7 +72,6 @@ static inline bool arch_syscall_match_sym_name(const char *sym, const char *name
>>   		(!strncmp(sym, "ppc32_", 6) && !strcmp(sym + 6, name + 4)) ||
>>   		(!strncmp(sym, "ppc64_", 6) && !strcmp(sym + 6, name + 4));
>>   }
>> -#endif /* CONFIG_PPC64_ELF_ABI_V1 */
>>   #endif /* CONFIG_FTRACE_SYSCALLS */
>>   
>>   #if defined(CONFIG_PPC64) && defined(CONFIG_FUNCTION_TRACER)
>> -- 
>> 2.34.1
Christophe Leroy Dec. 5, 2022, 6:56 p.m. UTC | #3
Le 05/12/2022 à 19:19, Michael Jeanson a écrit :
> [Vous ne recevez pas souvent de courriers de mjeanson@efficios.com. 
> Découvrez pourquoi ceci est important à 
> https://aka.ms/LearnAboutSenderIdentification ]
> 
> On 2022-12-05 00:34, Michael Ellerman wrote:
>> Michael Jeanson <mjeanson@efficios.com> writes:
>>> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
>>> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
>>> changing from their dot prefixed variant to the non-prefixed ones.
>>>
>>> Since ftrace prefixes a dot to the syscall names when matching them to
>>> build its syscall event list, this resulted in no syscall events being
>>> available.
>>>
>>> Remove the PPC64_ELF_ABI_V1 specific version of
>>> arch_syscall_match_sym_name to have the same behavior across all powerpc
>>> variants.
>>
>> This doesn't seem to work for me.
>>
>> Event with it applied I still don't see anything in 
>> /sys/kernel/debug/tracing/events/syscalls
>>
>> Did we break it in some other way recently?
>>
>> cheers
> 
> I've just tried this change on top of v6.1-rc8 in qemu with a base 
> config of
> 'corenet32_smp_defconfig' and these options on top:
> 
> CONFIG_FTRACE=y
> CONFIG_FTRACE_SYSCALLS=y
> 
> And I can trace syscalls with ftrace.
> 
> What kernel tree and config are you using?

If you are using a ppc32 config, CONFIG_PPC64_ELF_ABI_V1 won't be set, 
so it doesn't matter whether this change is there or not.

You should try corenet64_smp_defconfig if you want 
CONFIG_PPC64_ELF_ABI_V1 to be set.

You can also use ppc64_defconfig, that's a different platform but it 
also has CONFIG_PPC64_ELF_ABI_V1.

Christophe

> 
> Thanks for looking into this.
> 
>>
>>
>>> Fixes: 68b34588e202 ("powerpc/64/sycall: Implement syscall entry/exit 
>>> logic in C")
>>> Cc: stable@vger.kernel.org # v5.7+
>>> Cc: Steven Rostedt <rostedt@goodmis.org>
>>> Cc: Masami Hiramatsu <mhiramat@kernel.org>
>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>>> Cc: Nicholas Piggin <npiggin@gmail.com>
>>> Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
>>> Cc: Michal Suchanek <msuchanek@suse.de>
>>> Cc: linuxppc-dev@lists.ozlabs.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
>>> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>>> ---
>>>   arch/powerpc/include/asm/ftrace.h | 12 ------------
>>>   1 file changed, 12 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/ftrace.h 
>>> b/arch/powerpc/include/asm/ftrace.h
>>> index 3cee7115441b..e3d1f377bc5b 100644
>>> --- a/arch/powerpc/include/asm/ftrace.h
>>> +++ b/arch/powerpc/include/asm/ftrace.h
>>> @@ -64,17 +64,6 @@ void ftrace_graph_func(unsigned long ip, unsigned 
>>> long parent_ip,
>>>    * those.
>>>    */
>>>   #define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
>>> -#ifdef CONFIG_PPC64_ELF_ABI_V1
>>> -static inline bool arch_syscall_match_sym_name(const char *sym, 
>>> const char *name)
>>> -{
>>> -    /* We need to skip past the initial dot, and the __se_sys alias */
>>> -    return !strcmp(sym + 1, name) ||
>>> -            (!strncmp(sym, ".__se_sys", 9) && !strcmp(sym + 6, 
>>> name)) ||
>>> -            (!strncmp(sym, ".ppc_", 5) && !strcmp(sym + 5, name + 
>>> 4)) ||
>>> -            (!strncmp(sym, ".ppc32_", 7) && !strcmp(sym + 7, name + 
>>> 4)) ||
>>> -            (!strncmp(sym, ".ppc64_", 7) && !strcmp(sym + 7, name + 
>>> 4));
>>> -}
>>> -#else
>>>   static inline bool arch_syscall_match_sym_name(const char *sym, 
>>> const char *name)
>>>   {
>>>      return !strcmp(sym, name) ||
>>> @@ -83,7 +72,6 @@ static inline bool 
>>> arch_syscall_match_sym_name(const char *sym, const char *name
>>>              (!strncmp(sym, "ppc32_", 6) && !strcmp(sym + 6, name + 
>>> 4)) ||
>>>              (!strncmp(sym, "ppc64_", 6) && !strcmp(sym + 6, name + 4));
>>>   }
>>> -#endif /* CONFIG_PPC64_ELF_ABI_V1 */
>>>   #endif /* CONFIG_FTRACE_SYSCALLS */
>>>
>>>   #if defined(CONFIG_PPC64) && defined(CONFIG_FUNCTION_TRACER)
>>> -- 
>>> 2.34.1
>
Michael Jeanson Dec. 5, 2022, 8:11 p.m. UTC | #4
On 2022-12-05 13:56, Christophe Leroy wrote:
> 
> 
> Le 05/12/2022 à 19:19, Michael Jeanson a écrit :
>> [Vous ne recevez pas souvent de courriers de mjeanson@efficios.com.
>> Découvrez pourquoi ceci est important à
>> https://aka.ms/LearnAboutSenderIdentification ]
>>
>> On 2022-12-05 00:34, Michael Ellerman wrote:
>>> Michael Jeanson <mjeanson@efficios.com> writes:
>>>> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
>>>> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
>>>> changing from their dot prefixed variant to the non-prefixed ones.
>>>>
>>>> Since ftrace prefixes a dot to the syscall names when matching them to
>>>> build its syscall event list, this resulted in no syscall events being
>>>> available.
>>>>
>>>> Remove the PPC64_ELF_ABI_V1 specific version of
>>>> arch_syscall_match_sym_name to have the same behavior across all powerpc
>>>> variants.
>>>
>>> This doesn't seem to work for me.
>>>
>>> Event with it applied I still don't see anything in
>>> /sys/kernel/debug/tracing/events/syscalls
>>>
>>> Did we break it in some other way recently?
>>>
>>> cheers
>>
>> I've just tried this change on top of v6.1-rc8 in qemu with a base
>> config of
>> 'corenet32_smp_defconfig' and these options on top:
>>
>> CONFIG_FTRACE=y
>> CONFIG_FTRACE_SYSCALLS=y
>>
>> And I can trace syscalls with ftrace.
>>
>> What kernel tree and config are you using?
> 
> If you are using a ppc32 config, CONFIG_PPC64_ELF_ABI_V1 won't be set,
> so it doesn't matter whether this change is there or not.
> 
> You should try corenet64_smp_defconfig if you want
> CONFIG_PPC64_ELF_ABI_V1 to be set.
> 
> You can also use ppc64_defconfig, that's a different platform but it
> also has CONFIG_PPC64_ELF_ABI_V1.

You are absolutely right, I used the wrong environment, I blame Monday 
morning. I tested again this time using 'corenet64_smp_defconfig' with the 
same options and syscall tracing with ftrace also works.

I double checked that /proc/config.gz contained CONFIG_PPC64_ELF_ABI_V1 to be 
sure.

> 
> Christophe
> 
>>
>> Thanks for looking into this.
>>
Michael Jeanson Dec. 5, 2022, 9:46 p.m. UTC | #5
On 2022-12-05 15:11, Michael Jeanson wrote:
>>>> Michael Jeanson <mjeanson@efficios.com> writes:
>>>>> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
>>>>> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
>>>>> changing from their dot prefixed variant to the non-prefixed ones.
>>>>>
>>>>> Since ftrace prefixes a dot to the syscall names when matching them to
>>>>> build its syscall event list, this resulted in no syscall events being
>>>>> available.
>>>>>
>>>>> Remove the PPC64_ELF_ABI_V1 specific version of
>>>>> arch_syscall_match_sym_name to have the same behavior across all powerpc
>>>>> variants.
>>>>
>>>> This doesn't seem to work for me.
>>>>
>>>> Event with it applied I still don't see anything in
>>>> /sys/kernel/debug/tracing/events/syscalls
>>>>
>>>> Did we break it in some other way recently?
>>>>
>>>> cheers

I did some further testing, my config also enabled KALLSYMS_ALL, when I remove 
it there is indeed no syscall events.
Michael Ellerman Dec. 5, 2022, 10:50 p.m. UTC | #6
Michael Jeanson <mjeanson@efficios.com> writes:
> On 2022-12-05 15:11, Michael Jeanson wrote:
>>>>> Michael Jeanson <mjeanson@efficios.com> writes:
>>>>>> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
>>>>>> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
>>>>>> changing from their dot prefixed variant to the non-prefixed ones.
>>>>>>
>>>>>> Since ftrace prefixes a dot to the syscall names when matching them to
>>>>>> build its syscall event list, this resulted in no syscall events being
>>>>>> available.
>>>>>>
>>>>>> Remove the PPC64_ELF_ABI_V1 specific version of
>>>>>> arch_syscall_match_sym_name to have the same behavior across all powerpc
>>>>>> variants.
>>>>>
>>>>> This doesn't seem to work for me.
>>>>>
>>>>> Event with it applied I still don't see anything in
>>>>> /sys/kernel/debug/tracing/events/syscalls
>>>>>
>>>>> Did we break it in some other way recently?
>>>>>
>>>>> cheers
>
> I did some further testing, my config also enabled KALLSYMS_ALL, when I remove 
> it there is indeed no syscall events.

Aha, OK that explains it I guess.

I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS,
but does not have KALLSYMS_ALL. So I guess there's some other bug
lurking in there.

cheers
Mathieu Desnoyers Dec. 6, 2022, 2:38 p.m. UTC | #7
On 2022-12-05 17:50, Michael Ellerman wrote:
> Michael Jeanson <mjeanson@efficios.com> writes:
>> On 2022-12-05 15:11, Michael Jeanson wrote:
>>>>>> Michael Jeanson <mjeanson@efficios.com> writes:
>>>>>>> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
>>>>>>> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
>>>>>>> changing from their dot prefixed variant to the non-prefixed ones.
>>>>>>>
>>>>>>> Since ftrace prefixes a dot to the syscall names when matching them to
>>>>>>> build its syscall event list, this resulted in no syscall events being
>>>>>>> available.
>>>>>>>
>>>>>>> Remove the PPC64_ELF_ABI_V1 specific version of
>>>>>>> arch_syscall_match_sym_name to have the same behavior across all powerpc
>>>>>>> variants.
>>>>>>
>>>>>> This doesn't seem to work for me.
>>>>>>
>>>>>> Event with it applied I still don't see anything in
>>>>>> /sys/kernel/debug/tracing/events/syscalls
>>>>>>
>>>>>> Did we break it in some other way recently?
>>>>>>
>>>>>> cheers
>>
>> I did some further testing, my config also enabled KALLSYMS_ALL, when I remove
>> it there is indeed no syscall events.
> 
> Aha, OK that explains it I guess.
> 
> I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS,
> but does not have KALLSYMS_ALL. So I guess there's some other bug
> lurking in there.

I don't have the setup handy to validate it, but I suspect it is caused 
by the way scripts/kallsyms.c:symbol_valid() checks whether a symbol 
entry needs to be integrated into the assembler output when 
--all-symbols is not specified. It only keeps symbols which addresses 
are in the text range. On PPC64_ELF_ABI_V1, this means only the 
dot-prefixed symbols will be kept (those point to the function begin), 
leaving out the non-dot-prefixed symbols (those point to the function 
descriptors).

So I see two possible solutions there: either we ensure that 
FTRACE_SYSCALLS selects KALLSYMS_ALL on PPC64_ELF_ABI_V1, or we modify 
scripts/kallsyms.c:symbol_valid() to also include function descriptor 
symbols. This would mean accepting symbols pointing into the .opd ELF 
section.

IMHO the second option would be better because it does not increase the 
kernel image size as much as KALLSYMS_ALL.

Thoughts ?

Thanks,

Mathieu
Christophe Leroy Dec. 6, 2022, 11:08 p.m. UTC | #8
Le 06/12/2022 à 15:38, Mathieu Desnoyers a écrit :
> On 2022-12-05 17:50, Michael Ellerman wrote:
>> Michael Jeanson <mjeanson@efficios.com> writes:
>>> On 2022-12-05 15:11, Michael Jeanson wrote:
>>>>>>> Michael Jeanson <mjeanson@efficios.com> writes:
>>>>>>>> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
>>>>>>>> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
>>>>>>>> changing from their dot prefixed variant to the non-prefixed ones.
>>>>>>>>
>>>>>>>> Since ftrace prefixes a dot to the syscall names when matching 
>>>>>>>> them to
>>>>>>>> build its syscall event list, this resulted in no syscall events 
>>>>>>>> being
>>>>>>>> available.
>>>>>>>>
>>>>>>>> Remove the PPC64_ELF_ABI_V1 specific version of
>>>>>>>> arch_syscall_match_sym_name to have the same behavior across all 
>>>>>>>> powerpc
>>>>>>>> variants.
>>>>>>>
>>>>>>> This doesn't seem to work for me.
>>>>>>>
>>>>>>> Event with it applied I still don't see anything in
>>>>>>> /sys/kernel/debug/tracing/events/syscalls
>>>>>>>
>>>>>>> Did we break it in some other way recently?
>>>>>>>
>>>>>>> cheers
>>>
>>> I did some further testing, my config also enabled KALLSYMS_ALL, when 
>>> I remove
>>> it there is indeed no syscall events.
>>
>> Aha, OK that explains it I guess.
>>
>> I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS,
>> but does not have KALLSYMS_ALL. So I guess there's some other bug
>> lurking in there.
> 
> I don't have the setup handy to validate it, but I suspect it is caused 
> by the way scripts/kallsyms.c:symbol_valid() checks whether a symbol 
> entry needs to be integrated into the assembler output when 
> --all-symbols is not specified. It only keeps symbols which addresses 
> are in the text range. On PPC64_ELF_ABI_V1, this means only the 
> dot-prefixed symbols will be kept (those point to the function begin), 
> leaving out the non-dot-prefixed symbols (those point to the function 
> descriptors).
> 
> So I see two possible solutions there: either we ensure that 
> FTRACE_SYSCALLS selects KALLSYMS_ALL on PPC64_ELF_ABI_V1, or we modify 
> scripts/kallsyms.c:symbol_valid() to also include function descriptor 
> symbols. This would mean accepting symbols pointing into the .opd ELF 
> section.
> 
> IMHO the second option would be better because it does not increase the 
> kernel image size as much as KALLSYMS_ALL.
> 

Yes, seems to be the best solution.

Maybe the only thing to do is to add a new range to text_ranges, 
something like (untested):

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 03fa07ad45d9..decf31c497f5 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -64,6 +64,7 @@ static unsigned long long relative_base;
  static struct addr_range text_ranges[] = {
  	{ "_stext",     "_etext"     },
  	{ "_sinittext", "_einittext" },
+	{ "__start_opd", "__end_opd" },
  };
  #define text_range_text     (&text_ranges[0])
  #define text_range_inittext (&text_ranges[1])

---
Christophe
Michael Ellerman Dec. 7, 2022, 2:09 a.m. UTC | #9
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
> On 2022-12-05 17:50, Michael Ellerman wrote:
>> Michael Jeanson <mjeanson@efficios.com> writes:
>>> On 2022-12-05 15:11, Michael Jeanson wrote:
>>>>>>> Michael Jeanson <mjeanson@efficios.com> writes:
>>>>>>>> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
>>>>>>>> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
>>>>>>>> changing from their dot prefixed variant to the non-prefixed ones.
>>>>>>>>
>>>>>>>> Since ftrace prefixes a dot to the syscall names when matching them to
>>>>>>>> build its syscall event list, this resulted in no syscall events being
>>>>>>>> available.
>>>>>>>>
>>>>>>>> Remove the PPC64_ELF_ABI_V1 specific version of
>>>>>>>> arch_syscall_match_sym_name to have the same behavior across all powerpc
>>>>>>>> variants.
>>>>>>>
>>>>>>> This doesn't seem to work for me.
>>>>>>>
>>>>>>> Event with it applied I still don't see anything in
>>>>>>> /sys/kernel/debug/tracing/events/syscalls
>>>>>>>
>>>>>>> Did we break it in some other way recently?
>>>>>>>
>>>>>>> cheers
>>>
>>> I did some further testing, my config also enabled KALLSYMS_ALL, when I remove
>>> it there is indeed no syscall events.
>> 
>> Aha, OK that explains it I guess.
>> 
>> I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS,
>> but does not have KALLSYMS_ALL. So I guess there's some other bug
>> lurking in there.
>
> I don't have the setup handy to validate it, but I suspect it is caused 
> by the way scripts/kallsyms.c:symbol_valid() checks whether a symbol 
> entry needs to be integrated into the assembler output when 
> --all-symbols is not specified. It only keeps symbols which addresses 
> are in the text range. On PPC64_ELF_ABI_V1, this means only the 
> dot-prefixed symbols will be kept (those point to the function begin), 
> leaving out the non-dot-prefixed symbols (those point to the function 
> descriptors).

OK. So I guess it never worked without KALLSYMS_ALL.

It seems like most distros enable KALLSYMS_ALL, so I guess that's why
we've never noticed.

> So I see two possible solutions there: either we ensure that 
> FTRACE_SYSCALLS selects KALLSYMS_ALL on PPC64_ELF_ABI_V1, or we modify 
> scripts/kallsyms.c:symbol_valid() to also include function descriptor 
> symbols. This would mean accepting symbols pointing into the .opd ELF 
> section.

My only worry is that will cause some other breakage, because .opd
symbols are not really "text" in the normal sense, ie. you can't execute
them directly.

On the other hand the help for KALLSYMS_ALL says:

  "Normally kallsyms only contains the symbols of functions"

But without .opd included that's not really true. In practice it
probably doesn't really matter, because eg. backtraces will point to dot
symbols which can be resolved.

> IMHO the second option would be better because it does not increase the 
> kernel image size as much as KALLSYMS_ALL.

Yes I agree.

Even if that did break something, any breakage would be limited to
arches which uses function descriptors, which are now all rare.

Relatedly we have a patch in next to optionally use ABIv2 for 64-bit big
endian builds.

cheers
Mathieu Desnoyers Dec. 7, 2022, 3:18 p.m. UTC | #10
On 2022-12-06 21:09, Michael Ellerman wrote:
> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
>> On 2022-12-05 17:50, Michael Ellerman wrote:
>>> Michael Jeanson <mjeanson@efficios.com> writes:
>>>> On 2022-12-05 15:11, Michael Jeanson wrote:
>>>>>>>> Michael Jeanson <mjeanson@efficios.com> writes:
>>>>>>>>> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
>>>>>>>>> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
>>>>>>>>> changing from their dot prefixed variant to the non-prefixed ones.
>>>>>>>>>
>>>>>>>>> Since ftrace prefixes a dot to the syscall names when matching them to
>>>>>>>>> build its syscall event list, this resulted in no syscall events being
>>>>>>>>> available.
>>>>>>>>>
>>>>>>>>> Remove the PPC64_ELF_ABI_V1 specific version of
>>>>>>>>> arch_syscall_match_sym_name to have the same behavior across all powerpc
>>>>>>>>> variants.
>>>>>>>>
>>>>>>>> This doesn't seem to work for me.
>>>>>>>>
>>>>>>>> Event with it applied I still don't see anything in
>>>>>>>> /sys/kernel/debug/tracing/events/syscalls
>>>>>>>>
>>>>>>>> Did we break it in some other way recently?
>>>>>>>>
>>>>>>>> cheers
>>>>
>>>> I did some further testing, my config also enabled KALLSYMS_ALL, when I remove
>>>> it there is indeed no syscall events.
>>>
>>> Aha, OK that explains it I guess.
>>>
>>> I was using ppc64_guest_defconfig which has ABI_V1 and FTRACE_SYSCALLS,
>>> but does not have KALLSYMS_ALL. So I guess there's some other bug
>>> lurking in there.
>>
>> I don't have the setup handy to validate it, but I suspect it is caused
>> by the way scripts/kallsyms.c:symbol_valid() checks whether a symbol
>> entry needs to be integrated into the assembler output when
>> --all-symbols is not specified. It only keeps symbols which addresses
>> are in the text range. On PPC64_ELF_ABI_V1, this means only the
>> dot-prefixed symbols will be kept (those point to the function begin),
>> leaving out the non-dot-prefixed symbols (those point to the function
>> descriptors).
> 
> OK. So I guess it never worked without KALLSYMS_ALL.

I suspect it worked prior to kernel v5.7, because back then the PPC64 
ABIv1 system call table contained pointers to the text section 
(beginning of functions) rather than function descriptors.

So changing this system call table to point to C functions introduced 
the dot-prefix match issue for syscall tracing as well as a dependency 
on KALLSYMS_ALL.

> 
> It seems like most distros enable KALLSYMS_ALL, so I guess that's why
> we've never noticed.

Or very few people run recent PPC64 ABIv1 kernels :)

> 
>> So I see two possible solutions there: either we ensure that
>> FTRACE_SYSCALLS selects KALLSYMS_ALL on PPC64_ELF_ABI_V1, or we modify
>> scripts/kallsyms.c:symbol_valid() to also include function descriptor
>> symbols. This would mean accepting symbols pointing into the .opd ELF
>> section.
> 
> My only worry is that will cause some other breakage, because .opd
> symbols are not really "text" in the normal sense, ie. you can't execute
> them directly.

AFAIU adding the .opd section to scripts/kallsyms.c text_ranges will 
only affect the result of symbol_valid(), which decides which symbols 
get pulled into a KALLSYMS=n/KALLSYMS_ALL=n kernel build. Considering 
that a KALLSYMS_ALL=y build pulls those function descriptor symbols into 
  the image without breaking anything, I don't see why adding the .opd 
section to this script text_ranges would have any ill side-effect.

> 
> On the other hand the help for KALLSYMS_ALL says:
> 
>    "Normally kallsyms only contains the symbols of functions"
> 
> But without .opd included that's not really true. In practice it
> probably doesn't really matter, because eg. backtraces will point to dot
> symbols which can be resolved.

Indeed, I don't see this affecting backtraces, but as soon as a lookup 
depends on comparing the C function pointer to a function descriptor, 
the .opd symbols are needed. Not having those function descriptor 
symbols in KALLSYMS_ALL=n builds seems rather error-prone.

> 
>> IMHO the second option would be better because it does not increase the
>> kernel image size as much as KALLSYMS_ALL.
> 
> Yes I agree.
> 
> Even if that did break something, any breakage would be limited to
> arches which uses function descriptors, which are now all rare.

Yes, it would only impact those arches using function descriptors, which 
are broken today with respect to system call tracing. Are you aware of 
other architectures other than PPC64 ELF ABI v1 supported by the Linux 
kernel that use function descriptors ?

> 
> Relatedly we have a patch in next to optionally use ABIv2 for 64-bit big
> endian builds.

Interesting. Does it require a matching user-space ? (built with PPC64 
ABIv2 ?) Does it handle legacy PPC32 executables ?

Thanks,

Mathieu
Michal Suchánek Dec. 7, 2022, 3:59 p.m. UTC | #11
Hello,

On Wed, Dec 07, 2022 at 10:18:13AM -0500, Mathieu Desnoyers wrote:
> On 2022-12-06 21:09, Michael Ellerman wrote:
> > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
> > > On 2022-12-05 17:50, Michael Ellerman wrote:

> > 
> > Relatedly we have a patch in next to optionally use ABIv2 for 64-bit big
> > endian builds.
> 
> Interesting. Does it require a matching user-space ? (built with PPC64 ABIv2

No, the kernel and userspace ABI is separate.

> ?) Does it handle legacy PPC32 executables ?

Theoretically it should. No idea if anybody has tested it.

Thanks

Michal
Christophe Leroy Dec. 7, 2022, 4:26 p.m. UTC | #12
Le 07/12/2022 à 16:18, Mathieu Desnoyers a écrit :
> On 2022-12-06 21:09, Michael Ellerman wrote:
>> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
>>> On 2022-12-05 17:50, Michael Ellerman wrote:
>>> IMHO the second option would be better because it does not increase the
>>> kernel image size as much as KALLSYMS_ALL.
>>
>> Yes I agree.
>>
>> Even if that did break something, any breakage would be limited to
>> arches which uses function descriptors, which are now all rare.
> 
> Yes, it would only impact those arches using function descriptors, which 
> are broken today with respect to system call tracing. Are you aware of 
> other architectures other than PPC64 ELF ABI v1 supported by the Linux 
> kernel that use function descriptors ?

$ git grep "select HAVE_FUNCTION_DESCRIPTORS" arch/
arch/ia64/Kconfig:      select HAVE_FUNCTION_DESCRIPTORS
arch/parisc/Kconfig:    select HAVE_FUNCTION_DESCRIPTORS if 64BIT
arch/powerpc/Kconfig:   select HAVE_FUNCTION_DESCRIPTORS        if 
PPC64_ELF_ABI_V1


Christophe
Michael Ellerman Dec. 8, 2022, 12:40 p.m. UTC | #13
On Thu, 1 Dec 2022 11:14:42 -0500, Michael Jeanson wrote:
> In v5.7 the powerpc syscall entry/exit logic was rewritten in C, on
> PPC64_ELF_ABI_V1 this resulted in the symbols in the syscall table
> changing from their dot prefixed variant to the non-prefixed ones.
> 
> Since ftrace prefixes a dot to the syscall names when matching them to
> build its syscall event list, this resulted in no syscall events being
> available.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/ftrace: fix syscall tracing on PPC64_ELF_ABI_V1
      https://git.kernel.org/powerpc/c/ad050d2390fccb22aa3e6f65e11757ce7a5a7ca5

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index 3cee7115441b..e3d1f377bc5b 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -64,17 +64,6 @@  void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
  * those.
  */
 #define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
-#ifdef CONFIG_PPC64_ELF_ABI_V1
-static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
-{
-	/* We need to skip past the initial dot, and the __se_sys alias */
-	return !strcmp(sym + 1, name) ||
-		(!strncmp(sym, ".__se_sys", 9) && !strcmp(sym + 6, name)) ||
-		(!strncmp(sym, ".ppc_", 5) && !strcmp(sym + 5, name + 4)) ||
-		(!strncmp(sym, ".ppc32_", 7) && !strcmp(sym + 7, name + 4)) ||
-		(!strncmp(sym, ".ppc64_", 7) && !strcmp(sym + 7, name + 4));
-}
-#else
 static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
 {
 	return !strcmp(sym, name) ||
@@ -83,7 +72,6 @@  static inline bool arch_syscall_match_sym_name(const char *sym, const char *name
 		(!strncmp(sym, "ppc32_", 6) && !strcmp(sym + 6, name + 4)) ||
 		(!strncmp(sym, "ppc64_", 6) && !strcmp(sym + 6, name + 4));
 }
-#endif /* CONFIG_PPC64_ELF_ABI_V1 */
 #endif /* CONFIG_FTRACE_SYSCALLS */
 
 #if defined(CONFIG_PPC64) && defined(CONFIG_FUNCTION_TRACER)