diff mbox series

powerpc/perf: Fix crash with 'is_sier_available' when pmu is not set

Message ID 1606124997-3358-1-git-send-email-atrajeev@linux.vnet.ibm.com (mailing list archive)
State Changes Requested
Headers show
Series powerpc/perf: Fix crash with 'is_sier_available' when pmu is not set | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (9d1aa2f025c6cc516125c42c70f6a9ce087c49ea)
snowpatch_ozlabs/build-ppc64le success Build succeeded
snowpatch_ozlabs/build-ppc64be success Build succeeded
snowpatch_ozlabs/build-ppc64e success Build succeeded
snowpatch_ozlabs/build-pmac32 success Build succeeded
snowpatch_ozlabs/checkpatch warning total: 0 errors, 1 warnings, 0 checks, 9 lines checked
snowpatch_ozlabs/needsstable warning Please consider tagging this patch for stable!

Commit Message

Athira Rajeev Nov. 23, 2020, 9:49 a.m. UTC
On systems without any platform specific PMU driver support registered or
Generic Compat PMU support registered, running 'perf record' with
—intr-regs  will crash ( perf record -I <workload> ).

The relevant portion from crash logs and Call Trace:

Unable to handle kernel paging request for data at address 0x00000068
Faulting instruction address: 0xc00000000013eb18
Oops: Kernel access of bad area, sig: 11 [#1]
CPU: 2 PID: 13435 Comm: kill Kdump: loaded Not tainted 4.18.0-193.el8.ppc64le #1
NIP:  c00000000013eb18 LR: c000000000139f2c CTR: c000000000393d80
REGS: c0000004a07ab4f0 TRAP: 0300   Not tainted  (4.18.0-193.el8.ppc64le)
NIP [c00000000013eb18] is_sier_available+0x18/0x30
LR [c000000000139f2c] perf_reg_value+0x6c/0xb0
Call Trace:
[c0000004a07ab770] [c0000004a07ab7c8] 0xc0000004a07ab7c8 (unreliable)
[c0000004a07ab7a0] [c0000000003aa77c] perf_output_sample+0x60c/0xac0
[c0000004a07ab840] [c0000000003ab3f0] perf_event_output_forward+0x70/0xb0
[c0000004a07ab8c0] [c00000000039e208] __perf_event_overflow+0x88/0x1a0
[c0000004a07ab910] [c00000000039e42c] perf_swevent_hrtimer+0x10c/0x1d0
[c0000004a07abc50] [c000000000228b9c] __hrtimer_run_queues+0x17c/0x480
[c0000004a07abcf0] [c00000000022aaf4] hrtimer_interrupt+0x144/0x520
[c0000004a07abdd0] [c00000000002a864] timer_interrupt+0x104/0x2f0
[c0000004a07abe30] [c0000000000091c4] decrementer_common+0x114/0x120

When perf record session started with "-I" option, capture registers
via intr-regs, on each sample ‘is_sier_available()'i is called to check
for the SIER ( Sample Instruction Event Register) availability in the
platform. This function in core-book3s access 'ppmu->flags'. If platform
specific pmu driver is not registered, ppmu is set to null and accessing
its members results in crash. Patch fixes this by returning false in
'is_sier_available()' if 'ppmu' is not set.

Fixes: 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/perf/core-book3s.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Sachin Sant Nov. 23, 2020, 10:55 a.m. UTC | #1
> When perf record session started with "-I" option, capture registers
> via intr-regs, on each sample ‘is_sier_available()'i is called to check
> for the SIER ( Sample Instruction Event Register) availability in the
> platform. This function in core-book3s access 'ppmu->flags'. If platform
> specific pmu driver is not registered, ppmu is set to null and accessing
> its members results in crash. Patch fixes this by returning false in
> 'is_sier_available()' if 'ppmu' is not set.
> 
> Fixes: 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER")
> Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>

Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>

Thanks
-Sachin
Michael Ellerman Nov. 23, 2020, 11:19 a.m. UTC | #2
Hi Athira,

Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
> On systems without any platform specific PMU driver support registered or
> Generic Compat PMU support registered,

The compat PMU is registered just like other PMUs, so I don't see how we
can crash like this if the compat PMU is active?

ie. if we're using the compat PMU then ppmu will be non-NULL and point
to generic_compat_pmu.

> running 'perf record' with
> —intr-regs  will crash ( perf record -I <workload> ).
>
> The relevant portion from crash logs and Call Trace:
>
> Unable to handle kernel paging request for data at address 0x00000068
> Faulting instruction address: 0xc00000000013eb18
> Oops: Kernel access of bad area, sig: 11 [#1]
> CPU: 2 PID: 13435 Comm: kill Kdump: loaded Not tainted 4.18.0-193.el8.ppc64le #1
> NIP:  c00000000013eb18 LR: c000000000139f2c CTR: c000000000393d80
> REGS: c0000004a07ab4f0 TRAP: 0300   Not tainted  (4.18.0-193.el8.ppc64le)
> NIP [c00000000013eb18] is_sier_available+0x18/0x30
> LR [c000000000139f2c] perf_reg_value+0x6c/0xb0
> Call Trace:
> [c0000004a07ab770] [c0000004a07ab7c8] 0xc0000004a07ab7c8 (unreliable)
> [c0000004a07ab7a0] [c0000000003aa77c] perf_output_sample+0x60c/0xac0
> [c0000004a07ab840] [c0000000003ab3f0] perf_event_output_forward+0x70/0xb0
> [c0000004a07ab8c0] [c00000000039e208] __perf_event_overflow+0x88/0x1a0
> [c0000004a07ab910] [c00000000039e42c] perf_swevent_hrtimer+0x10c/0x1d0
> [c0000004a07abc50] [c000000000228b9c] __hrtimer_run_queues+0x17c/0x480
> [c0000004a07abcf0] [c00000000022aaf4] hrtimer_interrupt+0x144/0x520
> [c0000004a07abdd0] [c00000000002a864] timer_interrupt+0x104/0x2f0
> [c0000004a07abe30] [c0000000000091c4] decrementer_common+0x114/0x120
>
> When perf record session started with "-I" option, capture registers
                          ^
                          is

> via intr-regs,

"intr-regs" is just the full name for the -I option, so that kind of
repeats itself.

> on each sample ‘is_sier_available()'i is called to check
                                      ^
                                      extra i

The single quotes around is_sier_available() aren't necessary IMO.

> for the SIER ( Sample Instruction Event Register) availability in the
                ^
                stray space
> platform. This function in core-book3s access 'ppmu->flags'. If platform
                                               ^                 ^
                                               es                a
> specific pmu driver is not registered, ppmu is set to null and accessing
           ^                                            ^
           PMU                                          NULL
> its members results in crash. Patch fixes this by returning false in
                        ^
                        a
> 'is_sier_available()' if 'ppmu' is not set.

Use the imperative mood for the last sentence which says what the patch
does:

  Fix the crash by returning false in is_sier_available() if ppmu is not set.


> Fixes: 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER")
> Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/perf/core-book3s.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 08643cb..1de4770 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -137,6 +137,9 @@ static void pmao_restore_workaround(bool ebb) { }
>  
>  bool is_sier_available(void)
>  {
> +	if (!ppmu)
> +		return false;
> +
>  	if (ppmu->flags & PPMU_HAS_SIER)
>  		return true;
>  
> -- 
> 1.8.3.1


cheers
Athira Rajeev Nov. 23, 2020, 1:32 p.m. UTC | #3
> On 23-Nov-2020, at 4:49 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Hi Athira,
> 
> Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
>> On systems without any platform specific PMU driver support registered or
>> Generic Compat PMU support registered,
> 
> The compat PMU is registered just like other PMUs, so I don't see how we
> can crash like this if the compat PMU is active?
> 
> ie. if we're using the compat PMU then ppmu will be non-NULL and point
> to generic_compat_pmu.

Hi Michael,

Thanks for checking the patch.

Crash happens on systems which neither has compat PMU support registered nor 
has Platform specific PMU. This happens when the distro do not have either the PMU 
driver support for that platform or the generic "compat-mode" performance monitoring
driver support. 

So in such cases since compat PMU is in-active, ppmu is not set and
results in crash. Sorry for the confusion with my first line. I will correct it.

> 
>> running 'perf record' with
>> —intr-regs  will crash ( perf record -I <workload> ).
>> 
>> The relevant portion from crash logs and Call Trace:
>> 
>> Unable to handle kernel paging request for data at address 0x00000068
>> Faulting instruction address: 0xc00000000013eb18
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> CPU: 2 PID: 13435 Comm: kill Kdump: loaded Not tainted 4.18.0-193.el8.ppc64le #1
>> NIP:  c00000000013eb18 LR: c000000000139f2c CTR: c000000000393d80
>> REGS: c0000004a07ab4f0 TRAP: 0300   Not tainted  (4.18.0-193.el8.ppc64le)
>> NIP [c00000000013eb18] is_sier_available+0x18/0x30
>> LR [c000000000139f2c] perf_reg_value+0x6c/0xb0
>> Call Trace:
>> [c0000004a07ab770] [c0000004a07ab7c8] 0xc0000004a07ab7c8 (unreliable)
>> [c0000004a07ab7a0] [c0000000003aa77c] perf_output_sample+0x60c/0xac0
>> [c0000004a07ab840] [c0000000003ab3f0] perf_event_output_forward+0x70/0xb0
>> [c0000004a07ab8c0] [c00000000039e208] __perf_event_overflow+0x88/0x1a0
>> [c0000004a07ab910] [c00000000039e42c] perf_swevent_hrtimer+0x10c/0x1d0
>> [c0000004a07abc50] [c000000000228b9c] __hrtimer_run_queues+0x17c/0x480
>> [c0000004a07abcf0] [c00000000022aaf4] hrtimer_interrupt+0x144/0x520
>> [c0000004a07abdd0] [c00000000002a864] timer_interrupt+0x104/0x2f0
>> [c0000004a07abe30] [c0000000000091c4] decrementer_common+0x114/0x120
>> 
>> When perf record session started with "-I" option, capture registers
>                          ^
>                          is
> 
>> via intr-regs,
> 
> "intr-regs" is just the full name for the -I option, so that kind of
> repeats itself.
> 
>> on each sample ‘is_sier_available()'i is called to check
>                                      ^
>                                      extra i
> 
> The single quotes around is_sier_available() aren't necessary IMO.
> 
>> for the SIER ( Sample Instruction Event Register) availability in the
>                ^
>                stray space
>> platform. This function in core-book3s access 'ppmu->flags'. If platform
>                                               ^                 ^
>                                               es                a
>> specific pmu driver is not registered, ppmu is set to null and accessing
>           ^                                            ^
>           PMU                                          NULL
>> its members results in crash. Patch fixes this by returning false in
>                        ^
>                        a
>> 'is_sier_available()' if 'ppmu' is not set.
> 
> Use the imperative mood for the last sentence which says what the patch
> does:
> 
>  Fix the crash by returning false in is_sier_available() if ppmu is not set.

Sure,  I will make all these changes as suggested.

Thanks
Athira
> 
> 
>> Fixes: 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER")
>> Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/perf/core-book3s.c | 3 +++
>> 1 file changed, 3 insertions(+)
>> 
>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>> index 08643cb..1de4770 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -137,6 +137,9 @@ static void pmao_restore_workaround(bool ebb) { }
>> 
>> bool is_sier_available(void)
>> {
>> +	if (!ppmu)
>> +		return false;
>> +
>> 	if (ppmu->flags & PPMU_HAS_SIER)
>> 		return true;
>> 
>> -- 
>> 1.8.3.1
> 
> 
> cheers
diff mbox series

Patch

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 08643cb..1de4770 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -137,6 +137,9 @@  static void pmao_restore_workaround(bool ebb) { }
 
 bool is_sier_available(void)
 {
+	if (!ppmu)
+		return false;
+
 	if (ppmu->flags & PPMU_HAS_SIER)
 		return true;