diff mbox series

powerpc/perf: Account for interrupts during PMC overflow for an invalid SIAR check

Message ID 1596717992-7321-1-git-send-email-atrajeev@linux.vnet.ibm.com (mailing list archive)
State Accepted
Commit 17899eaf88d689529b866371344c8f269ba79b5f
Headers show
Series powerpc/perf: Account for interrupts during PMC overflow for an invalid SIAR check | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (3cd2184115b85cc8242fec3d42529cd112962984)
snowpatch_ozlabs/build-ppc64le warning Upstream build failed, couldn't test patch
snowpatch_ozlabs/build-ppc64be warning Upstream build failed, couldn't test patch
snowpatch_ozlabs/build-ppc64e warning Upstream build failed, couldn't test patch
snowpatch_ozlabs/build-pmac32 warning Upstream build failed, couldn't test patch
snowpatch_ozlabs/checkpatch success total: 0 errors, 0 warnings, 0 checks, 10 lines checked
snowpatch_ozlabs/needsstable success Patch has no Fixes tags

Commit Message

Athira Rajeev Aug. 6, 2020, 12:46 p.m. UTC
Performance monitor interrupt handler checks if any counter has overflown
and calls `record_and_restart` in core-book3s which invokes
`perf_event_overflow` to record the sample information.
Apart from creating sample, perf_event_overflow also does the interrupt
and period checks via perf_event_account_interrupt.

Currently we record information only if the SIAR valid bit is set
( using `siar_valid` check ) and hence the interrupt check.
But it is possible that we do sampling for some events that are not
generating valid SIAR and hence there is no chance to disable the event
if interrupts is more than max_samples_per_tick. This leads to soft lockup.

Fix this by adding perf_event_account_interrupt in the invalid siar
code path for a sampling event. ie if siar is invalid, just do interrupt
check and don't record the sample information.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 arch/powerpc/perf/core-book3s.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Alexey Kardashevskiy Aug. 6, 2020, 11:51 p.m. UTC | #1
On 06/08/2020 22:46, Athira Rajeev wrote:
> Performance monitor interrupt handler checks if any counter has overflown
> and calls `record_and_restart` in core-book3s which invokes
> `perf_event_overflow` to record the sample information.
> Apart from creating sample, perf_event_overflow also does the interrupt
> and period checks via perf_event_account_interrupt.
> 
> Currently we record information only if the SIAR valid bit is set
> ( using `siar_valid` check ) and hence the interrupt check.
> But it is possible that we do sampling for some events that are not
> generating valid SIAR and hence there is no chance to disable the event
> if interrupts is more than max_samples_per_tick. This leads to soft lockup.
> 
> Fix this by adding perf_event_account_interrupt in the invalid siar
> code path for a sampling event. ie if siar is invalid, just do interrupt
> check and don't record the sample information.
> 
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>



Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>


> ---
>  arch/powerpc/perf/core-book3s.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 01d7028..626e587 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -2101,6 +2101,10 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
>  
>  		if (perf_event_overflow(event, &data, regs))
>  			power_pmu_stop(event, 0);
> +	} else if (period) {
> +		/* Account for interrupt incase of invalid siar */
> +		if (perf_event_account_interrupt(event))
> +			power_pmu_stop(event, 0);
>  	}
>  }
>  
>
Michael Ellerman Aug. 20, 2020, 1:31 p.m. UTC | #2
On Thu, 6 Aug 2020 08:46:32 -0400, Athira Rajeev wrote:
> Performance monitor interrupt handler checks if any counter has overflown
> and calls `record_and_restart` in core-book3s which invokes
> `perf_event_overflow` to record the sample information.
> Apart from creating sample, perf_event_overflow also does the interrupt
> and period checks via perf_event_account_interrupt.
> 
> Currently we record information only if the SIAR valid bit is set
> ( using `siar_valid` check ) and hence the interrupt check.
> But it is possible that we do sampling for some events that are not
> generating valid SIAR and hence there is no chance to disable the event
> if interrupts is more than max_samples_per_tick. This leads to soft lockup.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/perf: Fix soft lockups due to missed interrupt accounting
      https://git.kernel.org/powerpc/c/17899eaf88d689529b866371344c8f269ba79b5f

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 01d7028..626e587 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2101,6 +2101,10 @@  static void record_and_restart(struct perf_event *event, unsigned long val,
 
 		if (perf_event_overflow(event, &data, regs))
 			power_pmu_stop(event, 0);
+	} else if (period) {
+		/* Account for interrupt incase of invalid siar */
+		if (perf_event_account_interrupt(event))
+			power_pmu_stop(event, 0);
 	}
 }