Message ID | 1596717992-7321-1-git-send-email-atrajeev@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 17899eaf88d689529b866371344c8f269ba79b5f |
Headers | show |
Series | powerpc/perf: Account for interrupts during PMC overflow for an invalid SIAR check | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch powerpc/merge (3cd2184115b85cc8242fec3d42529cd112962984) |
snowpatch_ozlabs/build-ppc64le | warning | Upstream build failed, couldn't test patch |
snowpatch_ozlabs/build-ppc64be | warning | Upstream build failed, couldn't test patch |
snowpatch_ozlabs/build-ppc64e | warning | Upstream build failed, couldn't test patch |
snowpatch_ozlabs/build-pmac32 | warning | Upstream build failed, couldn't test patch |
snowpatch_ozlabs/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 10 lines checked |
snowpatch_ozlabs/needsstable | success | Patch has no Fixes tags |
On 06/08/2020 22:46, Athira Rajeev wrote: > Performance monitor interrupt handler checks if any counter has overflown > and calls `record_and_restart` in core-book3s which invokes > `perf_event_overflow` to record the sample information. > Apart from creating sample, perf_event_overflow also does the interrupt > and period checks via perf_event_account_interrupt. > > Currently we record information only if the SIAR valid bit is set > ( using `siar_valid` check ) and hence the interrupt check. > But it is possible that we do sampling for some events that are not > generating valid SIAR and hence there is no chance to disable the event > if interrupts is more than max_samples_per_tick. This leads to soft lockup. > > Fix this by adding perf_event_account_interrupt in the invalid siar > code path for a sampling event. ie if siar is invalid, just do interrupt > check and don't record the sample information. > > Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> > Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > arch/powerpc/perf/core-book3s.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c > index 01d7028..626e587 100644 > --- a/arch/powerpc/perf/core-book3s.c > +++ b/arch/powerpc/perf/core-book3s.c > @@ -2101,6 +2101,10 @@ static void record_and_restart(struct perf_event *event, unsigned long val, > > if (perf_event_overflow(event, &data, regs)) > power_pmu_stop(event, 0); > + } else if (period) { > + /* Account for interrupt incase of invalid siar */ > + if (perf_event_account_interrupt(event)) > + power_pmu_stop(event, 0); > } > } > >
On Thu, 6 Aug 2020 08:46:32 -0400, Athira Rajeev wrote: > Performance monitor interrupt handler checks if any counter has overflown > and calls `record_and_restart` in core-book3s which invokes > `perf_event_overflow` to record the sample information. > Apart from creating sample, perf_event_overflow also does the interrupt > and period checks via perf_event_account_interrupt. > > Currently we record information only if the SIAR valid bit is set > ( using `siar_valid` check ) and hence the interrupt check. > But it is possible that we do sampling for some events that are not > generating valid SIAR and hence there is no chance to disable the event > if interrupts is more than max_samples_per_tick. This leads to soft lockup. > > [...] Applied to powerpc/fixes. [1/1] powerpc/perf: Fix soft lockups due to missed interrupt accounting https://git.kernel.org/powerpc/c/17899eaf88d689529b866371344c8f269ba79b5f cheers
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 01d7028..626e587 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2101,6 +2101,10 @@ static void record_and_restart(struct perf_event *event, unsigned long val, if (perf_event_overflow(event, &data, regs)) power_pmu_stop(event, 0); + } else if (period) { + /* Account for interrupt incase of invalid siar */ + if (perf_event_account_interrupt(event)) + power_pmu_stop(event, 0); } }
Performance monitor interrupt handler checks if any counter has overflown and calls `record_and_restart` in core-book3s which invokes `perf_event_overflow` to record the sample information. Apart from creating sample, perf_event_overflow also does the interrupt and period checks via perf_event_account_interrupt. Currently we record information only if the SIAR valid bit is set ( using `siar_valid` check ) and hence the interrupt check. But it is possible that we do sampling for some events that are not generating valid SIAR and hence there is no chance to disable the event if interrupts is more than max_samples_per_tick. This leads to soft lockup. Fix this by adding perf_event_account_interrupt in the invalid siar code path for a sampling event. ie if siar is invalid, just do interrupt check and don't record the sample information. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- arch/powerpc/perf/core-book3s.c | 4 ++++ 1 file changed, 4 insertions(+)