mbox series

[v4,00/13] perf/core: Generalise event exclusion checking

Message ID 1546878450-20341-1-git-send-email-andrew.murray@arm.com
Headers show
Series perf/core: Generalise event exclusion checking | expand

Message

Andrew Murray Jan. 7, 2019, 4:27 p.m. UTC
Many PMU drivers do not have the capability to exclude counting events
that occur in specific contexts such as idle, kernel, guest, etc. These
drivers indicate this by returning an error in their event_init upon
testing the events attribute flags.

However this approach requires that each time a new event modifier is
added to perf, all the perf drivers need to be modified to indicate that
they don't support the attribute. This results in additional boiler-plate
code common to many drivers that needs to be maintained. Furthermore the
drivers are not consistent with regards to the error value they return
when reporting unsupported attributes.

This patchset allow PMU drivers to advertise their inability to exclude
based on context via a new capability: PERF_PMU_CAP_NO_EXCLUDE. This
allows the perf core to reject requests for exclusion events where there
is no support in the PMU.

This is a functional change, in particular:

 - Some drivers will now additionally (but correctly) report unsupported
   exclusion flags. It's typical for existing userspace tools such as
   perf to handle such errors by retrying the system call without the
   unsupported flags.

 - Drivers that do not support any exclusion that previously reported
   -EPERM or -EOPNOTSUPP will now report -EINVAL - this is consistent
   with the majority and results in userspace perf retrying without
   exclusion.

All drivers touched by this patchset have been compile tested.

Changes from v3:

 - Added PERF_PMU_CAP_NO_EXCLUDE to Cavium TX2 PMU driver

Changes from v2:

 - Invert logic from CAP_EXCLUDE to CAP_NO_EXCLUDE

Changes from v1:

 - Changed approach from explicitly rejecting events in unsupporting PMU
   drivers to explicitly advertising a capability in PMU drivers that
   do support exclusion events

 - Added additional information to tools/perf/design.txt

 - Rename event_has_exclude_flags to event_has_any_exclude_flag and
   update commit log to reflect it's a function

Andrew Murray (13):
  perf/doc: update design.txt for exclude_{host|guest} flags
  perf/core: add function to test for event exclusion flags
  perf/core: add PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs
  alpha: perf/core: use PERF_PMU_CAP_NO_EXCLUDE
  arm: perf: conditionally use PERF_PMU_CAP_NO_EXCLUDE
  arm: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs
  drivers/perf: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude
    incapable PMUs
  drivers/perf: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude
    incapable PMUs
  powerpc: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable
    PMUs
  x86: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs
  x86: perf/core: use PERF_PMU_CAP_NO_EXCLUDE for exclude incapable PMUs
  perf/core: remove unused perf_flags
  drivers/perf: use PERF_PMU_CAP_NO_EXCLUDE for Cavium TX2 PMU

 arch/alpha/kernel/perf_event.c                |  7 +------
 arch/arm/mach-imx/mmdc.c                      |  9 ++-------
 arch/arm/mm/cache-l2x0-pmu.c                  |  9 +--------
 arch/powerpc/perf/hv-24x7.c                   | 10 +---------
 arch/powerpc/perf/hv-gpci.c                   | 10 +---------
 arch/powerpc/perf/imc-pmu.c                   | 19 +------------------
 arch/x86/events/amd/ibs.c                     | 13 +------------
 arch/x86/events/amd/iommu.c                   |  6 +-----
 arch/x86/events/amd/power.c                   | 10 ++--------
 arch/x86/events/amd/uncore.c                  |  7 ++-----
 arch/x86/events/intel/cstate.c                | 12 +++---------
 arch/x86/events/intel/rapl.c                  |  9 ++-------
 arch/x86/events/intel/uncore.c                |  9 +--------
 arch/x86/events/intel/uncore_snb.c            |  9 ++-------
 arch/x86/events/msr.c                         | 10 ++--------
 drivers/perf/arm-cci.c                        | 10 +---------
 drivers/perf/arm-ccn.c                        |  6 ++----
 drivers/perf/arm_dsu_pmu.c                    |  9 ++-------
 drivers/perf/arm_pmu.c                        | 15 +++++----------
 drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c |  1 +
 drivers/perf/hisilicon/hisi_uncore_hha_pmu.c  |  1 +
 drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c  |  1 +
 drivers/perf/hisilicon/hisi_uncore_pmu.c      |  9 ---------
 drivers/perf/qcom_l2_pmu.c                    |  9 +--------
 drivers/perf/qcom_l3_pmu.c                    |  8 +-------
 drivers/perf/thunderx2_pmu.c                  | 10 +---------
 drivers/perf/xgene_pmu.c                      |  6 +-----
 include/linux/perf_event.h                    | 10 ++++++++++
 include/uapi/linux/perf_event.h               |  2 --
 kernel/events/core.c                          |  9 +++++++++
 tools/include/uapi/linux/perf_event.h         |  2 --
 tools/perf/design.txt                         |  4 ++++
 32 files changed, 63 insertions(+), 198 deletions(-)

Comments

Peter Zijlstra Jan. 8, 2019, 10:28 a.m. UTC | #1
On Mon, Jan 07, 2019 at 04:27:22PM +0000, Andrew Murray wrote:
> @@ -393,9 +386,8 @@ __hw_perf_event_init(struct perf_event *event)
>  	/*
>  	 * Check whether we need to exclude the counter from certain modes.
>  	 */
> +	if (armpmu->set_event_filter &&
> +	    armpmu->set_event_filter(hwc, &event->attr)) {
>  		pr_debug("ARM performance counters do not support "
>  			 "mode exclusion\n");
>  		return -EOPNOTSUPP;

This then requires all set_event_filter() implementations to check all
the various exclude options; also, set_event_filter() failing then
returns with -EOPNOTSUPP instead of the -EINVAL the CAP_NO_EXCLUDE
generates, which is again inconsitent.

If I look at (the very first git-grep found me)
armv7pmu_set_event_filter(), then I find it returning -EPERM (again
inconsistent but irrelevant because the actual value is not preserved)
for exclude_idle.

But it doesn't seem to check exclude_host at all for example.

> @@ -867,6 +859,9 @@ int armpmu_register(struct arm_pmu *pmu)
>  	if (ret)
>  		return ret;
>  
> +	if (!pmu->set_event_filter)
> +		pmu->pmu.capabilities |= PERF_PMU_CAP_NO_EXCLUDE;
> +
>  	ret = perf_pmu_register(&pmu->pmu, pmu->name, -1);
>  	if (ret)
>  		goto out_destroy;
> -- 
> 2.7.4
>
Peter Zijlstra Jan. 8, 2019, 10:48 a.m. UTC | #2
On Mon, Jan 07, 2019 at 04:27:27PM +0000, Andrew Murray wrote:
> For drivers that do not support context exclusion let's advertise the
> PERF_PMU_CAP_NOEXCLUDE capability. This ensures that perf will
> prevent us from handling events where any exclusion flags are set.
> Let's also remove the now unnecessary check for exclusion flags.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>  arch/x86/events/amd/ibs.c          | 13 +------------
>  arch/x86/events/amd/power.c        | 10 ++--------
>  arch/x86/events/intel/cstate.c     | 12 +++---------
>  arch/x86/events/intel/rapl.c       |  9 ++-------
>  arch/x86/events/intel/uncore_snb.c |  9 ++-------
>  arch/x86/events/msr.c              | 10 ++--------
>  6 files changed, 12 insertions(+), 51 deletions(-)

You (correctly) don't add CAP_NO_EXCLUDE to the main x86 pmu code, but
then you also don't check if it handles all the various exclude options
correctly/consistently.

Now; I must admit that that is a bit of a maze, but I think we can at
least add exclude_idle and exclude_hv fails in there, nothing uses those
afaict.

On the various exclude options; they are as follows (IIUC):

  - exclude_guest: we're a HV/host-kernel and we don't want the counter
                   to run when we run a guest context.

  - exclude_host: we're a HV/host-kernel and we don't want the counter
                  to run when we run in host context.

  - exclude_hv: we're a guest and don't want the counter to run in HV
                context.

Now, KVM always implies exclude_hv afaict (for guests), I'm not sure
what, if anything Xen does on x86 (IIRC Brendan Gregg once said perf
works on Xen) -- nor quite sure who to ask, Boris, Jeurgen?
Peter Zijlstra Jan. 8, 2019, 10:49 a.m. UTC | #3
On Mon, Jan 07, 2019 at 04:27:28PM +0000, Andrew Murray wrote:

This patch has the exact same subject as the previous one.. that seems
sub-optimal.
Andrew Murray Jan. 8, 2019, 1:07 p.m. UTC | #4
On Tue, Jan 08, 2019 at 11:28:02AM +0100, Peter Zijlstra wrote:
> On Mon, Jan 07, 2019 at 04:27:22PM +0000, Andrew Murray wrote:
> > @@ -393,9 +386,8 @@ __hw_perf_event_init(struct perf_event *event)
> >  	/*
> >  	 * Check whether we need to exclude the counter from certain modes.
> >  	 */
> > +	if (armpmu->set_event_filter &&
> > +	    armpmu->set_event_filter(hwc, &event->attr)) {
> >  		pr_debug("ARM performance counters do not support "
> >  			 "mode exclusion\n");
> >  		return -EOPNOTSUPP;
> 
> This then requires all set_event_filter() implementations to check all
> the various exclude options;

Yes but this isn't a new requirement, this hunk uses the absence of
set_event_filter to blanket indicate that no exclusion flags are supported.


> also, set_event_filter() failing then
> returns with -EOPNOTSUPP instead of the -EINVAL the CAP_NO_EXCLUDE
> generates, which is again inconsitent.

Yes, it's not ideal - but a step in the right direction. I wanted to limit
user visible changes as much as possible, where I've identified them I've
noted it in the commit log.

> 
> If I look at (the very first git-grep found me)
> armv7pmu_set_event_filter(), then I find it returning -EPERM (again
> inconsistent but irrelevant because the actual value is not preserved)
> for exclude_idle.
> 
> But it doesn't seem to check exclude_host at all for example.

Yes I found lots of examples like this across the tree whilst doing this
work. However I decided to initially start with simply removing duplicated
code as a result of adding this flag and attempting to preserve existing
functionality. I thought that if I add missing checks then the patchset
will get much bigger and be harder to merge. I would like to do this though
as another non-cross-arch series.

Can we limit this patch series to the minimal changes required to fully
use PERF_PMU_CAP_NO_EXCLUDE and then attempt to fix these existing problems
in subsequent patch sets?

Thanks,

Andrew Murray

> 
> > @@ -867,6 +859,9 @@ int armpmu_register(struct arm_pmu *pmu)
> >  	if (ret)
> >  		return ret;
> >  
> > +	if (!pmu->set_event_filter)
> > +		pmu->pmu.capabilities |= PERF_PMU_CAP_NO_EXCLUDE;
> > +
> >  	ret = perf_pmu_register(&pmu->pmu, pmu->name, -1);
> >  	if (ret)
> >  		goto out_destroy;
> > -- 
> > 2.7.4
> >
Andrew Murray Jan. 8, 2019, 1:08 p.m. UTC | #5
On Tue, Jan 08, 2019 at 11:49:40AM +0100, Peter Zijlstra wrote:
> On Mon, Jan 07, 2019 at 04:27:28PM +0000, Andrew Murray wrote:
> 
> This patch has the exact same subject as the previous one.. that seems
> sub-optimal.

Ah yes, I'll update that in subsquent revisions. (The reason for two patches
was to separate functional vs non-functional changes).

Andrew Murray
Peter Zijlstra Jan. 8, 2019, 1:10 p.m. UTC | #6
On Tue, Jan 08, 2019 at 01:07:41PM +0000, Andrew Murray wrote:

> Yes I found lots of examples like this across the tree whilst doing this
> work. However I decided to initially start with simply removing duplicated
> code as a result of adding this flag and attempting to preserve existing
> functionality. I thought that if I add missing checks then the patchset
> will get much bigger and be harder to merge. I would like to do this though
> as another non-cross-arch series.
> 
> Can we limit this patch series to the minimal changes required to fully
> use PERF_PMU_CAP_NO_EXCLUDE and then attempt to fix these existing problems
> in subsequent patch sets?

Ok, but it would've been nice to see that mentioned somewhere.
Andrew Murray Jan. 8, 2019, 1:12 p.m. UTC | #7
On Tue, Jan 08, 2019 at 11:48:41AM +0100, Peter Zijlstra wrote:
> On Mon, Jan 07, 2019 at 04:27:27PM +0000, Andrew Murray wrote:
> > For drivers that do not support context exclusion let's advertise the
> > PERF_PMU_CAP_NOEXCLUDE capability. This ensures that perf will
> > prevent us from handling events where any exclusion flags are set.
> > Let's also remove the now unnecessary check for exclusion flags.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > ---
> >  arch/x86/events/amd/ibs.c          | 13 +------------
> >  arch/x86/events/amd/power.c        | 10 ++--------
> >  arch/x86/events/intel/cstate.c     | 12 +++---------
> >  arch/x86/events/intel/rapl.c       |  9 ++-------
> >  arch/x86/events/intel/uncore_snb.c |  9 ++-------
> >  arch/x86/events/msr.c              | 10 ++--------
> >  6 files changed, 12 insertions(+), 51 deletions(-)
> 
> You (correctly) don't add CAP_NO_EXCLUDE to the main x86 pmu code, but
> then you also don't check if it handles all the various exclude options
> correctly/consistently.
> 
> Now; I must admit that that is a bit of a maze, but I think we can at
> least add exclude_idle and exclude_hv fails in there, nothing uses those
> afaict.

Yes it took me some time to make sense of it.

As per my comments in the other patch, I think you're suggesting that I
add additional checks to x86. I think they are needed but I'd prefer to
make functional changes in a separate series, I'm happy to do this.

> 
> On the various exclude options; they are as follows (IIUC):
> 
>   - exclude_guest: we're a HV/host-kernel and we don't want the counter
>                    to run when we run a guest context.
> 
>   - exclude_host: we're a HV/host-kernel and we don't want the counter
>                   to run when we run in host context.
> 
>   - exclude_hv: we're a guest and don't want the counter to run in HV
>                 context.
> 
> Now, KVM always implies exclude_hv afaict (for guests),

It certaintly does for ARM.

> I'm not sure
> what, if anything Xen does on x86 (IIRC Brendan Gregg once said perf
> works on Xen) -- nor quite sure who to ask, Boris, Jeurgen?

Thanks,

Andrew Murray
>
Andrew Murray Jan. 8, 2019, 1:13 p.m. UTC | #8
On Tue, Jan 08, 2019 at 02:10:31PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 08, 2019 at 01:07:41PM +0000, Andrew Murray wrote:
> 
> > Yes I found lots of examples like this across the tree whilst doing this
> > work. However I decided to initially start with simply removing duplicated
> > code as a result of adding this flag and attempting to preserve existing
> > functionality. I thought that if I add missing checks then the patchset
> > will get much bigger and be harder to merge. I would like to do this though
> > as another non-cross-arch series.
> > 
> > Can we limit this patch series to the minimal changes required to fully
> > use PERF_PMU_CAP_NO_EXCLUDE and then attempt to fix these existing problems
> > in subsequent patch sets?
> 
> Ok, but it would've been nice to see that mentioned somewhere.

I'll update the cover leter on any next revision. I'll try to be clearer next
time with my intentions.

Andrew Murray
Peter Zijlstra Jan. 8, 2019, 2:43 p.m. UTC | #9
On Tue, Jan 08, 2019 at 01:13:57PM +0000, Andrew Murray wrote:
> On Tue, Jan 08, 2019 at 02:10:31PM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 08, 2019 at 01:07:41PM +0000, Andrew Murray wrote:
> > 
> > > Yes I found lots of examples like this across the tree whilst doing this
> > > work. However I decided to initially start with simply removing duplicated
> > > code as a result of adding this flag and attempting to preserve existing
> > > functionality. I thought that if I add missing checks then the patchset
> > > will get much bigger and be harder to merge. I would like to do this though
> > > as another non-cross-arch series.
> > > 
> > > Can we limit this patch series to the minimal changes required to fully
> > > use PERF_PMU_CAP_NO_EXCLUDE and then attempt to fix these existing problems
> > > in subsequent patch sets?
> > 
> > Ok, but it would've been nice to see that mentioned somewhere.
> 
> I'll update the cover leter on any next revision. I'll try to be clearer next
> time with my intentions.

Could you maybe include it in the relevant patches too; like for example
the ARM one where we rely on set_event_filter() to DTRT.

So with the changelogs and subjects fixed I can take these patches and
then you can get on with cleaning up the individual drivers.
Boris Ostrovsky Jan. 8, 2019, 4:36 p.m. UTC | #10
On 1/8/19 5:48 AM, Peter Zijlstra wrote:
> On Mon, Jan 07, 2019 at 04:27:27PM +0000, Andrew Murray wrote:
>> For drivers that do not support context exclusion let's advertise the
>> PERF_PMU_CAP_NOEXCLUDE capability. This ensures that perf will
>> prevent us from handling events where any exclusion flags are set.
>> Let's also remove the now unnecessary check for exclusion flags.
>>
>> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
>> ---
>>  arch/x86/events/amd/ibs.c          | 13 +------------
>>  arch/x86/events/amd/power.c        | 10 ++--------
>>  arch/x86/events/intel/cstate.c     | 12 +++---------
>>  arch/x86/events/intel/rapl.c       |  9 ++-------
>>  arch/x86/events/intel/uncore_snb.c |  9 ++-------
>>  arch/x86/events/msr.c              | 10 ++--------
>>  6 files changed, 12 insertions(+), 51 deletions(-)
> You (correctly) don't add CAP_NO_EXCLUDE to the main x86 pmu code, but
> then you also don't check if it handles all the various exclude options
> correctly/consistently.
>
> Now; I must admit that that is a bit of a maze, but I think we can at
> least add exclude_idle and exclude_hv fails in there, nothing uses those
> afaict.
>
> On the various exclude options; they are as follows (IIUC):
>
>   - exclude_guest: we're a HV/host-kernel and we don't want the counter
>                    to run when we run a guest context.
>
>   - exclude_host: we're a HV/host-kernel and we don't want the counter
>                   to run when we run in host context.
>
>   - exclude_hv: we're a guest and don't want the counter to run in HV
>                 context.
>
> Now, KVM always implies exclude_hv afaict (for guests), I'm not sure
> what, if anything Xen does on x86 (IIRC Brendan Gregg once said perf
> works on Xen) -- nor quite sure who to ask, Boris, Jeurgen?

perf does work inside guests.

VPMU is managed by the Xen and it presents to the guest only samples
that are associated with the guest. So from that perspective exclude_hv
doesn't seem to be needed.

There is a VPMU mode that allows profiling whole system (host and
guests) from dom0, and this where exclude_hv might be useful. But this
mode, ahem, needs some work.


-boris
Peter Zijlstra Jan. 8, 2019, 6:49 p.m. UTC | #11
On Tue, Jan 08, 2019 at 11:36:33AM -0500, Boris Ostrovsky wrote:
> On 1/8/19 5:48 AM, Peter Zijlstra wrote:

> > On the various exclude options; they are as follows (IIUC):
> >
> >   - exclude_guest: we're a HV/host-kernel and we don't want the counter
> >                    to run when we run a guest context.
> >
> >   - exclude_host: we're a HV/host-kernel and we don't want the counter
> >                   to run when we run in host context.
> >
> >   - exclude_hv: we're a guest and don't want the counter to run in HV
> >                 context.
> >
> > Now, KVM always implies exclude_hv afaict (for guests), I'm not sure
> > what, if anything Xen does on x86 (IIRC Brendan Gregg once said perf
> > works on Xen) -- nor quite sure who to ask, Boris, Jeurgen?
> 
> perf does work inside guests.
> 
> VPMU is managed by the Xen and it presents to the guest only samples
> that are associated with the guest. So from that perspective exclude_hv
> doesn't seem to be needed.
> 
> There is a VPMU mode that allows profiling whole system (host and
> guests) from dom0, and this where exclude_hv might be useful. But this
> mode, ahem, needs some work.

Thanks Boris!
Will Deacon Jan. 10, 2019, 11:10 a.m. UTC | #12
On Mon, Jan 07, 2019 at 04:27:30PM +0000, Andrew Murray wrote:
> The Cavium ThunderX2 UNCORE PMU driver doesn't support any event
> filtering. Let's advertise the PERF_PMU_CAP_NO_EXCLUDE capability to
> simplify the code.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>  drivers/perf/thunderx2_pmu.c | 10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)

Acked-by: Will Deacon <will.deacon@arm.com>

Thanks for fixing this up.

Will
Michael Ellerman Jan. 10, 2019, 1:15 p.m. UTC | #13
Peter Zijlstra <peterz@infradead.org> writes:

> On Mon, Jan 07, 2019 at 04:27:27PM +0000, Andrew Murray wrote:
>> For drivers that do not support context exclusion let's advertise the
>> PERF_PMU_CAP_NOEXCLUDE capability. This ensures that perf will
>> prevent us from handling events where any exclusion flags are set.
>> Let's also remove the now unnecessary check for exclusion flags.
>> 
>> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
>> ---
>>  arch/x86/events/amd/ibs.c          | 13 +------------
>>  arch/x86/events/amd/power.c        | 10 ++--------
>>  arch/x86/events/intel/cstate.c     | 12 +++---------
>>  arch/x86/events/intel/rapl.c       |  9 ++-------
>>  arch/x86/events/intel/uncore_snb.c |  9 ++-------
>>  arch/x86/events/msr.c              | 10 ++--------
>>  6 files changed, 12 insertions(+), 51 deletions(-)
>
> You (correctly) don't add CAP_NO_EXCLUDE to the main x86 pmu code, but
> then you also don't check if it handles all the various exclude options
> correctly/consistently.
>
> Now; I must admit that that is a bit of a maze, but I think we can at
> least add exclude_idle and exclude_hv fails in there, nothing uses those
> afaict.
>
> On the various exclude options; they are as follows (IIUC):
>
>   - exclude_guest: we're a HV/host-kernel and we don't want the counter
>                    to run when we run a guest context.
>
>   - exclude_host: we're a HV/host-kernel and we don't want the counter
>                   to run when we run in host context.
>
>   - exclude_hv: we're a guest and don't want the counter to run in HV
>                 context.
>
> Now, KVM always implies exclude_hv afaict (for guests)

On Power it mostly does.

There's some host code that can run in real mode (MMU off) and therefore
doesn't do a full context switch out of the guest (including the PMU),
so that's host code that is running while the guest PMCs are still
counting.

cheers