Message ID | 20121012012839.GA15348@us.ibm.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On 10/12/2012 06:58 AM, Sukadev Bhattiprolu wrote: > > From 89cb6a25b9f714e55a379467a832ee015014ed11 Mon Sep 17 00:00:00 2001 > From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> > Date: Tue, 18 Sep 2012 10:59:01 -0700 > Subject: [PATCH] perf: Add a few generic stalled-cycles events > > The existing generic event 'stalled-cycles-backend' corresponds to > PM_CMPLU_STALL event in Power7. While this event is useful, detailed > performance analysis often requires us to find more specific reasons > for the stalled cycle. For instance, stalled cycles in Power7 can > occur due to, among others: > > - instruction fetch unit (IFU), > - Load-store-unit (LSU), > - Fixed point unit (FXU) > - Branch unit (BRU) > > While it is possible to use raw codes to monitor these events, it quickly > becomes cumbersome with performance analysis frequently requiring mapping > the raw event codes in reports to their symbolic names. > > This patch is a proposal to try and generalize such perf events. Since > the code changes are quite simple, I bunched all the 4 events together. > > I am not familiar with how readily these events would map to other > architectures. Here is some information on the events for Power7: > > stalled-cycles-fixed-point (PM_CMPLU_STALL_FXU) > > Following a completion stall, the last instruction to finish > before completion resumes was from the Fixed Point Unit. > > Completion stall is any period when no groups completed and > the completion table was not empty for that thread. > > stalled-cycles-load-store (PM_CMPLU_STALL_LSU) > > Following a completion stall, the last instruction to finish > before completion resumes was from the Load-Store Unit. > > stalled-cycles-instruction-fetch (PM_CMPLU_STALL_IFU) > > Following a completion stall, the last instruction to finish > before completion resumes was from the Instruction Fetch Unit. > > stalled-cycles-branch (PM_CMPLU_STALL_BRU) > > Following a completion stall, the last instruction to finish > before completion resumes was from the Branch Unit. > > Looking for feedback on this approach and if this can be further extended. > Power7 has 530 events[2] out of which a "CPI stack analysis"[1] uses about 26 > events. > > > [1] CPI Stack analysis > https://www.power.org/documentation/commonly-used-metrics-for-performance-analysis > > [2] Power7 events: > https://www.power.org/documentation/comprehensive-pmu-event-reference-power7/ Here we should try to come up with a generic list of places in the processor where the cycles can stall. PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH PERF_COUNT_HW_STALLED_CYCLES_BRANCH PERF_COUNT_HW_STALLED_CYCLES_<ANY_OTHER_PLACE1> PERF_COUNT_HW_STALLED_CYCLES_<ANY_OTHER_PLACE2> PERF_COUNT_HW_STALLED_CYCLES_<ANY_OTHER_PLACE3> ----------------------------------------------- This generic list can be a superset which can accommodate all the architecture giving the flexibility to implement selectively there after. Stall locations are very important from CPI analysis stand point with real world use cases. This will definitely help us in that direction. Regards Anshuman
On 11.10.12 18:28:39, Sukadev Bhattiprolu wrote: > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT }, > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE }, > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH }, > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_BRANCH }, Instead of adding new hardware event types I would prefer to use raw events in conjunction with sysfs, see e.g. the intel-uncore implementation. Something like: $ find /sys/bus/event_source/devices/cpu/events/ ... /sys/bus/event_source/devices/cpu/events/stalled-cycles-fixed-point /sys/bus/event_source/devices/cpu/events/stalled-cycles-load-store /sys/bus/event_source/devices/cpu/events/stalled-cycles-instruction-fetch /sys/bus/event_source/devices/cpu/events/stalled-cycles-branch ... $ cat /sys/bus/event_source/devices/cpu/events/stalled-cycles-fixed-point event=0xff,umask=0x00 Perf tool works then out-of-the-box with: $ perf record -e cpu/stalled-cycles-fixed-point/ ... The event string can easily be reused by other architectures as a quasi standard. -Robert
On 10/15/12 8:55 AM, Robert Richter wrote: [..] > Perf tool works then out-of-the-box with: > > $ perf record -e cpu/stalled-cycles-fixed-point/ ... > > The event string can easily be reused by other architectures as a > quasi standard. I like Robert's proposal better. It's hard to model all the stall events (eg: instruction decoder related stalls on x86) in a hardware independent way. Another area to think about: software engineers are generally busy and have a limited amount of time to devote to hardware event based optimizations. The most common question I hear is: what is the expected perf gain if I fix this? It's hard to answer that with just the stall events. -Arun
On 10/15/2012 10:53 PM, Arun Sharma wrote: > On 10/15/12 8:55 AM, Robert Richter wrote: > > [..] >> Perf tool works then out-of-the-box with: >> >> $ perf record -e cpu/stalled-cycles-fixed-point/ ... >> >> The event string can easily be reused by other architectures as a >> quasi standard. > > I like Robert's proposal better. It's hard to model all the stall events > (eg: instruction decoder related stalls on x86) in a hardware > independent way. > > Another area to think about: software engineers are generally busy and > have a limited amount of time to devote to hardware event based > optimizations. The most common question I hear is: what is the expected > perf gain if I fix this? It's hard to answer that with just the stall > events. > Hardware event based optimization is a very important aspect of real world application tuning. CPI stack analysis is a good reason why perf should have stall events as generic ones. But I am not clear on situations where we consider adding these new generic events into linux/perf_event.h and the situations where we should go with the sys fs interface. Could you please elaborate on this ? Regards Anshuman
Sukadev, On 15.10.12 17:55:34, Robert Richter wrote: > On 11.10.12 18:28:39, Sukadev Bhattiprolu wrote: > > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT }, > > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE }, > > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH }, > > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_BRANCH }, > > Instead of adding new hardware event types I would prefer to use raw > events in conjunction with sysfs, see e.g. the intel-uncore > implementation. Something like: > > $ find /sys/bus/event_source/devices/cpu/events/ > ... > /sys/bus/event_source/devices/cpu/events/stalled-cycles-fixed-point > /sys/bus/event_source/devices/cpu/events/stalled-cycles-load-store > /sys/bus/event_source/devices/cpu/events/stalled-cycles-instruction-fetch > /sys/bus/event_source/devices/cpu/events/stalled-cycles-branch > ... > $ cat /sys/bus/event_source/devices/cpu/events/stalled-cycles-fixed-point > event=0xff,umask=0x00 > > Perf tool works then out-of-the-box with: > > $ perf record -e cpu/stalled-cycles-fixed-point/ ... I refer here to arch/x86/kernel/cpu/perf_event_intel_uncore.c (should be in v3.7-rc1 or tip:perf/core). See the INTEL_UNCORE_EVENT_DESC() macro and 'if (type->event_descs) ...' in uncore_type_init(). The code should be reworked to be non-architectural. PMU registration is implemented for a longer time already for all architectures and pmu types: /sys/bus/event_source/devices/* But /sys/bus/event_source/devices/*/events/ exists only for a small number of pmus. Perf tool support of this was implemented with: a6146d5 perf/tool: Add PMU event alias support -Robert
On Tue, Oct 16, 2012 at 12:08 PM, Robert Richter <robert.richter@amd.com> wrote: > Sukadev, > > On 15.10.12 17:55:34, Robert Richter wrote: >> On 11.10.12 18:28:39, Sukadev Bhattiprolu wrote: >> > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT }, >> > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE }, >> > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH }, >> > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_BRANCH }, >> >> Instead of adding new hardware event types I would prefer to use raw >> events in conjunction with sysfs, see e.g. the intel-uncore >> implementation. Something like: >> In general, I don't like generic events and especially stall events. I have not seen a clear definition of what they mean. Without it, there is no way to understand how to map them across architecture. If the definition is too precise, you may not be able to find an exact mapping. If the definition is to loose then it is unclear what you are measuring. Also this opens another can of worms which is that on some processors, you may need more than one event to encapsulate what the generic event is supposed to measure. That means developing a lot of code in the kernel to express and manage that. And of course, you would not be able to sample on those events (you cannot sample on a difference, for instance). So all in all, I think this is not a very good idea. You have to put this into the tool or a library that auto-detects the host CPU and programs the right set of events. We've had that discussion many times. Just reiterating my personal opinion on this. >> $ find /sys/bus/event_source/devices/cpu/events/ >> ... >> /sys/bus/event_source/devices/cpu/events/stalled-cycles-fixed-point >> /sys/bus/event_source/devices/cpu/events/stalled-cycles-load-store >> /sys/bus/event_source/devices/cpu/events/stalled-cycles-instruction-fetch >> /sys/bus/event_source/devices/cpu/events/stalled-cycles-branch >> ... >> $ cat /sys/bus/event_source/devices/cpu/events/stalled-cycles-fixed-point >> event=0xff,umask=0x00 >> >> Perf tool works then out-of-the-box with: >> >> $ perf record -e cpu/stalled-cycles-fixed-point/ ... > > I refer here to arch/x86/kernel/cpu/perf_event_intel_uncore.c (should > be in v3.7-rc1 or tip:perf/core). See the INTEL_UNCORE_EVENT_DESC() > macro and 'if (type->event_descs) ...' in uncore_type_init(). The code > should be reworked to be non-architectural. > > PMU registration is implemented for a longer time already for all > architectures and pmu types: > > /sys/bus/event_source/devices/* > > But > > /sys/bus/event_source/devices/*/events/ > > exists only for a small number of pmus. Perf tool support of this was > implemented with: > > a6146d5 perf/tool: Add PMU event alias support > > -Robert > > -- > Advanced Micro Devices, Inc. > Operating System Research Center >
Robert Richter [robert.richter@amd.com] wrote: | Sukadev, | | On 15.10.12 17:55:34, Robert Richter wrote: | > On 11.10.12 18:28:39, Sukadev Bhattiprolu wrote: | > > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT }, | > > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE }, | > > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH }, | > > + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_BRANCH }, | > | > Instead of adding new hardware event types I would prefer to use raw | > events in conjunction with sysfs, see e.g. the intel-uncore | > implementation. Something like: | > | > $ find /sys/bus/event_source/devices/cpu/events/ | > ... | > /sys/bus/event_source/devices/cpu/events/stalled-cycles-fixed-point | > /sys/bus/event_source/devices/cpu/events/stalled-cycles-load-store | > /sys/bus/event_source/devices/cpu/events/stalled-cycles-instruction-fetch | > /sys/bus/event_source/devices/cpu/events/stalled-cycles-branch | > ... | > $ cat /sys/bus/event_source/devices/cpu/events/stalled-cycles-fixed-point | > event=0xff,umask=0x00 | > | > Perf tool works then out-of-the-box with: | > | > $ perf record -e cpu/stalled-cycles-fixed-point/ ... | | I refer here to arch/x86/kernel/cpu/perf_event_intel_uncore.c (should | be in v3.7-rc1 or tip:perf/core). See the INTEL_UNCORE_EVENT_DESC() | macro and 'if (type->event_descs) ...' in uncore_type_init(). The code | should be reworked to be non-architectural. Ok. I will look through that code. Does that mean we are trying to avoid any more new hardware generic events ? Also a broader question - is the sysfs approach intended for all raw events or just for the generic events supported in the kernel ? If it is intended for all events a CPU supports, isn't there a chance of bloating kernel code ? Power7 has 530 events and Intel Nehalem (in libpfm) seems to have 370 events. Would that mean we would need to represent all these events in the kernel so they are available in sysfs ? On a side note, how does the kernel on x86 use the 'config' information in say /sys/bus/event_source/devices/cpu/format/cccr ? On Power7, the raw code encodes the information such as the PMC to use for the event. Is that how the 'config' info in Intel is used ? Does the 'config' info change from system to system or is it static for a given event on a given CPU ? I guess I am trying to understand if this mapping between event-name (event code) and the config info is something the kernel needs/uses or is it something the kernel simply passes through from userspace to CPU ? AFAICT, on the Power we use the raw codes to determine which PMC to select and which bits to set in some registers. That selection is static for a given CPU type such as Power7. If it is static, is it worth adding all this static mapping (for 530 events) into the kernel ? If we don't add to the kernel, we don't seem to have a way to specify the events symbolically. Thanks for you detailed comments. | | PMU registration is implemented for a longer time already for all | architectures and pmu types: | | /sys/bus/event_source/devices/* Yes I see this. | | But | | /sys/bus/event_source/devices/*/events/ Thanks for clarifying. I was looking to see if this was implemented too :-) Sukadev | | exists only for a small number of pmus. Perf tool support of this was | implemented with: | | a6146d5 perf/tool: Add PMU event alias support | | -Robert | | -- | Advanced Micro Devices, Inc. | Operating System Research Center
Stephane Eranian [eranian@google.com] wrote: | So all in all, I think this is not a very good idea. You have to put | this into the tool or a library that auto-detects the | host CPU and programs the right set of events. | | We've had that discussion many times. Just reiterating my personal | opinion on this. Yes that would work too. One drawback is that the hardware events will be in the tool, while the software/tracepoint events in the kernel sysfs representation. Or is that the reason we want all events in one place (sysfs) ? Sukadev
On Tue, 2012-10-16 at 11:31 -0700, Sukadev Bhattiprolu wrote: > On a side note, how does the kernel on x86 use the 'config' information in > say /sys/bus/event_source/devices/cpu/format/cccr ? On Power7, the raw > code encodes the information such as the PMC to use for the event. Is that > how the 'config' info in Intel is used ? > > Does the 'config' info change from system to system or is it static for > a given event on a given CPU ? Have a look at commits (tip/master): 641cc938815dfd09f8fa1ec72deb814f0938ac33 a47473939db20e3961b200eb00acf5fcf084d755 43c032febde48aabcf6d59f47cdcb7b5debbdc63 So basically /sys/bus/event_source/devices/cpu/format/event contains something like: config:0-7 Which says that for the 'cpu' PMU, field 'event' fills perf_event_attr::config bits 0 through 7 (for type=PERF_TYPE_RAW). The perf tool syntax for this is: perf stat -e 'cpu/event=0x3c/' This basically allows you to expose bitfields in the 'raw' event format for ease of writing raw events. I do not know if the Power PMU has such or not. Using this, /sys/bus/event_source/devices/cpu/events/cpu-cycles would contain something like: event=0x3c which one can use as: perf stat -e 'cpu/event=cpu-cycles/' perf stat -e 'cpu/cpu-cycles/' The tool will then read the sysfs file, substitute the content to obtain: perf stat -e 'cpu/event=0x3c/' and run with that. Within all this, the perf_event_attr::config* field names are hard-coded special, so 'cpu/config=0xffff/' will always work, even without sysfs format/ specification and is equivalent to the raw event stuff we had before. If the Power PMU lacks any structure to the raw config, you could simply provide sysfs event/ files with: config=0xdeadbeef like content.
Peter Zijlstra [peterz@infradead.org] wrote: | On Tue, 2012-10-16 at 11:31 -0700, Sukadev Bhattiprolu wrote: | > On a side note, how does the kernel on x86 use the 'config' information in | > say /sys/bus/event_source/devices/cpu/format/cccr ? On Power7, the raw | > code encodes the information such as the PMC to use for the event. Is that | > how the 'config' info in Intel is used ? | > | > Does the 'config' info change from system to system or is it static for | > a given event on a given CPU ? | | Have a look at commits (tip/master): | | 641cc938815dfd09f8fa1ec72deb814f0938ac33 | a47473939db20e3961b200eb00acf5fcf084d755 | 43c032febde48aabcf6d59f47cdcb7b5debbdc63 | | | So basically | | /sys/bus/event_source/devices/cpu/format/event | | contains something like: | | config:0-7 | | Which says that for the 'cpu' PMU, field 'event' fills | perf_event_attr::config bits 0 through 7 (for type=PERF_TYPE_RAW). | | The perf tool syntax for this is: | | perf stat -e 'cpu/event=0x3c/' | | This basically allows you to expose bitfields in the 'raw' event format | for ease of writing raw events. I do not know if the Power PMU has such | or not. Thanks for the detailed explanation. Power does not support this yet, but I have started working on it now. BTW, does this mean that we can use arch-specific names for the sysfs entries within: /sys/bus/event_source/devices/cpu/events/ So instead of the names I came up with in this patch, stalled-cycles-fixed-point we could use the name used in the CPU spec - 'cmplu_stall_fxu' in the arch specific code ? Sukadev
On Tue, 2012-10-30 at 23:40 -0700, Sukadev Bhattiprolu wrote: > So instead of the names I came up with in this patch, stalled-cycles-fixed-point > we could use the name used in the CPU spec - 'cmplu_stall_fxu' in the arch > specific code ? You could, but I would advise against it. Human readable names are so much more accessible.
diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c index 1251e4d..813e7c7 100644 --- a/arch/powerpc/perf/power7-pmu.c +++ b/arch/powerpc/perf/power7-pmu.c @@ -304,6 +304,10 @@ static int power7_generic_events[] = { [PERF_COUNT_HW_CACHE_MISSES] = 0x400f0, /* LD_MISS_L1 */ [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x10068, /* BRU_FIN */ [PERF_COUNT_HW_BRANCH_MISSES] = 0x400f6, /* BR_MPRED */ + [PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT] = 0x20014,/* CMPLU_STALL_FXU */ + [PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE] = 0x20012,/* CMPLU_STALL_LSU */ + [PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH] = 0x4004c,/* CMPLU_STALL_IFU */ + [PERF_COUNT_HW_STALLED_CYCLES_BRANCH] = 0x4004e,/* CMPLU_STALL_BRU */ }; #define C(x) PERF_COUNT_HW_CACHE_##x diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index bdb4161..ff9f0a6 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -55,6 +55,10 @@ enum perf_hw_id { PERF_COUNT_HW_STALLED_CYCLES_FRONTEND = 7, PERF_COUNT_HW_STALLED_CYCLES_BACKEND = 8, PERF_COUNT_HW_REF_CPU_CYCLES = 9, + PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT = 10, + PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE = 11, + PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH = 12, + PERF_COUNT_HW_STALLED_CYCLES_BRANCH = 13, PERF_COUNT_HW_MAX, /* non-ABI */ }; diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 861f0ae..6275dbb 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -77,6 +77,10 @@ static struct perf_event_attr default_attrs[] = { { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS }, { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS }, { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_STALLED_CYCLES_BRANCH }, }; diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 2eaae14..17e3190 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -77,6 +77,10 @@ static const char *perf_evsel__hw_names[PERF_COUNT_HW_MAX] = { "stalled-cycles-frontend", "stalled-cycles-backend", "ref-cycles", + "stalled-cycles-fixed-point", + "stalled-cycles-load-store", + "stalled-cycles-instruction-fetch", + "stalled-cycles-branch", }; static const char *__perf_evsel__hw_name(u64 config) diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l index 384ca74..0c49c05 100644 --- a/tools/perf/util/parse-events.l +++ b/tools/perf/util/parse-events.l @@ -102,6 +102,10 @@ branch-instructions|branches { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_ branch-misses { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_BRANCH_MISSES); } bus-cycles { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_BUS_CYCLES); } ref-cycles { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_REF_CPU_CYCLES); } +stalled-cycles-fixed-point { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT); } +stalled-cycles-load-store { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE); } +stalled-cycles-instruction-fetch { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH); } +stalled-cycles-branch { return sym(yyscanner, PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_BRANCH); } cpu-clock { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CPU_CLOCK); } task-clock { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_TASK_CLOCK); } page-faults|faults { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_PAGE_FAULTS); } diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c index 0688bfb..c563b30 100644 --- a/tools/perf/util/python.c +++ b/tools/perf/util/python.c @@ -952,6 +952,10 @@ static struct { { "COUNT_HW_STALLED_CYCLES_FRONTEND", PERF_COUNT_HW_STALLED_CYCLES_FRONTEND }, { "COUNT_HW_STALLED_CYCLES_BACKEND", PERF_COUNT_HW_STALLED_CYCLES_BACKEND }, + { "COUNT_HW_STALLED_CYCLES_FIXED_POINT", PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT }, + { "COUNT_HW_STALLED_CYCLES_LOAD_STORE", PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE }, + { "COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH", PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH }, + { "COUNT_HW_STALLED_CYCLES_BRANCH", PERF_COUNT_HW_STALLED_CYCLES_BRANCH }, { "COUNT_SW_CPU_CLOCK", PERF_COUNT_SW_CPU_CLOCK }, { "COUNT_SW_TASK_CLOCK", PERF_COUNT_SW_TASK_CLOCK },