Message ID | 1435652431-22024-2-git-send-email-khandual@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Tue, 2015-30-06 at 08:20:27 UTC, Anshuman Khandual wrote: > BHRB (Branch History Rolling Buffer) is a rolling buffer. Hence we > might end up in a situation where we have read one target address > but when we try to read the next entry indicating the from address > of the target address, the buffer just overflows. In this case, the > captured from address will be zero which indicates the end of the > buffer. Right. But with SMT8 the size of the buffer is very small, so we will actually hit this case somewhat often. When we originally wrote this we decided it was better to get some information, ie. the from address, than no information at all. > This patch drops the entire branch record which would have > otherwise confused the user space tools. Does it confuse the tools? Can you show me before/after output from perf? I'm not opposed to changing this but we need to be 100% sure it's the best option. cheers
On 07/27/2015 09:49 AM, Michael Ellerman wrote: > On Tue, 2015-30-06 at 08:20:27 UTC, Anshuman Khandual wrote: >> BHRB (Branch History Rolling Buffer) is a rolling buffer. Hence we >> might end up in a situation where we have read one target address >> but when we try to read the next entry indicating the from address >> of the target address, the buffer just overflows. In this case, the >> captured from address will be zero which indicates the end of the >> buffer. > > Right. But with SMT8 the size of the buffer is very small, so we will actually > hit this case somewhat often. When we originally wrote this we decided it was > better to get some information, ie. the from address, than no information at > all. You are right. But practically as of now we are not using this kind of (from, 0) branch entries any where as a special case. More over for certain kind of workloads which has a small code and a few branches, the chances of getting this kind of branch (from, 0) increases a lot making them probably one of the highest percentage entries in the final perf report. Now with this change of code, the workload session might have overall less number of branch entries, but in my opinion represents more accurate branch profile of the given workload in percentage wise. > >> This patch drops the entire branch record which would have >> otherwise confused the user space tools. > > Does it confuse the tools? Can you show me before/after output from perf? The word 'confuse' might be little misleading. But the point as explained above that the relative branch percentage profile of certain workloads might be distorted and that I believe is true. Also branch entries like "from ----> 0" in the perf report might be confusing to users who dont expect to see this kind of entries in the final perf report and will never get into "perf report -D" to figure out what really happened. > > I'm not opposed to changing this but we need to be 100% sure it's the best > option.
On 07/28/2015 08:38 AM, Anshuman Khandual wrote: > On 07/27/2015 09:49 AM, Michael Ellerman wrote: >> > On Tue, 2015-30-06 at 08:20:27 UTC, Anshuman Khandual wrote: >>> >> BHRB (Branch History Rolling Buffer) is a rolling buffer. Hence we >>> >> might end up in a situation where we have read one target address >>> >> but when we try to read the next entry indicating the from address >>> >> of the target address, the buffer just overflows. In this case, the >>> >> captured from address will be zero which indicates the end of the >>> >> buffer. >> > >> > Right. But with SMT8 the size of the buffer is very small, so we will actually >> > hit this case somewhat often. When we originally wrote this we decided it was >> > better to get some information, ie. the from address, than no information at >> > all. > You are right. But practically as of now we are not using this kind of > (from, 0) branch entries any where as a special case. More over for > certain kind of workloads which has a small code and a few branches, > the chances of getting this kind of branch (from, 0) increases a lot > making them probably one of the highest percentage entries in the final > perf report. Now with this change of code, the workload session might > have overall less number of branch entries, but in my opinion represents > more accurate branch profile of the given workload in percentage wise. > >> > >>> >> This patch drops the entire branch record which would have >>> >> otherwise confused the user space tools. >> > >> > Does it confuse the tools? Can you show me before/after output from perf? > The word 'confuse' might be little misleading. But the point as > explained above that the relative branch percentage profile of > certain workloads might be distorted and that I believe is true. > Also branch entries like "from ----> 0" in the perf report might > be confusing to users who dont expect to see this kind of entries > in the final perf report and will never get into "perf report -D" > to figure out what really happened. Hey Michael, As I had explained earlier, is not it a good idea to drop these kind of branch records from the final output ? I will request consideration of this patch along with others in the series. I have dropped the following patch as you had pointed out. [3/5] powerpc/perf: Replace last usage of get_cpu_var with this_cpu_ptr Also did not receive any comments or thoughts on the V10 of the BHRB SW branch filter patch series posted couple of months back. Does it look good ? http://comments.gmane.org/gmane.linux.ports.ppc.embedded/83206 After dropping the above patch and excluding the one which had already merged mainline, rebased the entire series and it still works fine on LE and BE kernel as of today. I will be sending them soon.
On Wed, 2015-09-30 at 14:33 +0530, Anshuman Khandual wrote: > On 07/28/2015 08:38 AM, Anshuman Khandual wrote: > > On 07/27/2015 09:49 AM, Michael Ellerman wrote: > >> > On Tue, 2015-30-06 at 08:20:27 UTC, Anshuman Khandual wrote: > >>> >> BHRB (Branch History Rolling Buffer) is a rolling buffer. Hence we > >>> >> might end up in a situation where we have read one target address > >>> >> but when we try to read the next entry indicating the from address > >>> >> of the target address, the buffer just overflows. In this case, the > >>> >> captured from address will be zero which indicates the end of the > >>> >> buffer. > >> > > >> > Right. But with SMT8 the size of the buffer is very small, so we will actually > >> > hit this case somewhat often. When we originally wrote this we decided it was > >> > better to get some information, ie. the from address, than no information at > >> > all. > > You are right. But practically as of now we are not using this kind of > > (from, 0) branch entries any where as a special case. More over for > > certain kind of workloads which has a small code and a few branches, > > the chances of getting this kind of branch (from, 0) increases a lot > > making them probably one of the highest percentage entries in the final > > perf report. Now with this change of code, the workload session might > > have overall less number of branch entries, but in my opinion represents > > more accurate branch profile of the given workload in percentage wise. > > > >> > > >>> >> This patch drops the entire branch record which would have > >>> >> otherwise confused the user space tools. > >> > > >> > Does it confuse the tools? Can you show me before/after output from perf? > > The word 'confuse' might be little misleading. But the point as > > explained above that the relative branch percentage profile of > > certain workloads might be distorted and that I believe is true. > > Also branch entries like "from ----> 0" in the perf report might > > be confusing to users who dont expect to see this kind of entries > > in the final perf report and will never get into "perf report -D" > > to figure out what really happened. > > Hey Michael, > > As I had explained earlier, is not it a good idea to drop these > kind of branch records from the final output ? I will request > consideration of this patch along with others in the series. I think it's too late to change it. ie. we've shipped several kernel versions that did emit them, so we just have to stick with that. If it was obviously a bug fix we could change it, but it's not obviously correct either way. So please drop that patch and repost the series. cheers
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index d90893b..b0c2d53 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -461,7 +461,6 @@ static void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw) * In this case we need to read the instruction from * memory to determine the target/to address. */ - if (val & BHRB_TARGET) { /* Target branches use two entries * (ie. computed gotos/XL form) @@ -472,6 +471,8 @@ static void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw) /* Get from address in next entry */ val = read_bhrb(r_index++); + if (!val) + break; addr = val & BHRB_EA; if (val & BHRB_TARGET) { /* Shouldn't have two targets in a
BHRB (Branch History Rolling Buffer) is a rolling buffer. Hence we might end up in a situation where we have read one target address but when we try to read the next entry indicating the from address of the target address, the buffer just overflows. In this case, the captured from address will be zero which indicates the end of the buffer. This patch drops the entire branch record which would have otherwise confused the user space tools. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> --- arch/powerpc/perf/core-book3s.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)