Message ID | 1465990973-31483-1-git-send-email-maddy@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Madhavan Srinivasan wrote: > +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) > + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] > + ^ invert; > +#else > tmp = addr[start / BITS_PER_LONG] ^ invert; > +#endif Than you for diagnosing this problem, but I don't think the fix is correct. 1) It's not clear that all users of _find_next_bit and for_each_set_bit() want this change. 2) Is your code even correct? I'd think you'd want addr[x ^ 1]. Are you sure you shpuld be reversing the whole array, and not just the halves of each 64-bit word? 3) You've now broken the case of 32-bit big-endian kernel. I think the proper solution is uglier than this. :-(
Hi Madhavan, On Wed, Jun 15, 2016 at 05:12:53PM +0530, Madhavan Srinivasan wrote: > When decoding the perf_regs mask in regs_dump__printf(), > we loop through the mask using find_first_bit and find_next_bit functions. > And mask is of type "u64". But "u64" is send as a "unsigned long *" to > lib functions along with sizeof(). > > While the exisitng code works fine in most of the case, when using a 32bit perf > on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(), > one word at a time (based on BITS_PER_LONG) is loaded and > checked for any bit set. In 32bit BE userspace, > BITS_PER_LONG turns out to be 32, and for a mask value of > "0x00000000000000ff", find_first_bit will return 32, instead of 0. > Reason for this is that, value in the word0 is all zeros and value > in word1 is 0xff. Ideally, second word in the mask should be loaded > and searched. Patch swaps the word to look incase of 32bit BE. I think this is not a problem of find_bit() at all. You have wrong typecast as the source of problem (tools/perf/util/session.c"): 940 static void regs_dump__printf(u64 mask, u64 *regs) 941 { 942 unsigned rid, i = 0; 943 944 for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) { ^^^^ Here ^^^^ 945 u64 val = regs[i++]; 946 947 printf(".... %-5s 0x%" PRIx64 "\n", 948 perf_reg_name(rid), val); 949 } 950 } But for some reason you change correct find_bit()... Though proper fix is like this for me: static void regs_dump__printf(u64 mask, u64 *regs) { unsigned rid, i = 0; unsigned long _mask[sizeof(mask)/sizeof(unsigned long)]; _mask[0] = mask & ULONG_MAX; if (sizeof(mask) > sizeof(unsigned long)) _mask[1] = mask >> BITS_PER_LONG; for_each_set_bit(rid, _mask, sizeof(mask) * BITS_PER_BYTE) { u64 val = regs[i++]; printf(".... %-5s 0x%" PRIx64 "\n", perf_reg_name(rid), val); } } Maybe there already is some macro doing the conversion for you... Yury. > Cc: Arnaldo Carvalho de Melo <acme@kernel.org> > Cc: Adrian Hunter <adrian.hunter@intel.com> > Cc: Borislav Petkov <bp@suse.de> > Cc: David Ahern <dsahern@gmail.com> > Cc: George Spelvin <linux@horizon.com> > Cc: Jiri Olsa <jolsa@redhat.com> > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> > Cc: Wang Nan <wangnan0@huawei.com> > Cc: Yury Norov <yury.norov@gmail.com> > Cc: Michael Ellerman <mpe@ellerman.id.au> > Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> > --- > tools/lib/find_bit.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c > index 9122a9e80046..996b3e04324f 100644 > --- a/tools/lib/find_bit.c > +++ b/tools/lib/find_bit.c > @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, > if (!nbits || start >= nbits) > return nbits; > > +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) > + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] > + ^ invert; > +#else > tmp = addr[start / BITS_PER_LONG] ^ invert; > +#endif > > /* Handle 1st word. */ > tmp &= BITMAP_FIRST_WORD_MASK(start); > @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, > if (start >= nbits) > return nbits; > > +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) > + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] > + ^ invert; > +#else > tmp = addr[start / BITS_PER_LONG] ^ invert; > +#endif > } > > return min(start + __ffs(tmp), nbits); > @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size) > unsigned long idx; > > for (idx = 0; idx * BITS_PER_LONG < size; idx++) { > +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) > + if (addr[(((size-1)/BITS_PER_LONG) - idx)]) > + return min(idx * BITS_PER_LONG + > + __ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]), > + size); > +#else > if (addr[idx]) > return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size); > +#endif > } > > return size; > -- > 1.9.1
On Wed, Jun 15, 2016 at 10:51:27PM +0300, Yury Norov wrote: > Hi Madhavan, > > On Wed, Jun 15, 2016 at 05:12:53PM +0530, Madhavan Srinivasan wrote: > > When decoding the perf_regs mask in regs_dump__printf(), > > we loop through the mask using find_first_bit and find_next_bit functions. > > And mask is of type "u64". But "u64" is send as a "unsigned long *" to > > lib functions along with sizeof(). > > > > While the exisitng code works fine in most of the case, when using a 32bit perf > > on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(), > > one word at a time (based on BITS_PER_LONG) is loaded and > > checked for any bit set. In 32bit BE userspace, > > BITS_PER_LONG turns out to be 32, and for a mask value of > > "0x00000000000000ff", find_first_bit will return 32, instead of 0. > > Reason for this is that, value in the word0 is all zeros and value > > in word1 is 0xff. Ideally, second word in the mask should be loaded > > and searched. Patch swaps the word to look incase of 32bit BE. > > I think this is not a problem of find_bit() at all. You have wrong > typecast as the source of problem (tools/perf/util/session.c"): > > 940 static void regs_dump__printf(u64 mask, u64 *regs) > 941 { > 942 unsigned rid, i = 0; > 943 > 944 for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) { > ^^^^ Here ^^^^ > 945 u64 val = regs[i++]; > 946 > 947 printf(".... %-5s 0x%" PRIx64 "\n", > 948 perf_reg_name(rid), val); > 949 } > 950 } > > But for some reason you change correct find_bit()... > > Though proper fix is like this for me: > > static void regs_dump__printf(u64 mask, u64 *regs) > { > unsigned rid, i = 0; > unsigned long _mask[sizeof(mask)/sizeof(unsigned long)]; > > _mask[0] = mask & ULONG_MAX; > if (sizeof(mask) > sizeof(unsigned long)) > _mask[1] = mask >> BITS_PER_LONG; > > for_each_set_bit(rid, _mask, sizeof(mask) * BITS_PER_BYTE) { > u64 val = regs[i++]; > > printf(".... %-5s 0x%" PRIx64 "\n", > perf_reg_name(rid), val); > } > } > > Maybe there already is some macro doing the conversion for you... yes it is, cpu_to_le64() is what you want > > Yury. > > > Cc: Arnaldo Carvalho de Melo <acme@kernel.org> > > Cc: Adrian Hunter <adrian.hunter@intel.com> > > Cc: Borislav Petkov <bp@suse.de> > > Cc: David Ahern <dsahern@gmail.com> > > Cc: George Spelvin <linux@horizon.com> > > Cc: Jiri Olsa <jolsa@redhat.com> > > Cc: Namhyung Kim <namhyung@kernel.org> > > Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> > > Cc: Wang Nan <wangnan0@huawei.com> > > Cc: Yury Norov <yury.norov@gmail.com> > > Cc: Michael Ellerman <mpe@ellerman.id.au> > > Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> > > --- > > tools/lib/find_bit.c | 17 +++++++++++++++++ > > 1 file changed, 17 insertions(+) > > > > diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c > > index 9122a9e80046..996b3e04324f 100644 > > --- a/tools/lib/find_bit.c > > +++ b/tools/lib/find_bit.c > > @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, > > if (!nbits || start >= nbits) > > return nbits; > > > > +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) > > + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] > > + ^ invert; > > +#else > > tmp = addr[start / BITS_PER_LONG] ^ invert; > > +#endif > > > > /* Handle 1st word. */ > > tmp &= BITMAP_FIRST_WORD_MASK(start); > > @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, > > if (start >= nbits) > > return nbits; > > > > +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) > > + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] > > + ^ invert; > > +#else > > tmp = addr[start / BITS_PER_LONG] ^ invert; > > +#endif > > } > > > > return min(start + __ffs(tmp), nbits); > > @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size) > > unsigned long idx; > > > > for (idx = 0; idx * BITS_PER_LONG < size; idx++) { > > +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) > > + if (addr[(((size-1)/BITS_PER_LONG) - idx)]) > > + return min(idx * BITS_PER_LONG + > > + __ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]), > > + size); > > +#else > > if (addr[idx]) > > return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size); > > +#endif > > } > > > > return size; > > -- > > 1.9.1
Em Thu, Jun 16, 2016 at 12:11:04AM +0300, Yury Norov escreveu: > On Wed, Jun 15, 2016 at 10:51:27PM +0300, Yury Norov wrote: > > Maybe there already is some macro doing the conversion for you... > > yes it is, cpu_to_le64() is what you want Beware that the cpu_to_le64() in tools/perf is bogus, we need to grab a copy from the kernel sources. - Arnaldo
在 2016/6/16 5:29, Arnaldo Carvalho de Melo 写道: > Em Thu, Jun 16, 2016 at 12:11:04AM +0300, Yury Norov escreveu: >> On Wed, Jun 15, 2016 at 10:51:27PM +0300, Yury Norov wrote: >>> Maybe there already is some macro doing the conversion for you... >> yes it is, cpu_to_le64() is what you want > Beware that the cpu_to_le64() in tools/perf is bogus, we need to grab a > copy from the kernel sources. > > - Arnaldo > [PATCH 1/2] tools include: Sync byteorder/generic.h [PATCH 2/2] tools include: Fix wrong macro definitions for cpu_to_le* for big endian Here're two patches related to this issue, sorry for wrongly sent two more reduntant mails. Thank you.
On Thursday 16 June 2016 01:21 AM, Yury Norov wrote: > Hi Madhavan, > > On Wed, Jun 15, 2016 at 05:12:53PM +0530, Madhavan Srinivasan wrote: >> When decoding the perf_regs mask in regs_dump__printf(), >> we loop through the mask using find_first_bit and find_next_bit functions. >> And mask is of type "u64". But "u64" is send as a "unsigned long *" to >> lib functions along with sizeof(). >> >> While the exisitng code works fine in most of the case, when using a 32bit perf >> on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(), >> one word at a time (based on BITS_PER_LONG) is loaded and >> checked for any bit set. In 32bit BE userspace, >> BITS_PER_LONG turns out to be 32, and for a mask value of >> "0x00000000000000ff", find_first_bit will return 32, instead of 0. >> Reason for this is that, value in the word0 is all zeros and value >> in word1 is 0xff. Ideally, second word in the mask should be loaded >> and searched. Patch swaps the word to look incase of 32bit BE. > I think this is not a problem of find_bit() at all. You have wrong > typecast as the source of problem (tools/perf/util/session.c"): > > 940 static void regs_dump__printf(u64 mask, u64 *regs) > 941 { > 942 unsigned rid, i = 0; > 943 > 944 for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) { > ^^^^ Here ^^^^ > 945 u64 val = regs[i++]; > 946 > 947 printf(".... %-5s 0x%" PRIx64 "\n", > 948 perf_reg_name(rid), val); > 949 } > 950 } > > But for some reason you change correct find_bit()... > > Though proper fix is like this for me: > > static void regs_dump__printf(u64 mask, u64 *regs) > { > unsigned rid, i = 0; > unsigned long _mask[sizeof(mask)/sizeof(unsigned long)]; > > _mask[0] = mask & ULONG_MAX; > if (sizeof(mask) > sizeof(unsigned long)) > _mask[1] = mask >> BITS_PER_LONG; > > for_each_set_bit(rid, _mask, sizeof(mask) * BITS_PER_BYTE) { > u64 val = regs[i++]; > > printf(".... %-5s 0x%" PRIx64 "\n", > perf_reg_name(rid), val); > } > } > > Maybe there already is some macro doing the conversion for you... Agreed, but reason for proposing fix in lib side is to avoid conversion on each case (if any in future). I will repost the fix as suggested. Maddy > Yury. > >> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> >> Cc: Adrian Hunter <adrian.hunter@intel.com> >> Cc: Borislav Petkov <bp@suse.de> >> Cc: David Ahern <dsahern@gmail.com> >> Cc: George Spelvin <linux@horizon.com> >> Cc: Jiri Olsa <jolsa@redhat.com> >> Cc: Namhyung Kim <namhyung@kernel.org> >> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> >> Cc: Wang Nan <wangnan0@huawei.com> >> Cc: Yury Norov <yury.norov@gmail.com> >> Cc: Michael Ellerman <mpe@ellerman.id.au> >> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> >> --- >> tools/lib/find_bit.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c >> index 9122a9e80046..996b3e04324f 100644 >> --- a/tools/lib/find_bit.c >> +++ b/tools/lib/find_bit.c >> @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, >> if (!nbits || start >= nbits) >> return nbits; >> >> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >> + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] >> + ^ invert; >> +#else >> tmp = addr[start / BITS_PER_LONG] ^ invert; >> +#endif >> >> /* Handle 1st word. */ >> tmp &= BITMAP_FIRST_WORD_MASK(start); >> @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, >> if (start >= nbits) >> return nbits; >> >> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >> + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] >> + ^ invert; >> +#else >> tmp = addr[start / BITS_PER_LONG] ^ invert; >> +#endif >> } >> >> return min(start + __ffs(tmp), nbits); >> @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size) >> unsigned long idx; >> >> for (idx = 0; idx * BITS_PER_LONG < size; idx++) { >> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >> + if (addr[(((size-1)/BITS_PER_LONG) - idx)]) >> + return min(idx * BITS_PER_LONG + >> + __ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]), >> + size); >> +#else >> if (addr[idx]) >> return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size); >> +#endif >> } >> >> return size; >> -- >> 1.9.1
On Wednesday 15 June 2016 06:14 PM, George Spelvin wrote: > Madhavan Srinivasan wrote: >> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >> + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] >> + ^ invert; >> +#else >> tmp = addr[start / BITS_PER_LONG] ^ invert; >> +#endif > Than you for diagnosing this problem, but I don't think the fix > is correct. > > 1) It's not clear that all users of _find_next_bit and for_each_set_bit() > want this change. > 2) Is your code even correct? I'd think you'd want addr[x ^ 1]. Are you > sure you shpuld be reversing the whole array, and not just the halves of > each 64-bit word? > 3) You've now broken the case of 32-bit big-endian kernel. Yes. But looks like we havent hit this case yet. Will post a fix. Maddy > > I think the proper solution is uglier than this. :-( >
On Thursday 16 June 2016 02:41 AM, Yury Norov wrote: > On Wed, Jun 15, 2016 at 10:51:27PM +0300, Yury Norov wrote: >> Hi Madhavan, >> >> On Wed, Jun 15, 2016 at 05:12:53PM +0530, Madhavan Srinivasan wrote: >>> When decoding the perf_regs mask in regs_dump__printf(), >>> we loop through the mask using find_first_bit and find_next_bit functions. >>> And mask is of type "u64". But "u64" is send as a "unsigned long *" to >>> lib functions along with sizeof(). >>> >>> While the exisitng code works fine in most of the case, when using a 32bit perf >>> on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(), >>> one word at a time (based on BITS_PER_LONG) is loaded and >>> checked for any bit set. In 32bit BE userspace, >>> BITS_PER_LONG turns out to be 32, and for a mask value of >>> "0x00000000000000ff", find_first_bit will return 32, instead of 0. >>> Reason for this is that, value in the word0 is all zeros and value >>> in word1 is 0xff. Ideally, second word in the mask should be loaded >>> and searched. Patch swaps the word to look incase of 32bit BE. >> I think this is not a problem of find_bit() at all. You have wrong >> typecast as the source of problem (tools/perf/util/session.c"): >> >> 940 static void regs_dump__printf(u64 mask, u64 *regs) >> 941 { >> 942 unsigned rid, i = 0; >> 943 >> 944 for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) { >> ^^^^ Here ^^^^ >> 945 u64 val = regs[i++]; >> 946 >> 947 printf(".... %-5s 0x%" PRIx64 "\n", >> 948 perf_reg_name(rid), val); >> 949 } >> 950 } >> >> But for some reason you change correct find_bit()... >> >> Though proper fix is like this for me: >> >> static void regs_dump__printf(u64 mask, u64 *regs) >> { >> unsigned rid, i = 0; >> unsigned long _mask[sizeof(mask)/sizeof(unsigned long)]; >> >> _mask[0] = mask & ULONG_MAX; >> if (sizeof(mask) > sizeof(unsigned long)) >> _mask[1] = mask >> BITS_PER_LONG; >> >> for_each_set_bit(rid, _mask, sizeof(mask) * BITS_PER_BYTE) { >> u64 val = regs[i++]; >> >> printf(".... %-5s 0x%" PRIx64 "\n", >> perf_reg_name(rid), val); >> } >> } >> >> Maybe there already is some macro doing the conversion for you... > yes it is, cpu_to_le64() is what you want no wait, on second look, cpu_to_le64() is not right. Because we will end up swapping within 32bit. But what you suggested looks to be fine. I can repost this with one minor tweak, right shift with 32 instead of BITS_PER_LONG (since I see compiler errors in 64bit). Maddy > >> Yury. >> >>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> >>> Cc: Adrian Hunter <adrian.hunter@intel.com> >>> Cc: Borislav Petkov <bp@suse.de> >>> Cc: David Ahern <dsahern@gmail.com> >>> Cc: George Spelvin <linux@horizon.com> >>> Cc: Jiri Olsa <jolsa@redhat.com> >>> Cc: Namhyung Kim <namhyung@kernel.org> >>> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> >>> Cc: Wang Nan <wangnan0@huawei.com> >>> Cc: Yury Norov <yury.norov@gmail.com> >>> Cc: Michael Ellerman <mpe@ellerman.id.au> >>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> >>> --- >>> tools/lib/find_bit.c | 17 +++++++++++++++++ >>> 1 file changed, 17 insertions(+) >>> >>> diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c >>> index 9122a9e80046..996b3e04324f 100644 >>> --- a/tools/lib/find_bit.c >>> +++ b/tools/lib/find_bit.c >>> @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, >>> if (!nbits || start >= nbits) >>> return nbits; >>> >>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >>> + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] >>> + ^ invert; >>> +#else >>> tmp = addr[start / BITS_PER_LONG] ^ invert; >>> +#endif >>> >>> /* Handle 1st word. */ >>> tmp &= BITMAP_FIRST_WORD_MASK(start); >>> @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, >>> if (start >= nbits) >>> return nbits; >>> >>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >>> + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] >>> + ^ invert; >>> +#else >>> tmp = addr[start / BITS_PER_LONG] ^ invert; >>> +#endif >>> } >>> >>> return min(start + __ffs(tmp), nbits); >>> @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size) >>> unsigned long idx; >>> >>> for (idx = 0; idx * BITS_PER_LONG < size; idx++) { >>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >>> + if (addr[(((size-1)/BITS_PER_LONG) - idx)]) >>> + return min(idx * BITS_PER_LONG + >>> + __ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]), >>> + size); >>> +#else >>> if (addr[idx]) >>> return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size); >>> +#endif >>> } >>> >>> return size; >>> -- >>> 1.9.1
diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c index 9122a9e80046..996b3e04324f 100644 --- a/tools/lib/find_bit.c +++ b/tools/lib/find_bit.c @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, if (!nbits || start >= nbits) return nbits; +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] + ^ invert; +#else tmp = addr[start / BITS_PER_LONG] ^ invert; +#endif /* Handle 1st word. */ tmp &= BITMAP_FIRST_WORD_MASK(start); @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, if (start >= nbits) return nbits; +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] + ^ invert; +#else tmp = addr[start / BITS_PER_LONG] ^ invert; +#endif } return min(start + __ffs(tmp), nbits); @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size) unsigned long idx; for (idx = 0; idx * BITS_PER_LONG < size; idx++) { +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) + if (addr[(((size-1)/BITS_PER_LONG) - idx)]) + return min(idx * BITS_PER_LONG + + __ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]), + size); +#else if (addr[idx]) return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size); +#endif } return size;
When decoding the perf_regs mask in regs_dump__printf(), we loop through the mask using find_first_bit and find_next_bit functions. And mask is of type "u64". But "u64" is send as a "unsigned long *" to lib functions along with sizeof(). While the exisitng code works fine in most of the case, when using a 32bit perf on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(), one word at a time (based on BITS_PER_LONG) is loaded and checked for any bit set. In 32bit BE userspace, BITS_PER_LONG turns out to be 32, and for a mask value of "0x00000000000000ff", find_first_bit will return 32, instead of 0. Reason for this is that, value in the word0 is all zeros and value in word1 is 0xff. Ideally, second word in the mask should be loaded and searched. Patch swaps the word to look incase of 32bit BE. Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: George Spelvin <linux@horizon.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Wang Nan <wangnan0@huawei.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> --- tools/lib/find_bit.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) -- 1.9.1