Message ID | 767e53c4c27da024ca277e21ffcd0cff131f5c73.1618469454.git.sathvika@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | powerpc/sstep: Add emulation support and tests for 'setb' instruction | expand |
Related | show |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch powerpc/merge (0702e74703f57173e70cfab2a79a3e682e9e96ec) |
snowpatch_ozlabs/checkpatch | warning | total: 0 errors, 0 warnings, 1 checks, 18 lines checked |
snowpatch_ozlabs/needsstable | success | Patch has no Fixes tags |
Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: > This adds emulation support for the following instruction: > * Set Boolean (setb) > > Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> > --- > arch/powerpc/lib/sstep.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c > index c6aebc149d14..263c613d7490 100644 > --- a/arch/powerpc/lib/sstep.c > +++ b/arch/powerpc/lib/sstep.c > @@ -1964,6 +1964,18 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs, > op->val = ~(regs->gpr[rd] | regs->gpr[rb]); > goto logical_done; > > + case 128: /* setb */ > + if (!cpu_has_feature(CPU_FTR_ARCH_300)) > + goto unknown_opcode; Ok, if I've understood correctly... > + ra = ra & ~0x3; This masks off the bits of RA that are not part of BTF: ra is in [0, 31] which is [0b00000, 0b11111] Then ~0x3 = ~0b00011 ra = ra & 0b11100 This gives us then, ra = btf << 2; or btf = ra >> 2; Let's then check to see if your calculations read the right fields. > + if ((regs->ccr) & (1 << (31 - ra))) > + op->val = -1; > + else if ((regs->ccr) & (1 << (30 - ra))) > + op->val = 1; > + else > + op->val = 0; CR field: 7 6 5 4 3 2 1 0 bit: 0123 0123 0123 0123 0123 0123 0123 0123 normal bit #: 0.....................................31 ibm bit #: 31.....................................0 If btf = 0, ra = 0, check normal bits 31 and 30, which are both in CR0. CR field: 7 6 5 4 3 2 1 0 bit: 0123 0123 0123 0123 0123 0123 0123 0123 ^^ If btf = 7, ra = 0b11100 = 28, so check normal bits 31-28 and 30-28, which are 3 and 2. CR field: 7 6 5 4 3 2 1 0 bit: 0123 0123 0123 0123 0123 0123 0123 0123 ^^ If btf = 3, ra = 0b01100 = 12, for normal bits 19 and 18: CR field: 7 6 5 4 3 2 1 0 bit: 0123 0123 0123 0123 0123 0123 0123 0123 ^^ So yes, your calculations, while I struggle to follow _how_ they work, do in fact seem to work. Checkpatch does have one complaint: CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr' #30: FILE: arch/powerpc/lib/sstep.c:1971: + if ((regs->ccr) & (1 << (31 - ra))) I don't really mind the parenteses: I think you are safe to ignore checkpatch here unless someone else complains :) If you do end up respinning the patch, I think it would be good to make the maths a bit clearer. I think it works because a left shift of 2 is the same as multiplying by 4, but it would be easier to follow if you used a temporary variable for btf. However, I do think this is still worth adding to the kernel either way, so: Reviewed-by: Daniel Axtens <dja@axtens.net> Kind regards, Daniel > + goto compute_done; > + > case 154: /* prtyw */ > do_prty(regs, op, regs->gpr[rd], 32); > goto logical_done_nocc; > -- > 2.16.4
Daniel Axtens wrote: > Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: > >> This adds emulation support for the following instruction: >> * Set Boolean (setb) >> >> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> >> --- >> arch/powerpc/lib/sstep.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c >> index c6aebc149d14..263c613d7490 100644 >> --- a/arch/powerpc/lib/sstep.c >> +++ b/arch/powerpc/lib/sstep.c >> @@ -1964,6 +1964,18 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs, >> op->val = ~(regs->gpr[rd] | regs->gpr[rb]); >> goto logical_done; >> >> + case 128: /* setb */ >> + if (!cpu_has_feature(CPU_FTR_ARCH_300)) >> + goto unknown_opcode; > > Ok, if I've understood correctly... > >> + ra = ra & ~0x3; > > This masks off the bits of RA that are not part of BTF: > > ra is in [0, 31] which is [0b00000, 0b11111] > Then ~0x3 = ~0b00011 > ra = ra & 0b11100 > > This gives us then, > ra = btf << 2; or > btf = ra >> 2; > > Let's then check to see if your calculations read the right fields. > >> + if ((regs->ccr) & (1 << (31 - ra))) >> + op->val = -1; >> + else if ((regs->ccr) & (1 << (30 - ra))) >> + op->val = 1; >> + else >> + op->val = 0; > > > CR field: 7 6 5 4 3 2 1 0 > bit: 0123 0123 0123 0123 0123 0123 0123 0123 > normal bit #: 0.....................................31 > ibm bit #: 31.....................................0 > > If btf = 0, ra = 0, check normal bits 31 and 30, which are both in CR0. > CR field: 7 6 5 4 3 2 1 0 > bit: 0123 0123 0123 0123 0123 0123 0123 0123 > ^^ > > If btf = 7, ra = 0b11100 = 28, so check normal bits 31-28 and 30-28, > which are 3 and 2. > > CR field: 7 6 5 4 3 2 1 0 > bit: 0123 0123 0123 0123 0123 0123 0123 0123 > ^^ > > If btf = 3, ra = 0b01100 = 12, for normal bits 19 and 18: > > CR field: 7 6 5 4 3 2 1 0 > bit: 0123 0123 0123 0123 0123 0123 0123 0123 > ^^ > > So yes, your calculations, while I struggle to follow _how_ they work, > do in fact seem to work. > > Checkpatch does have one complaint: > > CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr' > #30: FILE: arch/powerpc/lib/sstep.c:1971: > + if ((regs->ccr) & (1 << (31 - ra))) > > I don't really mind the parenteses: I think you are safe to ignore > checkpatch here unless someone else complains :) > > If you do end up respinning the patch, I think it would be good to make > the maths a bit clearer. I think it works because a left shift of 2 is > the same as multiplying by 4, but it would be easier to follow if you > used a temporary variable for btf. Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, the bit we are interested in is: 4 x BFA + 32 So, if we use that along with the PPC_BIT() macro, we get: if (regs->ccr & PPC_BIT(ra + 32)) >> + goto compute_done; >> + I can see why you thought this should be in the section with other logical instructions. However, since this instruction does not modify CR itself, this is probably better placed earlier -- somewhere near 'mfcr' instruction emulation. - Naveen
"Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes: > Daniel Axtens wrote: >> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: >> >>> This adds emulation support for the following instruction: >>> * Set Boolean (setb) >>> >>> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> ... >> >> If you do end up respinning the patch, I think it would be good to make >> the maths a bit clearer. I think it works because a left shift of 2 is >> the same as multiplying by 4, but it would be easier to follow if you >> used a temporary variable for btf. > > Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, > the bit we are interested in is: > 4 x BFA + 32 > > So, if we use that along with the PPC_BIT() macro, we get: > if (regs->ccr & PPC_BIT(ra + 32)) Use of PPC_BIT risks annoying your maintainer :) cheers
Michael Ellerman wrote: > "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes: >> Daniel Axtens wrote: >>> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: >>> >>>> This adds emulation support for the following instruction: >>>> * Set Boolean (setb) >>>> >>>> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> > ... >>> >>> If you do end up respinning the patch, I think it would be good to make >>> the maths a bit clearer. I think it works because a left shift of 2 is >>> the same as multiplying by 4, but it would be easier to follow if you >>> used a temporary variable for btf. >> >> Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, >> the bit we are interested in is: >> 4 x BFA + 32 >> >> So, if we use that along with the PPC_BIT() macro, we get: >> if (regs->ccr & PPC_BIT(ra + 32)) > > Use of PPC_BIT risks annoying your maintainer :) Uh oh... that isn't good :) I looked up previous discussions and I think I now understand why you don't prefer it. But, I feel it helps make it easy to follow the code when referring to the ISA. I'm wondering if it is just the name you dislike and if so, does it make sense to rename PPC_BIT() to something else? We have BIT_ULL(), so perhaps BIT_MSB_ULL() or MSB_BIT_ULL()? - Naveen
Hi! On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote: > Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: > Ok, if I've understood correctly... > > > + ra = ra & ~0x3; > > This masks off the bits of RA that are not part of BTF: > > ra is in [0, 31] which is [0b00000, 0b11111] > Then ~0x3 = ~0b00011 > ra = ra & 0b11100 > > This gives us then, > ra = btf << 2; or > btf = ra >> 2; Yes. In effect, you want the offset in bits of the CR field, which is just fine like this. But a comment would not hurt. > Let's then check to see if your calculations read the right fields. > > > + if ((regs->ccr) & (1 << (31 - ra))) > > + op->val = -1; > > + else if ((regs->ccr) & (1 << (30 - ra))) > > + op->val = 1; > > + else > > + op->val = 0; It imo is clearer if written if ((regs->ccr << ra) & 0x80000000) op->val = -1; else if ((regs->ccr << ra) & 0x40000000) op->val = 1; else op->val = 0; but I guess not everyone agrees :-) > CR field: 7 6 5 4 3 2 1 0 > bit: 0123 0123 0123 0123 0123 0123 0123 0123 > normal bit #: 0.....................................31 > ibm bit #: 31.....................................0 The bit numbers in CR fields are *always* numbered left-to-right. I have never seen anyone use LE for it, anyway. Also, even people who write LE have the bigger end on the left normally (they just write some things right-to-left, and other things left-to-right). > Checkpatch does have one complaint: > > CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr' > #30: FILE: arch/powerpc/lib/sstep.c:1971: > + if ((regs->ccr) & (1 << (31 - ra))) > > I don't really mind the parenteses: I think you are safe to ignore > checkpatch here unless someone else complains :) I find them annoying. If there are too many parentheses, it is hard to see at a glance what groups where. Also, a suspicious reader might think there is something special going on (with macros for example). This is simple code of course, but :-) > If you do end up respinning the patch, I think it would be good to make > the maths a bit clearer. I think it works because a left shift of 2 is > the same as multiplying by 4, but it would be easier to follow if you > used a temporary variable for btf. It is very simple. The BFA instruction field is closely related to the BI instruction field, which is 5 bits, and selects one of the 32 bits in the CR. If you have "BFA00 BFA01 BFA10 BFA11", that gives the bit numbers of all four bits in the selected CR field. So the "& ~3" does all you need. It is quite pretty :-) Segher
Hi, On Thu, Apr 22, 2021 at 02:13:34PM -0500, Segher Boessenkool wrote: > Hi! > > On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote: > > Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: > > Ok, if I've understood correctly... > > > > > + ra = ra & ~0x3; > > > > This masks off the bits of RA that are not part of BTF: > > > > ra is in [0, 31] which is [0b00000, 0b11111] > > Then ~0x3 = ~0b00011 > > ra = ra & 0b11100 > > > > This gives us then, > > ra = btf << 2; or > > btf = ra >> 2; > > Yes. In effect, you want the offset in bits of the CR field, which is > just fine like this. But a comment would not hurt. > > > Let's then check to see if your calculations read the right fields. > > > > > + if ((regs->ccr) & (1 << (31 - ra))) > > > + op->val = -1; > > > + else if ((regs->ccr) & (1 << (30 - ra))) > > > + op->val = 1; > > > + else > > > + op->val = 0; > > It imo is clearer if written > > if ((regs->ccr << ra) & 0x80000000) > op->val = -1; > else if ((regs->ccr << ra) & 0x40000000) > op->val = 1; > else > op->val = 0; > > but I guess not everyone agrees :-) > But this can be made jump free :-): int tmp = regs->ccr << ra; op->val = (tmp >> 31) | ((tmp >> 30) & 1); (IIRC the srawi instruction sign-extends its result to 64 bits). > > CR field: 7 6 5 4 3 2 1 0 > > bit: 0123 0123 0123 0123 0123 0123 0123 0123 > > normal bit #: 0.....................................31 > > ibm bit #: 31.....................................0 > > The bit numbers in CR fields are *always* numbered left-to-right. I > have never seen anyone use LE for it, anyway. > > Also, even people who write LE have the bigger end on the left normally > (they just write some things right-to-left, and other things > left-to-right). Around 1985, I had a documentation for the the National's 32032 (little-endian) processor family, and all the instruction encodings were presented with the LSB on the left and MSB on the right. BTW on these processors, the immediate operands and the offsets (1, 2 or 4 bytes) for the addressing modes were encoded in big-endian byte order, but I digress. Consistency is overrated ;-) Gabriel > > > Checkpatch does have one complaint: > > > > CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr' > > #30: FILE: arch/powerpc/lib/sstep.c:1971: > > + if ((regs->ccr) & (1 << (31 - ra))) > > > > I don't really mind the parenteses: I think you are safe to ignore > > checkpatch here unless someone else complains :) > > I find them annoying. If there are too many parentheses, it is hard to > see at a glance what groups where. Also, a suspicious reader might > think there is something special going on (with macros for example). > > This is simple code of course, but :-) > > > If you do end up respinning the patch, I think it would be good to make > > the maths a bit clearer. I think it works because a left shift of 2 is > > the same as multiplying by 4, but it would be easier to follow if you > > used a temporary variable for btf. > > It is very simple. The BFA instruction field is closely related to the > BI instruction field, which is 5 bits, and selects one of the 32 bits in > the CR. If you have "BFA00 BFA01 BFA10 BFA11", that gives the bit > numbers of all four bits in the selected CR field. So the "& ~3" does > all you need. It is quite pretty :-) > > > Segher
Hi! On Fri, Apr 23, 2021 at 12:16:18AM +0200, Gabriel Paubert wrote: > On Thu, Apr 22, 2021 at 02:13:34PM -0500, Segher Boessenkool wrote: > > On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote: > > > Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: > > > > + if ((regs->ccr) & (1 << (31 - ra))) > > > > + op->val = -1; > > > > + else if ((regs->ccr) & (1 << (30 - ra))) > > > > + op->val = 1; > > > > + else > > > > + op->val = 0; > > > > It imo is clearer if written > > > > if ((regs->ccr << ra) & 0x80000000) > > op->val = -1; > > else if ((regs->ccr << ra) & 0x40000000) > > op->val = 1; > > else > > op->val = 0; > > > > but I guess not everyone agrees :-) > > But this can be made jump free :-): > > int tmp = regs->ccr << ra; > op->val = (tmp >> 31) | ((tmp >> 30) & 1); The compiler will do so automatically (or think of some better way to get the same result); in source code, what matters most is readability, or clarity in general (also clarity to the compiler). (Right shifts of negative numbers are implementation-defined in C, fwiw -- but work like you expect in GCC). > (IIRC the srawi instruction sign-extends its result to 64 bits). If you consider it to work on 32-bit inputs, yeah, that is a simple way to express it. > > > CR field: 7 6 5 4 3 2 1 0 > > > bit: 0123 0123 0123 0123 0123 0123 0123 0123 > > > normal bit #: 0.....................................31 > > > ibm bit #: 31.....................................0 > > > > The bit numbers in CR fields are *always* numbered left-to-right. I > > have never seen anyone use LE for it, anyway. > > > > Also, even people who write LE have the bigger end on the left normally > > (they just write some things right-to-left, and other things > > left-to-right). > > Around 1985, I had a documentation for the the National's 32032 > (little-endian) processor family, and all the instruction encodings were > presented with the LSB on the left and MSB on the right. Ouch! Did they write "regular" numbers with the least significant digit on the left as well? > BTW on these processors, the immediate operands and the offsets (1, 2 or > 4 bytes) for the addressing modes were encoded in big-endian byte order, > but I digress. Consistency is overrated ;-) Inconsistency is the spice of life, yeah :-) Segher
On Thu, Apr 22, 2021 at 06:26:16PM -0500, Segher Boessenkool wrote: > Hi! > > On Fri, Apr 23, 2021 at 12:16:18AM +0200, Gabriel Paubert wrote: > > On Thu, Apr 22, 2021 at 02:13:34PM -0500, Segher Boessenkool wrote: > > > On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote: > > > > Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: > > > > > + if ((regs->ccr) & (1 << (31 - ra))) > > > > > + op->val = -1; > > > > > + else if ((regs->ccr) & (1 << (30 - ra))) > > > > > + op->val = 1; > > > > > + else > > > > > + op->val = 0; > > > > > > It imo is clearer if written > > > > > > if ((regs->ccr << ra) & 0x80000000) > > > op->val = -1; > > > else if ((regs->ccr << ra) & 0x40000000) > > > op->val = 1; > > > else > > > op->val = 0; > > > > > > but I guess not everyone agrees :-) > > > > But this can be made jump free :-): > > > > int tmp = regs->ccr << ra; > > op->val = (tmp >> 31) | ((tmp >> 30) & 1); > > The compiler will do so automatically (or think of some better way to > get the same result); in source code, what matters most is readability, > or clarity in general (also clarity to the compiler). I just did a test (trivial code attached) and the original code always produces one conditional branch at -O2, at least with the cross-compiler I have on Debian (gcc 8.3). I have tested both -m32 and -m64. The 64 bit version produces an unnecessary "extsw", so I wrote the second version splitting the setting of the return value which gets rid of it. The second "if" is fairly simple to optimize and the compiler does it properly. Of course with my suggestion the compiler does not produce any branch. But it needs a really good comment. > > (Right shifts of negative numbers are implementation-defined in C, > fwiw -- but work like you expect in GCC). Well, I'm not worried about it, since I'd expect a compiler that does logical right shifts on signed valued to break so much code that it would be easily noticed (also in the kernel). > > > (IIRC the srawi instruction sign-extends its result to 64 bits). > > If you consider it to work on 32-bit inputs, yeah, that is a simple way > to express it. > > > > > CR field: 7 6 5 4 3 2 1 0 > > > > bit: 0123 0123 0123 0123 0123 0123 0123 0123 > > > > normal bit #: 0.....................................31 > > > > ibm bit #: 31.....................................0 > > > > > > The bit numbers in CR fields are *always* numbered left-to-right. I > > > have never seen anyone use LE for it, anyway. > > > > > > Also, even people who write LE have the bigger end on the left normally > > > (they just write some things right-to-left, and other things > > > left-to-right). > > > > Around 1985, I had a documentation for the the National's 32032 > > (little-endian) processor family, and all the instruction encodings were > > presented with the LSB on the left and MSB on the right. > > Ouch! Did they write "regular" numbers with the least significant digit > on the left as well? No, they were not that sadistic! At least instructions were a whole number of bytes, unlike the iAPX432 where jumps needed to encode target addresses down to the bit level. > > > BTW on these processors, the immediate operands and the offsets (1, 2 or > > 4 bytes) for the addressing modes were encoded in big-endian byte order, > > but I digress. Consistency is overrated ;-) > > Inconsistency is the spice of life, yeah :-) ;-) Gabriel
"Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes: > Michael Ellerman wrote: >> "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes: >>> Daniel Axtens wrote: >>>> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: >>>> >>>>> This adds emulation support for the following instruction: >>>>> * Set Boolean (setb) >>>>> >>>>> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> >> ... >>>> >>>> If you do end up respinning the patch, I think it would be good to make >>>> the maths a bit clearer. I think it works because a left shift of 2 is >>>> the same as multiplying by 4, but it would be easier to follow if you >>>> used a temporary variable for btf. >>> >>> Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, >>> the bit we are interested in is: >>> 4 x BFA + 32 >>> >>> So, if we use that along with the PPC_BIT() macro, we get: >>> if (regs->ccr & PPC_BIT(ra + 32)) >> >> Use of PPC_BIT risks annoying your maintainer :) > > Uh oh... that isn't good :) > > I looked up previous discussions and I think I now understand why you > don't prefer it. Hah, I'd forgotten I'd written (ranted :D) about this in the past. > But, I feel it helps make it easy to follow the code when referring to > the ISA. That's true. But I think that's much much less common than people reading the code in isolation. And ultimately it doesn't matter if the code (appears to) match the ISA, it matters that the code works. My worry is that too much use of those type of macros obscures what's actually happening. > I'm wondering if it is just the name you dislike and if so, > does it make sense to rename PPC_BIT() to something else? We have > BIT_ULL(), so perhaps BIT_MSB_ULL() or MSB_BIT_ULL()? The name is part of it. But I don't really like BIT_ULL() either, it hides in a macro something that could just be there in front of you ie. (1ull << x). For this case of setb, I think I'd go with something like below. It doesn't exactly match the ISA, but I think there's minimal obfuscation of what's actually going on. // ra is now bfa ra = (ra >> 2); // Extract 4-bit CR field val = regs->ccr >> (CR0_SHIFT - 4 * ra); if (val & 8) op->val = -1; else if (val & 4) op->val = 1; else op->val = 0; If anything could use a macro it would be the 8 and 4, eg. CR_LT, CR_GT. Of course that's probably got a bug in it, because I just wrote it by eye and it's 11:28 pm :) cheers
On Fri, Apr 23, 2021 at 12:26:57PM +0200, Gabriel Paubert wrote: > On Thu, Apr 22, 2021 at 06:26:16PM -0500, Segher Boessenkool wrote: > > > But this can be made jump free :-): > > > > > > int tmp = regs->ccr << ra; > > > op->val = (tmp >> 31) | ((tmp >> 30) & 1); > > > > The compiler will do so automatically (or think of some better way to > > get the same result); in source code, what matters most is readability, > > or clarity in general (also clarity to the compiler). > > I just did a test (trivial code attached) and the original code always > produces one conditional branch at -O2, at least with the cross-compiler > I have on Debian (gcc 8.3). I have tested both -m32 and -m64. The 64 bit > version produces an unnecessary "extsw", so I wrote the second version > splitting the setting of the return value which gets rid of it. That is an older compiler, and it will be out-of-service any day now. It depends on what compiler flags you use, and what version of the ISA you are targetting. > The second "if" is fairly simple to optimize and the compiler does it > properly. Yeah. > Of course with my suggestion the compiler does not produce any branch. > But it needs a really good comment. Or you could try and help improve the compiler ;-) You can do this without writing compiler code yourself, by writing up some good enhancement request in bugzilla. The wider and more OoO the processors become, the more important it becomes to have branch-free code, in situations where the branches would not be well-predictable. > > (Right shifts of negative numbers are implementation-defined in C, > > fwiw -- but work like you expect in GCC). > > Well, I'm not worried about it, since I'd expect a compiler that does > logical right shifts on signed valued to break so much code that it > would be easily noticed (also in the kernel). Yup. And it *is* defined for signed values, as long as they are non-negative (the common case). > > > > Also, even people who write LE have the bigger end on the left normally > > > > (they just write some things right-to-left, and other things > > > > left-to-right). > > > > > > Around 1985, I had a documentation for the the National's 32032 > > > (little-endian) processor family, and all the instruction encodings were > > > presented with the LSB on the left and MSB on the right. > > > > Ouch! Did they write "regular" numbers with the least significant digit > > on the left as well? > > No, they were not that sadistic! But more inconsistent :-) Segher
Segher Boessenkool <segher@kernel.crashing.org> writes: > Hi! > > On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote: >> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: >> Ok, if I've understood correctly... >> >> > + ra = ra & ~0x3; >> >> This masks off the bits of RA that are not part of BTF: >> >> ra is in [0, 31] which is [0b00000, 0b11111] >> Then ~0x3 = ~0b00011 >> ra = ra & 0b11100 >> >> This gives us then, >> ra = btf << 2; or >> btf = ra >> 2; > > Yes. In effect, you want the offset in bits of the CR field, which is > just fine like this. But a comment would not hurt. > >> Let's then check to see if your calculations read the right fields. >> >> > + if ((regs->ccr) & (1 << (31 - ra))) >> > + op->val = -1; >> > + else if ((regs->ccr) & (1 << (30 - ra))) >> > + op->val = 1; >> > + else >> > + op->val = 0; > > It imo is clearer if written > > if ((regs->ccr << ra) & 0x80000000) > op->val = -1; > else if ((regs->ccr << ra) & 0x40000000) > op->val = 1; > else > op->val = 0; > > but I guess not everyone agrees :-) > >> CR field: 7 6 5 4 3 2 1 0 >> bit: 0123 0123 0123 0123 0123 0123 0123 0123 >> normal bit #: 0.....................................31 >> ibm bit #: 31.....................................0 > > The bit numbers in CR fields are *always* numbered left-to-right. I > have never seen anyone use LE for it, anyway. > > Also, even people who write LE have the bigger end on the left normally > (they just write some things right-to-left, and other things > left-to-right). Sorry, the numbers in the CR fields weren't meant to be especially meaningful, I was just trying to convince myself that we referenced the same bits doing the ISA way vs the way this code did it. Kind regards, Daniel > >> Checkpatch does have one complaint: >> >> CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr' >> #30: FILE: arch/powerpc/lib/sstep.c:1971: >> + if ((regs->ccr) & (1 << (31 - ra))) >> >> I don't really mind the parenteses: I think you are safe to ignore >> checkpatch here unless someone else complains :) > > I find them annoying. If there are too many parentheses, it is hard to > see at a glance what groups where. Also, a suspicious reader might > think there is something special going on (with macros for example). > > This is simple code of course, but :-) > >> If you do end up respinning the patch, I think it would be good to make >> the maths a bit clearer. I think it works because a left shift of 2 is >> the same as multiplying by 4, but it would be easier to follow if you >> used a temporary variable for btf. > > It is very simple. The BFA instruction field is closely related to the > BI instruction field, which is 5 bits, and selects one of the 32 bits in > the CR. If you have "BFA00 BFA01 BFA10 BFA11", that gives the bit > numbers of all four bits in the selected CR field. So the "& ~3" does > all you need. It is quite pretty :-) > > > Segher
Michael Ellerman wrote: > "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes: >> Michael Ellerman wrote: >>> "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes: >>>> Daniel Axtens wrote: >>>>> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes: >>>>> >>>>>> This adds emulation support for the following instruction: >>>>>> * Set Boolean (setb) >>>>>> >>>>>> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> >>> ... >>>>> >>>>> If you do end up respinning the patch, I think it would be good to make >>>>> the maths a bit clearer. I think it works because a left shift of 2 is >>>>> the same as multiplying by 4, but it would be easier to follow if you >>>>> used a temporary variable for btf. >>>> >>>> Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, >>>> the bit we are interested in is: >>>> 4 x BFA + 32 >>>> >>>> So, if we use that along with the PPC_BIT() macro, we get: >>>> if (regs->ccr & PPC_BIT(ra + 32)) >>> >>> Use of PPC_BIT risks annoying your maintainer :) >> >> Uh oh... that isn't good :) >> >> I looked up previous discussions and I think I now understand why you >> don't prefer it. > > Hah, I'd forgotten I'd written (ranted :D) about this in the past. > >> But, I feel it helps make it easy to follow the code when referring to >> the ISA. > > That's true. But I think that's much much less common than people > reading the code in isolation. I thought that isn't so for at least the instruction emulation infrastructure... > > And ultimately it doesn't matter if the code (appears to) match the ISA, > it matters that the code works. My worry is that too much use of those > type of macros obscures what's actually happening. ... but, I agree on the above point. I can see why it is better to keep it simple. I also see precedence for what both you and Segher are suggesting in the existing code in sstep.c > >> I'm wondering if it is just the name you dislike and if so, >> does it make sense to rename PPC_BIT() to something else? We have >> BIT_ULL(), so perhaps BIT_MSB_ULL() or MSB_BIT_ULL()? > > The name is part of it. But I don't really like BIT_ULL() either, it > hides in a macro something that could just be there in front of you > ie. (1ull << x). > > > For this case of setb, I think I'd go with something like below. It > doesn't exactly match the ISA, but I think there's minimal obfuscation > of what's actually going on. > > // ra is now bfa > ra = (ra >> 2); > > // Extract 4-bit CR field > val = regs->ccr >> (CR0_SHIFT - 4 * ra); > > if (val & 8) > op->val = -1; > else if (val & 4) > op->val = 1; > else > op->val = 0; > > > If anything could use a macro it would be the 8 and 4, eg. CR_LT, CR_GT. > > Of course that's probably got a bug in it, because I just wrote it by > eye and it's 11:28 pm :) LGTM, thanks. I'll let Sathvika decide on which variant she wants to go with for v2 :) - Naveen
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c index c6aebc149d14..263c613d7490 100644 --- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -1964,6 +1964,18 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs, op->val = ~(regs->gpr[rd] | regs->gpr[rb]); goto logical_done; + case 128: /* setb */ + if (!cpu_has_feature(CPU_FTR_ARCH_300)) + goto unknown_opcode; + ra = ra & ~0x3; + if ((regs->ccr) & (1 << (31 - ra))) + op->val = -1; + else if ((regs->ccr) & (1 << (30 - ra))) + op->val = 1; + else + op->val = 0; + goto compute_done; + case 154: /* prtyw */ do_prty(regs, op, regs->gpr[rd], 32); goto logical_done_nocc;
This adds emulation support for the following instruction: * Set Boolean (setb) Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> --- arch/powerpc/lib/sstep.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)