diff mbox series

[1/2] powerpc/sstep: Add emulation support for ‘setb’ instruction

Message ID 767e53c4c27da024ca277e21ffcd0cff131f5c73.1618469454.git.sathvika@linux.vnet.ibm.com (mailing list archive)
State Changes Requested
Headers show
Series powerpc/sstep: Add emulation support and tests for 'setb' instruction | expand
Related show

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (0702e74703f57173e70cfab2a79a3e682e9e96ec)
snowpatch_ozlabs/checkpatch warning total: 0 errors, 0 warnings, 1 checks, 18 lines checked
snowpatch_ozlabs/needsstable success Patch has no Fixes tags

Commit Message

Sathvika Vasireddy April 16, 2021, 7:02 a.m. UTC
This adds emulation support for the following instruction:
   * Set Boolean (setb)

Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com>
---
 arch/powerpc/lib/sstep.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Daniel Axtens April 16, 2021, 7:44 a.m. UTC | #1
Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:

> This adds emulation support for the following instruction:
>    * Set Boolean (setb)
>
> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com>
> ---
>  arch/powerpc/lib/sstep.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index c6aebc149d14..263c613d7490 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1964,6 +1964,18 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  			op->val = ~(regs->gpr[rd] | regs->gpr[rb]);
>  			goto logical_done;
>  
> +		case 128:	/* setb */
> +			if (!cpu_has_feature(CPU_FTR_ARCH_300))
> +				goto unknown_opcode;

Ok, if I've understood correctly...

> +			ra = ra & ~0x3;

This masks off the bits of RA that are not part of BTF:

ra is in [0, 31] which is [0b00000, 0b11111]
Then ~0x3 = ~0b00011
ra = ra & 0b11100

This gives us then,
ra = btf << 2; or
btf = ra >> 2;

Let's then check to see if your calculations read the right fields.

> +			if ((regs->ccr) & (1 << (31 - ra)))
> +				op->val = -1;
> +			else if ((regs->ccr) & (1 << (30 - ra)))
> +				op->val = 1;
> +			else
> +				op->val = 0;


CR field:      7    6    5    4    3    2    1    0
bit:          0123 0123 0123 0123 0123 0123 0123 0123
normal bit #: 0.....................................31
ibm bit #:   31.....................................0

If btf = 0, ra = 0, check normal bits 31 and 30, which are both in CR0.
CR field:      7    6    5    4    3    2    1    0
bit:          0123 0123 0123 0123 0123 0123 0123 0123
                                                   ^^

If btf = 7, ra = 0b11100 = 28, so check normal bits 31-28 and 30-28,
which are 3 and 2.

CR field:      7    6    5    4    3    2    1    0
bit:          0123 0123 0123 0123 0123 0123 0123 0123
                ^^

If btf = 3, ra = 0b01100 = 12, for normal bits 19 and 18:

CR field:      7    6    5    4    3    2    1    0
bit:          0123 0123 0123 0123 0123 0123 0123 0123
                                    ^^

So yes, your calculations, while I struggle to follow _how_ they work,
do in fact seem to work.

Checkpatch does have one complaint:

CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr'
#30: FILE: arch/powerpc/lib/sstep.c:1971:
+			if ((regs->ccr) & (1 << (31 - ra)))

I don't really mind the parenteses: I think you are safe to ignore
checkpatch here unless someone else complains :)

If you do end up respinning the patch, I think it would be good to make
the maths a bit clearer. I think it works because a left shift of 2 is
the same as multiplying by 4, but it would be easier to follow if you
used a temporary variable for btf.

However, I do think this is still worth adding to the kernel either way,
so:

Reviewed-by: Daniel Axtens <dja@axtens.net>

Kind regards,
Daniel

> +			goto compute_done;
> +
>  		case 154:	/* prtyw */
>  			do_prty(regs, op, regs->gpr[rd], 32);
>  			goto logical_done_nocc;
> -- 
> 2.16.4
Naveen N. Rao April 20, 2021, 6:26 a.m. UTC | #2
Daniel Axtens wrote:
> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:
> 
>> This adds emulation support for the following instruction:
>>    * Set Boolean (setb)
>>
>> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/lib/sstep.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
>> index c6aebc149d14..263c613d7490 100644
>> --- a/arch/powerpc/lib/sstep.c
>> +++ b/arch/powerpc/lib/sstep.c
>> @@ -1964,6 +1964,18 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>>  			op->val = ~(regs->gpr[rd] | regs->gpr[rb]);
>>  			goto logical_done;
>>  
>> +		case 128:	/* setb */
>> +			if (!cpu_has_feature(CPU_FTR_ARCH_300))
>> +				goto unknown_opcode;
> 
> Ok, if I've understood correctly...
> 
>> +			ra = ra & ~0x3;
> 
> This masks off the bits of RA that are not part of BTF:
> 
> ra is in [0, 31] which is [0b00000, 0b11111]
> Then ~0x3 = ~0b00011
> ra = ra & 0b11100
> 
> This gives us then,
> ra = btf << 2; or
> btf = ra >> 2;
> 
> Let's then check to see if your calculations read the right fields.
> 
>> +			if ((regs->ccr) & (1 << (31 - ra)))
>> +				op->val = -1;
>> +			else if ((regs->ccr) & (1 << (30 - ra)))
>> +				op->val = 1;
>> +			else
>> +				op->val = 0;
> 
> 
> CR field:      7    6    5    4    3    2    1    0
> bit:          0123 0123 0123 0123 0123 0123 0123 0123
> normal bit #: 0.....................................31
> ibm bit #:   31.....................................0
> 
> If btf = 0, ra = 0, check normal bits 31 and 30, which are both in CR0.
> CR field:      7    6    5    4    3    2    1    0
> bit:          0123 0123 0123 0123 0123 0123 0123 0123
>                                                    ^^
> 
> If btf = 7, ra = 0b11100 = 28, so check normal bits 31-28 and 30-28,
> which are 3 and 2.
> 
> CR field:      7    6    5    4    3    2    1    0
> bit:          0123 0123 0123 0123 0123 0123 0123 0123
>                 ^^
> 
> If btf = 3, ra = 0b01100 = 12, for normal bits 19 and 18:
> 
> CR field:      7    6    5    4    3    2    1    0
> bit:          0123 0123 0123 0123 0123 0123 0123 0123
>                                     ^^
> 
> So yes, your calculations, while I struggle to follow _how_ they work,
> do in fact seem to work.
> 
> Checkpatch does have one complaint:
> 
> CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr'
> #30: FILE: arch/powerpc/lib/sstep.c:1971:
> +			if ((regs->ccr) & (1 << (31 - ra)))
> 
> I don't really mind the parenteses: I think you are safe to ignore
> checkpatch here unless someone else complains :)
> 
> If you do end up respinning the patch, I think it would be good to make
> the maths a bit clearer. I think it works because a left shift of 2 is
> the same as multiplying by 4, but it would be easier to follow if you
> used a temporary variable for btf.

Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, 
the bit we are interested in is:
	4 x BFA + 32

So, if we use that along with the PPC_BIT() macro, we get:
	if (regs->ccr & PPC_BIT(ra + 32))


>> +			goto compute_done;
>> +

I can see why you thought this should be in the section with other 
logical instructions. However, since this instruction does not modify CR 
itself, this is probably better placed earlier -- somewhere near 'mfcr' 
instruction emulation.


- Naveen
Michael Ellerman April 21, 2021, 7:30 a.m. UTC | #3
"Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes:
> Daniel Axtens wrote:
>> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:
>> 
>>> This adds emulation support for the following instruction:
>>>    * Set Boolean (setb)
>>>
>>> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com>
...
>> 
>> If you do end up respinning the patch, I think it would be good to make
>> the maths a bit clearer. I think it works because a left shift of 2 is
>> the same as multiplying by 4, but it would be easier to follow if you
>> used a temporary variable for btf.
>
> Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, 
> the bit we are interested in is:
> 	4 x BFA + 32
>
> So, if we use that along with the PPC_BIT() macro, we get:
> 	if (regs->ccr & PPC_BIT(ra + 32))

Use of PPC_BIT risks annoying your maintainer :)

cheers
Naveen N. Rao April 22, 2021, 10:01 a.m. UTC | #4
Michael Ellerman wrote:
> "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes:
>> Daniel Axtens wrote:
>>> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:
>>> 
>>>> This adds emulation support for the following instruction:
>>>>    * Set Boolean (setb)
>>>>
>>>> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com>
> ...
>>> 
>>> If you do end up respinning the patch, I think it would be good to make
>>> the maths a bit clearer. I think it works because a left shift of 2 is
>>> the same as multiplying by 4, but it would be easier to follow if you
>>> used a temporary variable for btf.
>>
>> Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, 
>> the bit we are interested in is:
>> 	4 x BFA + 32
>>
>> So, if we use that along with the PPC_BIT() macro, we get:
>> 	if (regs->ccr & PPC_BIT(ra + 32))
> 
> Use of PPC_BIT risks annoying your maintainer :)

Uh oh... that isn't good :)

I looked up previous discussions and I think I now understand why you 
don't prefer it.

But, I feel it helps make it easy to follow the code when referring to 
the ISA. I'm wondering if it is just the name you dislike and if so, 
does it make sense to rename PPC_BIT() to something else? We have 
BIT_ULL(), so perhaps BIT_MSB_ULL() or MSB_BIT_ULL()?


- Naveen
Segher Boessenkool April 22, 2021, 7:13 p.m. UTC | #5
Hi!

On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote:
> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:
> Ok, if I've understood correctly...
> 
> > +			ra = ra & ~0x3;
> 
> This masks off the bits of RA that are not part of BTF:
> 
> ra is in [0, 31] which is [0b00000, 0b11111]
> Then ~0x3 = ~0b00011
> ra = ra & 0b11100
> 
> This gives us then,
> ra = btf << 2; or
> btf = ra >> 2;

Yes.  In effect, you want the offset in bits of the CR field, which is
just fine like this.  But a comment would not hurt.

> Let's then check to see if your calculations read the right fields.
> 
> > +			if ((regs->ccr) & (1 << (31 - ra)))
> > +				op->val = -1;
> > +			else if ((regs->ccr) & (1 << (30 - ra)))
> > +				op->val = 1;
> > +			else
> > +				op->val = 0;

It imo is clearer if written

			if ((regs->ccr << ra) & 0x80000000)
				op->val = -1;
			else if ((regs->ccr << ra) & 0x40000000)
				op->val = 1;
			else
				op->val = 0;

but I guess not everyone agrees :-)

> CR field:      7    6    5    4    3    2    1    0
> bit:          0123 0123 0123 0123 0123 0123 0123 0123
> normal bit #: 0.....................................31
> ibm bit #:   31.....................................0

The bit numbers in CR fields are *always* numbered left-to-right.  I
have never seen anyone use LE for it, anyway.

Also, even people who write LE have the bigger end on the left normally
(they just write some things right-to-left, and other things
left-to-right).

> Checkpatch does have one complaint:
> 
> CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr'
> #30: FILE: arch/powerpc/lib/sstep.c:1971:
> +			if ((regs->ccr) & (1 << (31 - ra)))
> 
> I don't really mind the parenteses: I think you are safe to ignore
> checkpatch here unless someone else complains :)

I find them annoying.  If there are too many parentheses, it is hard to
see at a glance what groups where.  Also, a suspicious reader might
think there is something special going on (with macros for example).

This is simple code of course, but :-)

> If you do end up respinning the patch, I think it would be good to make
> the maths a bit clearer. I think it works because a left shift of 2 is
> the same as multiplying by 4, but it would be easier to follow if you
> used a temporary variable for btf.

It is very simple.  The BFA instruction field is closely related to the
BI instruction field, which is 5 bits, and selects one of the 32 bits in
the CR.  If you have "BFA00 BFA01 BFA10 BFA11", that gives the bit
numbers of all four bits in the selected CR field.  So the "& ~3" does
all you need.  It is quite pretty :-)


Segher
Gabriel Paubert April 22, 2021, 10:16 p.m. UTC | #6
Hi,

On Thu, Apr 22, 2021 at 02:13:34PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote:
> > Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:
> > Ok, if I've understood correctly...
> > 
> > > +			ra = ra & ~0x3;
> > 
> > This masks off the bits of RA that are not part of BTF:
> > 
> > ra is in [0, 31] which is [0b00000, 0b11111]
> > Then ~0x3 = ~0b00011
> > ra = ra & 0b11100
> > 
> > This gives us then,
> > ra = btf << 2; or
> > btf = ra >> 2;
> 
> Yes.  In effect, you want the offset in bits of the CR field, which is
> just fine like this.  But a comment would not hurt.
> 
> > Let's then check to see if your calculations read the right fields.
> > 
> > > +			if ((regs->ccr) & (1 << (31 - ra)))
> > > +				op->val = -1;
> > > +			else if ((regs->ccr) & (1 << (30 - ra)))
> > > +				op->val = 1;
> > > +			else
> > > +				op->val = 0;
> 
> It imo is clearer if written
> 
> 			if ((regs->ccr << ra) & 0x80000000)
> 				op->val = -1;
> 			else if ((regs->ccr << ra) & 0x40000000)
> 				op->val = 1;
> 			else
> 				op->val = 0;
> 
> but I guess not everyone agrees :-)
> 

But this can be made jump free :-):

	int tmp = regs->ccr << ra;
	op->val = (tmp >> 31) | ((tmp >> 30) & 1);

(IIRC the srawi instruction sign-extends its result to 64 bits).



> > CR field:      7    6    5    4    3    2    1    0
> > bit:          0123 0123 0123 0123 0123 0123 0123 0123
> > normal bit #: 0.....................................31
> > ibm bit #:   31.....................................0
> 
> The bit numbers in CR fields are *always* numbered left-to-right.  I
> have never seen anyone use LE for it, anyway.
> 
> Also, even people who write LE have the bigger end on the left normally
> (they just write some things right-to-left, and other things
> left-to-right).

Around 1985, I had a documentation for the the National's 32032
(little-endian) processor family, and all the instruction encodings were
presented with the LSB on the left and MSB on the right.

BTW on these processors, the immediate operands and the offsets (1, 2 or
4 bytes) for the addressing modes were encoded in big-endian byte order,
but I digress. Consistency is overrated ;-)

	Gabriel


> 
> > Checkpatch does have one complaint:
> > 
> > CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr'
> > #30: FILE: arch/powerpc/lib/sstep.c:1971:
> > +			if ((regs->ccr) & (1 << (31 - ra)))
> > 
> > I don't really mind the parenteses: I think you are safe to ignore
> > checkpatch here unless someone else complains :)
> 
> I find them annoying.  If there are too many parentheses, it is hard to
> see at a glance what groups where.  Also, a suspicious reader might
> think there is something special going on (with macros for example).
> 
> This is simple code of course, but :-)
> 
> > If you do end up respinning the patch, I think it would be good to make
> > the maths a bit clearer. I think it works because a left shift of 2 is
> > the same as multiplying by 4, but it would be easier to follow if you
> > used a temporary variable for btf.
> 
> It is very simple.  The BFA instruction field is closely related to the
> BI instruction field, which is 5 bits, and selects one of the 32 bits in
> the CR.  If you have "BFA00 BFA01 BFA10 BFA11", that gives the bit
> numbers of all four bits in the selected CR field.  So the "& ~3" does
> all you need.  It is quite pretty :-)
> 
> 
> Segher
Segher Boessenkool April 22, 2021, 11:26 p.m. UTC | #7
Hi!

On Fri, Apr 23, 2021 at 12:16:18AM +0200, Gabriel Paubert wrote:
> On Thu, Apr 22, 2021 at 02:13:34PM -0500, Segher Boessenkool wrote:
> > On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote:
> > > Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:
> > > > +			if ((regs->ccr) & (1 << (31 - ra)))
> > > > +				op->val = -1;
> > > > +			else if ((regs->ccr) & (1 << (30 - ra)))
> > > > +				op->val = 1;
> > > > +			else
> > > > +				op->val = 0;
> > 
> > It imo is clearer if written
> > 
> > 			if ((regs->ccr << ra) & 0x80000000)
> > 				op->val = -1;
> > 			else if ((regs->ccr << ra) & 0x40000000)
> > 				op->val = 1;
> > 			else
> > 				op->val = 0;
> > 
> > but I guess not everyone agrees :-)
> 
> But this can be made jump free :-):
> 
> 	int tmp = regs->ccr << ra;
> 	op->val = (tmp >> 31) | ((tmp >> 30) & 1);

The compiler will do so automatically (or think of some better way to
get the same result); in source code, what matters most is readability,
or clarity in general (also clarity to the compiler).

(Right shifts of negative numbers are implementation-defined in C,
fwiw -- but work like you expect in GCC).

> (IIRC the srawi instruction sign-extends its result to 64 bits).

If you consider it to work on 32-bit inputs, yeah, that is a simple way
to express it.

> > > CR field:      7    6    5    4    3    2    1    0
> > > bit:          0123 0123 0123 0123 0123 0123 0123 0123
> > > normal bit #: 0.....................................31
> > > ibm bit #:   31.....................................0
> > 
> > The bit numbers in CR fields are *always* numbered left-to-right.  I
> > have never seen anyone use LE for it, anyway.
> > 
> > Also, even people who write LE have the bigger end on the left normally
> > (they just write some things right-to-left, and other things
> > left-to-right).
> 
> Around 1985, I had a documentation for the the National's 32032
> (little-endian) processor family, and all the instruction encodings were
> presented with the LSB on the left and MSB on the right.

Ouch!  Did they write "regular" numbers with the least significant digit
on the left as well?

> BTW on these processors, the immediate operands and the offsets (1, 2 or
> 4 bytes) for the addressing modes were encoded in big-endian byte order,
> but I digress. Consistency is overrated ;-)

Inconsistency is the spice of life, yeah :-)


Segher
Gabriel Paubert April 23, 2021, 10:26 a.m. UTC | #8
On Thu, Apr 22, 2021 at 06:26:16PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Apr 23, 2021 at 12:16:18AM +0200, Gabriel Paubert wrote:
> > On Thu, Apr 22, 2021 at 02:13:34PM -0500, Segher Boessenkool wrote:
> > > On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote:
> > > > Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:
> > > > > +			if ((regs->ccr) & (1 << (31 - ra)))
> > > > > +				op->val = -1;
> > > > > +			else if ((regs->ccr) & (1 << (30 - ra)))
> > > > > +				op->val = 1;
> > > > > +			else
> > > > > +				op->val = 0;
> > > 
> > > It imo is clearer if written
> > > 
> > > 			if ((regs->ccr << ra) & 0x80000000)
> > > 				op->val = -1;
> > > 			else if ((regs->ccr << ra) & 0x40000000)
> > > 				op->val = 1;
> > > 			else
> > > 				op->val = 0;
> > > 
> > > but I guess not everyone agrees :-)
> > 
> > But this can be made jump free :-):
> > 
> > 	int tmp = regs->ccr << ra;
> > 	op->val = (tmp >> 31) | ((tmp >> 30) & 1);
> 
> The compiler will do so automatically (or think of some better way to
> get the same result); in source code, what matters most is readability,
> or clarity in general (also clarity to the compiler).

I just did a test (trivial code attached) and the original code always
produces one conditional branch at -O2, at least with the cross-compiler
I have on Debian (gcc 8.3). I have tested both -m32 and -m64. The 64 bit
version produces an unnecessary "extsw", so I wrote the second version
splitting the setting of the return value which gets rid of it.

The second "if" is fairly simple to optimize and the compiler does it
properly.

Of course with my suggestion the compiler does not produce any branch. 
But it needs a really good comment.


> 
> (Right shifts of negative numbers are implementation-defined in C,
> fwiw -- but work like you expect in GCC).

Well, I'm not worried about it, since I'd expect a compiler that does
logical right shifts on signed valued to break so much code that it
would be easily noticed (also in the kernel).


> 
> > (IIRC the srawi instruction sign-extends its result to 64 bits).
> 
> If you consider it to work on 32-bit inputs, yeah, that is a simple way
> to express it.
> 
> > > > CR field:      7    6    5    4    3    2    1    0
> > > > bit:          0123 0123 0123 0123 0123 0123 0123 0123
> > > > normal bit #: 0.....................................31
> > > > ibm bit #:   31.....................................0
> > > 
> > > The bit numbers in CR fields are *always* numbered left-to-right.  I
> > > have never seen anyone use LE for it, anyway.
> > > 
> > > Also, even people who write LE have the bigger end on the left normally
> > > (they just write some things right-to-left, and other things
> > > left-to-right).
> > 
> > Around 1985, I had a documentation for the the National's 32032
> > (little-endian) processor family, and all the instruction encodings were
> > presented with the LSB on the left and MSB on the right.
> 
> Ouch!  Did they write "regular" numbers with the least significant digit
> on the left as well?

No, they were not that sadistic!

At least instructions were a whole number of bytes, unlike the iAPX432
where jumps needed to encode target addresses down to the bit level.

> 
> > BTW on these processors, the immediate operands and the offsets (1, 2 or
> > 4 bytes) for the addressing modes were encoded in big-endian byte order,
> > but I digress. Consistency is overrated ;-)
> 
> Inconsistency is the spice of life, yeah :-)

;-)

	Gabriel
Michael Ellerman April 23, 2021, 1:29 p.m. UTC | #9
"Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes:
> Michael Ellerman wrote:
>> "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes:
>>> Daniel Axtens wrote:
>>>> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:
>>>> 
>>>>> This adds emulation support for the following instruction:
>>>>>    * Set Boolean (setb)
>>>>>
>>>>> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com>
>> ...
>>>> 
>>>> If you do end up respinning the patch, I think it would be good to make
>>>> the maths a bit clearer. I think it works because a left shift of 2 is
>>>> the same as multiplying by 4, but it would be easier to follow if you
>>>> used a temporary variable for btf.
>>>
>>> Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, 
>>> the bit we are interested in is:
>>> 	4 x BFA + 32
>>>
>>> So, if we use that along with the PPC_BIT() macro, we get:
>>> 	if (regs->ccr & PPC_BIT(ra + 32))
>> 
>> Use of PPC_BIT risks annoying your maintainer :)
>
> Uh oh... that isn't good :)
>
> I looked up previous discussions and I think I now understand why you 
> don't prefer it.

Hah, I'd forgotten I'd written (ranted :D) about this in the past.

> But, I feel it helps make it easy to follow the code when referring to 
> the ISA.

That's true. But I think that's much much less common than people
reading the code in isolation.

And ultimately it doesn't matter if the code (appears to) match the ISA,
it matters that the code works. My worry is that too much use of those
type of macros obscures what's actually happening.

> I'm wondering if it is just the name you dislike and if so, 
> does it make sense to rename PPC_BIT() to something else? We have 
> BIT_ULL(), so perhaps BIT_MSB_ULL() or MSB_BIT_ULL()?

The name is part of it. But I don't really like BIT_ULL() either, it
hides in a macro something that could just be there in front of you
ie. (1ull << x).


For this case of setb, I think I'd go with something like below. It
doesn't exactly match the ISA, but I think there's minimal obfuscation
of what's actually going on.

    	// ra is now bfa
	ra = (ra >> 2);

	// Extract 4-bit CR field
	val = regs->ccr >> (CR0_SHIFT - 4 * ra);

	if (val & 8)
		op->val = -1;
	else if (val & 4)
		op->val = 1;
	else
		op->val = 0;


If anything could use a macro it would be the 8 and 4, eg. CR_LT, CR_GT.

Of course that's probably got a bug in it, because I just wrote it by
eye and it's 11:28 pm :)

cheers
Segher Boessenkool April 23, 2021, 4:57 p.m. UTC | #10
On Fri, Apr 23, 2021 at 12:26:57PM +0200, Gabriel Paubert wrote:
> On Thu, Apr 22, 2021 at 06:26:16PM -0500, Segher Boessenkool wrote:
> > > But this can be made jump free :-):
> > > 
> > > 	int tmp = regs->ccr << ra;
> > > 	op->val = (tmp >> 31) | ((tmp >> 30) & 1);
> > 
> > The compiler will do so automatically (or think of some better way to
> > get the same result); in source code, what matters most is readability,
> > or clarity in general (also clarity to the compiler).
> 
> I just did a test (trivial code attached) and the original code always
> produces one conditional branch at -O2, at least with the cross-compiler
> I have on Debian (gcc 8.3). I have tested both -m32 and -m64. The 64 bit
> version produces an unnecessary "extsw", so I wrote the second version
> splitting the setting of the return value which gets rid of it.

That is an older compiler, and it will be out-of-service any day now.

It depends on what compiler flags you use, and what version of the ISA
you are targetting.

> The second "if" is fairly simple to optimize and the compiler does it
> properly.

Yeah.

> Of course with my suggestion the compiler does not produce any branch. 
> But it needs a really good comment.

Or you could try and help improve the compiler ;-)  You can do this
without writing compiler code yourself, by writing up some good
enhancement request in bugzilla.

The wider and more OoO the processors become, the more important it
becomes to have branch-free code, in situations where the branches would
not be well-predictable.

> > (Right shifts of negative numbers are implementation-defined in C,
> > fwiw -- but work like you expect in GCC).
> 
> Well, I'm not worried about it, since I'd expect a compiler that does
> logical right shifts on signed valued to break so much code that it
> would be easily noticed (also in the kernel).

Yup.  And it *is* defined for signed values, as long as they are
non-negative (the common case).

> > > > Also, even people who write LE have the bigger end on the left normally
> > > > (they just write some things right-to-left, and other things
> > > > left-to-right).
> > > 
> > > Around 1985, I had a documentation for the the National's 32032
> > > (little-endian) processor family, and all the instruction encodings were
> > > presented with the LSB on the left and MSB on the right.
> > 
> > Ouch!  Did they write "regular" numbers with the least significant digit
> > on the left as well?
> 
> No, they were not that sadistic!

But more inconsistent :-)


Segher
Daniel Axtens April 24, 2021, 4:13 p.m. UTC | #11
Segher Boessenkool <segher@kernel.crashing.org> writes:

> Hi!
>
> On Fri, Apr 16, 2021 at 05:44:52PM +1000, Daniel Axtens wrote:
>> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:
>> Ok, if I've understood correctly...
>> 
>> > +			ra = ra & ~0x3;
>> 
>> This masks off the bits of RA that are not part of BTF:
>> 
>> ra is in [0, 31] which is [0b00000, 0b11111]
>> Then ~0x3 = ~0b00011
>> ra = ra & 0b11100
>> 
>> This gives us then,
>> ra = btf << 2; or
>> btf = ra >> 2;
>
> Yes.  In effect, you want the offset in bits of the CR field, which is
> just fine like this.  But a comment would not hurt.
>
>> Let's then check to see if your calculations read the right fields.
>> 
>> > +			if ((regs->ccr) & (1 << (31 - ra)))
>> > +				op->val = -1;
>> > +			else if ((regs->ccr) & (1 << (30 - ra)))
>> > +				op->val = 1;
>> > +			else
>> > +				op->val = 0;
>
> It imo is clearer if written
>
> 			if ((regs->ccr << ra) & 0x80000000)
> 				op->val = -1;
> 			else if ((regs->ccr << ra) & 0x40000000)
> 				op->val = 1;
> 			else
> 				op->val = 0;
>
> but I guess not everyone agrees :-)
>
>> CR field:      7    6    5    4    3    2    1    0
>> bit:          0123 0123 0123 0123 0123 0123 0123 0123
>> normal bit #: 0.....................................31
>> ibm bit #:   31.....................................0
>
> The bit numbers in CR fields are *always* numbered left-to-right.  I
> have never seen anyone use LE for it, anyway.
>
> Also, even people who write LE have the bigger end on the left normally
> (they just write some things right-to-left, and other things
> left-to-right).

Sorry, the numbers in the CR fields weren't meant to be especially
meaningful, I was just trying to convince myself that we referenced the
same bits doing the ISA way vs the way this code did it.

Kind regards,
Daniel
>
>> Checkpatch does have one complaint:
>> 
>> CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'regs->ccr'
>> #30: FILE: arch/powerpc/lib/sstep.c:1971:
>> +			if ((regs->ccr) & (1 << (31 - ra)))
>> 
>> I don't really mind the parenteses: I think you are safe to ignore
>> checkpatch here unless someone else complains :)
>
> I find them annoying.  If there are too many parentheses, it is hard to
> see at a glance what groups where.  Also, a suspicious reader might
> think there is something special going on (with macros for example).
>
> This is simple code of course, but :-)
>
>> If you do end up respinning the patch, I think it would be good to make
>> the maths a bit clearer. I think it works because a left shift of 2 is
>> the same as multiplying by 4, but it would be easier to follow if you
>> used a temporary variable for btf.
>
> It is very simple.  The BFA instruction field is closely related to the
> BI instruction field, which is 5 bits, and selects one of the 32 bits in
> the CR.  If you have "BFA00 BFA01 BFA10 BFA11", that gives the bit
> numbers of all four bits in the selected CR field.  So the "& ~3" does
> all you need.  It is quite pretty :-)
>
>
> Segher
Naveen N. Rao April 27, 2021, 4:44 p.m. UTC | #12
Michael Ellerman wrote:
> "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes:
>> Michael Ellerman wrote:
>>> "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> writes:
>>>> Daniel Axtens wrote:
>>>>> Sathvika Vasireddy <sathvika@linux.vnet.ibm.com> writes:
>>>>> 
>>>>>> This adds emulation support for the following instruction:
>>>>>>    * Set Boolean (setb)
>>>>>>
>>>>>> Signed-off-by: Sathvika Vasireddy <sathvika@linux.vnet.ibm.com>
>>> ...
>>>>> 
>>>>> If you do end up respinning the patch, I think it would be good to make
>>>>> the maths a bit clearer. I think it works because a left shift of 2 is
>>>>> the same as multiplying by 4, but it would be easier to follow if you
>>>>> used a temporary variable for btf.
>>>>
>>>> Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, 
>>>> the bit we are interested in is:
>>>> 	4 x BFA + 32
>>>>
>>>> So, if we use that along with the PPC_BIT() macro, we get:
>>>> 	if (regs->ccr & PPC_BIT(ra + 32))
>>> 
>>> Use of PPC_BIT risks annoying your maintainer :)
>>
>> Uh oh... that isn't good :)
>>
>> I looked up previous discussions and I think I now understand why you 
>> don't prefer it.
> 
> Hah, I'd forgotten I'd written (ranted :D) about this in the past.
> 
>> But, I feel it helps make it easy to follow the code when referring to 
>> the ISA.
> 
> That's true. But I think that's much much less common than people
> reading the code in isolation.

I thought that isn't so for at least the instruction emulation 
infrastructure...

> 
> And ultimately it doesn't matter if the code (appears to) match the ISA,
> it matters that the code works. My worry is that too much use of those
> type of macros obscures what's actually happening.

... but, I agree on the above point. I can see why it is better to keep 
it simple.

I also see precedence for what both you and Segher are suggesting in the 
existing code in sstep.c

> 
>> I'm wondering if it is just the name you dislike and if so, 
>> does it make sense to rename PPC_BIT() to something else? We have 
>> BIT_ULL(), so perhaps BIT_MSB_ULL() or MSB_BIT_ULL()?
> 
> The name is part of it. But I don't really like BIT_ULL() either, it
> hides in a macro something that could just be there in front of you
> ie. (1ull << x).
> 
> 
> For this case of setb, I think I'd go with something like below. It
> doesn't exactly match the ISA, but I think there's minimal obfuscation
> of what's actually going on.
> 
>     	// ra is now bfa
> 	ra = (ra >> 2);
> 
> 	// Extract 4-bit CR field
> 	val = regs->ccr >> (CR0_SHIFT - 4 * ra);
> 
> 	if (val & 8)
> 		op->val = -1;
> 	else if (val & 4)
> 		op->val = 1;
> 	else
> 		op->val = 0;
> 
> 
> If anything could use a macro it would be the 8 and 4, eg. CR_LT, CR_GT.
> 
> Of course that's probably got a bug in it, because I just wrote it by
> eye and it's 11:28 pm :)

LGTM, thanks. I'll let Sathvika decide on which variant she wants to go 
with for v2 :)


- Naveen
diff mbox series

Patch

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index c6aebc149d14..263c613d7490 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1964,6 +1964,18 @@  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			op->val = ~(regs->gpr[rd] | regs->gpr[rb]);
 			goto logical_done;
 
+		case 128:	/* setb */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				goto unknown_opcode;
+			ra = ra & ~0x3;
+			if ((regs->ccr) & (1 << (31 - ra)))
+				op->val = -1;
+			else if ((regs->ccr) & (1 << (30 - ra)))
+				op->val = 1;
+			else
+				op->val = 0;
+			goto compute_done;
+
 		case 154:	/* prtyw */
 			do_prty(regs, op, regs->gpr[rd], 32);
 			goto logical_done_nocc;