diff mbox

[v3,4/5] powerpc/lib/sstep: Add prty instruction emulation

Message ID 20170725033320.17893-4-matthew.brown.dev@gmail.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Matt Brown July 25, 2017, 3:33 a.m. UTC
This adds emulation for the prtyw and prtyd instructions.
Tested for logical correctness against the prtyw and prtyd instructions
on ppc64le.

Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com>
---
v3:
	- optimised using the Giles-Miller method of side-ways addition
v2:
	- fixed opcodes
	- fixed bitshifting and typecast errors
	- merged do_prtyw and do_prtyd into single function
---
 arch/powerpc/lib/sstep.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

Comments

Balbir Singh July 25, 2017, 8:08 a.m. UTC | #1
On Tue, 2017-07-25 at 13:33 +1000, Matt Brown wrote:
> This adds emulation for the prtyw and prtyd instructions.
> Tested for logical correctness against the prtyw and prtyd instructions
> on ppc64le.
> 
> Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com>
> ---
> v3:
> 	- optimised using the Giles-Miller method of side-ways addition
> v2:
> 	- fixed opcodes
> 	- fixed bitshifting and typecast errors
> 	- merged do_prtyw and do_prtyd into single function
> ---
>  arch/powerpc/lib/sstep.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 6a79618..0bcf631 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -655,6 +655,25 @@ static nokprobe_inline void do_bpermd(struct pt_regs *regs, unsigned long v1,
>  	regs->gpr[ra] = perm;
>  }
>  #endif /* __powerpc64__ */
> +/*
> + * The size parameter adjusts the equivalent prty instruction.
> + * prtyw = 32, prtyd = 64
> + */
> +static nokprobe_inline void do_prty(struct pt_regs *regs, unsigned long v,
> +				int size, int ra)
> +{
> +	unsigned long long res = v;
> +
> +	res = (0x0001000100010001 & res) + (0x0001000100010001 & (res >> 8));
> +	res = (0x0000000700000007 & res) + (0x0000000700000007 & (res >> 16));
> +	if (size == 32) {		/* prtyw */
> +		regs->gpr[ra] = (0x0000000100000001 & res);
> +		return;
> +	}
> +
> +	res = (0x000000000000000f & res) + (0x000000000000000f & (res >> 32));
> +	regs->gpr[ra] = res & 1;	/*prtyd */

Looks good, you can also xor instead of adding, but the masks would need
to change in that case.

Balbir Singh.
Segher Boessenkool July 25, 2017, 3:30 p.m. UTC | #2
On Tue, Jul 25, 2017 at 01:33:19PM +1000, Matt Brown wrote:
> +static nokprobe_inline void do_prty(struct pt_regs *regs, unsigned long v,
> +				int size, int ra)
> +{
> +	unsigned long long res = v;
> +
> +	res = (0x0001000100010001 & res) + (0x0001000100010001 & (res >> 8));
> +	res = (0x0000000700000007 & res) + (0x0000000700000007 & (res >> 16));
> +	if (size == 32) {		/* prtyw */
> +		regs->gpr[ra] = (0x0000000100000001 & res);
> +		return;
> +	}
> +
> +	res = (0x000000000000000f & res) + (0x000000000000000f & (res >> 32));
> +	regs->gpr[ra] = res & 1;	/*prtyd */
> +}

Does 7's and 0xf look strange (since the top bit in the values there
is always 0).  You always "and" the values with 1 later, you could do
that immediately (and change + to ^).

A general question about these patches: some things are inside #ifdef
__powerpc64__, some are not.  It seems it is the wrong macro, and it
should be used (or not used) consistently?


Segher
Gabriel Paubert July 26, 2017, 7:49 a.m. UTC | #3
On Tue, Jul 25, 2017 at 06:08:05PM +1000, Balbir Singh wrote:
> On Tue, 2017-07-25 at 13:33 +1000, Matt Brown wrote:
> > This adds emulation for the prtyw and prtyd instructions.
> > Tested for logical correctness against the prtyw and prtyd instructions
> > on ppc64le.
> > 
> > Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com>
> > ---
> > v3:
> > 	- optimised using the Giles-Miller method of side-ways addition
> > v2:
> > 	- fixed opcodes
> > 	- fixed bitshifting and typecast errors
> > 	- merged do_prtyw and do_prtyd into single function
> > ---
> >  arch/powerpc/lib/sstep.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> > 
> > diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> > index 6a79618..0bcf631 100644
> > --- a/arch/powerpc/lib/sstep.c
> > +++ b/arch/powerpc/lib/sstep.c
> > @@ -655,6 +655,25 @@ static nokprobe_inline void do_bpermd(struct pt_regs *regs, unsigned long v1,
> >  	regs->gpr[ra] = perm;
> >  }
> >  #endif /* __powerpc64__ */
> > +/*
> > + * The size parameter adjusts the equivalent prty instruction.
> > + * prtyw = 32, prtyd = 64
> > + */
> > +static nokprobe_inline void do_prty(struct pt_regs *regs, unsigned long v,
> > +				int size, int ra)
> > +{
> > +	unsigned long long res = v;
> > +
> > +	res = (0x0001000100010001 & res) + (0x0001000100010001 & (res >> 8));
> > +	res = (0x0000000700000007 & res) + (0x0000000700000007 & (res >> 16));
> > +	if (size == 32) {		/* prtyw */
> > +		regs->gpr[ra] = (0x0000000100000001 & res);
> > +		return;
> > +	}
> > +
> > +	res = (0x000000000000000f & res) + (0x000000000000000f & (res >> 32));
> > +	regs->gpr[ra] = res & 1;	/*prtyd */
> 
> Looks good, you can also xor instead of adding, but the masks would need
> to change in that case.

Actually, I'd prefer to use xor for this case, since you hardly ever
need any mask, except at the end. Most masks are only needed because of
carry propagation, which the xor completely avoid.

In other words:

	unsigned long long res = v ^ (v >> 8);
	res ^= res >> 16; 
	if (size == 32) {
		regs->gpr[ra] = res & 0x0000000100000001;
		return;
	}
	res ^= res >> 32;
	regs-> gpr[ra] = res & 1;

should work, but I've not even compiled it.

	Gabriel
Michael Ellerman July 26, 2017, 10:03 a.m. UTC | #4
Segher Boessenkool <segher@kernel.crashing.org> writes:

> On Tue, Jul 25, 2017 at 01:33:19PM +1000, Matt Brown wrote:
>> +static nokprobe_inline void do_prty(struct pt_regs *regs, unsigned long v,
>> +				int size, int ra)
>> +{
>> +	unsigned long long res = v;
>> +
>> +	res = (0x0001000100010001 & res) + (0x0001000100010001 & (res >> 8));
>> +	res = (0x0000000700000007 & res) + (0x0000000700000007 & (res >> 16));
>> +	if (size == 32) {		/* prtyw */
>> +		regs->gpr[ra] = (0x0000000100000001 & res);
>> +		return;
>> +	}
>> +
>> +	res = (0x000000000000000f & res) + (0x000000000000000f & (res >> 32));
>> +	regs->gpr[ra] = res & 1;	/*prtyd */
>> +}
>
> Does 7's and 0xf look strange (since the top bit in the values there
> is always 0).  You always "and" the values with 1 later, you could do
> that immediately (and change + to ^).
>
> A general question about these patches: some things are inside #ifdef
> __powerpc64__, some are not.  It seems it is the wrong macro, and it
> should be used (or not used) consistently?

Why is it the wrong macro? Because we tend to use CONFIG_PPC64 you mean?

I thought the reason some are #ifdef'ed is that some are 64-bit only.
ie. bpermd is 64-bit only ?

cheers
Segher Boessenkool July 26, 2017, 4:02 p.m. UTC | #5
On Wed, Jul 26, 2017 at 08:03:30PM +1000, Michael Ellerman wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > A general question about these patches: some things are inside #ifdef
> > __powerpc64__, some are not.  It seems it is the wrong macro, and it
> > should be used (or not used) consistently?
> 
> Why is it the wrong macro? Because we tend to use CONFIG_PPC64 you mean?

Yeah.  But I see sstep.c already mixes those two at will (or if there
is a distinction, I'm not seeing it :-) )

> I thought the reason some are #ifdef'ed is that some are 64-bit only.
> ie. bpermd is 64-bit only ?

64-bit only, in what way?  It's not clear what the rules are.

It's not instructions that can only run in 64-bit mode.
It's not instructions that only give a usable result with 64-bit regs
implemented.
It's not instructions only implemented on 64-bit CPUs.
It's not even "all instructions that would not give a correct result
in the low 32 bits of GPRs if the high 32 bits are not implemented".


Segher
Michael Ellerman July 27, 2017, 1:26 a.m. UTC | #6
Segher Boessenkool <segher@kernel.crashing.org> writes:

> On Wed, Jul 26, 2017 at 08:03:30PM +1000, Michael Ellerman wrote:
>> Segher Boessenkool <segher@kernel.crashing.org> writes:
>> > A general question about these patches: some things are inside #ifdef
>> > __powerpc64__, some are not.  It seems it is the wrong macro, and it
>> > should be used (or not used) consistently?
>> 
>> Why is it the wrong macro? Because we tend to use CONFIG_PPC64 you mean?
>
> Yeah.  But I see sstep.c already mixes those two at will (or if there
> is a distinction, I'm not seeing it :-) )

Yeah OK. In practice they're equivalent, if CONFIG_PPC64=y then the
kernel is built 64-bit and therefore __powerpc64__ is defined.

But I agree it's a mess, we should use CONFIG_PPC64 exclusively unless
there's some reason not to (which I don't think there ever is).

>> I thought the reason some are #ifdef'ed is that some are 64-bit only.
>> ie. bpermd is 64-bit only ?
>
> 64-bit only, in what way?  It's not clear what the rules are.

Instructions that have "d" in the name? :P

> It's not instructions that can only run in 64-bit mode.
> It's not instructions that only give a usable result with 64-bit regs
> implemented.
> It's not instructions only implemented on 64-bit CPUs.

I think it's trying to be that ^

If you build a 32-bit kernel then instructions that are only defined on
64-bit CPUs should be treated as illegal, so the easiest way to achieve
that is to #ifdef off the code for those instructions.

cheers
Matt Brown July 28, 2017, 12:48 a.m. UTC | #7
On Thu, Jul 27, 2017 at 11:26 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
>
>> On Wed, Jul 26, 2017 at 08:03:30PM +1000, Michael Ellerman wrote:
>>> Segher Boessenkool <segher@kernel.crashing.org> writes:
>>> > A general question about these patches: some things are inside #ifdef
>>> > __powerpc64__, some are not.  It seems it is the wrong macro, and it
>>> > should be used (or not used) consistently?
>>>
>>> Why is it the wrong macro? Because we tend to use CONFIG_PPC64 you mean?
>>
>> Yeah.  But I see sstep.c already mixes those two at will (or if there
>> is a distinction, I'm not seeing it :-) )
>
> Yeah OK. In practice they're equivalent, if CONFIG_PPC64=y then the
> kernel is built 64-bit and therefore __powerpc64__ is defined.
>
> But I agree it's a mess, we should use CONFIG_PPC64 exclusively unless
> there's some reason not to (which I don't think there ever is).
>
>>> I thought the reason some are #ifdef'ed is that some are 64-bit only.
>>> ie. bpermd is 64-bit only ?
>>
>> 64-bit only, in what way?  It's not clear what the rules are.
>
> Instructions that have "d" in the name? :P
>
>> It's not instructions that can only run in 64-bit mode.
>> It's not instructions that only give a usable result with 64-bit regs
>> implemented.
>> It's not instructions only implemented on 64-bit CPUs.
>
> I think it's trying to be that ^
>
> If you build a 32-bit kernel then instructions that are only defined on
> 64-bit CPUs should be treated as illegal, so the easiest way to achieve
> that is to #ifdef off the code for those instructions.
>

I'll fixup this up to use the xor implementation, and change the
series to use CONFIG_PPC64 for the ifdef.

Thanks,
Matt
> cheers
Michael Ellerman July 28, 2017, 1:31 a.m. UTC | #8
Matt Brown <matthew.brown.dev@gmail.com> writes:

> On Thu, Jul 27, 2017 at 11:26 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> Segher Boessenkool <segher@kernel.crashing.org> writes:
>>
>>> On Wed, Jul 26, 2017 at 08:03:30PM +1000, Michael Ellerman wrote:
>>>> Segher Boessenkool <segher@kernel.crashing.org> writes:
>>>> > A general question about these patches: some things are inside #ifdef
>>>> > __powerpc64__, some are not.  It seems it is the wrong macro, and it
>>>> > should be used (or not used) consistently?
>>>>
>>>> Why is it the wrong macro? Because we tend to use CONFIG_PPC64 you mean?
>>>
>>> Yeah.  But I see sstep.c already mixes those two at will (or if there
>>> is a distinction, I'm not seeing it :-) )
>>
>> Yeah OK. In practice they're equivalent, if CONFIG_PPC64=y then the
>> kernel is built 64-bit and therefore __powerpc64__ is defined.
>>
>> But I agree it's a mess, we should use CONFIG_PPC64 exclusively unless
>> there's some reason not to (which I don't think there ever is).
>>
>>>> I thought the reason some are #ifdef'ed is that some are 64-bit only.
>>>> ie. bpermd is 64-bit only ?
>>>
>>> 64-bit only, in what way?  It's not clear what the rules are.
>>
>> Instructions that have "d" in the name? :P
>>
>>> It's not instructions that can only run in 64-bit mode.
>>> It's not instructions that only give a usable result with 64-bit regs
>>> implemented.
>>> It's not instructions only implemented on 64-bit CPUs.
>>
>> I think it's trying to be that ^
>>
>> If you build a 32-bit kernel then instructions that are only defined on
>> 64-bit CPUs should be treated as illegal, so the easiest way to achieve
>> that is to #ifdef off the code for those instructions.
>
> I'll fixup this up to use the xor implementation, and change the
> series to use CONFIG_PPC64 for the ifdef.

Thanks.

In terms of the actual algorithms, we want to use something reasonably
efficient, but don't sweat too much on the performance. The overhead of
the emulation is in the exception we took to get there. So readable and
obviously correct source is the main goal.

cheers
diff mbox

Patch

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 6a79618..0bcf631 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -655,6 +655,25 @@  static nokprobe_inline void do_bpermd(struct pt_regs *regs, unsigned long v1,
 	regs->gpr[ra] = perm;
 }
 #endif /* __powerpc64__ */
+/*
+ * The size parameter adjusts the equivalent prty instruction.
+ * prtyw = 32, prtyd = 64
+ */
+static nokprobe_inline void do_prty(struct pt_regs *regs, unsigned long v,
+				int size, int ra)
+{
+	unsigned long long res = v;
+
+	res = (0x0001000100010001 & res) + (0x0001000100010001 & (res >> 8));
+	res = (0x0000000700000007 & res) + (0x0000000700000007 & (res >> 16));
+	if (size == 32) {		/* prtyw */
+		regs->gpr[ra] = (0x0000000100000001 & res);
+		return;
+	}
+
+	res = (0x000000000000000f & res) + (0x000000000000000f & (res >> 32));
+	regs->gpr[ra] = res & 1;	/*prtyd */
+}
 
 static nokprobe_inline int trap_compare(long v1, long v2)
 {
@@ -1245,6 +1264,14 @@  int analyse_instr(struct instruction_op *op, struct pt_regs *regs,
 		case 124:	/* nor */
 			regs->gpr[ra] = ~(regs->gpr[rd] | regs->gpr[rb]);
 			goto logical_done;
+
+		case 154:	/* prtyw */
+			do_prty(regs, regs->gpr[rd], 32, ra);
+			goto logical_done;
+
+		case 186:	/* prtyd */
+			do_prty(regs, regs->gpr[rd], 64, ra);
+			goto logical_done;
 #ifdef __powerpc64__
 		case 252:	/* bpermd */
 			do_bpermd(regs, regs->gpr[rd], regs->gpr[rb], ra);