Patchwork arm/translate.c: Fix adc_CC/sbc_CC implementation

login
register
mail settings
Submitter Peter Crosthwaite
Date Feb. 25, 2013, 8:04 a.m.
Message ID <1361779487-9274-1-git-send-email-peter.crosthwaite@xilinx.com>
Download mbox | patch
Permalink /patch/222858/
State New
Headers show

Comments

Peter Crosthwaite - Feb. 25, 2013, 8:04 a.m.
commits 49b4c31efcce45ab714f286f14fa5d5173f9069d and
2de68a4900ef6eb67380b0c128abfe1976bc66e8 reworked the implementation of adc_CC
and sub_CC. The new implementations (on the TCG_TARGET_HAS_add2_i32 code path)
are incorrect. The new logic is:

CF:NF = 0:A +/- 0:CF
CF:NF = CF:A +/- 0:B

The lower 32 bits of the intermediate result stored in NF needs to be passes
into the second addition in place of A (s/CF:A/CF:NF):

CF:NF = 0:A +/- 0:CF
CF:NF = CF:NF +/- 0:B

This patch fixes the issue.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
---
 target-arm/translate.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)
Richard Henderson - Feb. 25, 2013, 2:43 p.m.
On 2013-02-25 00:04, Peter Crosthwaite wrote:
> commits 49b4c31efcce45ab714f286f14fa5d5173f9069d and
> 2de68a4900ef6eb67380b0c128abfe1976bc66e8 reworked the implementation of adc_CC
> and sub_CC. The new implementations (on the TCG_TARGET_HAS_add2_i32 code path)
> are incorrect. The new logic is:
>
> CF:NF = 0:A +/- 0:CF
> CF:NF = CF:A +/- 0:B
>
> The lower 32 bits of the intermediate result stored in NF needs to be passes
> into the second addition in place of A (s/CF:A/CF:NF):
>
> CF:NF = 0:A +/- 0:CF
> CF:NF = CF:NF +/- 0:B
>
> This patch fixes the issue.
>
> Signed-off-by: Peter Crosthwaite<peter.crosthwaite@xilinx.com>
> ---
>   target-arm/translate.c |    4 ++--
>   1 files changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>

Sorry for the breakage.  Blue, please apply asap.


r~
Laurent Desnogues - Feb. 25, 2013, 5:15 p.m.
On Mon, Feb 25, 2013 at 3:43 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 2013-02-25 00:04, Peter Crosthwaite wrote:
>>
>> commits 49b4c31efcce45ab714f286f14fa5d5173f9069d and
>> 2de68a4900ef6eb67380b0c128abfe1976bc66e8 reworked the implementation of
>> adc_CC
>> and sub_CC. The new implementations (on the TCG_TARGET_HAS_add2_i32 code
>> path)
>> are incorrect. The new logic is:
>>
>> CF:NF = 0:A +/- 0:CF
>> CF:NF = CF:A +/- 0:B
>>
>> The lower 32 bits of the intermediate result stored in NF needs to be
>> passes
>> into the second addition in place of A (s/CF:A/CF:NF):
>>
>> CF:NF = 0:A +/- 0:CF
>> CF:NF = CF:NF +/- 0:B
>>
>> This patch fixes the issue.
>>
>> Signed-off-by: Peter Crosthwaite<peter.crosthwaite@xilinx.com>
>> ---
>>   target-arm/translate.c |    4 ++--
>>   1 files changed, 2 insertions(+), 2 deletions(-)
>
>
> Reviewed-by: Richard Henderson <rth@twiddle.net>
>
> Sorry for the breakage.  Blue, please apply asap.

I'm afraid this fix is not enough as I still can't get my Linux image
to boot after applying it.

Running this, my image boots:
git checkout 49b4c31efcce45ab714f286f14fa5d5173f9069d target-arm

Looking at the new sbc_cc, it seems that if t0=t1 and CF=1,
then CF will be cleared while the old code in the helper did
set it.


Laurent

PS: My image is the vexpress found here:
http://releases.linaro.org/images/linaro-n/alip/11.08
Richard Henderson - Feb. 25, 2013, 5:44 p.m.
On 02/25/2013 09:15 AM, Laurent Desnogues wrote:
> Looking at the new sbc_cc, it seems that if t0=t1 and CF=1,
> then CF will be cleared while the old code in the helper did
> set it.

Sigh, yes I see it.  While the transform

	x + ~y + cf
->
	x - y + cf - 1

works for the low 32-bits, it gets the carry wrong:

	x - x + (1-1) = 0
whereas
	x + ~x + 1 = 0xffffffff + 1 = 0x1_00000000

I need to just do exactly what the spec says, not "optimize" it.


r~
Peter Maydell - Feb. 25, 2013, 6:24 p.m.
On 25 February 2013 17:15, Laurent Desnogues
<laurent.desnogues@gmail.com> wrote:
> I'm afraid this fix is not enough

I've confirmed with my risu test tool that sbcs/adcs are indeed
broken even with this patch applied. Patterns used:

ADC_imm A1 cond:4 0010101 s:1 rn:4 rd:4 imm:12
ADC_reg A1 cond:4 0000101 s:1 rn:4 rd:4 imm:5 type:2 0 rm:4
ADC_rsr A1 cond:4 0000101 s:1 rn:4 rd:4 rs:4 0 type:2 1 rm:4

SBC_imm A1 cond:4 0010110 s:1 rn:4 rd:4 imm:12
SBC_reg A1 cond:4 0000110 s:1 rn:4 rd:4 imm:5 type:2 0 rm:4
SBC_rsr A1 cond:4 0000110 s:1 rn:4 rd:4 rs:4 0 type:2 1 rm:4

I really must get round to implementing the record-n-replay
functionality so you can run this kind of test without ARM
hardware to hand.

-- PMM

Patch

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 9993aea..6d91b70 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -428,7 +428,7 @@  static void gen_adc_CC(TCGv dest, TCGv t0, TCGv t1)
     if (TCG_TARGET_HAS_add2_i32) {
         tcg_gen_movi_i32(tmp, 0);
         tcg_gen_add2_i32(cpu_NF, cpu_CF, t0, tmp, cpu_CF, tmp);
-        tcg_gen_add2_i32(cpu_NF, cpu_CF, t0, cpu_CF, t1, tmp);
+        tcg_gen_add2_i32(cpu_NF, cpu_CF, cpu_NF, cpu_CF, t1, tmp);
     } else {
         TCGv_i64 q0 = tcg_temp_new_i64();
         TCGv_i64 q1 = tcg_temp_new_i64();
@@ -472,7 +472,7 @@  static void gen_sbc_CC(TCGv dest, TCGv t0, TCGv t1)
     if (TCG_TARGET_HAS_add2_i32) {
         tcg_gen_movi_i32(tmp, 0);
         tcg_gen_add2_i32(cpu_NF, cpu_CF, t0, tmp, cpu_CF, tmp);
-        tcg_gen_sub2_i32(cpu_NF, cpu_CF, t0, cpu_CF, t1, tmp);
+        tcg_gen_sub2_i32(cpu_NF, cpu_CF, cpu_NF, cpu_CF, t1, tmp);
     } else {
         TCGv_i64 q0 = tcg_temp_new_i64();
         TCGv_i64 q1 = tcg_temp_new_i64();