sh4: mac.l: implement saturation arithmetic logic

Message ID	20240404151100.24063-1-zack@buhman.org
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> Feedback-ID: i1541475f:Fastmail From: Zack Buhman <zack@buhman.org> To: qemu-devel@nongnu.org Cc: Zack Buhman <zack@buhman.org> Subject: [PATCH] sh4: mac.l: implement saturation arithmetic logic Date: Thu, 4 Apr 2024 23:10:35 +0800 Message-ID: <20240404151100.24063-1-zack@buhman.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=64.147.123.152; envelope-from=zack@buhman.org; helo=wfhigh1-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Series	sh4: mac.l: implement saturation arithmetic logic \| expand sh4: mac.l: implement saturation arithmetic logic

Message ID

20240404151100.24063-1-zack@buhman.org

State

New

Headers

Feedback-ID: i1541475f:Fastmail
From: Zack Buhman <zack@buhman.org>
To: qemu-devel@nongnu.org
Cc: Zack Buhman <zack@buhman.org>
Subject: [PATCH] sh4: mac.l: implement saturation arithmetic logic
Date: Thu,  4 Apr 2024 23:10:35 +0800
Message-ID: <20240404151100.24063-1-zack@buhman.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=64.147.123.152; envelope-from=zack@buhman.org;
 helo=wfhigh1-smtp.messagingengine.com
X-Spam_score_int: -27
X-Spam_score: -2.8
X-Spam_bar: --
X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Series

sh4: mac.l: implement saturation arithmetic logic | expand

Commit Message

Zack Buhman April 4, 2024, 3:10 p.m. UTC

The saturation arithmetic logic in helper_macl is not correct.

I tested and verified this behavior on a SH7091, the general pattern
is a code sequence such as:

	sets

	mov.l _mach,r2
	lds r2,mach
	mov.l _macl,r2
	lds r2,macl

	mova _n,r0
	mov r0,r1
	mova _m,r0
	mac.l @r0+,@r1+

    _mach: .long 0x00007fff
    _macl: .long 0x12345678
    _m:    .long 0x7fffffff
    _n:    .long 0x7fffffff

Test case 0: (no int64_t overflow)
  given; prior to saturation mac.l:
    mach = 0x00007fff macl = 0x12345678
    @r0  = 0x7fffffff @r1  = 0x7fffffff

  expected saturation mac.l result:
    mach = 0x00007fff macl = 0xffffffff

  qemu saturation mac.l result (prior to this commit):
    mach = 0x00007ffe macl = 0x12345678

Test case 1: (no int64_t overflow)
  given; prior to saturation mac.l:
    mach = 0xffff8000 macl = 0x00000000
    @r0  = 0xffffffff @r1  = 0x00000001

  expected saturation mac.l result:
    mach = 0xffff8000 macl = 0x00000000

  qemu saturation mac.l result (prior to this commit):
    mach = 0xffff7fff macl = 0xffffffff

Test case 2: (int64_t addition overflow)
  given; prior to saturation mac.l:
    mach = 0x80000000 macl = 0x00000000
    @r0  = 0xffffffff @r1  = 0x00000001

  expected saturation mac.l result:
    mach = 0xffff8000 macl = 0x00000000

  qemu saturation mac.l result (prior to this commit):
    mach = 0xffff7fff macl = 0xffffffff

Test case 3: (int64_t addition overflow)
  given; prior to saturation mac.l:
    mach = 0x7fffffff macl = 0x00000000
    @r0 = 0x7fffffff @r1 = 0x7fffffff

  expected saturation mac.l result:
    mach = 0x00007fff macl = 0xffffffff

  qemu saturation mac.l result (prior to this commit):
    mach = 0xfffffffe macl = 0x00000001

All of the above also matches the description of MAC.L as documented
in cd00147165-sh-4-32-bit-cpu-core-architecture-stmicroelectronics.pdf
---
 target/sh4/op_helper.c | 45 ++++++++++++++++++++++++++++++++----------
 1 file changed, 35 insertions(+), 10 deletions(-)

Comments

Peter Maydell April 4, 2024, 3:37 p.m. UTC | #1

On Thu, 4 Apr 2024 at 16:12, Zack Buhman <zack@buhman.org> wrote:
>
> The saturation arithmetic logic in helper_macl is not correct.
>
> I tested and verified this behavior on a SH7091, the general pattern
> is a code sequence such as:
>
>         sets
>
>         mov.l _mach,r2
>         lds r2,mach
>         mov.l _macl,r2
>         lds r2,macl
>
>         mova _n,r0
>         mov r0,r1
>         mova _m,r0
>         mac.l @r0+,@r1+
>
>     _mach: .long 0x00007fff
>     _macl: .long 0x12345678
>     _m:    .long 0x7fffffff
>     _n:    .long 0x7fffffff
>
> Test case 0: (no int64_t overflow)
>   given; prior to saturation mac.l:
>     mach = 0x00007fff macl = 0x12345678
>     @r0  = 0x7fffffff @r1  = 0x7fffffff
>
>   expected saturation mac.l result:
>     mach = 0x00007fff macl = 0xffffffff
>
>   qemu saturation mac.l result (prior to this commit):
>     mach = 0x00007ffe macl = 0x12345678
>
> Test case 1: (no int64_t overflow)
>   given; prior to saturation mac.l:
>     mach = 0xffff8000 macl = 0x00000000
>     @r0  = 0xffffffff @r1  = 0x00000001
>
>   expected saturation mac.l result:
>     mach = 0xffff8000 macl = 0x00000000
>
>   qemu saturation mac.l result (prior to this commit):
>     mach = 0xffff7fff macl = 0xffffffff
>
> Test case 2: (int64_t addition overflow)
>   given; prior to saturation mac.l:
>     mach = 0x80000000 macl = 0x00000000
>     @r0  = 0xffffffff @r1  = 0x00000001
>
>   expected saturation mac.l result:
>     mach = 0xffff8000 macl = 0x00000000
>
>   qemu saturation mac.l result (prior to this commit):
>     mach = 0xffff7fff macl = 0xffffffff
>
> Test case 3: (int64_t addition overflow)
>   given; prior to saturation mac.l:
>     mach = 0x7fffffff macl = 0x00000000
>     @r0 = 0x7fffffff @r1 = 0x7fffffff
>
>   expected saturation mac.l result:
>     mach = 0x00007fff macl = 0xffffffff
>
>   qemu saturation mac.l result (prior to this commit):
>     mach = 0xfffffffe macl = 0x00000001
>
> All of the above also matches the description of MAC.L as documented
> in cd00147165-sh-4-32-bit-cpu-core-architecture-stmicroelectronics.pdf
> ---
>  target/sh4/op_helper.c | 45 ++++++++++++++++++++++++++++++++----------
>  1 file changed, 35 insertions(+), 10 deletions(-)
>
> diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
> index 4559d0d376..a3eb2f5281 100644
> --- a/target/sh4/op_helper.c
> +++ b/target/sh4/op_helper.c
> @@ -160,18 +160,43 @@ void helper_ocbi(CPUSH4State *env, uint32_t address)
>
>  void helper_macl(CPUSH4State *env, uint32_t arg0, uint32_t arg1)
>  {
> -    int64_t res;
> -
> -    res = ((uint64_t) env->mach << 32) | env->macl;
> -    res += (int64_t) (int32_t) arg0 *(int64_t) (int32_t) arg1;
> -    env->mach = (res >> 32) & 0xffffffff;
> -    env->macl = res & 0xffffffff;
> +    int32_t value0 = (int32_t)arg0;
> +    int32_t value1 = (int32_t)arg1;
> +    int64_t mul = ((int64_t)value0) * ((int64_t)value1);
> +    int64_t mac = (((uint64_t)env->mach) << 32) | env->macl;
> +    int64_t result = mac + mul;
> +    /* Perform 48-bit saturation arithmetic if the S flag is set */
>      if (env->sr & (1u << SR_S)) {
> -        if (res < 0)
> -            env->mach |= 0xffff0000;
> -        else
> -            env->mach &= 0x00007fff;
> +        /*
> +         * The following xor/and expression is necessary to detect an
> +         * overflow in MSB of res; this is logic necessary because the
> +         * sign bit of `mac + mul` may overflow. The MAC unit on real
> +         * SH-4 hardware has carry/saturation logic that is equivalent
> +         * to the following:
> +         */
> +        const int64_t upper_bound =  ((1ull << 47) - 1);
> +        const int64_t lower_bound = -((1ull << 47) - 0);
> +
> +        if (((((result ^ mac) & (result ^ mul)) >> 63) & 1) == 1) {
> +            /* An overflow occured during 64-bit addition */

This is testing whether the "int64_t result = mac + mul"
signed 64-bit arithmetic overflowed, right? That's probably
cleaner written by using the sadd64_overflow() function in
host-utils.h, which does the 64-bit add and returns a bool
to tell you whether it overflowed or not:

   if (sadd64_overflow(mac, mul, &result)) {
       result = (result < 0) ? lower_bound : upper_bound;
   } else {
       result = MIN(MAX(result, lower_bound), upper_bound);
   }



> +            if (((mac >> 63) & 1) == 0) {
> +                result = upper_bound;
> +            } else {
> +                result = lower_bound;
> +            }
> +        } else {
> +            /* An overflow did not occur during 64-bit addition */
> +            if (result > upper_bound) {
> +                result = upper_bound;
> +            } else if (result < lower_bound) {
> +                result = lower_bound;
> +            } else {
> +                /* leave result unchanged */
> +            }
> +        }
>      }
> +    env->macl = result;
> +    env->mach = result >> 32;

thanks
-- PMM

diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
index 4559d0d376..a3eb2f5281 100644
--- a/target/sh4/op_helper.c
+++ b/target/sh4/op_helper.c
@@ -160,18 +160,43 @@  void helper_ocbi(CPUSH4State *env, uint32_t address)
 
 void helper_macl(CPUSH4State *env, uint32_t arg0, uint32_t arg1)
 {
-    int64_t res;
-
-    res = ((uint64_t) env->mach << 32) | env->macl;
-    res += (int64_t) (int32_t) arg0 *(int64_t) (int32_t) arg1;
-    env->mach = (res >> 32) & 0xffffffff;
-    env->macl = res & 0xffffffff;
+    int32_t value0 = (int32_t)arg0;
+    int32_t value1 = (int32_t)arg1;
+    int64_t mul = ((int64_t)value0) * ((int64_t)value1);
+    int64_t mac = (((uint64_t)env->mach) << 32) | env->macl;
+    int64_t result = mac + mul;
+    /* Perform 48-bit saturation arithmetic if the S flag is set */
     if (env->sr & (1u << SR_S)) {
-        if (res < 0)
-            env->mach |= 0xffff0000;
-        else
-            env->mach &= 0x00007fff;
+        /*
+         * The following xor/and expression is necessary to detect an
+         * overflow in MSB of res; this is logic necessary because the
+         * sign bit of `mac + mul` may overflow. The MAC unit on real
+         * SH-4 hardware has carry/saturation logic that is equivalent
+         * to the following:
+         */
+        const int64_t upper_bound =  ((1ull << 47) - 1);
+        const int64_t lower_bound = -((1ull << 47) - 0);
+
+        if (((((result ^ mac) & (result ^ mul)) >> 63) & 1) == 1) {
+            /* An overflow occured during 64-bit addition */
+            if (((mac >> 63) & 1) == 0) {
+                result = upper_bound;
+            } else {
+                result = lower_bound;
+            }
+        } else {
+            /* An overflow did not occur during 64-bit addition */
+            if (result > upper_bound) {
+                result = upper_bound;
+            } else if (result < lower_bound) {
+                result = lower_bound;
+            } else {
+                /* leave result unchanged */
+            }
+        }
     }
+    env->macl = result;
+    env->mach = result >> 32;
 }
 
 void helper_macw(CPUSH4State *env, uint32_t arg0, uint32_t arg1)

sh4: mac.l: implement saturation arithmetic logic

Commit Message

Comments

Patch