diff mbox series

x86: Use unsigned short to compute pextrw result

Message ID CAMe9rOroJQTjzpphp8VNJ9s=kOn2vmjh0ebpUm8WAMgmuT2N_w@mail.gmail.com
State New
Headers show
Series x86: Use unsigned short to compute pextrw result | expand

Commit Message

H.J. Lu Jan. 5, 2021, 7:03 p.m. UTC
On Mon, Jan 4, 2021 at 7:41 PM Jeff Law <law@redhat.com> wrote:
>
>
>
> On 1/1/21 6:34 AM, H.J. Lu via Gcc-patches wrote:
> > _mm_extract_pi16 is intrinsic for pextrw, which should be zero-extended,
> > not sign-extended.
> >
> > gcc/
> >
> >       PR target/98495
> >       * config/i386/xmmintrin.h (_mm_extract_pi16): Cast to unsigned
> >       short first.
> I'd tend to prefer masking with 0xffff  rather than relying on the size
> of a particular type being what we need.  But this header is limited to
> just x86 and it doesn't look like there's any variance in the size of a
> short, across the x86 platforms.
>
> So, OK.
> jeff
>

I am checking in this patch to use unsigned short to compute the
zero-extended pextrw result.  This fixed:

FAIL: gcc.target/i386/sse2-mmx-pextrw.c execution test
diff mbox series

Patch

From 4b3d73a439caffd82eba0a64ee43bae5d5e07de9 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Tue, 5 Jan 2021 10:57:20 -0800
Subject: [PATCH] x86: Use unsigned short to compute pextrw result

Use unsigned short to compute the zero-extended pextrw result.

	PR target/98495
	* gcc.target/i386/sse2-mmx-pextrw.c (compute_correct_result): Use
	unsigned short to compute pextrw result.
---
 gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c b/gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c
index bb48740a7ca..edbac919fd8 100644
--- a/gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c
+++ b/gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c
@@ -32,7 +32,7 @@  test_pextrw (__m64 *i, unsigned int imm, int *r)
 static void
 compute_correct_result (__m64 *src_p, unsigned int imm, int *res_p)
 {
-  short *src = (short *) src_p;
+  unsigned short *src = (unsigned short *) src_p;
   if (imm < 4)
     *res_p = src[imm];
 }
-- 
2.29.2