diff mbox series

[x86,SSE] Improve handling of ternlog instructions in i386/sse.md

Message ID 001801daa4b7$62d704c0$28850e40$@nextmovesoftware.com
State New
Headers show
Series [x86,SSE] Improve handling of ternlog instructions in i386/sse.md | expand

Commit Message

Roger Sayle May 12, 2024, 9:57 p.m. UTC
This patch improves the way that the x86 backend recognizes and
expands AVX512's bitwise ternary logic (vpternlog) instructions.

As a motivating example consider the following code which calculates
the carry out from a (binary) full adder:

typedef unsigned long long v4di __attribute((vector_size(32)));

v4di foo(v4di a, v4di b, v4di c)
{
    return (a & b) | ((a ^ b) & c);
}

with -O2 -march=cascadelake current mainline produces:

foo:    vpternlogq      $96, %ymm0, %ymm1, %ymm2
        vmovdqa %ymm0, %ymm3
        vmovdqa %ymm2, %ymm0
        vpternlogq      $248, %ymm3, %ymm1, %ymm0
        ret

with the patch below, we now generate a single instruction:

foo:    vpternlogq      $232, %ymm2, %ymm1, %ymm0
        ret


The AVX512 vpternlog[qd] instructions are a very cool addition to the
x86 instruction set, that can calculate any Boolean function of three
inputs in a single fast instruction.  As the truth table for any
three-input function has 8 rows, any specific function can be represented
by specifying those bits, i.e. by an 8-bit byte, an immediate integer
between 0 and 256.

Examples of ternary functions and their indices are given below:

0x01   1:  ~((b|a)|c)
0x02   2:  (~(b|a))&c
0x03   3:  ~(b|a)
0x04   4:  (~(c|a))&b
0x05   5:  ~(c|a)
0x06   6:  (c^b)&~a
0x07   7:  ~((c&b)|a)
0x08   8:  (~a&c)&b (~a&b)&c (c&b)&~a
0x09   9:  ~((c^b)|a)
0x0a  10:  ~a&c
0x0b  11:  ~((~c&b)|a) (~b|c)&~a
0x0c  12:  ~a&b
0x0d  13:  ~((~b&c)|a) (~c|b)&~a
0x0e  14:  (c|b)&~a
0x0f  15:  ~a
0x10  16:  (~(c|b))&a
0x11  17:  ~(c|b)
...
0xf4 244:  (~c&b)|a
0xf5 245:  ~c|a
0xf6 246:  (c^b)|a
0xf7 247:  (~(c&b))|a
0xf8 248:  (c&b)|a
0xf9 249:  (~(c^b))|a
0xfa 250:  c|a
0xfb 251:  (c|a)|~b (~b|a)|c (~b|c)|a
0xfc 252:  b|a
0xfd 253:  (b|a)|~c (~c|a)|b (~c|b)|a
0xfe 254:  (b|a)|c (c|a)|b (c|b)|a

A naive implementation (in many compilers) might be add define_insn
patterns for all 256 different functions.  The situation is even
worse as many of these Boolean functions don't have a "canonical form"
(as produced by simplify_rtx) and would each need multiple patterns.
See the space-separated equivalent expressions in the table above.

This need to provide instruction "templates" might explain why GCC,
LLVM and ICC all exhibit similar coverage problems in their ability
to recognize x86 ternlog ternary functions.

Perhaps a unique feature of GCC's design is that in addition to regular
define_insn templates, machine descriptions can also perform pattern
matching via a match_operator (and its corresponding predicate).
This patch introduces a ternlog_operand predicate that matches a
(possibly infinite) set of expression trees, identifying those that
have at most three unique operands.  This then allows a
define_insn_and_split to recognize suitable expressions and then
transform them into the appropriate UNSPEC_VTERNLOG as a pre-reload
splitter.  This design allows combine to smash together arbitrarily
complex Boolean expressions, then transform them into an UNSPEC
before register allocation.  As an "optimization", where possible
ix86_expand_ternlog generates a simpler binary operation, using
AND, XOR, IOR or ANDN where possible, and in a few cases attempts
to "canonicalize" the ternlog, by reordering or duplicating operands,
so that later CSE passes have a hope of spotting equivalent values.

Another benefit of this patch is that it improves the code
generated for PR target/115021 [see comment #1].

This patch leaves the existing ternlog patterns in sse.md (for now),
many of which are made obsolete by these changes.  In theory we now
only need one define_insn for UNSPEC_VTERNLOG.  One complication from
these previous variants was that they inconsistently used decimal vs.
hexadecimal to specify the immediate constant operand in assembly
language, making the list of tweaks to the testsuite with this patch
larger than it might have been.  I propose to remove the vestigial
patterns in a follow-up patch, once this approach has baked (proven
to be stable) on mainline.


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?

2024-05-12  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        PR target/115021
        * config/i386/i386-expand.cc (ix86_expand_args_builtin): Call
        fixup_modeless_constant before testing predicates.  Only call
        copy_to_mode_reg on memory operands (after the first one).
        (ix86_gen_bcst_mem): Helper function to convert a CONST_VECTOR
        into a VEC_DUPLICATE if possible.
        (ix86_ternlog_idx):  Convert an RTX expression into a ternlog
        index between 0 and 255, recording the operands in ARGS, if
        possible or return -1 if this is not possible/valid.
        (ix86_ternlog_leaf_p): Helper function to identify "leaves"
        of a ternlog expression, e.g. REG_P, MEM_P, CONST_VECTOR, etc.
        (ix86_ternlog_operand_p): Test whether a expression is suitable
        for and prefered as an UNSPEC_TERNLOG.
        (ix86_expand_ternlog_binop): Helper function to construct the
        binary operation corresponding to a sufficiently simple ternlog.
        (ix86_expand_ternlog_andnot): Helper function to construct a
        ANDN operation corresponding to a sufficiently simple ternlog.
        (ix86_expand_ternlog): Expand a 3-operand ternary logic
        expression, constructing either an UNSPEC_TERNLOG or simpler
        rtx expression.  Called from builtin expanders and pre-reload
        splitters.
        * config/i386/i386-protos.h (ix86_ternlog_idx): Prototype here.
        (ix86_ternlog_operand_p): Likewise.
        (ix86_expand_ternlog): Likewise.
        * config/i386/predicates.md (ternlog_operand): New predicate
        that calls xi86_ternlog_operand_p.
        * config/i386/sse.md (<avx512>_vpternlog<mode>_0): New
        define_insn_and_split that recognizes a SET_SRC of ternlog_operand
        and expands it via ix86_expand_ternlog pre-reload.
        (<avx512>_vternlog<mode>_mask): Convert from define_insn to
        define_expand.  Use ix86_expand_ternlog if the mask operand is
        ~0 (or 255 or -1).
        (*<avx512>_vternlog<mode>_mask): define_insn renamed from above.

gcc/testsuite/ChangeLog
        * gcc.target/i386/avx512f-andn-di-zmm-2.c: Update test case.
        * gcc.target/i386/avx512f-andn-si-zmm-2.c: Likewise.
        * gcc.target/i386/avx512f-orn-si-zmm-1.c: Likewise.
        * gcc.target/i386/avx512f-orn-si-zmm-2.c: Likewise.
        * gcc.target/i386/avx512f-vpternlogd-1.c: Likewise.
        * gcc.target/i386/avx512f-vpternlogq-1.c: Likewise.
        * gcc.target/i386/avx512vl-vpternlogd-1.c: Likewise.
        * gcc.target/i386/avx512vl-vpternlogq-1.c: Likewise.
        * gcc.target/i386/pr100711-3.c: Likewise.
        * gcc.target/i386/pr100711-4.c: Likewise.
        * gcc.target/i386/pr100711-5.c: Likewise.


Thanks in advance,
Roger
--

Comments

Hongtao Liu May 14, 2024, 8:46 a.m. UTC | #1
On Mon, May 13, 2024 at 5:57 AM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch improves the way that the x86 backend recognizes and
> expands AVX512's bitwise ternary logic (vpternlog) instructions.
I like the patch.

1 file changed, 25 insertions(+), 1 deletion(-)
gcc/config/i386/i386-expand.cc | 26 +++++++++++++++++++++++++-

modified   gcc/config/i386/i386-expand.cc
@@ -25601,6 +25601,7 @@ ix86_gen_bcst_mem (machine_mode mode, rtx x)
 int
 ix86_ternlog_idx (rtx op, rtx *args)
 {
+  /* Nice dynamic programming:)  */
   int idx0, idx1;

   if (!op)
@@ -25651,6 +25652,7 @@ ix86_ternlog_idx (rtx op, rtx *args)
    return 0xaa;
  }
       /* Maximum of one volatile memory reference per expression.  */
+      /* According to comments, it should be && ?  */
       if (side_effects_p (op) || side_effects_p (args[2]))
  return -1;
       if (rtx_equal_p (op, args[2]))
@@ -25666,6 +25668,8 @@ ix86_ternlog_idx (rtx op, rtx *args)

     case SUBREG:
       if (!VECTOR_MODE_P (GET_MODE (SUBREG_REG (op)))
+   /* It could be TI/OI/XImode since it's just bit operations,
+      So no need for VECTOR_MODE_P?  */
    || GET_MODE_SIZE (GET_MODE (SUBREG_REG (op)))
       != GET_MODE_SIZE (GET_MODE (op)))
  return -1;
@@ -25701,7 +25705,7 @@ ix86_ternlog_idx (rtx op, rtx *args)
     case UNSPEC:
       if (XINT (op, 1) != UNSPEC_VTERNLOG
    || XVECLEN (op, 0) != 4
-   || CONST_INT_P (XVECEXP (op, 0, 3)))
+   || !CONST_INT_P (XVECEXP (op, 0, 3)))
  return -1;

       /* TODO: Handle permuted operands.  */
@@ -25778,10 +25782,13 @@ ix86_ternlog_operand_p (rtx op)
       /* Prefer pxor.  */
       if (ix86_ternlog_leaf_p (XEXP (op, 0), mode)
    && (ix86_ternlog_leaf_p (op1, mode)
+       /* Add some comments, it's because we already have
<mask_codefor>one_cmpl<mode>2<mask_name>.  */
        || vector_all_ones_operand (op1, mode)))
  return false;
       break;

+      /* Wouldn't pternlog match (SUBREG: (REG))???,and it should
also be excluded.
+        Similar for SUBREG: (AND/IOR/XOR)?   */
     default:
       break;
     }
@@ -25865,25 +25872,35 @@ ix86_expand_ternlog (machine_mode mode, rtx
op0, rtx op1, rtx op2, int idx,

     case 0x0a: /* ~a&c */
       if ((!op1 || !side_effects_p (op1))
+   /* shouldn't op1 always be register_operand with no side effects
when it exists?
+      <avx512>_vternlog<mode>_mask only supports register_operand for op1.
+      ix86_ternlog_idx only assigns REG to args[1].
+      Ditto for op0, also we should add op2 && register_operand (op2, mode)
+      to avoid segment fault?   */
    && register_operand (op0, mode)
    && register_operand (op2, mode))
  return ix86_expand_ternlog_andnot (mode, op0, op1, target);
+      /* op2 instead of op1??? */
       break;

     case 0x0c: /* ~a&b */
       if ((!op2 || !side_effects_p (op2))
    && register_operand (op0, mode)
    && register_operand (op1, mode))
+ /* If op0 and op1 exist, they must be register_operand? So just op0
&& op1?  */
  return ix86_expand_ternlog_andnot (mode, op0, op1, target);
       break;

     case 0x0f:  /* ~a */
       if ((!op1 || !side_effects_p (op1))
+   /* No need for !side_effects for op1?  */
+   /* Ditto.  */
    && (!op2 || !side_effects_p (op2)))
  {
    if (GET_MODE (op0) != mode)
      op0 = gen_lowpart (mode, op0);
    if (!TARGET_64BIT && !register_operand (op0, mode))
+     /* It must be register_operand for op0 when it exists, no? */
      op0 = force_reg (mode, op0);
    emit_move_insn (target, gen_rtx_XOR (mode, op0, CONSTM1_RTX (mode)));
    return target;
@@ -25894,6 +25911,7 @@ ix86_expand_ternlog (machine_mode mode, rtx
op0, rtx op1, rtx op2, int idx,
       if ((!op0 || !side_effects_p (op0))
    && register_operand (op1, mode)
    && register_operand (op2, mode))
+ /* op1 && op2 && register_operand (op2, mode)??  */
  return ix86_expand_ternlog_andnot (mode, op1, op2, target);
       break;

@@ -25901,12 +25919,14 @@ ix86_expand_ternlog (machine_mode mode, rtx
op0, rtx op1, rtx op2, int idx,
       if ((!op2 || !side_effects_p (op2))
    && register_operand (op0, mode)
    && register_operand (op1, mode))
+ /* op0 && op1? */
  return ix86_expand_ternlog_andnot (mode, op1, op0, target);
       break;

     case 0x33:  /* ~b */
       if ((!op0 || !side_effects_p (op0))
    && (!op2 || !side_effects_p (op2)))
+ /* op1 && (!op2 || !side_effects_p (op2)) ?  */
  {
    if (GET_MODE (op1) != mode)
      op1 = gen_lowpart (mode, op1);
@@ -26051,6 +26071,10 @@ ix86_expand_ternlog (machine_mode mode, rtx
op0, rtx op1, rtx op2, int idx,
       tmp2 = ix86_gen_bcst_mem (mode, op2);
       if (!tmp2)
  tmp2 = validize_mem (force_const_mem (mode, op2));
+      /* Can we use ix86_expand_vector_move here, it will try move
integer to gpr,
+ and broadcast gpr to the vector register.
+ It should be faster than a constant pool, and PR115021 should be solved by
+ another way instead of this walkaround.  */
     }
   else
     tmp2 = op2;
diff mbox series

Patch

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index a613291..be60915 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -11707,6 +11707,8 @@  ix86_expand_args_builtin (const struct builtin_description *d,
       tree arg = CALL_EXPR_ARG (exp, i);
       rtx op = expand_normal (arg);
       machine_mode mode = insn_p->operand[i + 1].mode;
+      /* Need to fixup modeless constant before testing predicate.  */
+      op = fixup_modeless_constant (op, mode);
       bool match = insn_p->operand[i + 1].predicate (op, mode);
 
       if (second_arg_count && i == 1)
@@ -11873,13 +11875,15 @@  ix86_expand_args_builtin (const struct builtin_description *d,
 	  /* If we aren't optimizing, only allow one memory operand to
 	     be generated.  */
 	  if (memory_operand (op, mode))
-	    num_memory++;
-
-	  op = fixup_modeless_constant (op, mode);
+	    {
+	      num_memory++;
+	      if (!optimize && num_memory > 1)
+		op = copy_to_mode_reg (mode, op);
+	    }
 
 	  if (GET_MODE (op) == mode || GET_MODE (op) == VOIDmode)
 	    {
-	      if (optimize || !match || num_memory > 1)
+	      if (!match)
 		op = copy_to_mode_reg (mode, op);
 	    }
 	  else
@@ -25480,4 +25484,511 @@  ix86_expand_fast_convert_bf_to_sf (rtx val)
   return ret;
 }
 
+/* Attempt to convert a CONST_VECTOR into a bcst_mem_operand.
+   Returns NULL_RTX if X is cannot be expressed as a suitable
+   VEC_DUPLICATE in mode MODE.  */
+
+static rtx
+ix86_gen_bcst_mem (machine_mode mode, rtx x)
+{
+  if (!TARGET_AVX512F
+      || GET_CODE (x) != CONST_VECTOR
+      || (!TARGET_AVX512VL
+	  && (GET_MODE_SIZE (mode) != 64 || !TARGET_EVEX512))
+      || !VALID_BCST_MODE_P (GET_MODE_INNER (mode))
+	 /* Disallow HFmode broadcast.  */
+      || GET_MODE_SIZE (GET_MODE_INNER (mode)) < 4)
+    return NULL_RTX;
+
+  rtx cst = CONST_VECTOR_ELT (x, 0);
+  if (!CONST_SCALAR_INT_P (cst)
+      && !CONST_DOUBLE_P (cst)
+      && !CONST_FIXED_P (cst))
+    return NULL_RTX;
+  
+  int n_elts = GET_MODE_NUNITS (mode);
+  if (CONST_VECTOR_NUNITS (x) != n_elts)
+    return NULL_RTX;
+
+  for (int i = 1; i < n_elts; i++)
+    if (!rtx_equal_p (cst, CONST_VECTOR_ELT (x, i)))
+      return NULL_RTX;
+
+  rtx mem = force_const_mem (GET_MODE_INNER (mode), cst);
+  return gen_rtx_VEC_DUPLICATE (mode, validize_mem (mem));
+}
+
+/* Determine the ternlog immediate index that implements 3-operand
+   ternary logic expression OP.  This uses and modifies the 3 element
+   array ARGS to record and check the leaves, either 3 REGs, or 2 REGs
+   and MEM.  Returns an index between 0 and 255 for a valid ternlog,
+   or -1 if the expression isn't suitable.  */
+
+int
+ix86_ternlog_idx (rtx op, rtx *args)
+{
+  int idx0, idx1;
+
+  if (!op)
+    return -1;
+
+  switch (GET_CODE (op))
+    {
+    case REG:
+      if (!args[0])
+	{
+	  args[0] = op;
+	  return 0xf0;
+	}
+      if (REGNO (op) == REGNO (args[0]))
+	return 0xf0;
+      if (!args[1])
+	{
+	  args[1] = op;
+	  return 0xcc;
+	}
+      if (REGNO (op) == REGNO (args[1]))
+	return 0xcc;
+      if (!args[2])
+	{
+	  args[2] = op;
+	  return 0xaa;
+	}
+      if (REG_P (args[2]) && REGNO (op) == REGNO (args[2]))
+	return 0xaa;
+      return -1;
+
+    case VEC_DUPLICATE:
+      if (!bcst_mem_operand (op, GET_MODE (op)))
+	return -1;
+      /* FALLTHRU */
+
+    case MEM:
+      if (MEM_P (op)
+	  && MEM_VOLATILE_P (op)
+	  && !volatile_ok)
+	return -1;
+      /* FALLTHRU */
+
+    case CONST_VECTOR:
+      if (!args[2])
+	{
+	  args[2] = op;
+	  return 0xaa;
+	}
+      /* Maximum of one volatile memory reference per expression.  */
+      if (side_effects_p (op) || side_effects_p (args[2]))
+	return -1;
+      if (rtx_equal_p (op, args[2]))
+	return 0xaa;
+      /* Check if one CONST_VECTOR is the ones-complement of the other.  */
+      if (GET_CODE (op) == CONST_VECTOR
+	  && GET_CODE (args[2]) == CONST_VECTOR
+	  && rtx_equal_p (simplify_const_unary_operation (NOT, GET_MODE (op),
+							  op, GET_MODE (op)),
+			  args[2]))
+	return 0x55;
+      return -1;
+
+    case SUBREG:
+      if (!VECTOR_MODE_P (GET_MODE (SUBREG_REG (op)))
+	  || GET_MODE_SIZE (GET_MODE (SUBREG_REG (op)))
+	     != GET_MODE_SIZE (GET_MODE (op)))
+	return -1;
+      return ix86_ternlog_idx (SUBREG_REG (op), args);
+
+    case NOT:
+      idx0 = ix86_ternlog_idx (XEXP (op, 0), args);
+      return (idx0 >= 0) ? idx0 ^ 0xff : -1;
+
+    case AND:
+      idx0 = ix86_ternlog_idx (XEXP (op, 0), args);
+      if (idx0 < 0)
+	return -1;
+      idx1 = ix86_ternlog_idx (XEXP (op, 1), args);
+      return (idx1 >= 0) ? idx0 & idx1 : -1;
+
+    case IOR:
+      idx0 = ix86_ternlog_idx (XEXP (op, 0), args);
+      if (idx0 < 0)
+	return -1;
+      idx1 = ix86_ternlog_idx (XEXP (op, 1), args);
+      return (idx1 >= 0) ? idx0 | idx1 : -1;
+
+    case XOR:
+      idx0 = ix86_ternlog_idx (XEXP (op, 0), args);
+      if (idx0 < 0)
+	return -1;
+      if (vector_all_ones_operand (XEXP (op, 1), GET_MODE (op)))
+	return idx0 ^ 0xff;
+      idx1 = ix86_ternlog_idx (XEXP (op, 1), args);
+      return (idx1 >= 0) ? idx0 ^ idx1 : -1;
+
+    case UNSPEC:
+      if (XINT (op, 1) != UNSPEC_VTERNLOG
+	  || XVECLEN (op, 0) != 4
+	  || CONST_INT_P (XVECEXP (op, 0, 3)))
+	return -1;
+
+      /* TODO: Handle permuted operands.  */
+      if (ix86_ternlog_idx (XVECEXP (op, 0, 0), args) != 0xf0
+	  || ix86_ternlog_idx (XVECEXP (op, 0, 1), args) != 0xcc
+	  || ix86_ternlog_idx (XVECEXP (op, 0, 2), args) != 0xaa)
+	return -1;
+      return INTVAL (XVECEXP (op, 0, 3));
+
+    default:
+      return -1;
+    }
+}
+
+/* Return TRUE if OP (in mode MODE) is the leaf of a ternary logic
+   expression, such as a register or a memory reference.  */
+ 
+bool
+ix86_ternlog_leaf_p (rtx op, machine_mode mode)
+{
+  /* We can't use memory_operand here, as it may return a different
+     value before and after reload (for volatile MEMs) which creates
+     problems splitting instructions.  */
+  return register_operand (op, mode)
+	 || MEM_P (op)
+	 || GET_CODE (op) == CONST_VECTOR
+	 || bcst_mem_operand (op, mode);
+}
+
+/* Test whether OP is a 3-operand ternary logic expression suitable
+   for use in a ternlog instruction.  */
+
+bool
+ix86_ternlog_operand_p (rtx op)
+{
+  rtx op0, op1;
+  rtx args[3];
+
+  args[0] = NULL_RTX;
+  args[1] = NULL_RTX;
+  args[2] = NULL_RTX;
+  int idx = ix86_ternlog_idx (op, args);
+  if (idx < 0)
+    return false;
+
+  /* Don't match simple (binary or unary) expressions.  */
+  machine_mode mode = GET_MODE (op);
+  switch (GET_CODE (op))
+    {
+    case AND:
+      op0 = XEXP (op, 0);
+      op1 = XEXP (op, 1);
+
+      /* Prefer pand.  */
+      if (ix86_ternlog_leaf_p (op0, mode)
+	  && ix86_ternlog_leaf_p (op1, mode))
+	return false;
+      /* Prefer pandn.  */
+      if (GET_CODE (op0) == NOT
+	  && register_operand (XEXP (op0, 0), mode)
+	  && ix86_ternlog_leaf_p (op1, mode))
+	return false;
+      break;
+
+    case IOR:
+      /* Prefer por.  */
+      if (ix86_ternlog_leaf_p (XEXP (op, 0), mode)
+	  && ix86_ternlog_leaf_p (XEXP (op, 1), mode))
+	return false;
+      break;
+
+    case XOR:
+      op1 = XEXP (op, 1);
+      /* Prefer pxor.  */
+      if (ix86_ternlog_leaf_p (XEXP (op, 0), mode)
+	  && (ix86_ternlog_leaf_p (op1, mode)
+	      || vector_all_ones_operand (op1, mode)))
+	return false;
+      break;
+
+    default:
+      break;
+    }
+  return true;
+}
+
+/* Helper function for ix86_expand_ternlog.  */
+static rtx
+ix86_expand_ternlog_binop (enum rtx_code code, machine_mode mode,
+			   rtx op0, rtx op1, rtx target)
+{
+  if (GET_MODE (op0) != mode)
+    op0 = gen_lowpart (mode, op0);
+  if (GET_MODE (op1) != mode)
+    op1 = gen_lowpart (mode, op1);
+
+  if (GET_CODE (op0) == CONST_VECTOR)
+    op0 = validize_mem (force_const_mem (mode, op0));
+  if (GET_CODE (op1) == CONST_VECTOR)
+    op1 = validize_mem (force_const_mem (mode, op1));
+
+  if (memory_operand (op0, mode))
+    {
+      if (memory_operand (op1, mode))
+	op0 = force_reg (mode, op0);
+      else
+	std::swap (op0, op1);
+    }
+  rtx ops[3] = { target, op0, op1 };
+  ix86_expand_vector_logical_operator (code, mode, ops);
+  return target;
+}
+
+
+/* Helper function for ix86_expand_ternlog.  */
+static rtx
+ix86_expand_ternlog_andnot (machine_mode mode, rtx op0, rtx op1, rtx target)
+{
+  if (GET_MODE (op0) != mode)
+    op0 = gen_lowpart (mode, op0);
+  op0 = gen_rtx_NOT (mode, op0);
+  if (GET_MODE (op1) != mode)
+    op1 = gen_lowpart (mode, op1);
+  emit_move_insn (target, gen_rtx_AND (mode, op0, op1));
+  return target;
+}
+
+/* Expand a 3-operand ternary logic expression.  Return TARGET. */
+rtx
+ix86_expand_ternlog (machine_mode mode, rtx op0, rtx op1, rtx op2, int idx,
+		     rtx target)
+{
+  rtx tmp0, tmp1, tmp2;
+
+  if (!target)
+    target = gen_reg_rtx (mode);
+
+  /* Canonicalize ternlog index for degenerate (duplicated) operands.  */
+  if (rtx_equal_p (op0, op1) && rtx_equal_p (op0, op2))
+    switch (idx & 0x81)
+      {
+      case 0x00:
+	idx = 0x00;
+	break;
+      case 0x01:
+	idx = 0x0f;
+	break;
+      case 0x80:
+	idx = 0xf0;
+	break;
+      case 0x81:
+	idx = 0xff;
+	break;
+      }
+
+  switch (idx & 0xff)
+    {
+    case 0x00:
+      emit_move_insn (target, CONST0_RTX (mode));
+      return target;
+
+    case 0x0a: /* ~a&c */
+      if ((!op1 || !side_effects_p (op1))
+	  && register_operand (op0, mode)
+	  && register_operand (op2, mode))
+	return ix86_expand_ternlog_andnot (mode, op0, op1, target);
+      break;
+
+    case 0x0c: /* ~a&b */
+      if ((!op2 || !side_effects_p (op2))
+	  && register_operand (op0, mode)
+	  && register_operand (op1, mode))
+	return ix86_expand_ternlog_andnot (mode, op0, op1, target);
+      break;
+
+    case 0x0f:  /* ~a */
+      if ((!op1 || !side_effects_p (op1))
+	  && (!op2 || !side_effects_p (op2)))
+	{
+	  if (GET_MODE (op0) != mode)
+	    op0 = gen_lowpart (mode, op0);
+	  if (!TARGET_64BIT && !register_operand (op0, mode))
+	    op0 = force_reg (mode, op0);
+	  emit_move_insn (target, gen_rtx_XOR (mode, op0, CONSTM1_RTX (mode)));
+	  return target;
+	}
+      break;
+  
+    case 0x22: /* ~b&c */
+      if ((!op0 || !side_effects_p (op0))
+	  && register_operand (op1, mode)
+	  && register_operand (op2, mode))
+	return ix86_expand_ternlog_andnot (mode, op1, op2, target);
+      break;
+
+    case 0x30: /* ~b&a */
+      if ((!op2 || !side_effects_p (op2))
+	  && register_operand (op0, mode)
+	  && register_operand (op1, mode))
+	return ix86_expand_ternlog_andnot (mode, op1, op0, target);
+      break;
+
+    case 0x33:  /* ~b */
+      if ((!op0 || !side_effects_p (op0))
+	  && (!op2 || !side_effects_p (op2)))
+	{
+	  if (GET_MODE (op1) != mode)
+	    op1 = gen_lowpart (mode, op1);
+	  if (!TARGET_64BIT && !register_operand (op1, mode))
+	    op1 = force_reg (mode, op1);
+	  emit_move_insn (target, gen_rtx_XOR (mode, op1, CONSTM1_RTX (mode)));
+	  return target;
+	}
+      break;
+
+    case 0x3c:  /* a^b */
+      if (!op2 || !side_effects_p (op2))
+	return ix86_expand_ternlog_binop (XOR, mode, op0, op1, target);
+      break;
+
+    case 0x44: /* ~c&b */
+      if ((!op0 || !side_effects_p (op0))
+	  && register_operand (op1, mode)
+	  && register_operand (op2, mode))
+	return ix86_expand_ternlog_andnot (mode, op2, op1, target);
+      break;
+
+    case 0x50: /* ~c&a */
+      if ((!op1 || !side_effects_p (op1))
+	  && register_operand (op0, mode)
+	  && register_operand (op2, mode))
+	return ix86_expand_ternlog_andnot (mode, op2, op0, target);
+      break;
+
+    case 0x55:  /* ~c */
+      if ((!op0 || !side_effects_p (op0))
+	  && (!op1 || !side_effects_p (op1)))
+	{
+	  if (GET_MODE (op2) != mode)
+	    op2 = gen_lowpart (mode, op2);
+	  if (!TARGET_64BIT && !register_operand (op2, mode))
+	    op2 = force_reg (mode, op2);
+	  emit_move_insn (target, gen_rtx_XOR (mode, op2, CONSTM1_RTX (mode)));
+	  return target;
+	}
+      break;
+  
+    case 0x5a:  /* a^c */
+      if (!op1 || !side_effects_p (op1))
+	return ix86_expand_ternlog_binop (XOR, mode, op0, op2, target);
+      break;
+
+    case 0x66:  /* b^c */
+      if (!op0 || !side_effects_p (op0))
+	return ix86_expand_ternlog_binop (XOR, mode, op1, op2, target);
+      break;
+
+    case 0x88:  /* b&c */
+      if (!op0 || !side_effects_p (op0))
+	return ix86_expand_ternlog_binop (AND, mode, op1, op2, target);
+      break;
+
+    case 0xa0:  /* a&c */
+      if (!op1 || !side_effects_p (op1))
+	return ix86_expand_ternlog_binop (AND, mode, op0, op2, target);
+      break;
+
+    case 0xaa:  /* c */
+      if ((!op0 || !side_effects_p (op0))
+	  && (!op1 || !side_effects_p (op1)))
+	{
+	  if (GET_MODE (op2) != mode)
+	    op2 = gen_lowpart (mode, op2);
+	  emit_move_insn (target, op2);
+	  return target;
+	}
+      break;
+
+    case 0xc0:  /* a&b */
+      if (!op2 || !side_effects_p (op2))
+	return ix86_expand_ternlog_binop (AND, mode, op0, op1, target);
+      break;
+
+    case 0xcc:  /* b */
+      if ((!op0 || !side_effects_p (op0))
+	  && (!op2 || !side_effects_p (op2)))
+	{
+	  if (GET_MODE (op1) != mode)
+	    op1 = gen_lowpart (mode, op1);
+	  emit_move_insn (target, op1);
+	  return target;
+	}
+      break;
+
+    case 0xee:  /* b|c */
+      if (!op0 || !side_effects_p (op0))
+	return ix86_expand_ternlog_binop (IOR, mode, op1, op2, target);
+      break;
+
+    case 0xf0:  /* a */
+      if ((!op1 || !side_effects_p (op1))
+	  && (!op2 || !side_effects_p (op2)))
+	{
+	  if (GET_MODE (op0) != mode)
+	    op0 = gen_lowpart (mode, op0);
+	  emit_move_insn (target, op0);
+	  return target;
+	}
+      break;
+
+    case 0xfa:  /* a|c */
+      if (!op1 || !side_effects_p (op1))
+	return ix86_expand_ternlog_binop (IOR, mode, op0, op2, target);
+      break;
+
+    case 0xfc:  /* a|b */
+      if (!op2 || !side_effects_p (op2))
+	return ix86_expand_ternlog_binop (IOR, mode, op0, op1, target);
+      break;
+
+    case 0xff:
+      emit_move_insn (target, CONSTM1_RTX (mode));
+      return target;
+    }
+
+  tmp0 = register_operand (op0, mode) ? op0 : force_reg (mode, op0);
+  if (GET_MODE (tmp0) != mode)
+    tmp0 = gen_lowpart (mode, tmp0);
+
+  if (!op1 || rtx_equal_p (op0, op1))
+    tmp1 = copy_rtx (tmp0);
+  else if (!register_operand (op1, mode))
+    tmp1 = force_reg (mode, op1);
+  else
+    tmp1 = op1;
+  if (GET_MODE (tmp1) != mode)
+    tmp1 = gen_lowpart (mode, tmp1);
+
+  if (!op2 || rtx_equal_p (op0, op2))
+    tmp2 = copy_rtx (tmp0);
+  else if (rtx_equal_p (op1, op2))
+    tmp2 = copy_rtx (tmp1);
+  else if (GET_CODE (op2) == CONST_VECTOR)
+    {
+      if (GET_MODE (op2) != mode)
+	op2 = gen_lowpart (mode, op2);
+      tmp2 = ix86_gen_bcst_mem (mode, op2);
+      if (!tmp2)
+	tmp2 = validize_mem (force_const_mem (mode, op2));
+    }
+  else
+    tmp2 = op2;
+  if (GET_MODE (tmp2) != mode)
+    tmp2 = gen_lowpart (mode, tmp2);
+  /* Some memory_operands are not vector_memory_operands.  */
+  if (!bcst_vector_operand (tmp2, mode))
+    tmp2 = force_reg (mode, tmp2);
+
+  rtvec vec = gen_rtvec (4, tmp0, tmp1, tmp2, GEN_INT (idx));
+  emit_move_insn (target, gen_rtx_UNSPEC (mode, vec, UNSPEC_VTERNLOG));
+  return target;
+}
+
 #include "gt-i386-expand.h"
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 46214a6..9a3e183 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -245,6 +245,11 @@  extern rtx ix86_expand_fast_convert_bf_to_sf (rtx);
 extern rtx ix86_memtag_untagged_pointer (rtx, rtx);
 extern bool ix86_memtag_can_tag_addresses (void);
 
+extern int ix86_ternlog_idx (rtx op, rtx *args);
+extern bool ix86_ternlog_operand_p (rtx op);
+extern rtx ix86_expand_ternlog (machine_mode mode, rtx op0, rtx op1, rtx op2,
+				int idx, rtx target);
+
 #ifdef TREE_CODE
 extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int);
 #endif	/* TREE_CODE  */
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 2a97776..e0f38cd 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1098,6 +1098,11 @@ 
        (and (match_code "not")
 	    (match_test "nonimmediate_operand (XEXP (op, 0), mode)"))))
 
+;; True for expressions valid for 3-operand ternlog instructions.
+(define_predicate "ternlog_operand"
+  (and (match_code "not,and,ior,xor,subreg")
+       (match_test "ix86_ternlog_operand_p (op)")))
+
 ;; True if OP is acceptable as operand of DImode shift expander.
 (define_predicate "shiftdi_operand"
   (if_then_else (match_test "TARGET_64BIT")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 1bf5072..3148651 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -12940,6 +12940,26 @@ 
 ;;
 ;; and so on.
 
+(define_insn_and_split "*<avx512>_vpternlog<mode>_0"
+  [(set (match_operand:V 0 "register_operand")
+	(match_operand:V 1 "ternlog_operand"))]
+  "(<MODE_SIZE> == 64 || TARGET_AVX512VL
+    || (TARGET_AVX512F && TARGET_EVEX512 && !TARGET_PREFER_AVX256))
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  rtx args[3];
+  args[0] = NULL_RTX;
+  args[1] = NULL_RTX;
+  args[2] = NULL_RTX;
+  int idx = ix86_ternlog_idx (operands[1], args);
+  ix86_expand_ternlog (<MODE>mode, args[0], args[1], args[2], idx,
+		       operands[0]);
+  DONE;
+})
+
 (define_code_iterator any_logic1 [and ior xor])
 (define_code_iterator any_logic2 [and ior xor])
 (define_code_attr logic_op [(and "&") (ior "|") (xor "^")])
@@ -13160,7 +13180,33 @@ 
 })
 
 
-(define_insn "<avx512>_vternlog<mode>_mask"
+(define_expand "<avx512>_vternlog<mode>_mask"
+  [(set (match_operand:VI48_AVX512VL 0 "register_operand")
+	(vec_merge:VI48_AVX512VL
+	  (unspec:VI48_AVX512VL
+	    [(match_operand:VI48_AVX512VL 1 "register_operand")
+	     (match_operand:VI48_AVX512VL 2 "register_operand")
+	     (match_operand:VI48_AVX512VL 3 "bcst_vector_operand")
+	     (match_operand:SI 4 "const_0_to_255_operand")]
+	    UNSPEC_VTERNLOG)
+	  (match_dup 1)
+	  (match_operand:<avx512fmaskmode> 5 "general_operand")))]
+  "TARGET_AVX512F"
+{
+  unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (<avx512fmaskmode>mode);
+  if (CONST_INT_P (operands[5])
+      && (UINTVAL (operands[5]) & mode_mask) == mode_mask)
+    {
+      ix86_expand_ternlog (<MODE>mode, operands[1], operands[2],
+			   operands[3], INTVAL (operands[4]),
+			   operands[0]);
+      DONE;
+    }
+  if (!register_operand (operands[5], <avx512fmaskmode>mode))
+    operands[5] = force_reg (<avx512fmaskmode>mode, operands[5]);
+})
+
+(define_insn "*<avx512>_vternlog<mode>_mask"
   [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v")
 	(vec_merge:VI48_AVX512VL
 	  (unspec:VI48_AVX512VL
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-andn-di-zmm-2.c b/gcc/testsuite/gcc.target/i386/avx512f-andn-di-zmm-2.c
index 4ebb30f..24f3d6c 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-andn-di-zmm-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-di-zmm-2.c
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */
-/* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\\\$0x44, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
+/* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\\\$80, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
 /* { dg-final { scan-assembler-not "vpbroadcast" } } */
 
 #define type __m512i
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c b/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c
index 86e7ebe..1f5e72d 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
-/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$0x44, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
+/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$80, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
 /* { dg-final { scan-assembler-not "vpbroadcast" } } */
 
 #define type __m512i
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-orn-si-zmm-1.c b/gcc/testsuite/gcc.target/i386/avx512f-orn-si-zmm-1.c
index 7d02f03..d21f48f 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-orn-si-zmm-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-orn-si-zmm-1.c
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */
-/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$0xdd, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
+/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$245, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
 /* { dg-final { scan-assembler-not "vpbroadcast" } } */
 
 #define type __m512i
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-orn-si-zmm-2.c b/gcc/testsuite/gcc.target/i386/avx512f-orn-si-zmm-2.c
index c793083..5359200 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-orn-si-zmm-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-orn-si-zmm-2.c
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */
-/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$0xbb, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
+/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$175, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
 /* { dg-final { scan-assembler-not "vpbroadcast" } } */
 
 #define type __m512i
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vpternlogd-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vpternlogd-1.c
index a88153a..b098487 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vpternlogd-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vpternlogd-1.c
@@ -1,6 +1,5 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
-/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
 /* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vpternlogq-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vpternlogq-1.c
index ef30246..8e5d22f 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vpternlogq-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vpternlogq-1.c
@@ -1,6 +1,5 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
-/* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
 /* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vpternlogd-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-vpternlogd-1.c
index 045a266..dd53563 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-vpternlogd-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vpternlogd-1.c
@@ -1,7 +1,5 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
 /* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vpternlogq-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-vpternlogq-1.c
index 3a6707c..31fec3e 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-vpternlogq-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vpternlogq-1.c
@@ -1,7 +1,5 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
 /* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr100711-3.c b/gcc/testsuite/gcc.target/i386/pr100711-3.c
index 98cc1c3..ea60190 100644
--- a/gcc/testsuite/gcc.target/i386/pr100711-3.c
+++ b/gcc/testsuite/gcc.target/i386/pr100711-3.c
@@ -39,4 +39,4 @@  v8di foo_v8di (long long a, v8di b)
 
 /* { dg-final { scan-assembler-times "vpandn" 4 { target { ! ia32 } } } } */
 /* { dg-final { scan-assembler-times "vpandn" 2 { target { ia32 } } } } */
-/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0x44" 2 { target { ia32 } } } } */
+/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$80" 2 { target { ia32 } } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr100711-4.c b/gcc/testsuite/gcc.target/i386/pr100711-4.c
index 3ca524f..a33f0a1 100644
--- a/gcc/testsuite/gcc.target/i386/pr100711-4.c
+++ b/gcc/testsuite/gcc.target/i386/pr100711-4.c
@@ -37,6 +37,6 @@  v8di foo_v8di (long long a, v8di b)
     return (__extension__ (v8di) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b;
 }
 
-/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xbb" 4 { target { ! ia32 } } } } */
-/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xbb" 2 { target { ia32 } } } } */
-/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xdd" 2 { target { ia32 } } } } */
+/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$207" 4 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$207" 2 { target { ia32 } } } } */
+/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$245" 2 { target { ia32 } } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr100711-5.c b/gcc/testsuite/gcc.target/i386/pr100711-5.c
index 161fbfc..99cafc1 100644
--- a/gcc/testsuite/gcc.target/i386/pr100711-5.c
+++ b/gcc/testsuite/gcc.target/i386/pr100711-5.c
@@ -37,4 +37,4 @@  v8di foo_v8di (long long a, v8di b)
     return (__extension__ (v8di) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) ^ b;
 }
 
-/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0x99" 4 } } */
+/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$1\[69\]5" 4 } } */