diff mbox series

internal-fn: Reuse SUBREG_PROMOTED_VAR_P handling

Message ID mptr0c11cbb.fsf@arm.com
State New
Headers show
Series internal-fn: Reuse SUBREG_PROMOTED_VAR_P handling | expand

Commit Message

Richard Sandiford July 10, 2024, 3:44 p.m. UTC
expand_fn_using_insn has code to handle SUBREG_PROMOTED_VAR_P
destinations.  Specifically, for:

  (subreg/v:M1 (reg:M2 R) ...)

it creates a new temporary register T, uses it for the output
operand, then sign- or zero-extends the M1 lowpart of T to M2,
storing the result in R.

This patch splits this handling out into helper routines and
uses them for other instances of:

  if (!rtx_equal_p (target, ops[0].value))
    emit_move_insn (target, ops[0].value);

It's quite probable that this doesn't help any of the other cases;
in particular, it shouldn't affect vectors.  But I think it could
be useful for the CRC work.

Bootstrapped & regression-tested on aarch64-linux-gnu.  OK to install?

Richard


gcc/
	* internal-fn.cc (create_call_lhs_operand, assign_call_lhs): New
	functions, split out from...
	(expand_fn_using_insn): ...here.
	(expand_load_lanes_optab_fn): Use them.
	(expand_GOMP_SIMT_ENTER_ALLOC): Likewise.
	(expand_GOMP_SIMT_LAST_LANE): Likewise.
	(expand_GOMP_SIMT_ORDERED_PRED): Likewise.
	(expand_GOMP_SIMT_VOTE_ANY): Likewise.
	(expand_GOMP_SIMT_XCHG_BFLY): Likewise.
	(expand_GOMP_SIMT_XCHG_IDX): Likewise.
	(expand_partial_load_optab_fn): Likewise.
	(expand_vec_cond_optab_fn): Likewise.
	(expand_vec_cond_mask_optab_fn): Likewise.
	(expand_RAWMEMCHR): Likewise.
	(expand_gather_load_optab_fn): Likewise.
	(expand_while_optab_fn): Likewise.
	(expand_SPACESHIP): Likewise.
---
 gcc/internal-fn.cc | 162 +++++++++++++++++++++++----------------------
 1 file changed, 84 insertions(+), 78 deletions(-)

Comments

Jeff Law July 10, 2024, 4:26 p.m. UTC | #1
On 7/10/24 9:44 AM, Richard Sandiford wrote:
> expand_fn_using_insn has code to handle SUBREG_PROMOTED_VAR_P
> destinations.  Specifically, for:
> 
>    (subreg/v:M1 (reg:M2 R) ...)
> 
> it creates a new temporary register T, uses it for the output
> operand, then sign- or zero-extends the M1 lowpart of T to M2,
> storing the result in R.
> 
> This patch splits this handling out into helper routines and
> uses them for other instances of:
> 
>    if (!rtx_equal_p (target, ops[0].value))
>      emit_move_insn (target, ops[0].value);
> 
> It's quite probable that this doesn't help any of the other cases;
> in particular, it shouldn't affect vectors.  But I think it could
> be useful for the CRC work.
> 
> Bootstrapped & regression-tested on aarch64-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> gcc/
> 	* internal-fn.cc (create_call_lhs_operand, assign_call_lhs): New
> 	functions, split out from...
> 	(expand_fn_using_insn): ...here.
> 	(expand_load_lanes_optab_fn): Use them.
> 	(expand_GOMP_SIMT_ENTER_ALLOC): Likewise.
> 	(expand_GOMP_SIMT_LAST_LANE): Likewise.
> 	(expand_GOMP_SIMT_ORDERED_PRED): Likewise.
> 	(expand_GOMP_SIMT_VOTE_ANY): Likewise.
> 	(expand_GOMP_SIMT_XCHG_BFLY): Likewise.
> 	(expand_GOMP_SIMT_XCHG_IDX): Likewise.
> 	(expand_partial_load_optab_fn): Likewise.
> 	(expand_vec_cond_optab_fn): Likewise.
> 	(expand_vec_cond_mask_optab_fn): Likewise.
> 	(expand_RAWMEMCHR): Likewise.
> 	(expand_gather_load_optab_fn): Likewise.
> 	(expand_while_optab_fn): Likewise.
> 	(expand_SPACESHIP): Likewise.
OK.

FWIW, we did some testing for cases where we didn't utilize the 
SUBREG_PROMOTED* bits to eliminate extensions coming out of the 
expansion phase.  For the most part we're doing a good job.  IIRC our 
instrumentation showed they tended to sneak in mostly through expansion 
of builtins (particularly the arithemtic overflow builtins).

Jeff
Richard Sandiford July 10, 2024, 4:48 p.m. UTC | #2
Thanks for the review.

Jeff Law <jeffreyalaw@gmail.com> writes:
> On 7/10/24 9:44 AM, Richard Sandiford wrote:
>> expand_fn_using_insn has code to handle SUBREG_PROMOTED_VAR_P
>> destinations.  Specifically, for:
>> 
>>    (subreg/v:M1 (reg:M2 R) ...)
>> 
>> it creates a new temporary register T, uses it for the output
>> operand, then sign- or zero-extends the M1 lowpart of T to M2,
>> storing the result in R.
>> 
>> This patch splits this handling out into helper routines and
>> uses them for other instances of:
>> 
>>    if (!rtx_equal_p (target, ops[0].value))
>>      emit_move_insn (target, ops[0].value);
>> 
>> It's quite probable that this doesn't help any of the other cases;
>> in particular, it shouldn't affect vectors.  But I think it could
>> be useful for the CRC work.
>> 
>> Bootstrapped & regression-tested on aarch64-linux-gnu.  OK to install?
>> 
>> Richard
>> 
>> 
>> gcc/
>> 	* internal-fn.cc (create_call_lhs_operand, assign_call_lhs): New
>> 	functions, split out from...
>> 	(expand_fn_using_insn): ...here.
>> 	(expand_load_lanes_optab_fn): Use them.
>> 	(expand_GOMP_SIMT_ENTER_ALLOC): Likewise.
>> 	(expand_GOMP_SIMT_LAST_LANE): Likewise.
>> 	(expand_GOMP_SIMT_ORDERED_PRED): Likewise.
>> 	(expand_GOMP_SIMT_VOTE_ANY): Likewise.
>> 	(expand_GOMP_SIMT_XCHG_BFLY): Likewise.
>> 	(expand_GOMP_SIMT_XCHG_IDX): Likewise.
>> 	(expand_partial_load_optab_fn): Likewise.
>> 	(expand_vec_cond_optab_fn): Likewise.
>> 	(expand_vec_cond_mask_optab_fn): Likewise.
>> 	(expand_RAWMEMCHR): Likewise.
>> 	(expand_gather_load_optab_fn): Likewise.
>> 	(expand_while_optab_fn): Likewise.
>> 	(expand_SPACESHIP): Likewise.
> OK.
>
> FWIW, we did some testing for cases where we didn't utilize the 
> SUBREG_PROMOTED* bits to eliminate extensions coming out of the 
> expansion phase.  For the most part we're doing a good job.  IIRC our 
> instrumentation showed they tended to sneak in mostly through expansion 
> of builtins (particularly the arithemtic overflow builtins).

That sounds great!  I can see why SUBREG_PROMOTED* was a nice hack,
but it's also a source of subtle bugs, and can sometimes mean that
we generate extensions that aren't really needed.  It would be good
if the optimisers are getting to the state where we could remove
it and express things in "natural rtl".

(Admittedly I'm using "natural rtl" to mean "rtl that seems
sensible to me".  It's not very objective.)

On a similar theme, have you ever tried getting rid of
WORD_REGISTER_OPERATIONS for riscv?  Kyrill did that in 2016
for aarch64 (56c9ef5f2fa5787ddd7b2c83804a46554fa1ffc9) and I've
never seen it cause a missed optimisation.  There too, we seem
to get good results using natural rtl.

Richard
Jeff Law July 10, 2024, 5 p.m. UTC | #3
On 7/10/24 10:48 AM, Richard Sandiford wrote:
> Thanks for the review.
> 
> Jeff Law <jeffreyalaw@gmail.com> writes:
>> On 7/10/24 9:44 AM, Richard Sandiford wrote:
>>> expand_fn_using_insn has code to handle SUBREG_PROMOTED_VAR_P
>>> destinations.  Specifically, for:
>>>
>>>     (subreg/v:M1 (reg:M2 R) ...)
>>>
>>> it creates a new temporary register T, uses it for the output
>>> operand, then sign- or zero-extends the M1 lowpart of T to M2,
>>> storing the result in R.
>>>
>>> This patch splits this handling out into helper routines and
>>> uses them for other instances of:
>>>
>>>     if (!rtx_equal_p (target, ops[0].value))
>>>       emit_move_insn (target, ops[0].value);
>>>
>>> It's quite probable that this doesn't help any of the other cases;
>>> in particular, it shouldn't affect vectors.  But I think it could
>>> be useful for the CRC work.
>>>
>>> Bootstrapped & regression-tested on aarch64-linux-gnu.  OK to install?
>>>
>>> Richard
>>>
>>>
>>> gcc/
>>> 	* internal-fn.cc (create_call_lhs_operand, assign_call_lhs): New
>>> 	functions, split out from...
>>> 	(expand_fn_using_insn): ...here.
>>> 	(expand_load_lanes_optab_fn): Use them.
>>> 	(expand_GOMP_SIMT_ENTER_ALLOC): Likewise.
>>> 	(expand_GOMP_SIMT_LAST_LANE): Likewise.
>>> 	(expand_GOMP_SIMT_ORDERED_PRED): Likewise.
>>> 	(expand_GOMP_SIMT_VOTE_ANY): Likewise.
>>> 	(expand_GOMP_SIMT_XCHG_BFLY): Likewise.
>>> 	(expand_GOMP_SIMT_XCHG_IDX): Likewise.
>>> 	(expand_partial_load_optab_fn): Likewise.
>>> 	(expand_vec_cond_optab_fn): Likewise.
>>> 	(expand_vec_cond_mask_optab_fn): Likewise.
>>> 	(expand_RAWMEMCHR): Likewise.
>>> 	(expand_gather_load_optab_fn): Likewise.
>>> 	(expand_while_optab_fn): Likewise.
>>> 	(expand_SPACESHIP): Likewise.
>> OK.
>>
>> FWIW, we did some testing for cases where we didn't utilize the
>> SUBREG_PROMOTED* bits to eliminate extensions coming out of the
>> expansion phase.  For the most part we're doing a good job.  IIRC our
>> instrumentation showed they tended to sneak in mostly through expansion
>> of builtins (particularly the arithemtic overflow builtins).
> 
> That sounds great!  I can see why SUBREG_PROMOTED* was a nice hack,
> but it's also a source of subtle bugs, and can sometimes mean that
> we generate extensions that aren't really needed.  It would be good
> if the optimisers are getting to the state where we could remove
> it and express things in "natural rtl".
They're getting better :-)  But we're not there yet.  I sketched out 
exposing the ABI extension requirements in RTL for REE.  I never got it 
working well enough to claim a successful POC.


> 
> (Admittedly I'm using "natural rtl" to mean "rtl that seems
> sensible to me".  It's not very objective.)
> 
> On a similar theme, have you ever tried getting rid of
> WORD_REGISTER_OPERATIONS for riscv?  Kyrill did that in 2016
> for aarch64 (56c9ef5f2fa5787ddd7b2c83804a46554fa1ffc9) and I've
> never seen it cause a missed optimisation.  There too, we seem
> to get good results using natural rtl.
We haven't tried, but it's on the list of things to explore.

jeff
diff mbox series

Patch

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 4948b48bde8..95946bfd683 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -199,6 +199,58 @@  const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = {
   not_direct
 };
 
+/* Like create_output_operand, but for callers that will use
+   assign_call_lhs afterwards.  */
+
+static void
+create_call_lhs_operand (expand_operand *op, rtx lhs_rtx, machine_mode mode)
+{
+  /* Do not assign directly to a promoted subreg, since there is no
+     guarantee that the instruction will leave the upper bits of the
+     register in the state required by SUBREG_PROMOTED_SIGN.  */
+  rtx dest = lhs_rtx;
+  if (dest && GET_CODE (dest) == SUBREG && SUBREG_PROMOTED_VAR_P (dest))
+    dest = NULL_RTX;
+  create_output_operand (op, dest, mode);
+}
+
+/* Move the result of an expanded instruction into the lhs of a gimple call.
+   LHS is the lhs of the call, LHS_RTX is its expanded form, and OP is the
+   result of the expanded instruction.  OP should have been set up by
+   create_call_lhs_operand.  */
+
+static void
+assign_call_lhs (tree lhs, rtx lhs_rtx, expand_operand *op)
+{
+  if (rtx_equal_p (lhs_rtx, op->value))
+    return;
+
+  /* If the return value has an integral type, convert the instruction
+     result to that type.  This is useful for things that return an
+     int regardless of the size of the input.  If the instruction result
+     is smaller than required, assume that it is signed.
+
+     If the return value has a nonintegral type, its mode must match
+     the instruction result.  */
+  if (GET_CODE (lhs_rtx) == SUBREG && SUBREG_PROMOTED_VAR_P (lhs_rtx))
+    {
+      /* If this is a scalar in a register that is stored in a wider
+	 mode than the declared mode, compute the result into its
+	 declared mode and then convert to the wider mode.  */
+      gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
+      rtx tmp = convert_to_mode (GET_MODE (lhs_rtx), op->value, 0);
+      convert_move (SUBREG_REG (lhs_rtx), tmp,
+		    SUBREG_PROMOTED_SIGN (lhs_rtx));
+    }
+  else if (GET_MODE (lhs_rtx) == GET_MODE (op->value))
+    emit_move_insn (lhs_rtx, op->value);
+  else
+    {
+      gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
+      convert_move (lhs_rtx, op->value, 0);
+    }
+}
+
 /* Expand STMT using instruction ICODE.  The instruction has NOUTPUTS
    output operands and NINPUTS input operands, where NOUTPUTS is either
    0 or 1.  The output operand (if any) comes first, followed by the
@@ -220,15 +272,8 @@  expand_fn_using_insn (gcall *stmt, insn_code icode, unsigned int noutputs,
       gcc_assert (noutputs == 1);
       if (lhs)
 	lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-
-      /* Do not assign directly to a promoted subreg, since there is no
-	 guarantee that the instruction will leave the upper bits of the
-	 register in the state required by SUBREG_PROMOTED_SIGN.  */
-      rtx dest = lhs_rtx;
-      if (dest && GET_CODE (dest) == SUBREG && SUBREG_PROMOTED_VAR_P (dest))
-	dest = NULL_RTX;
-      create_output_operand (&ops[opno], dest,
-			     insn_data[icode].operand[opno].mode);
+      create_call_lhs_operand (&ops[opno], lhs_rtx,
+			       insn_data[icode].operand[opno].mode);
       opno += 1;
     }
   else
@@ -266,33 +311,8 @@  expand_fn_using_insn (gcall *stmt, insn_code icode, unsigned int noutputs,
 
   gcc_assert (opno == noutputs + ninputs);
   expand_insn (icode, opno, ops);
-  if (lhs_rtx && !rtx_equal_p (lhs_rtx, ops[0].value))
-    {
-      /* If the return value has an integral type, convert the instruction
-	 result to that type.  This is useful for things that return an
-	 int regardless of the size of the input.  If the instruction result
-	 is smaller than required, assume that it is signed.
-
-	 If the return value has a nonintegral type, its mode must match
-	 the instruction result.  */
-      if (GET_CODE (lhs_rtx) == SUBREG && SUBREG_PROMOTED_VAR_P (lhs_rtx))
-	{
-	  /* If this is a scalar in a register that is stored in a wider
-	     mode than the declared mode, compute the result into its
-	     declared mode and then convert to the wider mode.  */
-	  gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
-	  rtx tmp = convert_to_mode (GET_MODE (lhs_rtx), ops[0].value, 0);
-	  convert_move (SUBREG_REG (lhs_rtx), tmp,
-			SUBREG_PROMOTED_SIGN (lhs_rtx));
-	}
-      else if (GET_MODE (lhs_rtx) == GET_MODE (ops[0].value))
-	emit_move_insn (lhs_rtx, ops[0].value);
-      else
-	{
-	  gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
-	  convert_move (lhs_rtx, ops[0].value, 0);
-	}
-    }
+  if (lhs_rtx)
+    assign_call_lhs (lhs, lhs_rtx, &ops[0]);
 }
 
 /* ARRAY_TYPE is an array of vector modes.  Return the associated insn
@@ -376,11 +396,10 @@  expand_load_lanes_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
   gcc_assert (MEM_P (mem));
   PUT_MODE (mem, TYPE_MODE (type));
 
-  create_output_operand (&ops[0], target, TYPE_MODE (type));
+  create_call_lhs_operand (&ops[0], target, TYPE_MODE (type));
   create_fixed_operand (&ops[1], mem);
   expand_insn (get_multi_vector_move (type, optab), 2, ops);
-  if (!rtx_equal_p (target, ops[0].value))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 /* Expand STORE_LANES call STMT using optab OPTAB.  */
@@ -443,13 +462,12 @@  expand_GOMP_SIMT_ENTER_ALLOC (internal_fn, gcall *stmt)
   rtx size = expand_normal (gimple_call_arg (stmt, 0));
   rtx align = expand_normal (gimple_call_arg (stmt, 1));
   class expand_operand ops[3];
-  create_output_operand (&ops[0], target, Pmode);
+  create_call_lhs_operand (&ops[0], target, Pmode);
   create_input_operand (&ops[1], size, Pmode);
   create_input_operand (&ops[2], align, Pmode);
   gcc_assert (targetm.have_omp_simt_enter ());
   expand_insn (targetm.code_for_omp_simt_enter, 3, ops);
-  if (!rtx_equal_p (target, ops[0].value))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 /* Deallocate per-lane storage and leave non-uniform execution region.  */
@@ -511,12 +529,11 @@  expand_GOMP_SIMT_LAST_LANE (internal_fn, gcall *stmt)
   rtx cond = expand_normal (gimple_call_arg (stmt, 0));
   machine_mode mode = TYPE_MODE (TREE_TYPE (lhs));
   class expand_operand ops[2];
-  create_output_operand (&ops[0], target, mode);
+  create_call_lhs_operand (&ops[0], target, mode);
   create_input_operand (&ops[1], cond, mode);
   gcc_assert (targetm.have_omp_simt_last_lane ());
   expand_insn (targetm.code_for_omp_simt_last_lane, 2, ops);
-  if (!rtx_equal_p (target, ops[0].value))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 /* Non-transparent predicate used in SIMT lowering of OpenMP "ordered".  */
@@ -532,12 +549,11 @@  expand_GOMP_SIMT_ORDERED_PRED (internal_fn, gcall *stmt)
   rtx ctr = expand_normal (gimple_call_arg (stmt, 0));
   machine_mode mode = TYPE_MODE (TREE_TYPE (lhs));
   class expand_operand ops[2];
-  create_output_operand (&ops[0], target, mode);
+  create_call_lhs_operand (&ops[0], target, mode);
   create_input_operand (&ops[1], ctr, mode);
   gcc_assert (targetm.have_omp_simt_ordered ());
   expand_insn (targetm.code_for_omp_simt_ordered, 2, ops);
-  if (!rtx_equal_p (target, ops[0].value))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 /* "Or" boolean reduction across SIMT lanes: return non-zero in all lanes if
@@ -554,12 +570,11 @@  expand_GOMP_SIMT_VOTE_ANY (internal_fn, gcall *stmt)
   rtx cond = expand_normal (gimple_call_arg (stmt, 0));
   machine_mode mode = TYPE_MODE (TREE_TYPE (lhs));
   class expand_operand ops[2];
-  create_output_operand (&ops[0], target, mode);
+  create_call_lhs_operand (&ops[0], target, mode);
   create_input_operand (&ops[1], cond, mode);
   gcc_assert (targetm.have_omp_simt_vote_any ());
   expand_insn (targetm.code_for_omp_simt_vote_any, 2, ops);
-  if (!rtx_equal_p (target, ops[0].value))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 /* Exchange between SIMT lanes with a "butterfly" pattern: source lane index
@@ -577,13 +592,12 @@  expand_GOMP_SIMT_XCHG_BFLY (internal_fn, gcall *stmt)
   rtx idx = expand_normal (gimple_call_arg (stmt, 1));
   machine_mode mode = TYPE_MODE (TREE_TYPE (lhs));
   class expand_operand ops[3];
-  create_output_operand (&ops[0], target, mode);
+  create_call_lhs_operand (&ops[0], target, mode);
   create_input_operand (&ops[1], src, mode);
   create_input_operand (&ops[2], idx, SImode);
   gcc_assert (targetm.have_omp_simt_xchg_bfly ());
   expand_insn (targetm.code_for_omp_simt_xchg_bfly, 3, ops);
-  if (!rtx_equal_p (target, ops[0].value))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 /* Exchange between SIMT lanes according to given source lane index.  */
@@ -600,13 +614,12 @@  expand_GOMP_SIMT_XCHG_IDX (internal_fn, gcall *stmt)
   rtx idx = expand_normal (gimple_call_arg (stmt, 1));
   machine_mode mode = TYPE_MODE (TREE_TYPE (lhs));
   class expand_operand ops[3];
-  create_output_operand (&ops[0], target, mode);
+  create_call_lhs_operand (&ops[0], target, mode);
   create_input_operand (&ops[1], src, mode);
   create_input_operand (&ops[2], idx, SImode);
   gcc_assert (targetm.have_omp_simt_xchg_idx ());
   expand_insn (targetm.code_for_omp_simt_xchg_idx, 3, ops);
-  if (!rtx_equal_p (target, ops[0].value))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 /* This should get expanded in adjust_simduid_builtins.  */
@@ -3029,13 +3042,12 @@  expand_partial_load_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab)
   set_mem_expr (mem, NULL_TREE);
   clear_mem_offset (mem);
   target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-  create_output_operand (&ops[i++], target, TYPE_MODE (type));
+  create_call_lhs_operand (&ops[i++], target, TYPE_MODE (type));
   create_fixed_operand (&ops[i++], mem);
   i = add_mask_and_len_args (ops, i, stmt);
   expand_insn (icode, i, ops);
 
-  if (!rtx_equal_p (target, ops[0].value))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 #define expand_mask_load_optab_fn expand_partial_load_optab_fn
@@ -3129,15 +3141,14 @@  expand_vec_cond_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
     rtx_op2 = expand_normal (op2);
 
   rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-  create_output_operand (&ops[0], target, mode);
+  create_call_lhs_operand (&ops[0], target, mode);
   create_input_operand (&ops[1], rtx_op1, mode);
   create_input_operand (&ops[2], rtx_op2, mode);
   create_fixed_operand (&ops[3], comparison);
   create_fixed_operand (&ops[4], XEXP (comparison, 0));
   create_fixed_operand (&ops[5], XEXP (comparison, 1));
   expand_insn (icode, 6, ops);
-  if (!rtx_equal_p (ops[0].value, target))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 /* Expand VCOND_MASK optab internal function.
@@ -3166,13 +3177,12 @@  expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
   rtx_op2 = expand_normal (op2);
 
   rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-  create_output_operand (&ops[0], target, mode);
+  create_call_lhs_operand (&ops[0], target, mode);
   create_input_operand (&ops[1], rtx_op1, mode);
   create_input_operand (&ops[2], rtx_op2, mode);
   create_input_operand (&ops[3], mask, mask_mode);
   expand_insn (icode, 4, ops);
-  if (!rtx_equal_p (ops[0].value, target))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 /* Expand VEC_SET internal functions.  */
@@ -3266,7 +3276,7 @@  expand_RAWMEMCHR (internal_fn, gcall *stmt)
     return;
   machine_mode lhs_mode = TYPE_MODE (TREE_TYPE (lhs));
   rtx lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-  create_output_operand (&ops[0], lhs_rtx, lhs_mode);
+  create_call_lhs_operand (&ops[0], lhs_rtx, lhs_mode);
 
   tree mem = gimple_call_arg (stmt, 0);
   rtx mem_rtx = get_memory_rtx (mem, NULL);
@@ -3280,8 +3290,7 @@  expand_RAWMEMCHR (internal_fn, gcall *stmt)
   insn_code icode = direct_optab_handler (rawmemchr_optab, mode);
 
   expand_insn (icode, 3, ops);
-  if (!rtx_equal_p (lhs_rtx, ops[0].value))
-    emit_move_insn (lhs_rtx, ops[0].value);
+  assign_call_lhs (lhs, lhs_rtx, &ops[0]);
 }
 
 /* Expand the IFN_UNIQUE function according to its first argument.  */
@@ -3691,7 +3700,7 @@  expand_gather_load_optab_fn (internal_fn, gcall *stmt, direct_optab optab)
 
   int i = 0;
   class expand_operand ops[8];
-  create_output_operand (&ops[i++], lhs_rtx, TYPE_MODE (TREE_TYPE (lhs)));
+  create_call_lhs_operand (&ops[i++], lhs_rtx, TYPE_MODE (TREE_TYPE (lhs)));
   create_address_operand (&ops[i++], base_rtx);
   create_input_operand (&ops[i++], offset_rtx, TYPE_MODE (TREE_TYPE (offset)));
   create_integer_operand (&ops[i++], TYPE_UNSIGNED (TREE_TYPE (offset)));
@@ -3700,8 +3709,7 @@  expand_gather_load_optab_fn (internal_fn, gcall *stmt, direct_optab optab)
   insn_code icode = convert_optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)),
 					   TYPE_MODE (TREE_TYPE (offset)));
   expand_insn (icode, i, ops);
-  if (!rtx_equal_p (lhs_rtx, ops[0].value))
-    emit_move_insn (lhs_rtx, ops[0].value);
+  assign_call_lhs (lhs, lhs_rtx, &ops[0]);
 }
 
 /* Helper for expand_DIVMOD.  Return true if the sequence starting with
@@ -3909,7 +3917,7 @@  expand_while_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
   tree lhs = gimple_call_lhs (stmt);
   tree lhs_type = TREE_TYPE (lhs);
   rtx lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-  create_output_operand (&ops[0], lhs_rtx, TYPE_MODE (lhs_type));
+  create_call_lhs_operand (&ops[0], lhs_rtx, TYPE_MODE (lhs_type));
 
   for (unsigned int i = 0; i < 2; ++i)
     {
@@ -3937,8 +3945,7 @@  expand_while_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 					   TYPE_MODE (lhs_type));
 
   expand_insn (icode, opcnt, ops);
-  if (!rtx_equal_p (lhs_rtx, ops[0].value))
-    emit_move_insn (lhs_rtx, ops[0].value);
+  assign_call_lhs (lhs, lhs_rtx, &ops[0]);
 }
 
 /* Expand a call to a convert-like optab using the operands in STMT.
@@ -5067,13 +5074,12 @@  expand_SPACESHIP (internal_fn, gcall *stmt)
   rtx op2 = expand_normal (rhs2);
 
   class expand_operand ops[3];
-  create_output_operand (&ops[0], target, TYPE_MODE (TREE_TYPE (lhs)));
+  create_call_lhs_operand (&ops[0], target, TYPE_MODE (TREE_TYPE (lhs)));
   create_input_operand (&ops[1], op1, TYPE_MODE (type));
   create_input_operand (&ops[2], op2, TYPE_MODE (type));
   insn_code icode = optab_handler (spaceship_optab, TYPE_MODE (type));
   expand_insn (icode, 3, ops);
-  if (!rtx_equal_p (target, ops[0].value))
-    emit_move_insn (target, ops[0].value);
+  assign_call_lhs (lhs, target, &ops[0]);
 }
 
 void