Message ID | 20171206210447.GJ2353@tucnak |
---|---|
State | New |
Headers | show |
Series | Use tail calls to memcpy/memset even for structure assignments (PR target/41455, PR target/82935) | expand |
On Wed, 6 Dec 2017, Jakub Jelinek wrote: > Hi! > > Aggregate assignments and clears aren't in GIMPLE represented as calls, > and while often they expand inline, sometimes we emit libcalls for them. > This patch allows us to tail call those libcalls if there is nothing > after them. The patch changes the tailcall pass, so that it recognizes > a = b; and c = {}; statements under certain conditions as potential tail > calls returning void, and if it finds good tail call candidates, it marks > them specially. Because we have only a single bit left for GIMPLE_ASSIGN, > I've decided to wrap the rhs1 into a new internal call, so > a = b; will be transformed into a = TAILCALL_ASSIGN (b); and > c = {}; will be transformed into c = TAILCALL_ASSIGN (); > The rest of the patch is about propagating the flag (may use tailcall if > the emit_block_move or clear_storage is the last thing emitted) down > through expand_assignment and functions it calls. > > Those functions use 1-3 other flags, so instead of adding another bool > to all of them (next to nontemporal, call_param_p, reverse) I've decided > to pass around a bitmask of flags. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Hum, it doesn't look pretty ;) Can we defer this to stage1 given it's a long-standing issue and we have quite big changes going in still? Thanks, Richard. > 2017-12-06 Jakub Jelinek <jakub@redhat.com> > > PR target/41455 > PR target/82935 > * internal-fn.def (TAILCALL_ASSIGN): New internal function. > * internal-fn.c (expand_LAUNDER): Pass EXPAND_FLAG_NORMAL to > expand_assignment. > (expand_TAILCALL_ASSIGN): New function. > * tree-tailcall.c (struct tailcall): Adjust comment. > (find_tail_calls): Recognize also aggregate assignments and > aggregate clearing as possible tail calls. Use is_gimple_assign > instead of gimple_code check. > (optimize_tail_call): Rewrite aggregate assignments or aggregate > clearing in tail call positions using IFN_TAILCALL_ASSIGN > internal function. > * tree-outof-ssa.c (insert_value_copy_on_edge): Adjust store_expr > caller. > * tree-chkp.c (chkp_expand_bounds_reset_for_mem): Adjust > expand_assignment caller. > * function.c (assign_parm_setup_reg): Likewise. > * ubsan.c (ubsan_encode_value): Likewise. > * cfgexpand.c (expand_call_stmt, expand_asm_stmt): Likewise. > (expand_gimple_stmt_1): Likewise. Fix up formatting. > * calls.c (initialize_argument_information): Adjust store_expr caller. > * expr.h (enum expand_flag): New. > (expand_assignment): Replace bool argument with enum expand_flag. > (store_expr_with_bounds, store_expr): Replace int, bool, bool arguments > with enum expand_flag. > * expr.c (expand_assignment): Replace nontemporal argument with flags. > Assert no bits other than EXPAND_FLAG_NONTEMPORAL and > EXPAND_FLAG_TAILCALL are set. Adjust store_expr, store_fields and > store_expr_with_bounds callers. > (store_expr_with_bounds): Replace call_param_p, nontemporal and > reverse args with flags argument. Adjust recursive calls. Pass > BLOCK_OP_TAILCALL to clear_storage and expand_block_move if > EXPAND_FLAG_TAILCALL is set. Call clear_storage directly for > EXPAND_FLAG_TAILCALL assignments from emtpy CONSTRUCTOR. > (store_expr): Replace call_param_p, nontemporal and reverse args > with flags argument. Adjust store_expr_with_bounds caller. > (store_constructor_field): Adjust store_field caller. > (store_constructor): Adjust store_expr and expand_assignment callers. > (store_field): Replace nontemporal and reverse arguments with flags > argument. Adjust store_expr callers. Pass BLOCK_OP_TAILCALL to > emit_block_move if EXPAND_FLAG_TAILCALL is set. > (expand_expr_real_2): Adjust store_expr and store_field callers. > (expand_expr_real_1): Adjust store_expr and expand_assignment callers. > > * gcc.target/i386/pr41455.c: New test. > > --- gcc/internal-fn.def.jj 2017-12-06 09:02:30.072952012 +0100 > +++ gcc/internal-fn.def 2017-12-06 16:56:20.958518104 +0100 > @@ -254,6 +254,11 @@ DEF_INTERNAL_FN (LAUNDER, ECF_LEAF | ECF > /* Divmod function. */ > DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF_LEAF, NULL) > > +/* Special markup for aggregate copy or clear that can be implemented > + using a tailcall. lhs = rhs1; is represented by > + lhs = TAILCALL_ASSIGN (rhs1); and lhs = {}; by lhs = TAILCALL_ASSIGN (); */ > +DEF_INTERNAL_FN (TAILCALL_ASSIGN, ECF_NOTHROW | ECF_LEAF, NULL) > + > #undef DEF_INTERNAL_INT_FN > #undef DEF_INTERNAL_FLT_FN > #undef DEF_INTERNAL_FLT_FLOATN_FN > --- gcc/internal-fn.c.jj 2017-12-06 09:02:29.968953307 +0100 > +++ gcc/internal-fn.c 2017-12-06 18:00:15.993826828 +0100 > @@ -2672,7 +2672,7 @@ expand_LAUNDER (internal_fn, gcall *call > if (!lhs) > return; > > - expand_assignment (lhs, gimple_call_arg (call, 0), false); > + expand_assignment (lhs, gimple_call_arg (call, 0), EXPAND_FLAG_NORMAL); > } > > /* Expand DIVMOD() using: > @@ -2722,6 +2722,23 @@ expand_DIVMOD (internal_fn, gcall *call_ > target, VOIDmode, EXPAND_NORMAL); > } > > +/* Expand TAILCALL_ASSIGN. */ > + > +static void > +expand_TAILCALL_ASSIGN (internal_fn, gcall *call_stmt) > +{ > + tree lhs = gimple_call_lhs (call_stmt); > + tree rhs; > + if (gimple_call_num_args (call_stmt) == 0) > + { > + rhs = build_constructor (TREE_TYPE (lhs), NULL); > + TREE_STATIC (rhs) = 1; > + } > + else > + rhs = gimple_call_arg (call_stmt, 0); > + expand_assignment (lhs, rhs, EXPAND_FLAG_TAILCALL); > +} > + > /* Expand a call to FN using the operands in STMT. FN has a single > output operand and NARGS input operands. */ > > --- gcc/tree-tailcall.c.jj 2017-12-06 09:02:30.031952522 +0100 > +++ gcc/tree-tailcall.c 2017-12-06 17:59:37.166299929 +0100 > @@ -106,7 +106,8 @@ along with GCC; see the file COPYING3. > > struct tailcall > { > - /* The iterator pointing to the call statement. */ > + /* The iterator pointing to the call (or aggregate copy that might be > + expanded as call) statement. */ > gimple_stmt_iterator call_gsi; > > /* True if it is a call to the current function. */ > @@ -398,8 +399,7 @@ static void > find_tail_calls (basic_block bb, struct tailcall **ret) > { > tree ass_var = NULL_TREE, ret_var, func, param; > - gimple *stmt; > - gcall *call = NULL; > + gimple *stmt, *call = NULL; > gimple_stmt_iterator gsi, agsi; > bool tail_recursion; > struct tailcall *nw; > @@ -428,7 +428,7 @@ find_tail_calls (basic_block bb, struct > /* Check for a call. */ > if (is_gimple_call (stmt)) > { > - call = as_a <gcall *> (stmt); > + call = stmt; > ass_var = gimple_call_lhs (call); > break; > } > @@ -440,6 +440,38 @@ find_tail_calls (basic_block bb, struct > && auto_var_in_fn_p (gimple_assign_rhs1 (stmt), cfun->decl)) > continue; > > + /* In addition to calls, allow aggregate copies that could be expanded > + as memcpy or memset. Pretend it has NULL lhs then. */ > + if (gimple_references_memory_p (stmt) > + && gimple_assign_single_p (stmt) > + && !gimple_has_volatile_ops (stmt) > + && !gimple_assign_nontemporal_move_p (as_a <gassign *> (stmt)) > + && gimple_vdef (stmt) > + && !is_gimple_reg_type (TREE_TYPE (gimple_assign_lhs (stmt)))) > + { > + tree lhs = gimple_assign_lhs (stmt); > + if (TYPE_MODE (TREE_TYPE (lhs)) != BLKmode) > + return; > + tree rhs1 = gimple_assign_rhs1 (stmt); > + if (auto_var_in_fn_p (get_base_address (lhs), cfun->decl)) > + return; > + if (TREE_CODE (rhs1) == CONSTRUCTOR) > + { > + if (CONSTRUCTOR_NELTS (rhs1) != 0 || !TREE_STATIC (rhs1)) > + return; > + } > + else if (auto_var_in_fn_p (get_base_address (rhs1), cfun->decl)) > + return; > + if (reverse_storage_order_for_component_p (lhs) > + || reverse_storage_order_for_component_p (rhs1)) > + return; > + if (operand_equal_p (lhs, rhs1, 0)) > + return; > + call = stmt; > + ass_var = NULL_TREE; > + break; > + } > + > /* If the statement references memory or volatile operands, fail. */ > if (gimple_references_memory_p (stmt) > || gimple_has_volatile_ops (stmt)) > @@ -474,7 +506,7 @@ find_tail_calls (basic_block bb, struct > > /* We found the call, check whether it is suitable. */ > tail_recursion = false; > - func = gimple_call_fndecl (call); > + func = is_gimple_call (call) ? gimple_call_fndecl (call) : NULL_TREE; > if (func > && !DECL_BUILT_IN (func) > && recursive_call_p (current_function_decl, func)) > @@ -521,7 +553,9 @@ find_tail_calls (basic_block bb, struct > && auto_var_in_fn_p (var, cfun->decl) > && may_be_aliased (var) > && (ref_maybe_used_by_stmt_p (call, var) > - || call_may_clobber_ref_p (call, var))) > + || (is_gimple_call (call) > + ? call_may_clobber_ref_p (as_a <gcall *> (call), var) > + : refs_output_dependent_p (gimple_assign_lhs (call), var)))) > return; > } > > @@ -560,7 +594,7 @@ find_tail_calls (basic_block bb, struct > || is_gimple_debug (stmt)) > continue; > > - if (gimple_code (stmt) != GIMPLE_ASSIGN) > + if (!is_gimple_assign (stmt)) > return; > > /* This is a gimple assign. */ > @@ -956,9 +990,31 @@ optimize_tail_call (struct tailcall *t, > > if (opt_tailcalls) > { > - gcall *stmt = as_a <gcall *> (gsi_stmt (t->call_gsi)); > - > - gimple_call_set_tail (stmt, true); > + gimple *stmt = gsi_stmt (t->call_gsi); > + if (gcall *call = dyn_cast <gcall *> (stmt)) > + gimple_call_set_tail (call, true); > + else > + { > + tree lhs = gimple_assign_lhs (stmt); > + tree rhs1 = gimple_assign_rhs1 (stmt); > + gcall *g; > + if (TREE_CODE (rhs1) == CONSTRUCTOR) > + g = gimple_build_call_internal (IFN_TAILCALL_ASSIGN, 0); > + else > + g = gimple_build_call_internal (IFN_TAILCALL_ASSIGN, 1, > + rhs1); > + gimple_call_set_lhs (g, lhs); > + gimple_set_location (g, gimple_location (stmt)); > + if (gimple_vdef (stmt) > + && TREE_CODE (gimple_vdef (stmt)) == SSA_NAME) > + { > + gimple_set_vdef (g, gimple_vdef (stmt)); > + SSA_NAME_DEF_STMT (gimple_vdef (g)) = g; > + } > + if (gimple_vuse (stmt)) > + gimple_set_vuse (g, gimple_vuse (stmt)); > + gsi_replace (&t->call_gsi, g, false); > + } > cfun->tail_call_marked = true; > if (dump_file && (dump_flags & TDF_DETAILS)) > { > --- gcc/tree-outof-ssa.c.jj 2017-10-09 09:41:21.000000000 +0200 > +++ gcc/tree-outof-ssa.c 2017-12-06 17:17:16.468209052 +0100 > @@ -311,7 +311,7 @@ insert_value_copy_on_edge (edge e, int d > else if (src_mode == BLKmode) > { > x = dest_rtx; > - store_expr (src, x, 0, false, false); > + store_expr (src, x, EXPAND_FLAG_NORMAL); > } > else > x = expand_expr (src, dest_rtx, dest_mode, EXPAND_NORMAL); > --- gcc/tree-chkp.c.jj 2017-12-06 09:02:30.141951153 +0100 > +++ gcc/tree-chkp.c 2017-12-06 16:56:20.956518128 +0100 > @@ -481,7 +481,7 @@ chkp_expand_bounds_reset_for_mem (tree m > build_pointer_type (TREE_TYPE (mem)), mem); > bndstx = chkp_build_bndstx_call (addr, ptr, bnd); > > - expand_assignment (bnd, zero_bnd, false); > + expand_assignment (bnd, zero_bnd, EXPAND_FLAG_NORMAL); > expand_normal (bndstx); > } > > --- gcc/function.c.jj 2017-12-06 09:02:29.991953021 +0100 > +++ gcc/function.c 2017-12-06 16:56:20.958518104 +0100 > @@ -3284,7 +3284,8 @@ assign_parm_setup_reg (struct assign_par > /* TREE_USED gets set erroneously during expand_assignment. */ > save_tree_used = TREE_USED (parm); > SET_DECL_RTL (parm, rtl); > - expand_assignment (parm, make_tree (data->nominal_type, tempreg), false); > + expand_assignment (parm, make_tree (data->nominal_type, tempreg), > + EXPAND_FLAG_NORMAL); > SET_DECL_RTL (parm, NULL_RTX); > TREE_USED (parm) = save_tree_used; > all->first_conversion_insn = get_insns (); > --- gcc/ubsan.c.jj 2017-12-06 09:02:29.951953519 +0100 > +++ gcc/ubsan.c 2017-12-06 16:56:20.960518080 +0100 > @@ -165,7 +165,7 @@ ubsan_encode_value (tree t, enum ubsan_e > rtx mem = assign_stack_temp_for_type (mode, GET_MODE_SIZE (mode), > type); > SET_DECL_RTL (var, mem); > - expand_assignment (var, t, false); > + expand_assignment (var, t, EXPAND_FLAG_NORMAL); > return build_fold_addr_expr (var); > } > if (phase != UBSAN_ENCODE_VALUE_GENERIC) > --- gcc/cfgexpand.c.jj 2017-12-06 09:02:30.056952211 +0100 > +++ gcc/cfgexpand.c 2017-12-06 16:56:20.959518092 +0100 > @@ -2668,7 +2668,7 @@ expand_call_stmt (gcall *stmt) > rtx_insn *before_call = get_last_insn (); > lhs = gimple_call_lhs (stmt); > if (lhs) > - expand_assignment (lhs, exp, false); > + expand_assignment (lhs, exp, EXPAND_FLAG_NORMAL); > else > expand_expr (exp, const0_rtx, VOIDmode, EXPAND_NORMAL); > > @@ -3071,7 +3071,7 @@ expand_asm_stmt (gasm *stmt) > generating_concat_p = old_generating_concat_p; > > push_to_sequence2 (after_rtl_seq, after_rtl_end); > - expand_assignment (val, make_tree (type, op), false); > + expand_assignment (val, make_tree (type, op), EXPAND_FLAG_NORMAL); > after_rtl_seq = get_insns (); > after_rtl_end = get_last_insn (); > end_sequence (); > @@ -3672,9 +3672,14 @@ expand_gimple_stmt_1 (gimple *stmt) > this LHS. */ > ; > else > - expand_assignment (lhs, rhs, > - gimple_assign_nontemporal_move_p ( > - assign_stmt)); > + { > + enum expand_flag flag; > + if (gimple_assign_nontemporal_move_p (assign_stmt)) > + flag = EXPAND_FLAG_NONTEMPORAL; > + else > + flag = EXPAND_FLAG_NORMAL; > + expand_assignment (lhs, rhs, flag); > + } > } > else > { > --- gcc/calls.c.jj 2017-11-22 21:37:50.000000000 +0100 > +++ gcc/calls.c 2017-12-06 17:10:12.432363809 +0100 > @@ -1971,7 +1971,7 @@ initialize_argument_information (int num > else > copy = assign_temp (type, 1, 0); > > - store_expr (args[i].tree_value, copy, 0, false, false); > + store_expr (args[i].tree_value, copy, EXPAND_FLAG_NORMAL); > > /* Just change the const function to pure and then let > the next test clear the pure based on > --- gcc/expr.h.jj 2017-12-06 09:02:30.120951414 +0100 > +++ gcc/expr.h 2017-12-06 17:09:34.782823601 +0100 > @@ -35,6 +35,26 @@ enum expand_modifier {EXPAND_NORMAL = 0, > EXPAND_CONST_ADDRESS, EXPAND_INITIALIZER, EXPAND_WRITE, > EXPAND_MEMORY}; > > +/* Flags arguments for expand_assignment/store_expr*. The argument is > + a bitwise or of these flags. */ > +enum expand_flag { > + /* Value if none of the flags are set. */ > + EXPAND_FLAG_NORMAL = 0, > + /* Expand the assignment/store as nontemporal store if possible. */ > + EXPAND_FLAG_NONTEMPORAL = 1, > + /* If the assignment is expanded as a libcall, it can be a tail call. */ > + EXPAND_FLAG_TAILCALL = 2, > + > + /* Flags below this point are only for store_expr*, not for > + expand_assignment. */ > + > + /* Reverse bytes in the store. */ > + EXPAND_FLAG_REVERSE = 4, > + /* True for stores into call params on the stack, where block moves to > + that may need special treatment. */ > + EXPAND_FLAG_CALL_PARAM_P = 8 > +}; > + > /* Prevent the compiler from deferring stack pops. See > inhibit_defer_pop for more information. */ > #define NO_DEFER_POP (inhibit_defer_pop += 1) > @@ -244,14 +264,14 @@ extern void get_bit_range (unsigned HOST > tree, HOST_WIDE_INT *, tree *); > > /* Expand an assignment that stores the value of FROM into TO. */ > -extern void expand_assignment (tree, tree, bool); > +extern void expand_assignment (tree, tree, enum expand_flag); > > /* Generate code for computing expression EXP, > and storing the value into TARGET. > If SUGGEST_REG is nonzero, copy the value through a register > and return that register, if that is possible. */ > -extern rtx store_expr_with_bounds (tree, rtx, int, bool, bool, tree); > -extern rtx store_expr (tree, rtx, int, bool, bool); > +extern rtx store_expr_with_bounds (tree, rtx, enum expand_flag, tree); > +extern rtx store_expr (tree, rtx, enum expand_flag); > > /* Given an rtx that may include add and multiply operations, > generate them as insns and return a pseudo-reg containing the value. > --- gcc/expr.c.jj 2017-12-06 09:02:30.103951626 +0100 > +++ gcc/expr.c 2017-12-06 17:54:46.185845429 +0100 > @@ -86,7 +86,7 @@ static void store_constructor_field (rtx > static void store_constructor (tree, rtx, int, HOST_WIDE_INT, bool); > static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT, > unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, > - machine_mode, tree, alias_set_type, bool, bool); > + machine_mode, tree, alias_set_type, enum expand_flag); > > static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree); > > @@ -4915,11 +4915,12 @@ mem_ref_refers_to_non_mem_p (tree ref) > return addr_expr_of_non_mem_decl_p_1 (base, false); > } > > -/* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL > - is true, try generating a nontemporal store. */ > +/* Expand an assignment that stores the value of FROM into TO. If > + flags & EXPAND_FLAG_NONTEMPORAL, try generating a nontemporal store, > + if flags & EXPAND_FLAG_TAILCALL, allow generating a tail call. */ > > void > -expand_assignment (tree to, tree from, bool nontemporal) > +expand_assignment (tree to, tree from, enum expand_flag flags) > { > rtx to_rtx = 0; > rtx result; > @@ -4927,6 +4928,10 @@ expand_assignment (tree to, tree from, b > unsigned int align; > enum insn_code icode; > > + /* Rest of the flags only make sense for store_*. */ > + gcc_checking_assert ((flags & ~(EXPAND_FLAG_NONTEMPORAL > + | EXPAND_FLAG_TAILCALL)) == 0); > + > /* Don't crash if the lhs of the assignment was erroneous. */ > if (TREE_CODE (to) == ERROR_MARK) > { > @@ -4992,6 +4997,7 @@ expand_assignment (tree to, tree from, b > tree offset; > int unsignedp, reversep, volatilep = 0; > tree tem; > + enum expand_flag flags_rev = flags; > > push_temp_slots (); > tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1, > @@ -5004,6 +5010,8 @@ expand_assignment (tree to, tree from, b > offset = size_int (bitpos >> LOG2_BITS_PER_UNIT); > bitpos &= BITS_PER_UNIT - 1; > } > + if (reversep) > + flags_rev = (enum expand_flag) (flags | EXPAND_FLAG_REVERSE); > > if (TREE_CODE (to) == COMPONENT_REF > && DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1))) > @@ -5114,22 +5122,21 @@ expand_assignment (tree to, tree from, b > && COMPLEX_MODE_P (GET_MODE (to_rtx)) > && bitpos == 0 > && bitsize == mode_bitsize) > - result = store_expr (from, to_rtx, false, nontemporal, reversep); > + result = store_expr (from, to_rtx, flags_rev); > else if (bitsize == mode_bitsize / 2 > && (bitpos == 0 || bitpos == mode_bitsize / 2)) > - result = store_expr (from, XEXP (to_rtx, bitpos != 0), false, > - nontemporal, reversep); > + result = store_expr (from, XEXP (to_rtx, bitpos != 0), flags_rev); > else if (bitpos + bitsize <= mode_bitsize / 2) > result = store_field (XEXP (to_rtx, 0), bitsize, bitpos, > bitregion_start, bitregion_end, > mode1, from, get_alias_set (to), > - nontemporal, reversep); > + flags_rev); > else if (bitpos >= mode_bitsize / 2) > result = store_field (XEXP (to_rtx, 1), bitsize, > bitpos - mode_bitsize / 2, > bitregion_start, bitregion_end, > mode1, from, get_alias_set (to), > - nontemporal, reversep); > + flags_rev); > else if (bitpos == 0 && bitsize == mode_bitsize) > { > result = expand_normal (from); > @@ -5166,7 +5173,8 @@ expand_assignment (tree to, tree from, b > result = store_field (temp, bitsize, bitpos, > bitregion_start, bitregion_end, > mode1, from, get_alias_set (to), > - nontemporal, reversep); > + (enum expand_flag) > + (flags_rev & ~EXPAND_FLAG_TAILCALL)); > emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false)); > emit_move_insn (XEXP (to_rtx, 1), read_complex_part (temp, true)); > } > @@ -5192,7 +5200,7 @@ expand_assignment (tree to, tree from, b > result = store_field (to_rtx, bitsize, bitpos, > bitregion_start, bitregion_end, > mode1, from, get_alias_set (to), > - nontemporal, reversep); > + flags_rev); > } > > if (result) > @@ -5341,7 +5349,7 @@ expand_assignment (tree to, tree from, b > /* Compute FROM and store the value in the rtx we got. */ > > push_temp_slots (); > - result = store_expr_with_bounds (from, to_rtx, 0, nontemporal, false, to); > + result = store_expr_with_bounds (from, to_rtx, flags, to); > preserve_temp_slots (result); > pop_temp_slots (); > return; > @@ -5375,19 +5383,14 @@ emit_storent_insn (rtx to, rtx from) > with no sequence point. Will other languages need this to > be more thorough? > > - If CALL_PARAM_P is nonzero, this is a store into a call param on the > - stack, and block moves may need to be treated specially. > - > - If NONTEMPORAL is true, try using a nontemporal store instruction. > - > - If REVERSE is true, the store is to be done in reverse order. > + FLAGS is a bitwise or of EXPAND_FLAG_* defined in expr.h. > > If BTARGET is not NULL then computed bounds of EXP are > associated with BTARGET. */ > > rtx > -store_expr_with_bounds (tree exp, rtx target, int call_param_p, > - bool nontemporal, bool reverse, tree btarget) > +store_expr_with_bounds (tree exp, rtx target, enum expand_flag flags, > + tree btarget) > { > rtx temp; > rtx alt_rtl = NULL_RTX; > @@ -5398,7 +5401,7 @@ store_expr_with_bounds (tree exp, rtx ta > /* C++ can generate ?: expressions with a throw expression in one > branch and an rvalue in the other. Here, we resolve attempts to > store the throw expression's nonexistent result. */ > - gcc_assert (!call_param_p); > + gcc_assert ((flags & EXPAND_FLAG_CALL_PARAM_P) == 0); > expand_expr (exp, const0_rtx, VOIDmode, EXPAND_NORMAL); > return NULL_RTX; > } > @@ -5407,9 +5410,9 @@ store_expr_with_bounds (tree exp, rtx ta > /* Perform first part of compound expression, then assign from second > part. */ > expand_expr (TREE_OPERAND (exp, 0), const0_rtx, VOIDmode, > - call_param_p ? EXPAND_STACK_PARM : EXPAND_NORMAL); > - return store_expr_with_bounds (TREE_OPERAND (exp, 1), target, > - call_param_p, nontemporal, reverse, > + (flags & EXPAND_FLAG_CALL_PARAM_P) > + ? EXPAND_STACK_PARM : EXPAND_NORMAL); > + return store_expr_with_bounds (TREE_OPERAND (exp, 1), target, flags, > btarget); > } > else if (TREE_CODE (exp) == COND_EXPR && GET_MODE (target) == BLKmode) > @@ -5425,13 +5428,15 @@ store_expr_with_bounds (tree exp, rtx ta > NO_DEFER_POP; > jumpifnot (TREE_OPERAND (exp, 0), lab1, > profile_probability::uninitialized ()); > - store_expr_with_bounds (TREE_OPERAND (exp, 1), target, call_param_p, > - nontemporal, reverse, btarget); > + store_expr_with_bounds (TREE_OPERAND (exp, 1), target, > + (enum expand_flag) > + (flags & ~EXPAND_FLAG_TAILCALL), btarget); > emit_jump_insn (targetm.gen_jump (lab2)); > emit_barrier (); > emit_label (lab1); > - store_expr_with_bounds (TREE_OPERAND (exp, 2), target, call_param_p, > - nontemporal, reverse, btarget); > + store_expr_with_bounds (TREE_OPERAND (exp, 2), target, > + (enum expand_flag) > + (flags & ~EXPAND_FLAG_TAILCALL), btarget); > emit_label (lab2); > OK_DEFER_POP; > > @@ -5482,7 +5487,8 @@ store_expr_with_bounds (tree exp, rtx ta > } > > temp = expand_expr (exp, inner_target, VOIDmode, > - call_param_p ? EXPAND_STACK_PARM : EXPAND_NORMAL); > + (flags & EXPAND_FLAG_CALL_PARAM_P) > + ? EXPAND_STACK_PARM : EXPAND_NORMAL); > > /* Handle bounds returned by call. */ > if (TREE_CODE (exp) == CALL_EXPR) > @@ -5518,7 +5524,8 @@ store_expr_with_bounds (tree exp, rtx ta > && TREE_CODE (TREE_OPERAND (TREE_OPERAND (exp, 0), 0)) > == STRING_CST > && integer_zerop (TREE_OPERAND (exp, 1)))) > - && !nontemporal && !call_param_p > + && (flags & (EXPAND_FLAG_NONTEMPORAL > + | EXPAND_FLAG_CALL_PARAM_P)) == 0 > && MEM_P (target)) > { > /* Optimize initialization of an array with a STRING_CST. */ > @@ -5562,7 +5569,21 @@ store_expr_with_bounds (tree exp, rtx ta > if (exp_len > str_copy_len) > clear_storage (adjust_address (dest_mem, BLKmode, 0), > GEN_INT (exp_len - str_copy_len), > - BLOCK_OP_NORMAL); > + (flags & EXPAND_FLAG_TAILCALL) > + ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL); > + return NULL_RTX; > + } > + else if (flags == EXPAND_FLAG_TAILCALL > + && TREE_CODE (exp) == CONSTRUCTOR > + && TREE_CODE (TREE_TYPE (exp)) != ERROR_MARK > + && TREE_STATIC (exp) > + && !TREE_ADDRESSABLE (exp) > + && TYPE_MODE (TREE_TYPE (exp)) == BLKmode > + && MEM_P (target) > + && GET_MODE (target) == BLKmode > + && CONSTRUCTOR_NELTS (exp) == 0) > + { > + clear_storage (target, expr_size (exp), BLOCK_OP_TAILCALL); > return NULL_RTX; > } > else > @@ -5572,9 +5593,10 @@ store_expr_with_bounds (tree exp, rtx ta > normal_expr: > /* If we want to use a nontemporal or a reverse order store, force the > value into a register first. */ > - tmp_target = nontemporal || reverse ? NULL_RTX : target; > + tmp_target = (flags & (EXPAND_FLAG_NONTEMPORAL | EXPAND_FLAG_REVERSE)) > + ? NULL_RTX : target; > temp = expand_expr_real (exp, tmp_target, GET_MODE (target), > - (call_param_p > + ((flags & EXPAND_FLAG_CALL_PARAM_P) > ? EXPAND_STACK_PARM : EXPAND_NORMAL), > &alt_rtl, false); > > @@ -5647,7 +5669,8 @@ store_expr_with_bounds (tree exp, rtx ta > else > store_bit_field (target, > INTVAL (expr_size (exp)) * BITS_PER_UNIT, > - 0, 0, 0, GET_MODE (temp), temp, reverse); > + 0, 0, 0, GET_MODE (temp), temp, > + (flags & EXPAND_FLAG_REVERSE) != 0); > } > else > convert_move (target, temp, TYPE_UNSIGNED (TREE_TYPE (exp))); > @@ -5664,8 +5687,10 @@ store_expr_with_bounds (tree exp, rtx ta > if (CONST_INT_P (size) > && INTVAL (size) < TREE_STRING_LENGTH (exp)) > emit_block_move (target, temp, size, > - (call_param_p > - ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL)); > + ((flags & EXPAND_FLAG_CALL_PARAM_P) > + ? BLOCK_OP_CALL_PARM > + : (flags & EXPAND_FLAG_TAILCALL) > + ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL)); > else > { > machine_mode pointer_mode > @@ -5679,7 +5704,7 @@ store_expr_with_bounds (tree exp, rtx ta > size_int (TREE_STRING_LENGTH (exp))); > rtx copy_size_rtx > = expand_expr (copy_size, NULL_RTX, VOIDmode, > - (call_param_p > + ((flags & EXPAND_FLAG_CALL_PARAM_P) > ? EXPAND_STACK_PARM : EXPAND_NORMAL)); > rtx_code_label *label = 0; > > @@ -5687,7 +5712,7 @@ store_expr_with_bounds (tree exp, rtx ta > copy_size_rtx = convert_to_mode (pointer_mode, copy_size_rtx, > TYPE_UNSIGNED (sizetype)); > emit_block_move (target, temp, copy_size_rtx, > - (call_param_p > + ((flags & EXPAND_FLAG_CALL_PARAM_P) > ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL)); > > /* Figure out how much is left in TARGET that we have to clear. > @@ -5739,14 +5764,17 @@ store_expr_with_bounds (tree exp, rtx ta > int_size_in_bytes (TREE_TYPE (exp))); > else if (GET_MODE (temp) == BLKmode) > emit_block_move (target, temp, expr_size (exp), > - (call_param_p > - ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL)); > + ((flags & EXPAND_FLAG_CALL_PARAM_P) > + ? BLOCK_OP_CALL_PARM > + : (flags & EXPAND_FLAG_TAILCALL) > + ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL)); > /* If we emit a nontemporal store, there is nothing else to do. */ > - else if (nontemporal && emit_storent_insn (target, temp)) > + else if ((flags & EXPAND_FLAG_NONTEMPORAL) > + && emit_storent_insn (target, temp)) > ; > else > { > - if (reverse) > + if (flags & EXPAND_FLAG_REVERSE) > temp = flip_storage_order (GET_MODE (target), temp); > temp = force_operand (temp, target); > if (temp != target) > @@ -5759,11 +5787,9 @@ store_expr_with_bounds (tree exp, rtx ta > > /* Same as store_expr_with_bounds but ignoring bounds of EXP. */ > rtx > -store_expr (tree exp, rtx target, int call_param_p, bool nontemporal, > - bool reverse) > +store_expr (tree exp, rtx target, enum expand_flag flags) > { > - return store_expr_with_bounds (exp, target, call_param_p, nontemporal, > - reverse, NULL); > + return store_expr_with_bounds (exp, target, flags, NULL); > } > > /* Return true if field F of structure TYPE is a flexible array. */ > @@ -6141,7 +6167,8 @@ store_constructor_field (rtx target, uns > } > else > store_field (target, bitsize, bitpos, bitregion_start, bitregion_end, mode, > - exp, alias_set, false, reverse); > + exp, alias_set, > + reverse ? EXPAND_FLAG_REVERSE : EXPAND_FLAG_NORMAL); > } > > > @@ -6338,6 +6365,8 @@ store_constructor (tree exp, rtx target, > > /* The storage order is specified for every aggregate type. */ > reverse = TYPE_REVERSE_STORAGE_ORDER (type); > + enum expand_flag flags_rev > + = reverse ? EXPAND_FLAG_REVERSE : EXPAND_FLAG_NORMAL; > > domain = TYPE_DOMAIN (type); > const_bounds_p = (TYPE_MIN_VALUE (domain) > @@ -6495,7 +6524,7 @@ store_constructor (tree exp, rtx target, > VAR_DECL, NULL_TREE, domain); > index_r = gen_reg_rtx (promote_decl_mode (index, NULL)); > SET_DECL_RTL (index, index_r); > - store_expr (lo_index, index_r, 0, false, reverse); > + store_expr (lo_index, index_r, flags_rev); > > /* Build the head of the loop. */ > do_pending_stack_adjust (); > @@ -6522,7 +6551,7 @@ store_constructor (tree exp, rtx target, > store_constructor (value, xtarget, cleared, > bitsize / BITS_PER_UNIT, reverse); > else > - store_expr (value, xtarget, 0, false, reverse); > + store_expr (value, xtarget, flags_rev); > > /* Generate a conditional jump to exit the loop. */ > exit_cond = build2 (LT_EXPR, integer_type_node, > @@ -6535,7 +6564,7 @@ store_constructor (tree exp, rtx target, > expand_assignment (index, > build2 (PLUS_EXPR, TREE_TYPE (index), > index, integer_one_node), > - false); > + EXPAND_FLAG_NORMAL); > > emit_jump (loop_start); > > @@ -6566,7 +6595,7 @@ store_constructor (tree exp, rtx target, > expand_normal (position), > highest_pow2_factor (position)); > xtarget = adjust_address (xtarget, mode, 0); > - store_expr (value, xtarget, 0, false, reverse); > + store_expr (value, xtarget, flags_rev); > } > else > { > @@ -6760,16 +6789,14 @@ store_constructor (tree exp, rtx target, > (in general) be different from that for TARGET, since TARGET is a > reference to the containing structure. > > - If NONTEMPORAL is true, try generating a nontemporal store. > - > - If REVERSE is true, the store is to be done in reverse order. */ > + FLAGS is a bitmask of EXPAND_FLAG_* flags defined in expr.h. */ > > static rtx > store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos, > unsigned HOST_WIDE_INT bitregion_start, > unsigned HOST_WIDE_INT bitregion_end, > machine_mode mode, tree exp, > - alias_set_type alias_set, bool nontemporal, bool reverse) > + alias_set_type alias_set, enum expand_flag flags) > { > if (TREE_CODE (exp) == ERROR_MARK) > return const0_rtx; > @@ -6787,7 +6814,7 @@ store_field (rtx target, HOST_WIDE_INT b > /* We're storing into a struct containing a single __complex. */ > > gcc_assert (!bitpos); > - return store_expr (exp, target, 0, nontemporal, reverse); > + return store_expr (exp, target, flags); > } > > /* If the structure is in a register or if the component > @@ -6903,11 +6930,16 @@ store_field (rtx target, HOST_WIDE_INT b > { > HOST_WIDE_INT size = GET_MODE_BITSIZE (temp_mode); > > - reverse = TYPE_REVERSE_STORAGE_ORDER (TREE_TYPE (exp)); > + bool reverse = TYPE_REVERSE_STORAGE_ORDER (TREE_TYPE (exp)); > > if (reverse) > temp = flip_storage_order (temp_mode, temp); > > + if (reverse) > + flags = (enum expand_flag) (flags | EXPAND_FLAG_REVERSE); > + else > + flags = (enum expand_flag) (flags & ~EXPAND_FLAG_REVERSE); > + > if (bitsize < size > && reverse ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN > && !(mode == BLKmode && bitsize > BITS_PER_WORD)) > @@ -6937,7 +6969,8 @@ store_field (rtx target, HOST_WIDE_INT b > emit_block_move (target, temp, > GEN_INT ((bitsize + BITS_PER_UNIT - 1) > / BITS_PER_UNIT), > - BLOCK_OP_NORMAL); > + (flags & EXPAND_FLAG_TAILCALL) > + ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL); > > return const0_rtx; > } > @@ -6954,7 +6987,7 @@ store_field (rtx target, HOST_WIDE_INT b > /* Store the value in the bitfield. */ > store_bit_field (target, bitsize, bitpos, > bitregion_start, bitregion_end, > - mode, temp, reverse); > + mode, temp, (flags & EXPAND_FLAG_REVERSE) != 0); > > return const0_rtx; > } > @@ -6974,11 +7007,12 @@ store_field (rtx target, HOST_WIDE_INT b > if (TREE_CODE (exp) == CONSTRUCTOR && bitsize >= 0) > { > gcc_assert (bitsize % BITS_PER_UNIT == 0); > - store_constructor (exp, to_rtx, 0, bitsize / BITS_PER_UNIT, reverse); > + store_constructor (exp, to_rtx, 0, bitsize / BITS_PER_UNIT, > + (flags & EXPAND_FLAG_REVERSE) != 0); > return to_rtx; > } > > - return store_expr (exp, to_rtx, 0, nontemporal, reverse); > + return store_expr (exp, to_rtx, flags); > } > } > > @@ -8322,8 +8356,11 @@ expand_expr_real_2 (sepops ops, rtx targ > /* Store data into beginning of memory target. */ > store_expr (treeop0, > adjust_address (target, TYPE_MODE (valtype), 0), > - modifier == EXPAND_STACK_PARM, > - false, TYPE_REVERSE_STORAGE_ORDER (type)); > + (enum expand_flag) > + ((modifier == EXPAND_STACK_PARM > + ? EXPAND_FLAG_CALL_PARAM_P : EXPAND_FLAG_NORMAL) > + | (TYPE_REVERSE_STORAGE_ORDER (type) > + ? EXPAND_FLAG_REVERSE : EXPAND_FLAG_NORMAL))); > > else > { > @@ -8337,7 +8374,7 @@ expand_expr_real_2 (sepops ops, rtx targ > * BITS_PER_UNIT), > (HOST_WIDE_INT) GET_MODE_BITSIZE (mode)), > 0, 0, 0, TYPE_MODE (valtype), treeop0, 0, > - false, false); > + EXPAND_FLAG_NORMAL); > } > > /* Return the entire union. */ > @@ -9548,15 +9585,15 @@ expand_expr_real_2 (sepops ops, rtx targ > jumpifnot (treeop0, lab0, > profile_probability::uninitialized ()); > store_expr (treeop1, temp, > - modifier == EXPAND_STACK_PARM, > - false, false); > + modifier == EXPAND_STACK_PARM > + ? EXPAND_FLAG_CALL_PARAM_P : EXPAND_FLAG_NORMAL); > > emit_jump_insn (targetm.gen_jump (lab1)); > emit_barrier (); > emit_label (lab0); > store_expr (treeop2, temp, > - modifier == EXPAND_STACK_PARM, > - false, false); > + modifier == EXPAND_STACK_PARM > + ? EXPAND_FLAG_CALL_PARAM_P : EXPAND_FLAG_NORMAL); > > emit_label (lab1); > OK_DEFER_POP; > @@ -10182,7 +10219,7 @@ expand_expr_real_1 (tree exp, rtx target > { > temp = assign_stack_temp (DECL_MODE (base), > GET_MODE_SIZE (DECL_MODE (base))); > - store_expr (base, temp, 0, false, false); > + store_expr (base, temp, EXPAND_FLAG_NORMAL); > temp = adjust_address (temp, BLKmode, offset); > set_mem_size (temp, int_size_in_bytes (type)); > return temp; > @@ -11075,13 +11112,13 @@ expand_expr_real_1 (tree exp, rtx target > value ? 0 : label, > profile_probability::uninitialized ()); > expand_assignment (lhs, build_int_cst (TREE_TYPE (rhs), value), > - false); > + EXPAND_FLAG_NORMAL); > do_pending_stack_adjust (); > emit_label (label); > return const0_rtx; > } > > - expand_assignment (lhs, rhs, false); > + expand_assignment (lhs, rhs, EXPAND_FLAG_NORMAL); > return const0_rtx; > } > > --- gcc/testsuite/gcc.target/i386/pr41455.c.jj 2017-12-06 18:06:10.552506649 +0100 > +++ gcc/testsuite/gcc.target/i386/pr41455.c 2017-12-06 18:05:51.000000000 +0100 > @@ -0,0 +1,23 @@ > +/* PR middle-end/41455 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mstringop-strategy=libcall" } */ > +/* Verify we tail call memcpy and memset. */ > +/* { dg-final { scan-assembler "jmp\[ \t]*_*memcpy" } } */ > +/* { dg-final { scan-assembler "jmp\[ \t]*_*memset" } } */ > + > +struct S { char c[111111]; }; > + > +void > +foo (struct S *a, struct S *b, int *c) > +{ > + *c = 0; > + *a = *b; > +} > + > +void > +bar (struct S *a, int *b, int *c) > +{ > + *b = 0; > + *c = 0; > + *a = (struct S) {}; > +} > > Jakub > >
On Fri, Dec 15, 2017 at 10:30:32AM +0100, Richard Biener wrote: > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > Hum, it doesn't look pretty ;) Can we defer this to stage1 given > it's a long-standing issue and we have quite big changes going in still? Ok, deferred. Jakub
On 12/06/2017 02:04 PM, Jakub Jelinek wrote: > Hi! > > Aggregate assignments and clears aren't in GIMPLE represented as calls, > and while often they expand inline, sometimes we emit libcalls for them. > This patch allows us to tail call those libcalls if there is nothing > after them. The patch changes the tailcall pass, so that it recognizes > a = b; and c = {}; statements under certain conditions as potential tail > calls returning void, and if it finds good tail call candidates, it marks > them specially. Because we have only a single bit left for GIMPLE_ASSIGN, > I've decided to wrap the rhs1 into a new internal call, so > a = b; will be transformed into a = TAILCALL_ASSIGN (b); and > c = {}; will be transformed into c = TAILCALL_ASSIGN (); > The rest of the patch is about propagating the flag (may use tailcall if > the emit_block_move or clear_storage is the last thing emitted) down > through expand_assignment and functions it calls. > > Those functions use 1-3 other flags, so instead of adding another bool > to all of them (next to nontemporal, call_param_p, reverse) I've decided > to pass around a bitmask of flags. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2017-12-06 Jakub Jelinek <jakub@redhat.com> > > PR target/41455 > PR target/82935 > * internal-fn.def (TAILCALL_ASSIGN): New internal function. > * internal-fn.c (expand_LAUNDER): Pass EXPAND_FLAG_NORMAL to > expand_assignment. > (expand_TAILCALL_ASSIGN): New function. > * tree-tailcall.c (struct tailcall): Adjust comment. > (find_tail_calls): Recognize also aggregate assignments and > aggregate clearing as possible tail calls. Use is_gimple_assign > instead of gimple_code check. > (optimize_tail_call): Rewrite aggregate assignments or aggregate > clearing in tail call positions using IFN_TAILCALL_ASSIGN > internal function. > * tree-outof-ssa.c (insert_value_copy_on_edge): Adjust store_expr > caller. > * tree-chkp.c (chkp_expand_bounds_reset_for_mem): Adjust > expand_assignment caller. > * function.c (assign_parm_setup_reg): Likewise. > * ubsan.c (ubsan_encode_value): Likewise. > * cfgexpand.c (expand_call_stmt, expand_asm_stmt): Likewise. > (expand_gimple_stmt_1): Likewise. Fix up formatting. > * calls.c (initialize_argument_information): Adjust store_expr caller. > * expr.h (enum expand_flag): New. > (expand_assignment): Replace bool argument with enum expand_flag. > (store_expr_with_bounds, store_expr): Replace int, bool, bool arguments > with enum expand_flag. > * expr.c (expand_assignment): Replace nontemporal argument with flags. > Assert no bits other than EXPAND_FLAG_NONTEMPORAL and > EXPAND_FLAG_TAILCALL are set. Adjust store_expr, store_fields and > store_expr_with_bounds callers. > (store_expr_with_bounds): Replace call_param_p, nontemporal and > reverse args with flags argument. Adjust recursive calls. Pass > BLOCK_OP_TAILCALL to clear_storage and expand_block_move if > EXPAND_FLAG_TAILCALL is set. Call clear_storage directly for > EXPAND_FLAG_TAILCALL assignments from emtpy CONSTRUCTOR. > (store_expr): Replace call_param_p, nontemporal and reverse args > with flags argument. Adjust store_expr_with_bounds caller. > (store_constructor_field): Adjust store_field caller. > (store_constructor): Adjust store_expr and expand_assignment callers. > (store_field): Replace nontemporal and reverse arguments with flags > argument. Adjust store_expr callers. Pass BLOCK_OP_TAILCALL to > emit_block_move if EXPAND_FLAG_TAILCALL is set. > (expand_expr_real_2): Adjust store_expr and store_field callers. > (expand_expr_real_1): Adjust store_expr and expand_assignment callers. > > * gcc.target/i386/pr41455.c: New test. This looks pretty reasonable to me. Is it big? Yes, but a fair amount is changing how we pass flags into the expanders. I think you just need to merge back to the trunk retest and this should be good to go. jeff
--- gcc/internal-fn.def.jj 2017-12-06 09:02:30.072952012 +0100 +++ gcc/internal-fn.def 2017-12-06 16:56:20.958518104 +0100 @@ -254,6 +254,11 @@ DEF_INTERNAL_FN (LAUNDER, ECF_LEAF | ECF /* Divmod function. */ DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF_LEAF, NULL) +/* Special markup for aggregate copy or clear that can be implemented + using a tailcall. lhs = rhs1; is represented by + lhs = TAILCALL_ASSIGN (rhs1); and lhs = {}; by lhs = TAILCALL_ASSIGN (); */ +DEF_INTERNAL_FN (TAILCALL_ASSIGN, ECF_NOTHROW | ECF_LEAF, NULL) + #undef DEF_INTERNAL_INT_FN #undef DEF_INTERNAL_FLT_FN #undef DEF_INTERNAL_FLT_FLOATN_FN --- gcc/internal-fn.c.jj 2017-12-06 09:02:29.968953307 +0100 +++ gcc/internal-fn.c 2017-12-06 18:00:15.993826828 +0100 @@ -2672,7 +2672,7 @@ expand_LAUNDER (internal_fn, gcall *call if (!lhs) return; - expand_assignment (lhs, gimple_call_arg (call, 0), false); + expand_assignment (lhs, gimple_call_arg (call, 0), EXPAND_FLAG_NORMAL); } /* Expand DIVMOD() using: @@ -2722,6 +2722,23 @@ expand_DIVMOD (internal_fn, gcall *call_ target, VOIDmode, EXPAND_NORMAL); } +/* Expand TAILCALL_ASSIGN. */ + +static void +expand_TAILCALL_ASSIGN (internal_fn, gcall *call_stmt) +{ + tree lhs = gimple_call_lhs (call_stmt); + tree rhs; + if (gimple_call_num_args (call_stmt) == 0) + { + rhs = build_constructor (TREE_TYPE (lhs), NULL); + TREE_STATIC (rhs) = 1; + } + else + rhs = gimple_call_arg (call_stmt, 0); + expand_assignment (lhs, rhs, EXPAND_FLAG_TAILCALL); +} + /* Expand a call to FN using the operands in STMT. FN has a single output operand and NARGS input operands. */ --- gcc/tree-tailcall.c.jj 2017-12-06 09:02:30.031952522 +0100 +++ gcc/tree-tailcall.c 2017-12-06 17:59:37.166299929 +0100 @@ -106,7 +106,8 @@ along with GCC; see the file COPYING3. struct tailcall { - /* The iterator pointing to the call statement. */ + /* The iterator pointing to the call (or aggregate copy that might be + expanded as call) statement. */ gimple_stmt_iterator call_gsi; /* True if it is a call to the current function. */ @@ -398,8 +399,7 @@ static void find_tail_calls (basic_block bb, struct tailcall **ret) { tree ass_var = NULL_TREE, ret_var, func, param; - gimple *stmt; - gcall *call = NULL; + gimple *stmt, *call = NULL; gimple_stmt_iterator gsi, agsi; bool tail_recursion; struct tailcall *nw; @@ -428,7 +428,7 @@ find_tail_calls (basic_block bb, struct /* Check for a call. */ if (is_gimple_call (stmt)) { - call = as_a <gcall *> (stmt); + call = stmt; ass_var = gimple_call_lhs (call); break; } @@ -440,6 +440,38 @@ find_tail_calls (basic_block bb, struct && auto_var_in_fn_p (gimple_assign_rhs1 (stmt), cfun->decl)) continue; + /* In addition to calls, allow aggregate copies that could be expanded + as memcpy or memset. Pretend it has NULL lhs then. */ + if (gimple_references_memory_p (stmt) + && gimple_assign_single_p (stmt) + && !gimple_has_volatile_ops (stmt) + && !gimple_assign_nontemporal_move_p (as_a <gassign *> (stmt)) + && gimple_vdef (stmt) + && !is_gimple_reg_type (TREE_TYPE (gimple_assign_lhs (stmt)))) + { + tree lhs = gimple_assign_lhs (stmt); + if (TYPE_MODE (TREE_TYPE (lhs)) != BLKmode) + return; + tree rhs1 = gimple_assign_rhs1 (stmt); + if (auto_var_in_fn_p (get_base_address (lhs), cfun->decl)) + return; + if (TREE_CODE (rhs1) == CONSTRUCTOR) + { + if (CONSTRUCTOR_NELTS (rhs1) != 0 || !TREE_STATIC (rhs1)) + return; + } + else if (auto_var_in_fn_p (get_base_address (rhs1), cfun->decl)) + return; + if (reverse_storage_order_for_component_p (lhs) + || reverse_storage_order_for_component_p (rhs1)) + return; + if (operand_equal_p (lhs, rhs1, 0)) + return; + call = stmt; + ass_var = NULL_TREE; + break; + } + /* If the statement references memory or volatile operands, fail. */ if (gimple_references_memory_p (stmt) || gimple_has_volatile_ops (stmt)) @@ -474,7 +506,7 @@ find_tail_calls (basic_block bb, struct /* We found the call, check whether it is suitable. */ tail_recursion = false; - func = gimple_call_fndecl (call); + func = is_gimple_call (call) ? gimple_call_fndecl (call) : NULL_TREE; if (func && !DECL_BUILT_IN (func) && recursive_call_p (current_function_decl, func)) @@ -521,7 +553,9 @@ find_tail_calls (basic_block bb, struct && auto_var_in_fn_p (var, cfun->decl) && may_be_aliased (var) && (ref_maybe_used_by_stmt_p (call, var) - || call_may_clobber_ref_p (call, var))) + || (is_gimple_call (call) + ? call_may_clobber_ref_p (as_a <gcall *> (call), var) + : refs_output_dependent_p (gimple_assign_lhs (call), var)))) return; } @@ -560,7 +594,7 @@ find_tail_calls (basic_block bb, struct || is_gimple_debug (stmt)) continue; - if (gimple_code (stmt) != GIMPLE_ASSIGN) + if (!is_gimple_assign (stmt)) return; /* This is a gimple assign. */ @@ -956,9 +990,31 @@ optimize_tail_call (struct tailcall *t, if (opt_tailcalls) { - gcall *stmt = as_a <gcall *> (gsi_stmt (t->call_gsi)); - - gimple_call_set_tail (stmt, true); + gimple *stmt = gsi_stmt (t->call_gsi); + if (gcall *call = dyn_cast <gcall *> (stmt)) + gimple_call_set_tail (call, true); + else + { + tree lhs = gimple_assign_lhs (stmt); + tree rhs1 = gimple_assign_rhs1 (stmt); + gcall *g; + if (TREE_CODE (rhs1) == CONSTRUCTOR) + g = gimple_build_call_internal (IFN_TAILCALL_ASSIGN, 0); + else + g = gimple_build_call_internal (IFN_TAILCALL_ASSIGN, 1, + rhs1); + gimple_call_set_lhs (g, lhs); + gimple_set_location (g, gimple_location (stmt)); + if (gimple_vdef (stmt) + && TREE_CODE (gimple_vdef (stmt)) == SSA_NAME) + { + gimple_set_vdef (g, gimple_vdef (stmt)); + SSA_NAME_DEF_STMT (gimple_vdef (g)) = g; + } + if (gimple_vuse (stmt)) + gimple_set_vuse (g, gimple_vuse (stmt)); + gsi_replace (&t->call_gsi, g, false); + } cfun->tail_call_marked = true; if (dump_file && (dump_flags & TDF_DETAILS)) { --- gcc/tree-outof-ssa.c.jj 2017-10-09 09:41:21.000000000 +0200 +++ gcc/tree-outof-ssa.c 2017-12-06 17:17:16.468209052 +0100 @@ -311,7 +311,7 @@ insert_value_copy_on_edge (edge e, int d else if (src_mode == BLKmode) { x = dest_rtx; - store_expr (src, x, 0, false, false); + store_expr (src, x, EXPAND_FLAG_NORMAL); } else x = expand_expr (src, dest_rtx, dest_mode, EXPAND_NORMAL); --- gcc/tree-chkp.c.jj 2017-12-06 09:02:30.141951153 +0100 +++ gcc/tree-chkp.c 2017-12-06 16:56:20.956518128 +0100 @@ -481,7 +481,7 @@ chkp_expand_bounds_reset_for_mem (tree m build_pointer_type (TREE_TYPE (mem)), mem); bndstx = chkp_build_bndstx_call (addr, ptr, bnd); - expand_assignment (bnd, zero_bnd, false); + expand_assignment (bnd, zero_bnd, EXPAND_FLAG_NORMAL); expand_normal (bndstx); } --- gcc/function.c.jj 2017-12-06 09:02:29.991953021 +0100 +++ gcc/function.c 2017-12-06 16:56:20.958518104 +0100 @@ -3284,7 +3284,8 @@ assign_parm_setup_reg (struct assign_par /* TREE_USED gets set erroneously during expand_assignment. */ save_tree_used = TREE_USED (parm); SET_DECL_RTL (parm, rtl); - expand_assignment (parm, make_tree (data->nominal_type, tempreg), false); + expand_assignment (parm, make_tree (data->nominal_type, tempreg), + EXPAND_FLAG_NORMAL); SET_DECL_RTL (parm, NULL_RTX); TREE_USED (parm) = save_tree_used; all->first_conversion_insn = get_insns (); --- gcc/ubsan.c.jj 2017-12-06 09:02:29.951953519 +0100 +++ gcc/ubsan.c 2017-12-06 16:56:20.960518080 +0100 @@ -165,7 +165,7 @@ ubsan_encode_value (tree t, enum ubsan_e rtx mem = assign_stack_temp_for_type (mode, GET_MODE_SIZE (mode), type); SET_DECL_RTL (var, mem); - expand_assignment (var, t, false); + expand_assignment (var, t, EXPAND_FLAG_NORMAL); return build_fold_addr_expr (var); } if (phase != UBSAN_ENCODE_VALUE_GENERIC) --- gcc/cfgexpand.c.jj 2017-12-06 09:02:30.056952211 +0100 +++ gcc/cfgexpand.c 2017-12-06 16:56:20.959518092 +0100 @@ -2668,7 +2668,7 @@ expand_call_stmt (gcall *stmt) rtx_insn *before_call = get_last_insn (); lhs = gimple_call_lhs (stmt); if (lhs) - expand_assignment (lhs, exp, false); + expand_assignment (lhs, exp, EXPAND_FLAG_NORMAL); else expand_expr (exp, const0_rtx, VOIDmode, EXPAND_NORMAL); @@ -3071,7 +3071,7 @@ expand_asm_stmt (gasm *stmt) generating_concat_p = old_generating_concat_p; push_to_sequence2 (after_rtl_seq, after_rtl_end); - expand_assignment (val, make_tree (type, op), false); + expand_assignment (val, make_tree (type, op), EXPAND_FLAG_NORMAL); after_rtl_seq = get_insns (); after_rtl_end = get_last_insn (); end_sequence (); @@ -3672,9 +3672,14 @@ expand_gimple_stmt_1 (gimple *stmt) this LHS. */ ; else - expand_assignment (lhs, rhs, - gimple_assign_nontemporal_move_p ( - assign_stmt)); + { + enum expand_flag flag; + if (gimple_assign_nontemporal_move_p (assign_stmt)) + flag = EXPAND_FLAG_NONTEMPORAL; + else + flag = EXPAND_FLAG_NORMAL; + expand_assignment (lhs, rhs, flag); + } } else { --- gcc/calls.c.jj 2017-11-22 21:37:50.000000000 +0100 +++ gcc/calls.c 2017-12-06 17:10:12.432363809 +0100 @@ -1971,7 +1971,7 @@ initialize_argument_information (int num else copy = assign_temp (type, 1, 0); - store_expr (args[i].tree_value, copy, 0, false, false); + store_expr (args[i].tree_value, copy, EXPAND_FLAG_NORMAL); /* Just change the const function to pure and then let the next test clear the pure based on --- gcc/expr.h.jj 2017-12-06 09:02:30.120951414 +0100 +++ gcc/expr.h 2017-12-06 17:09:34.782823601 +0100 @@ -35,6 +35,26 @@ enum expand_modifier {EXPAND_NORMAL = 0, EXPAND_CONST_ADDRESS, EXPAND_INITIALIZER, EXPAND_WRITE, EXPAND_MEMORY}; +/* Flags arguments for expand_assignment/store_expr*. The argument is + a bitwise or of these flags. */ +enum expand_flag { + /* Value if none of the flags are set. */ + EXPAND_FLAG_NORMAL = 0, + /* Expand the assignment/store as nontemporal store if possible. */ + EXPAND_FLAG_NONTEMPORAL = 1, + /* If the assignment is expanded as a libcall, it can be a tail call. */ + EXPAND_FLAG_TAILCALL = 2, + + /* Flags below this point are only for store_expr*, not for + expand_assignment. */ + + /* Reverse bytes in the store. */ + EXPAND_FLAG_REVERSE = 4, + /* True for stores into call params on the stack, where block moves to + that may need special treatment. */ + EXPAND_FLAG_CALL_PARAM_P = 8 +}; + /* Prevent the compiler from deferring stack pops. See inhibit_defer_pop for more information. */ #define NO_DEFER_POP (inhibit_defer_pop += 1) @@ -244,14 +264,14 @@ extern void get_bit_range (unsigned HOST tree, HOST_WIDE_INT *, tree *); /* Expand an assignment that stores the value of FROM into TO. */ -extern void expand_assignment (tree, tree, bool); +extern void expand_assignment (tree, tree, enum expand_flag); /* Generate code for computing expression EXP, and storing the value into TARGET. If SUGGEST_REG is nonzero, copy the value through a register and return that register, if that is possible. */ -extern rtx store_expr_with_bounds (tree, rtx, int, bool, bool, tree); -extern rtx store_expr (tree, rtx, int, bool, bool); +extern rtx store_expr_with_bounds (tree, rtx, enum expand_flag, tree); +extern rtx store_expr (tree, rtx, enum expand_flag); /* Given an rtx that may include add and multiply operations, generate them as insns and return a pseudo-reg containing the value. --- gcc/expr.c.jj 2017-12-06 09:02:30.103951626 +0100 +++ gcc/expr.c 2017-12-06 17:54:46.185845429 +0100 @@ -86,7 +86,7 @@ static void store_constructor_field (rtx static void store_constructor (tree, rtx, int, HOST_WIDE_INT, bool); static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, - machine_mode, tree, alias_set_type, bool, bool); + machine_mode, tree, alias_set_type, enum expand_flag); static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree); @@ -4915,11 +4915,12 @@ mem_ref_refers_to_non_mem_p (tree ref) return addr_expr_of_non_mem_decl_p_1 (base, false); } -/* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL - is true, try generating a nontemporal store. */ +/* Expand an assignment that stores the value of FROM into TO. If + flags & EXPAND_FLAG_NONTEMPORAL, try generating a nontemporal store, + if flags & EXPAND_FLAG_TAILCALL, allow generating a tail call. */ void -expand_assignment (tree to, tree from, bool nontemporal) +expand_assignment (tree to, tree from, enum expand_flag flags) { rtx to_rtx = 0; rtx result; @@ -4927,6 +4928,10 @@ expand_assignment (tree to, tree from, b unsigned int align; enum insn_code icode; + /* Rest of the flags only make sense for store_*. */ + gcc_checking_assert ((flags & ~(EXPAND_FLAG_NONTEMPORAL + | EXPAND_FLAG_TAILCALL)) == 0); + /* Don't crash if the lhs of the assignment was erroneous. */ if (TREE_CODE (to) == ERROR_MARK) { @@ -4992,6 +4997,7 @@ expand_assignment (tree to, tree from, b tree offset; int unsignedp, reversep, volatilep = 0; tree tem; + enum expand_flag flags_rev = flags; push_temp_slots (); tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1, @@ -5004,6 +5010,8 @@ expand_assignment (tree to, tree from, b offset = size_int (bitpos >> LOG2_BITS_PER_UNIT); bitpos &= BITS_PER_UNIT - 1; } + if (reversep) + flags_rev = (enum expand_flag) (flags | EXPAND_FLAG_REVERSE); if (TREE_CODE (to) == COMPONENT_REF && DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1))) @@ -5114,22 +5122,21 @@ expand_assignment (tree to, tree from, b && COMPLEX_MODE_P (GET_MODE (to_rtx)) && bitpos == 0 && bitsize == mode_bitsize) - result = store_expr (from, to_rtx, false, nontemporal, reversep); + result = store_expr (from, to_rtx, flags_rev); else if (bitsize == mode_bitsize / 2 && (bitpos == 0 || bitpos == mode_bitsize / 2)) - result = store_expr (from, XEXP (to_rtx, bitpos != 0), false, - nontemporal, reversep); + result = store_expr (from, XEXP (to_rtx, bitpos != 0), flags_rev); else if (bitpos + bitsize <= mode_bitsize / 2) result = store_field (XEXP (to_rtx, 0), bitsize, bitpos, bitregion_start, bitregion_end, mode1, from, get_alias_set (to), - nontemporal, reversep); + flags_rev); else if (bitpos >= mode_bitsize / 2) result = store_field (XEXP (to_rtx, 1), bitsize, bitpos - mode_bitsize / 2, bitregion_start, bitregion_end, mode1, from, get_alias_set (to), - nontemporal, reversep); + flags_rev); else if (bitpos == 0 && bitsize == mode_bitsize) { result = expand_normal (from); @@ -5166,7 +5173,8 @@ expand_assignment (tree to, tree from, b result = store_field (temp, bitsize, bitpos, bitregion_start, bitregion_end, mode1, from, get_alias_set (to), - nontemporal, reversep); + (enum expand_flag) + (flags_rev & ~EXPAND_FLAG_TAILCALL)); emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false)); emit_move_insn (XEXP (to_rtx, 1), read_complex_part (temp, true)); } @@ -5192,7 +5200,7 @@ expand_assignment (tree to, tree from, b result = store_field (to_rtx, bitsize, bitpos, bitregion_start, bitregion_end, mode1, from, get_alias_set (to), - nontemporal, reversep); + flags_rev); } if (result) @@ -5341,7 +5349,7 @@ expand_assignment (tree to, tree from, b /* Compute FROM and store the value in the rtx we got. */ push_temp_slots (); - result = store_expr_with_bounds (from, to_rtx, 0, nontemporal, false, to); + result = store_expr_with_bounds (from, to_rtx, flags, to); preserve_temp_slots (result); pop_temp_slots (); return; @@ -5375,19 +5383,14 @@ emit_storent_insn (rtx to, rtx from) with no sequence point. Will other languages need this to be more thorough? - If CALL_PARAM_P is nonzero, this is a store into a call param on the - stack, and block moves may need to be treated specially. - - If NONTEMPORAL is true, try using a nontemporal store instruction. - - If REVERSE is true, the store is to be done in reverse order. + FLAGS is a bitwise or of EXPAND_FLAG_* defined in expr.h. If BTARGET is not NULL then computed bounds of EXP are associated with BTARGET. */ rtx -store_expr_with_bounds (tree exp, rtx target, int call_param_p, - bool nontemporal, bool reverse, tree btarget) +store_expr_with_bounds (tree exp, rtx target, enum expand_flag flags, + tree btarget) { rtx temp; rtx alt_rtl = NULL_RTX; @@ -5398,7 +5401,7 @@ store_expr_with_bounds (tree exp, rtx ta /* C++ can generate ?: expressions with a throw expression in one branch and an rvalue in the other. Here, we resolve attempts to store the throw expression's nonexistent result. */ - gcc_assert (!call_param_p); + gcc_assert ((flags & EXPAND_FLAG_CALL_PARAM_P) == 0); expand_expr (exp, const0_rtx, VOIDmode, EXPAND_NORMAL); return NULL_RTX; } @@ -5407,9 +5410,9 @@ store_expr_with_bounds (tree exp, rtx ta /* Perform first part of compound expression, then assign from second part. */ expand_expr (TREE_OPERAND (exp, 0), const0_rtx, VOIDmode, - call_param_p ? EXPAND_STACK_PARM : EXPAND_NORMAL); - return store_expr_with_bounds (TREE_OPERAND (exp, 1), target, - call_param_p, nontemporal, reverse, + (flags & EXPAND_FLAG_CALL_PARAM_P) + ? EXPAND_STACK_PARM : EXPAND_NORMAL); + return store_expr_with_bounds (TREE_OPERAND (exp, 1), target, flags, btarget); } else if (TREE_CODE (exp) == COND_EXPR && GET_MODE (target) == BLKmode) @@ -5425,13 +5428,15 @@ store_expr_with_bounds (tree exp, rtx ta NO_DEFER_POP; jumpifnot (TREE_OPERAND (exp, 0), lab1, profile_probability::uninitialized ()); - store_expr_with_bounds (TREE_OPERAND (exp, 1), target, call_param_p, - nontemporal, reverse, btarget); + store_expr_with_bounds (TREE_OPERAND (exp, 1), target, + (enum expand_flag) + (flags & ~EXPAND_FLAG_TAILCALL), btarget); emit_jump_insn (targetm.gen_jump (lab2)); emit_barrier (); emit_label (lab1); - store_expr_with_bounds (TREE_OPERAND (exp, 2), target, call_param_p, - nontemporal, reverse, btarget); + store_expr_with_bounds (TREE_OPERAND (exp, 2), target, + (enum expand_flag) + (flags & ~EXPAND_FLAG_TAILCALL), btarget); emit_label (lab2); OK_DEFER_POP; @@ -5482,7 +5487,8 @@ store_expr_with_bounds (tree exp, rtx ta } temp = expand_expr (exp, inner_target, VOIDmode, - call_param_p ? EXPAND_STACK_PARM : EXPAND_NORMAL); + (flags & EXPAND_FLAG_CALL_PARAM_P) + ? EXPAND_STACK_PARM : EXPAND_NORMAL); /* Handle bounds returned by call. */ if (TREE_CODE (exp) == CALL_EXPR) @@ -5518,7 +5524,8 @@ store_expr_with_bounds (tree exp, rtx ta && TREE_CODE (TREE_OPERAND (TREE_OPERAND (exp, 0), 0)) == STRING_CST && integer_zerop (TREE_OPERAND (exp, 1)))) - && !nontemporal && !call_param_p + && (flags & (EXPAND_FLAG_NONTEMPORAL + | EXPAND_FLAG_CALL_PARAM_P)) == 0 && MEM_P (target)) { /* Optimize initialization of an array with a STRING_CST. */ @@ -5562,7 +5569,21 @@ store_expr_with_bounds (tree exp, rtx ta if (exp_len > str_copy_len) clear_storage (adjust_address (dest_mem, BLKmode, 0), GEN_INT (exp_len - str_copy_len), - BLOCK_OP_NORMAL); + (flags & EXPAND_FLAG_TAILCALL) + ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL); + return NULL_RTX; + } + else if (flags == EXPAND_FLAG_TAILCALL + && TREE_CODE (exp) == CONSTRUCTOR + && TREE_CODE (TREE_TYPE (exp)) != ERROR_MARK + && TREE_STATIC (exp) + && !TREE_ADDRESSABLE (exp) + && TYPE_MODE (TREE_TYPE (exp)) == BLKmode + && MEM_P (target) + && GET_MODE (target) == BLKmode + && CONSTRUCTOR_NELTS (exp) == 0) + { + clear_storage (target, expr_size (exp), BLOCK_OP_TAILCALL); return NULL_RTX; } else @@ -5572,9 +5593,10 @@ store_expr_with_bounds (tree exp, rtx ta normal_expr: /* If we want to use a nontemporal or a reverse order store, force the value into a register first. */ - tmp_target = nontemporal || reverse ? NULL_RTX : target; + tmp_target = (flags & (EXPAND_FLAG_NONTEMPORAL | EXPAND_FLAG_REVERSE)) + ? NULL_RTX : target; temp = expand_expr_real (exp, tmp_target, GET_MODE (target), - (call_param_p + ((flags & EXPAND_FLAG_CALL_PARAM_P) ? EXPAND_STACK_PARM : EXPAND_NORMAL), &alt_rtl, false); @@ -5647,7 +5669,8 @@ store_expr_with_bounds (tree exp, rtx ta else store_bit_field (target, INTVAL (expr_size (exp)) * BITS_PER_UNIT, - 0, 0, 0, GET_MODE (temp), temp, reverse); + 0, 0, 0, GET_MODE (temp), temp, + (flags & EXPAND_FLAG_REVERSE) != 0); } else convert_move (target, temp, TYPE_UNSIGNED (TREE_TYPE (exp))); @@ -5664,8 +5687,10 @@ store_expr_with_bounds (tree exp, rtx ta if (CONST_INT_P (size) && INTVAL (size) < TREE_STRING_LENGTH (exp)) emit_block_move (target, temp, size, - (call_param_p - ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL)); + ((flags & EXPAND_FLAG_CALL_PARAM_P) + ? BLOCK_OP_CALL_PARM + : (flags & EXPAND_FLAG_TAILCALL) + ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL)); else { machine_mode pointer_mode @@ -5679,7 +5704,7 @@ store_expr_with_bounds (tree exp, rtx ta size_int (TREE_STRING_LENGTH (exp))); rtx copy_size_rtx = expand_expr (copy_size, NULL_RTX, VOIDmode, - (call_param_p + ((flags & EXPAND_FLAG_CALL_PARAM_P) ? EXPAND_STACK_PARM : EXPAND_NORMAL)); rtx_code_label *label = 0; @@ -5687,7 +5712,7 @@ store_expr_with_bounds (tree exp, rtx ta copy_size_rtx = convert_to_mode (pointer_mode, copy_size_rtx, TYPE_UNSIGNED (sizetype)); emit_block_move (target, temp, copy_size_rtx, - (call_param_p + ((flags & EXPAND_FLAG_CALL_PARAM_P) ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL)); /* Figure out how much is left in TARGET that we have to clear. @@ -5739,14 +5764,17 @@ store_expr_with_bounds (tree exp, rtx ta int_size_in_bytes (TREE_TYPE (exp))); else if (GET_MODE (temp) == BLKmode) emit_block_move (target, temp, expr_size (exp), - (call_param_p - ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL)); + ((flags & EXPAND_FLAG_CALL_PARAM_P) + ? BLOCK_OP_CALL_PARM + : (flags & EXPAND_FLAG_TAILCALL) + ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL)); /* If we emit a nontemporal store, there is nothing else to do. */ - else if (nontemporal && emit_storent_insn (target, temp)) + else if ((flags & EXPAND_FLAG_NONTEMPORAL) + && emit_storent_insn (target, temp)) ; else { - if (reverse) + if (flags & EXPAND_FLAG_REVERSE) temp = flip_storage_order (GET_MODE (target), temp); temp = force_operand (temp, target); if (temp != target) @@ -5759,11 +5787,9 @@ store_expr_with_bounds (tree exp, rtx ta /* Same as store_expr_with_bounds but ignoring bounds of EXP. */ rtx -store_expr (tree exp, rtx target, int call_param_p, bool nontemporal, - bool reverse) +store_expr (tree exp, rtx target, enum expand_flag flags) { - return store_expr_with_bounds (exp, target, call_param_p, nontemporal, - reverse, NULL); + return store_expr_with_bounds (exp, target, flags, NULL); } /* Return true if field F of structure TYPE is a flexible array. */ @@ -6141,7 +6167,8 @@ store_constructor_field (rtx target, uns } else store_field (target, bitsize, bitpos, bitregion_start, bitregion_end, mode, - exp, alias_set, false, reverse); + exp, alias_set, + reverse ? EXPAND_FLAG_REVERSE : EXPAND_FLAG_NORMAL); } @@ -6338,6 +6365,8 @@ store_constructor (tree exp, rtx target, /* The storage order is specified for every aggregate type. */ reverse = TYPE_REVERSE_STORAGE_ORDER (type); + enum expand_flag flags_rev + = reverse ? EXPAND_FLAG_REVERSE : EXPAND_FLAG_NORMAL; domain = TYPE_DOMAIN (type); const_bounds_p = (TYPE_MIN_VALUE (domain) @@ -6495,7 +6524,7 @@ store_constructor (tree exp, rtx target, VAR_DECL, NULL_TREE, domain); index_r = gen_reg_rtx (promote_decl_mode (index, NULL)); SET_DECL_RTL (index, index_r); - store_expr (lo_index, index_r, 0, false, reverse); + store_expr (lo_index, index_r, flags_rev); /* Build the head of the loop. */ do_pending_stack_adjust (); @@ -6522,7 +6551,7 @@ store_constructor (tree exp, rtx target, store_constructor (value, xtarget, cleared, bitsize / BITS_PER_UNIT, reverse); else - store_expr (value, xtarget, 0, false, reverse); + store_expr (value, xtarget, flags_rev); /* Generate a conditional jump to exit the loop. */ exit_cond = build2 (LT_EXPR, integer_type_node, @@ -6535,7 +6564,7 @@ store_constructor (tree exp, rtx target, expand_assignment (index, build2 (PLUS_EXPR, TREE_TYPE (index), index, integer_one_node), - false); + EXPAND_FLAG_NORMAL); emit_jump (loop_start); @@ -6566,7 +6595,7 @@ store_constructor (tree exp, rtx target, expand_normal (position), highest_pow2_factor (position)); xtarget = adjust_address (xtarget, mode, 0); - store_expr (value, xtarget, 0, false, reverse); + store_expr (value, xtarget, flags_rev); } else { @@ -6760,16 +6789,14 @@ store_constructor (tree exp, rtx target, (in general) be different from that for TARGET, since TARGET is a reference to the containing structure. - If NONTEMPORAL is true, try generating a nontemporal store. - - If REVERSE is true, the store is to be done in reverse order. */ + FLAGS is a bitmask of EXPAND_FLAG_* flags defined in expr.h. */ static rtx store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos, unsigned HOST_WIDE_INT bitregion_start, unsigned HOST_WIDE_INT bitregion_end, machine_mode mode, tree exp, - alias_set_type alias_set, bool nontemporal, bool reverse) + alias_set_type alias_set, enum expand_flag flags) { if (TREE_CODE (exp) == ERROR_MARK) return const0_rtx; @@ -6787,7 +6814,7 @@ store_field (rtx target, HOST_WIDE_INT b /* We're storing into a struct containing a single __complex. */ gcc_assert (!bitpos); - return store_expr (exp, target, 0, nontemporal, reverse); + return store_expr (exp, target, flags); } /* If the structure is in a register or if the component @@ -6903,11 +6930,16 @@ store_field (rtx target, HOST_WIDE_INT b { HOST_WIDE_INT size = GET_MODE_BITSIZE (temp_mode); - reverse = TYPE_REVERSE_STORAGE_ORDER (TREE_TYPE (exp)); + bool reverse = TYPE_REVERSE_STORAGE_ORDER (TREE_TYPE (exp)); if (reverse) temp = flip_storage_order (temp_mode, temp); + if (reverse) + flags = (enum expand_flag) (flags | EXPAND_FLAG_REVERSE); + else + flags = (enum expand_flag) (flags & ~EXPAND_FLAG_REVERSE); + if (bitsize < size && reverse ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN && !(mode == BLKmode && bitsize > BITS_PER_WORD)) @@ -6937,7 +6969,8 @@ store_field (rtx target, HOST_WIDE_INT b emit_block_move (target, temp, GEN_INT ((bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT), - BLOCK_OP_NORMAL); + (flags & EXPAND_FLAG_TAILCALL) + ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL); return const0_rtx; } @@ -6954,7 +6987,7 @@ store_field (rtx target, HOST_WIDE_INT b /* Store the value in the bitfield. */ store_bit_field (target, bitsize, bitpos, bitregion_start, bitregion_end, - mode, temp, reverse); + mode, temp, (flags & EXPAND_FLAG_REVERSE) != 0); return const0_rtx; } @@ -6974,11 +7007,12 @@ store_field (rtx target, HOST_WIDE_INT b if (TREE_CODE (exp) == CONSTRUCTOR && bitsize >= 0) { gcc_assert (bitsize % BITS_PER_UNIT == 0); - store_constructor (exp, to_rtx, 0, bitsize / BITS_PER_UNIT, reverse); + store_constructor (exp, to_rtx, 0, bitsize / BITS_PER_UNIT, + (flags & EXPAND_FLAG_REVERSE) != 0); return to_rtx; } - return store_expr (exp, to_rtx, 0, nontemporal, reverse); + return store_expr (exp, to_rtx, flags); } } @@ -8322,8 +8356,11 @@ expand_expr_real_2 (sepops ops, rtx targ /* Store data into beginning of memory target. */ store_expr (treeop0, adjust_address (target, TYPE_MODE (valtype), 0), - modifier == EXPAND_STACK_PARM, - false, TYPE_REVERSE_STORAGE_ORDER (type)); + (enum expand_flag) + ((modifier == EXPAND_STACK_PARM + ? EXPAND_FLAG_CALL_PARAM_P : EXPAND_FLAG_NORMAL) + | (TYPE_REVERSE_STORAGE_ORDER (type) + ? EXPAND_FLAG_REVERSE : EXPAND_FLAG_NORMAL))); else { @@ -8337,7 +8374,7 @@ expand_expr_real_2 (sepops ops, rtx targ * BITS_PER_UNIT), (HOST_WIDE_INT) GET_MODE_BITSIZE (mode)), 0, 0, 0, TYPE_MODE (valtype), treeop0, 0, - false, false); + EXPAND_FLAG_NORMAL); } /* Return the entire union. */ @@ -9548,15 +9585,15 @@ expand_expr_real_2 (sepops ops, rtx targ jumpifnot (treeop0, lab0, profile_probability::uninitialized ()); store_expr (treeop1, temp, - modifier == EXPAND_STACK_PARM, - false, false); + modifier == EXPAND_STACK_PARM + ? EXPAND_FLAG_CALL_PARAM_P : EXPAND_FLAG_NORMAL); emit_jump_insn (targetm.gen_jump (lab1)); emit_barrier (); emit_label (lab0); store_expr (treeop2, temp, - modifier == EXPAND_STACK_PARM, - false, false); + modifier == EXPAND_STACK_PARM + ? EXPAND_FLAG_CALL_PARAM_P : EXPAND_FLAG_NORMAL); emit_label (lab1); OK_DEFER_POP; @@ -10182,7 +10219,7 @@ expand_expr_real_1 (tree exp, rtx target { temp = assign_stack_temp (DECL_MODE (base), GET_MODE_SIZE (DECL_MODE (base))); - store_expr (base, temp, 0, false, false); + store_expr (base, temp, EXPAND_FLAG_NORMAL); temp = adjust_address (temp, BLKmode, offset); set_mem_size (temp, int_size_in_bytes (type)); return temp; @@ -11075,13 +11112,13 @@ expand_expr_real_1 (tree exp, rtx target value ? 0 : label, profile_probability::uninitialized ()); expand_assignment (lhs, build_int_cst (TREE_TYPE (rhs), value), - false); + EXPAND_FLAG_NORMAL); do_pending_stack_adjust (); emit_label (label); return const0_rtx; } - expand_assignment (lhs, rhs, false); + expand_assignment (lhs, rhs, EXPAND_FLAG_NORMAL); return const0_rtx; } --- gcc/testsuite/gcc.target/i386/pr41455.c.jj 2017-12-06 18:06:10.552506649 +0100 +++ gcc/testsuite/gcc.target/i386/pr41455.c 2017-12-06 18:05:51.000000000 +0100 @@ -0,0 +1,23 @@ +/* PR middle-end/41455 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mstringop-strategy=libcall" } */ +/* Verify we tail call memcpy and memset. */ +/* { dg-final { scan-assembler "jmp\[ \t]*_*memcpy" } } */ +/* { dg-final { scan-assembler "jmp\[ \t]*_*memset" } } */ + +struct S { char c[111111]; }; + +void +foo (struct S *a, struct S *b, int *c) +{ + *c = 0; + *a = *b; +} + +void +bar (struct S *a, int *b, int *c) +{ + *b = 0; + *c = 0; + *a = (struct S) {}; +}