diff mbox series

[1/2] Add cumulative_args_t variants of TARGET_FUNCTION_ROUND_BOUNDARY and friends

Message ID 20211113094202.96232-1-maxim.blinov@embecosm.com
State New
Headers show
Series [1/2] Add cumulative_args_t variants of TARGET_FUNCTION_ROUND_BOUNDARY and friends | expand

Commit Message

Maxim Blinov Nov. 13, 2021, 9:42 a.m. UTC
The two target hooks responsible for informing GCC about stack
parameter alignment are `TARGET_FUNCTION_ARG_BOUNDARY` and
`TARGET_FUNCTION_ARG_ROUND_BOUNDARY`, which currently only consider
the tree and machine_mode of a specific given argument.

Create two new target hooks suffixed with '_CA', and pass in a third
`cumulative_args_t` parameter. This enables the backend to make
alignment decisions based on the context of the whole function rather
than individual parameters.

The orignal machine_mode/tree type macros are not removed - they are
called by the default implementations of `TARGET_...BOUNDARY_CA` and
`TARGET_...ROUND_BOUNDARY_CA`. This is done with the intetnion of
avoiding large mechanical modifications of nearly every backend in
GCC. There is also a new flag, -fstack-use-cumulative-args, which
provides a way to completely bypass the new `..._CA` macros. This
feature is intended for debugging GCC itself.

gcc/ChangeLog:

	* calls.c (initialize_argument_information): Pass `args_so_far`.
	* common.opt: New flag `-fstack-use-cumulative-args`.
	* config.gcc: No platforms currently use ..._CA-hooks: Set
	-fstack-use-cumulative-args to be off by default.
	* target.h (cumulative_args_t): Move declaration from here, to...
	* cumulative-args.h (cumulative_args_t): ...this new file. This is
	to permit backends to include the declaration of cumulative_args_t
	without dragging in circular dependencies.
	* function.c (assign_parm_find_entry_rtl): Provide
	cumulative_args_t to locate_and_pad_parm.
	(gimplify_parameters): Ditto.
	(locate_and_pad_parm): Conditionally call new hooks if we're
	invoked with -fstack-use-cumulative-args.
	* function.h: Include cumulative-args.h.
	(locate_and_pad_parm): Add cumulative_args_t parameter.
	* target.def (function_arg_boundary_ca): Add.
	(function_arg_round_boundary_ca): Ditto.
	* targhooks.c (default_function_arg_boundary_ca): Implement.
	(default_function_arg_round_boundary_ca): Ditto.
	* targhooks.h (default_function_arg_boundary_ca): Declare.
	(default_function_arg_round_boundary_ca): Ditto.
	* doc/invoke.texi (-fstack-use-cumulative-args): Document.
	* doc/tm.texi: Regenerate.
	* doc/tm.texi.in: Ditto.
---
 gcc/calls.c           |  3 +++
 gcc/common.opt        |  4 ++++
 gcc/config.gcc        |  7 +++++++
 gcc/cumulative-args.h | 20 ++++++++++++++++++++
 gcc/doc/invoke.texi   | 12 ++++++++++++
 gcc/doc/tm.texi       | 20 ++++++++++++++++++++
 gcc/doc/tm.texi.in    |  4 ++++
 gcc/function.c        | 25 +++++++++++++++++++++----
 gcc/function.h        |  2 ++
 gcc/target.def        | 24 ++++++++++++++++++++++++
 gcc/target.h          | 17 +----------------
 gcc/targhooks.c       | 16 ++++++++++++++++
 gcc/targhooks.h       |  6 ++++++
 13 files changed, 140 insertions(+), 20 deletions(-)
 create mode 100644 gcc/cumulative-args.h

Comments

Richard Biener Nov. 15, 2021, 7:11 a.m. UTC | #1
On Sat, Nov 13, 2021 at 10:43 AM Maxim Blinov <maxim.blinov@embecosm.com> wrote:
>
> The two target hooks responsible for informing GCC about stack
> parameter alignment are `TARGET_FUNCTION_ARG_BOUNDARY` and
> `TARGET_FUNCTION_ARG_ROUND_BOUNDARY`, which currently only consider
> the tree and machine_mode of a specific given argument.
>
> Create two new target hooks suffixed with '_CA', and pass in a third
> `cumulative_args_t` parameter. This enables the backend to make
> alignment decisions based on the context of the whole function rather
> than individual parameters.
>
> The orignal machine_mode/tree type macros are not removed - they are
> called by the default implementations of `TARGET_...BOUNDARY_CA` and
> `TARGET_...ROUND_BOUNDARY_CA`. This is done with the intetnion of
> avoiding large mechanical modifications of nearly every backend in
> GCC. There is also a new flag, -fstack-use-cumulative-args, which
> provides a way to completely bypass the new `..._CA` macros. This
> feature is intended for debugging GCC itself.

Just two quick comments without looking at the patch.

Please do not introduce options in the user namespace -f... which are
for debugging only.  I think you should go without this part instead.

Second, you fail to motivate the change.  I cannot make sense of
"This enables the backend to make alignment decisions based on the
context of the whole function rather than individual parameters."

Richard.

>
> gcc/ChangeLog:
>
>         * calls.c (initialize_argument_information): Pass `args_so_far`.
>         * common.opt: New flag `-fstack-use-cumulative-args`.
>         * config.gcc: No platforms currently use ..._CA-hooks: Set
>         -fstack-use-cumulative-args to be off by default.
>         * target.h (cumulative_args_t): Move declaration from here, to...
>         * cumulative-args.h (cumulative_args_t): ...this new file. This is
>         to permit backends to include the declaration of cumulative_args_t
>         without dragging in circular dependencies.
>         * function.c (assign_parm_find_entry_rtl): Provide
>         cumulative_args_t to locate_and_pad_parm.
>         (gimplify_parameters): Ditto.
>         (locate_and_pad_parm): Conditionally call new hooks if we're
>         invoked with -fstack-use-cumulative-args.
>         * function.h: Include cumulative-args.h.
>         (locate_and_pad_parm): Add cumulative_args_t parameter.
>         * target.def (function_arg_boundary_ca): Add.
>         (function_arg_round_boundary_ca): Ditto.
>         * targhooks.c (default_function_arg_boundary_ca): Implement.
>         (default_function_arg_round_boundary_ca): Ditto.
>         * targhooks.h (default_function_arg_boundary_ca): Declare.
>         (default_function_arg_round_boundary_ca): Ditto.
>         * doc/invoke.texi (-fstack-use-cumulative-args): Document.
>         * doc/tm.texi: Regenerate.
>         * doc/tm.texi.in: Ditto.
> ---
>  gcc/calls.c           |  3 +++
>  gcc/common.opt        |  4 ++++
>  gcc/config.gcc        |  7 +++++++
>  gcc/cumulative-args.h | 20 ++++++++++++++++++++
>  gcc/doc/invoke.texi   | 12 ++++++++++++
>  gcc/doc/tm.texi       | 20 ++++++++++++++++++++
>  gcc/doc/tm.texi.in    |  4 ++++
>  gcc/function.c        | 25 +++++++++++++++++++++----
>  gcc/function.h        |  2 ++
>  gcc/target.def        | 24 ++++++++++++++++++++++++
>  gcc/target.h          | 17 +----------------
>  gcc/targhooks.c       | 16 ++++++++++++++++
>  gcc/targhooks.h       |  6 ++++++
>  13 files changed, 140 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/cumulative-args.h
>
> diff --git a/gcc/calls.c b/gcc/calls.c
> index 27b59f26ad3..cef612a6ef4 100644
> --- a/gcc/calls.c
> +++ b/gcc/calls.c
> @@ -1527,6 +1527,7 @@ initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED,
>  #endif
>                              reg_parm_stack_space,
>                              args[i].pass_on_stack ? 0 : args[i].partial,
> +                            args_so_far,
>                              fndecl, args_size, &args[i].locate);
>  #ifdef BLOCK_REG_PADDING
>        else
> @@ -4205,6 +4206,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
>                            argvec[count].reg != 0,
>  #endif
>                            reg_parm_stack_space, 0,
> +                          args_so_far,
>                            NULL_TREE, &args_size, &argvec[count].locate);
>
>        if (argvec[count].reg == 0 || argvec[count].partial != 0
> @@ -4296,6 +4298,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
>                                argvec[count].reg != 0,
>  #endif
>                                reg_parm_stack_space, argvec[count].partial,
> +                              args_so_far,
>                                NULL_TREE, &args_size, &argvec[count].locate);
>           args_size.constant += argvec[count].locate.size.constant;
>           gcc_assert (!argvec[count].locate.size.var);
> diff --git a/gcc/common.opt b/gcc/common.opt
> index de9b848eda5..982417c1e39 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2705,6 +2705,10 @@ fstack-usage
>  Common RejectNegative Var(flag_stack_usage)
>  Output stack usage information on a per-function basis.
>
> +fstack-use-cumulative-args
> +Common RejectNegative Var(flag_stack_use_cumulative_args) Init(STACK_USE_CUMULATIVE_ARGS_INIT)
> +Use cumulative args-based stack layout hooks.
> +
>  fstrength-reduce
>  Common Ignore
>  Does nothing.  Preserved for backward compatibility.
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index edd12655c4a..046d691af56 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1070,6 +1070,13 @@ case ${target} in
>    ;;
>  esac
>
> +# Figure out if we need to enable -foff-stack-trampolines by default.
> +case ${target} in
> +*)
> +  tm_defines="$tm_defines STACK_USE_CUMULATIVE_ARGS_INIT=0"
> +  ;;
> +esac
> +
>  case ${target} in
>  aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
>         tm_file="${tm_file} dbxelf.h elfos.h newlib-stdint.h"
> diff --git a/gcc/cumulative-args.h b/gcc/cumulative-args.h
> new file mode 100644
> index 00000000000..b60928e37f9
> --- /dev/null
> +++ b/gcc/cumulative-args.h
> @@ -0,0 +1,20 @@
> +#ifndef GCC_CUMULATIVE_ARGS_H
> +#define GCC_CUMULATIVE_ARGS_H
> +
> +#if CHECKING_P
> +
> +struct cumulative_args_t { void *magic; void *p; };
> +
> +#else /* !CHECKING_P */
> +
> +/* When using a GCC build compiler, we could use
> +   __attribute__((transparent_union)) to get cumulative_args_t function
> +   arguments passed like scalars where the ABI would mandate a less
> +   efficient way of argument passing otherwise.  However, that would come
> +   at the cost of less type-safe !CHECKING_P compilation.  */
> +
> +union cumulative_args_t { void *p; };
> +
> +#endif /* !CHECKING_P */
> +
> +#endif /* GCC_CUMULATIVE_ARGS_H */
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 2aba4c70b44..7ffb2997c69 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -670,6 +670,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fverbose-asm  -fpack-struct[=@var{n}]  @gol
>  -fleading-underscore  -ftls-model=@var{model} @gol
>  -fstack-reuse=@var{reuse_level} @gol
> +-fstack-use-cumulative-args @gol
>  -ftrampolines  -ftrapv  -fwrapv @gol
>  -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]} @gol
>  -fstrict-volatile-bitfields  -fsync-libcalls}
> @@ -16625,6 +16626,17 @@ the behavior of older compilers in which temporaries' stack space is
>  not reused, the aggressive stack reuse can lead to runtime errors. This
>  option is used to control the temporary stack reuse optimization.
>
> +@item -fstack-use-cumulative-args
> +@opindex fstack_use_cumulative_args
> +This option instructs the compiler to use the
> +@code{cumulative_args_t}-based stack layout target hooks,
> +@code{TARGET_FUNCTION_ARG_BOUNDARY_CA} and
> +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA}. If a given target does
> +not define these hooks, the default behaviour is to fallback to using
> +the standard non-@code{_CA} variants instead. Certain targets (such as
> +AArch64 Darwin) require using the more advanced @code{_CA}-based
> +hooks: For these targets this option should be enabled by default.
> +
>  @item -ftrapv
>  @opindex ftrapv
>  This option generates traps for signed overflow on addition, subtraction,
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 78a1af1ad4d..261eaf46ef5 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -4322,6 +4322,16 @@ with the specified mode and type.  The default hook returns
>  @code{PARM_BOUNDARY} for all arguments.
>  @end deftypefn
>
> +@deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_BOUNDARY_CA (machine_mode @var{mode}, const_tree @var{type}, cumulative_args_t @var{ca})
> +This is the @code{cumulative_args_t}-based version of
> +@code{TARGET_FUNCTION_ARG_BOUNDARY}. Define this hook if you need more
> +fine-grained control over argument alignment, e.g. depending on whether
> +it is a named argument or not, or any other criteria that you choose to
> +place in the @var{ca} structure.
> +
> +The default hook will call @code{TARGET_FUNCTION_ARG_BOUNDARY}.
> +@end deftypefn
> +
>  @deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_ROUND_BOUNDARY (machine_mode @var{mode}, const_tree @var{type})
>  Normally, the size of an argument is rounded up to @code{PARM_BOUNDARY},
>  which is the default value for this hook.  You can define this hook to
> @@ -4329,6 +4339,16 @@ return a different value if an argument size must be rounded to a larger
>  value.
>  @end deftypefn
>
> +@deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA (machine_mode @var{mode}, const_tree @var{type}, cumulative_args_t @var{ca})
> +This is the @code{cumulative_args_t}-based version of
> +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}. Define this hook if you need more
> +fine-grained control over argument size rounding, e.g. depending on whether
> +it is a named argument or not, or any other criteria that you choose to
> +place in the @var{ca} structure.
> +
> +The default hook will call @code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}.
> +@end deftypefn
> +
>  @defmac FUNCTION_ARG_REGNO_P (@var{regno})
>  A C expression that is nonzero if @var{regno} is the number of a hard
>  register in which function arguments are sometimes passed.  This does
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 4401550989e..933644dfa86 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -3330,8 +3330,12 @@ required.
>
>  @hook TARGET_FUNCTION_ARG_BOUNDARY
>
> +@hook TARGET_FUNCTION_ARG_BOUNDARY_CA
> +
>  @hook TARGET_FUNCTION_ARG_ROUND_BOUNDARY
>
> +@hook TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA
> +
>  @defmac FUNCTION_ARG_REGNO_P (@var{regno})
>  A C expression that is nonzero if @var{regno} is the number of a hard
>  register in which function arguments are sometimes passed.  This does
> diff --git a/gcc/function.c b/gcc/function.c
> index 61b3bd036b8..1ebf3e0ffe5 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -2610,7 +2610,9 @@ assign_parm_find_entry_rtl (struct assign_parm_data_all *all,
>
>    locate_and_pad_parm (data->arg.mode, data->arg.type, in_regs,
>                        all->reg_parm_stack_space,
> -                      entry_parm ? data->partial : 0, current_function_decl,
> +                      entry_parm ? data->partial : 0,
> +                      all->args_so_far,
> +                      current_function_decl,
>                        &all->stack_args_size, &data->locate);
>
>    /* Update parm_stack_boundary if this parameter is passed in the
> @@ -4027,6 +4029,7 @@ gimplify_parameters (gimple_seq *cleanup)
>  void
>  locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
>                      int reg_parm_stack_space, int partial,
> +                    cumulative_args_t ca,
>                      tree fndecl ATTRIBUTE_UNUSED,
>                      struct args_size *initial_offset_ptr,
>                      struct locate_and_pad_arg_data *locate)
> @@ -4064,9 +4067,23 @@ locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
>               ? arg_size_in_bytes (type)
>               : size_int (GET_MODE_SIZE (passed_mode)));
>    where_pad = targetm.calls.function_arg_padding (passed_mode, type);
> -  boundary = targetm.calls.function_arg_boundary (passed_mode, type);
> -  round_boundary = targetm.calls.function_arg_round_boundary (passed_mode,
> -                                                             type);
> +
> +  if (flag_stack_use_cumulative_args)
> +    {
> +      boundary = targetm.calls.function_arg_boundary_ca (passed_mode,
> +                                                        type,
> +                                                        ca);
> +      round_boundary = targetm.calls.function_arg_round_boundary_ca
> +       (passed_mode, type, ca);
> +    }
> +  else
> +    {
> +      boundary = targetm.calls.function_arg_boundary (passed_mode,
> +                                                     type);
> +      round_boundary = targetm.calls.function_arg_round_boundary
> +       (passed_mode, type);
> +    }
> +
>    locate->where_pad = where_pad;
>
>    /* Alignment can't exceed MAX_SUPPORTED_STACK_ALIGNMENT.  */
> diff --git a/gcc/function.h b/gcc/function.h
> index 899430833ce..26f342caba3 100644
> --- a/gcc/function.h
> +++ b/gcc/function.h
> @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
>  #ifndef GCC_FUNCTION_H
>  #define GCC_FUNCTION_H
>
> +#include "cumulative-args.h"
>
>  /* Stack of pending (incomplete) sequences saved by `start_sequence'.
>     Each element describes one pending sequence.
> @@ -661,6 +662,7 @@ extern int aggregate_value_p (const_tree, const_tree);
>  extern bool use_register_for_decl (const_tree);
>  extern gimple_seq gimplify_parameters (gimple_seq *);
>  extern void locate_and_pad_parm (machine_mode, tree, int, int, int,
> +                                cumulative_args_t,
>                                  tree, struct args_size *,
>                                  struct locate_and_pad_arg_data *);
>  extern void generate_setjmp_warnings (void);
> diff --git a/gcc/target.def b/gcc/target.def
> index 51ea167172b..bb56afab539 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -4959,6 +4959,18 @@ with the specified mode and type.  The default hook returns\n\
>   unsigned int, (machine_mode mode, const_tree type),
>   default_function_arg_boundary)
>
> +DEFHOOK
> +(function_arg_boundary_ca,
> + "This is the @code{cumulative_args_t}-based version of\n\
> +@code{TARGET_FUNCTION_ARG_BOUNDARY}. Define this hook if you need more\n\
> +fine-grained control over argument alignment, e.g. depending on whether\n\
> +it is a named argument or not, or any other criteria that you choose to\n\
> +place in the @var{ca} structure.\n\
> +\n\
> +The default hook will call @code{TARGET_FUNCTION_ARG_BOUNDARY}.",
> + unsigned int, (machine_mode mode, const_tree type, cumulative_args_t ca),
> + default_function_arg_boundary_ca)
> +
>  DEFHOOK
>  (function_arg_round_boundary,
>   "Normally, the size of an argument is rounded up to @code{PARM_BOUNDARY},\n\
> @@ -4968,6 +4980,18 @@ value.",
>   unsigned int, (machine_mode mode, const_tree type),
>   default_function_arg_round_boundary)
>
> +DEFHOOK
> +(function_arg_round_boundary_ca,
> + "This is the @code{cumulative_args_t}-based version of\n\
> +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}. Define this hook if you need more\n\
> +fine-grained control over argument size rounding, e.g. depending on whether\n\
> +it is a named argument or not, or any other criteria that you choose to\n\
> +place in the @var{ca} structure.\n\
> +\n\
> +The default hook will call @code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}.",
> + unsigned int, (machine_mode mode, const_tree type, cumulative_args_t ca),
> + default_function_arg_round_boundary_ca)
> +
>  /* Return the diagnostic message string if function without a prototype
>     is not allowed for this 'val' argument; NULL otherwise. */
>  DEFHOOK
> diff --git a/gcc/target.h b/gcc/target.h
> index d8f45fb99c3..5b28ece7e1c 100644
> --- a/gcc/target.h
> +++ b/gcc/target.h
> @@ -51,22 +51,7 @@
>  #include "insn-codes.h"
>  #include "tm.h"
>  #include "hard-reg-set.h"
> -
> -#if CHECKING_P
> -
> -struct cumulative_args_t { void *magic; void *p; };
> -
> -#else /* !CHECKING_P */
> -
> -/* When using a GCC build compiler, we could use
> -   __attribute__((transparent_union)) to get cumulative_args_t function
> -   arguments passed like scalars where the ABI would mandate a less
> -   efficient way of argument passing otherwise.  However, that would come
> -   at the cost of less type-safe !CHECKING_P compilation.  */
> -
> -union cumulative_args_t { void *p; };
> -
> -#endif /* !CHECKING_P */
> +#include "cumulative-args.h"
>
>  /* Types of memory operation understood by the "by_pieces" infrastructure.
>     Used by the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P target hook and
> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> index 6f071f80231..d83f362a5ae 100644
> --- a/gcc/targhooks.c
> +++ b/gcc/targhooks.c
> @@ -850,6 +850,14 @@ default_function_arg_boundary (machine_mode mode ATTRIBUTE_UNUSED,
>    return PARM_BOUNDARY;
>  }
>
> +unsigned int
> +default_function_arg_boundary_ca (machine_mode mode ATTRIBUTE_UNUSED,
> +                                 const_tree type ATTRIBUTE_UNUSED,
> +                                 cumulative_args_t ca ATTRIBUTE_UNUSED)
> +{
> +  return default_function_arg_boundary (mode, type);
> +}
> +
>  unsigned int
>  default_function_arg_round_boundary (machine_mode mode ATTRIBUTE_UNUSED,
>                                      const_tree type ATTRIBUTE_UNUSED)
> @@ -857,6 +865,14 @@ default_function_arg_round_boundary (machine_mode mode ATTRIBUTE_UNUSED,
>    return PARM_BOUNDARY;
>  }
>
> +unsigned int
> +default_function_arg_round_boundary_ca (machine_mode mode ATTRIBUTE_UNUSED,
> +                                       const_tree type ATTRIBUTE_UNUSED,
> +                                       cumulative_args_t ca ATTRIBUTE_UNUSED)
> +{
> +  return default_function_arg_round_boundary (mode, type);
> +}
> +
>  void
>  hook_void_bitmap (bitmap regs ATTRIBUTE_UNUSED)
>  {
> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> index 11e9d7dd1a8..f33374ef073 100644
> --- a/gcc/targhooks.h
> +++ b/gcc/targhooks.h
> @@ -154,6 +154,12 @@ extern unsigned int default_function_arg_boundary (machine_mode,
>                                                    const_tree);
>  extern unsigned int default_function_arg_round_boundary (machine_mode,
>                                                          const_tree);
> +extern unsigned int default_function_arg_boundary_ca (machine_mode,
> +                                                     const_tree,
> +                                                     cumulative_args_t ca);
> +extern unsigned int default_function_arg_round_boundary_ca (machine_mode,
> +                                                           const_tree,
> +                                                           cumulative_args_t ca);
>  extern bool hook_bool_const_rtx_commutative_p (const_rtx, int);
>  extern rtx default_function_value (const_tree, const_tree, bool);
>  extern HARD_REG_SET default_zero_call_used_regs (HARD_REG_SET);
> --
> 2.30.1 (Apple Git-130)
>
Maxim Blinov Nov. 22, 2021, 2:40 p.m. UTC | #2
Hi Richard,

The purpose of this patch is to give more of the target code access to
the cumulative_args_t structure in order to enable certain (otherwise
currently impossible) stack layout constraints, specifically for
parameters that are passed over the stack. For example, there are some
targets (not yet in GCC trunk) which explicitly require the
distinction between named and variadic parameters in order to enforce
different alignment rules (when passing over the stack.) Such a
constraint would be accommodated by this patch.

The patch itself is very straightforward and simply adds the parameter
to the two functions which we'd like to extend. The motivation of
defining new target hooks was to minimize the patch size.

A concrete user of this change that we have in mind is the AArch64
Darwin parameter passing abi, which mandates that when passing on the
stack, named parameters are to be naturally-aligned, however variadic
ones are to be word-aligned. However this patch is completely generic
in nature and should enable all targets to have more control over
their parameter layout process.

Best Regards,
Maxim

On Mon, 15 Nov 2021 at 07:11, Richard Biener <richard.guenther@gmail.com> wrote:
>
> On Sat, Nov 13, 2021 at 10:43 AM Maxim Blinov <maxim.blinov@embecosm.com> wrote:
> >
> > The two target hooks responsible for informing GCC about stack
> > parameter alignment are `TARGET_FUNCTION_ARG_BOUNDARY` and
> > `TARGET_FUNCTION_ARG_ROUND_BOUNDARY`, which currently only consider
> > the tree and machine_mode of a specific given argument.
> >
> > Create two new target hooks suffixed with '_CA', and pass in a third
> > `cumulative_args_t` parameter. This enables the backend to make
> > alignment decisions based on the context of the whole function rather
> > than individual parameters.
> >
> > The orignal machine_mode/tree type macros are not removed - they are
> > called by the default implementations of `TARGET_...BOUNDARY_CA` and
> > `TARGET_...ROUND_BOUNDARY_CA`. This is done with the intetnion of
> > avoiding large mechanical modifications of nearly every backend in
> > GCC. There is also a new flag, -fstack-use-cumulative-args, which
> > provides a way to completely bypass the new `..._CA` macros. This
> > feature is intended for debugging GCC itself.
>
> Just two quick comments without looking at the patch.
>
> Please do not introduce options in the user namespace -f... which are
> for debugging only.  I think you should go without this part instead.
>
> Second, you fail to motivate the change.  I cannot make sense of
> "This enables the backend to make alignment decisions based on the
> context of the whole function rather than individual parameters."
>
> Richard.
>
> >
> > gcc/ChangeLog:
> >
> >         * calls.c (initialize_argument_information): Pass `args_so_far`.
> >         * common.opt: New flag `-fstack-use-cumulative-args`.
> >         * config.gcc: No platforms currently use ..._CA-hooks: Set
> >         -fstack-use-cumulative-args to be off by default.
> >         * target.h (cumulative_args_t): Move declaration from here, to...
> >         * cumulative-args.h (cumulative_args_t): ...this new file. This is
> >         to permit backends to include the declaration of cumulative_args_t
> >         without dragging in circular dependencies.
> >         * function.c (assign_parm_find_entry_rtl): Provide
> >         cumulative_args_t to locate_and_pad_parm.
> >         (gimplify_parameters): Ditto.
> >         (locate_and_pad_parm): Conditionally call new hooks if we're
> >         invoked with -fstack-use-cumulative-args.
> >         * function.h: Include cumulative-args.h.
> >         (locate_and_pad_parm): Add cumulative_args_t parameter.
> >         * target.def (function_arg_boundary_ca): Add.
> >         (function_arg_round_boundary_ca): Ditto.
> >         * targhooks.c (default_function_arg_boundary_ca): Implement.
> >         (default_function_arg_round_boundary_ca): Ditto.
> >         * targhooks.h (default_function_arg_boundary_ca): Declare.
> >         (default_function_arg_round_boundary_ca): Ditto.
> >         * doc/invoke.texi (-fstack-use-cumulative-args): Document.
> >         * doc/tm.texi: Regenerate.
> >         * doc/tm.texi.in: Ditto.
> > ---
> >  gcc/calls.c           |  3 +++
> >  gcc/common.opt        |  4 ++++
> >  gcc/config.gcc        |  7 +++++++
> >  gcc/cumulative-args.h | 20 ++++++++++++++++++++
> >  gcc/doc/invoke.texi   | 12 ++++++++++++
> >  gcc/doc/tm.texi       | 20 ++++++++++++++++++++
> >  gcc/doc/tm.texi.in    |  4 ++++
> >  gcc/function.c        | 25 +++++++++++++++++++++----
> >  gcc/function.h        |  2 ++
> >  gcc/target.def        | 24 ++++++++++++++++++++++++
> >  gcc/target.h          | 17 +----------------
> >  gcc/targhooks.c       | 16 ++++++++++++++++
> >  gcc/targhooks.h       |  6 ++++++
> >  13 files changed, 140 insertions(+), 20 deletions(-)
> >  create mode 100644 gcc/cumulative-args.h
> >
> > diff --git a/gcc/calls.c b/gcc/calls.c
> > index 27b59f26ad3..cef612a6ef4 100644
> > --- a/gcc/calls.c
> > +++ b/gcc/calls.c
> > @@ -1527,6 +1527,7 @@ initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED,
> >  #endif
> >                              reg_parm_stack_space,
> >                              args[i].pass_on_stack ? 0 : args[i].partial,
> > +                            args_so_far,
> >                              fndecl, args_size, &args[i].locate);
> >  #ifdef BLOCK_REG_PADDING
> >        else
> > @@ -4205,6 +4206,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
> >                            argvec[count].reg != 0,
> >  #endif
> >                            reg_parm_stack_space, 0,
> > +                          args_so_far,
> >                            NULL_TREE, &args_size, &argvec[count].locate);
> >
> >        if (argvec[count].reg == 0 || argvec[count].partial != 0
> > @@ -4296,6 +4298,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
> >                                argvec[count].reg != 0,
> >  #endif
> >                                reg_parm_stack_space, argvec[count].partial,
> > +                              args_so_far,
> >                                NULL_TREE, &args_size, &argvec[count].locate);
> >           args_size.constant += argvec[count].locate.size.constant;
> >           gcc_assert (!argvec[count].locate.size.var);
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index de9b848eda5..982417c1e39 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -2705,6 +2705,10 @@ fstack-usage
> >  Common RejectNegative Var(flag_stack_usage)
> >  Output stack usage information on a per-function basis.
> >
> > +fstack-use-cumulative-args
> > +Common RejectNegative Var(flag_stack_use_cumulative_args) Init(STACK_USE_CUMULATIVE_ARGS_INIT)
> > +Use cumulative args-based stack layout hooks.
> > +
> >  fstrength-reduce
> >  Common Ignore
> >  Does nothing.  Preserved for backward compatibility.
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index edd12655c4a..046d691af56 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -1070,6 +1070,13 @@ case ${target} in
> >    ;;
> >  esac
> >
> > +# Figure out if we need to enable -foff-stack-trampolines by default.
> > +case ${target} in
> > +*)
> > +  tm_defines="$tm_defines STACK_USE_CUMULATIVE_ARGS_INIT=0"
> > +  ;;
> > +esac
> > +
> >  case ${target} in
> >  aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
> >         tm_file="${tm_file} dbxelf.h elfos.h newlib-stdint.h"
> > diff --git a/gcc/cumulative-args.h b/gcc/cumulative-args.h
> > new file mode 100644
> > index 00000000000..b60928e37f9
> > --- /dev/null
> > +++ b/gcc/cumulative-args.h
> > @@ -0,0 +1,20 @@
> > +#ifndef GCC_CUMULATIVE_ARGS_H
> > +#define GCC_CUMULATIVE_ARGS_H
> > +
> > +#if CHECKING_P
> > +
> > +struct cumulative_args_t { void *magic; void *p; };
> > +
> > +#else /* !CHECKING_P */
> > +
> > +/* When using a GCC build compiler, we could use
> > +   __attribute__((transparent_union)) to get cumulative_args_t function
> > +   arguments passed like scalars where the ABI would mandate a less
> > +   efficient way of argument passing otherwise.  However, that would come
> > +   at the cost of less type-safe !CHECKING_P compilation.  */
> > +
> > +union cumulative_args_t { void *p; };
> > +
> > +#endif /* !CHECKING_P */
> > +
> > +#endif /* GCC_CUMULATIVE_ARGS_H */
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 2aba4c70b44..7ffb2997c69 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -670,6 +670,7 @@ Objective-C and Objective-C++ Dialects}.
> >  -fverbose-asm  -fpack-struct[=@var{n}]  @gol
> >  -fleading-underscore  -ftls-model=@var{model} @gol
> >  -fstack-reuse=@var{reuse_level} @gol
> > +-fstack-use-cumulative-args @gol
> >  -ftrampolines  -ftrapv  -fwrapv @gol
> >  -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]} @gol
> >  -fstrict-volatile-bitfields  -fsync-libcalls}
> > @@ -16625,6 +16626,17 @@ the behavior of older compilers in which temporaries' stack space is
> >  not reused, the aggressive stack reuse can lead to runtime errors. This
> >  option is used to control the temporary stack reuse optimization.
> >
> > +@item -fstack-use-cumulative-args
> > +@opindex fstack_use_cumulative_args
> > +This option instructs the compiler to use the
> > +@code{cumulative_args_t}-based stack layout target hooks,
> > +@code{TARGET_FUNCTION_ARG_BOUNDARY_CA} and
> > +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA}. If a given target does
> > +not define these hooks, the default behaviour is to fallback to using
> > +the standard non-@code{_CA} variants instead. Certain targets (such as
> > +AArch64 Darwin) require using the more advanced @code{_CA}-based
> > +hooks: For these targets this option should be enabled by default.
> > +
> >  @item -ftrapv
> >  @opindex ftrapv
> >  This option generates traps for signed overflow on addition, subtraction,
> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > index 78a1af1ad4d..261eaf46ef5 100644
> > --- a/gcc/doc/tm.texi
> > +++ b/gcc/doc/tm.texi
> > @@ -4322,6 +4322,16 @@ with the specified mode and type.  The default hook returns
> >  @code{PARM_BOUNDARY} for all arguments.
> >  @end deftypefn
> >
> > +@deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_BOUNDARY_CA (machine_mode @var{mode}, const_tree @var{type}, cumulative_args_t @var{ca})
> > +This is the @code{cumulative_args_t}-based version of
> > +@code{TARGET_FUNCTION_ARG_BOUNDARY}. Define this hook if you need more
> > +fine-grained control over argument alignment, e.g. depending on whether
> > +it is a named argument or not, or any other criteria that you choose to
> > +place in the @var{ca} structure.
> > +
> > +The default hook will call @code{TARGET_FUNCTION_ARG_BOUNDARY}.
> > +@end deftypefn
> > +
> >  @deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_ROUND_BOUNDARY (machine_mode @var{mode}, const_tree @var{type})
> >  Normally, the size of an argument is rounded up to @code{PARM_BOUNDARY},
> >  which is the default value for this hook.  You can define this hook to
> > @@ -4329,6 +4339,16 @@ return a different value if an argument size must be rounded to a larger
> >  value.
> >  @end deftypefn
> >
> > +@deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA (machine_mode @var{mode}, const_tree @var{type}, cumulative_args_t @var{ca})
> > +This is the @code{cumulative_args_t}-based version of
> > +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}. Define this hook if you need more
> > +fine-grained control over argument size rounding, e.g. depending on whether
> > +it is a named argument or not, or any other criteria that you choose to
> > +place in the @var{ca} structure.
> > +
> > +The default hook will call @code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}.
> > +@end deftypefn
> > +
> >  @defmac FUNCTION_ARG_REGNO_P (@var{regno})
> >  A C expression that is nonzero if @var{regno} is the number of a hard
> >  register in which function arguments are sometimes passed.  This does
> > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> > index 4401550989e..933644dfa86 100644
> > --- a/gcc/doc/tm.texi.in
> > +++ b/gcc/doc/tm.texi.in
> > @@ -3330,8 +3330,12 @@ required.
> >
> >  @hook TARGET_FUNCTION_ARG_BOUNDARY
> >
> > +@hook TARGET_FUNCTION_ARG_BOUNDARY_CA
> > +
> >  @hook TARGET_FUNCTION_ARG_ROUND_BOUNDARY
> >
> > +@hook TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA
> > +
> >  @defmac FUNCTION_ARG_REGNO_P (@var{regno})
> >  A C expression that is nonzero if @var{regno} is the number of a hard
> >  register in which function arguments are sometimes passed.  This does
> > diff --git a/gcc/function.c b/gcc/function.c
> > index 61b3bd036b8..1ebf3e0ffe5 100644
> > --- a/gcc/function.c
> > +++ b/gcc/function.c
> > @@ -2610,7 +2610,9 @@ assign_parm_find_entry_rtl (struct assign_parm_data_all *all,
> >
> >    locate_and_pad_parm (data->arg.mode, data->arg.type, in_regs,
> >                        all->reg_parm_stack_space,
> > -                      entry_parm ? data->partial : 0, current_function_decl,
> > +                      entry_parm ? data->partial : 0,
> > +                      all->args_so_far,
> > +                      current_function_decl,
> >                        &all->stack_args_size, &data->locate);
> >
> >    /* Update parm_stack_boundary if this parameter is passed in the
> > @@ -4027,6 +4029,7 @@ gimplify_parameters (gimple_seq *cleanup)
> >  void
> >  locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
> >                      int reg_parm_stack_space, int partial,
> > +                    cumulative_args_t ca,
> >                      tree fndecl ATTRIBUTE_UNUSED,
> >                      struct args_size *initial_offset_ptr,
> >                      struct locate_and_pad_arg_data *locate)
> > @@ -4064,9 +4067,23 @@ locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
> >               ? arg_size_in_bytes (type)
> >               : size_int (GET_MODE_SIZE (passed_mode)));
> >    where_pad = targetm.calls.function_arg_padding (passed_mode, type);
> > -  boundary = targetm.calls.function_arg_boundary (passed_mode, type);
> > -  round_boundary = targetm.calls.function_arg_round_boundary (passed_mode,
> > -                                                             type);
> > +
> > +  if (flag_stack_use_cumulative_args)
> > +    {
> > +      boundary = targetm.calls.function_arg_boundary_ca (passed_mode,
> > +                                                        type,
> > +                                                        ca);
> > +      round_boundary = targetm.calls.function_arg_round_boundary_ca
> > +       (passed_mode, type, ca);
> > +    }
> > +  else
> > +    {
> > +      boundary = targetm.calls.function_arg_boundary (passed_mode,
> > +                                                     type);
> > +      round_boundary = targetm.calls.function_arg_round_boundary
> > +       (passed_mode, type);
> > +    }
> > +
> >    locate->where_pad = where_pad;
> >
> >    /* Alignment can't exceed MAX_SUPPORTED_STACK_ALIGNMENT.  */
> > diff --git a/gcc/function.h b/gcc/function.h
> > index 899430833ce..26f342caba3 100644
> > --- a/gcc/function.h
> > +++ b/gcc/function.h
> > @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #ifndef GCC_FUNCTION_H
> >  #define GCC_FUNCTION_H
> >
> > +#include "cumulative-args.h"
> >
> >  /* Stack of pending (incomplete) sequences saved by `start_sequence'.
> >     Each element describes one pending sequence.
> > @@ -661,6 +662,7 @@ extern int aggregate_value_p (const_tree, const_tree);
> >  extern bool use_register_for_decl (const_tree);
> >  extern gimple_seq gimplify_parameters (gimple_seq *);
> >  extern void locate_and_pad_parm (machine_mode, tree, int, int, int,
> > +                                cumulative_args_t,
> >                                  tree, struct args_size *,
> >                                  struct locate_and_pad_arg_data *);
> >  extern void generate_setjmp_warnings (void);
> > diff --git a/gcc/target.def b/gcc/target.def
> > index 51ea167172b..bb56afab539 100644
> > --- a/gcc/target.def
> > +++ b/gcc/target.def
> > @@ -4959,6 +4959,18 @@ with the specified mode and type.  The default hook returns\n\
> >   unsigned int, (machine_mode mode, const_tree type),
> >   default_function_arg_boundary)
> >
> > +DEFHOOK
> > +(function_arg_boundary_ca,
> > + "This is the @code{cumulative_args_t}-based version of\n\
> > +@code{TARGET_FUNCTION_ARG_BOUNDARY}. Define this hook if you need more\n\
> > +fine-grained control over argument alignment, e.g. depending on whether\n\
> > +it is a named argument or not, or any other criteria that you choose to\n\
> > +place in the @var{ca} structure.\n\
> > +\n\
> > +The default hook will call @code{TARGET_FUNCTION_ARG_BOUNDARY}.",
> > + unsigned int, (machine_mode mode, const_tree type, cumulative_args_t ca),
> > + default_function_arg_boundary_ca)
> > +
> >  DEFHOOK
> >  (function_arg_round_boundary,
> >   "Normally, the size of an argument is rounded up to @code{PARM_BOUNDARY},\n\
> > @@ -4968,6 +4980,18 @@ value.",
> >   unsigned int, (machine_mode mode, const_tree type),
> >   default_function_arg_round_boundary)
> >
> > +DEFHOOK
> > +(function_arg_round_boundary_ca,
> > + "This is the @code{cumulative_args_t}-based version of\n\
> > +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}. Define this hook if you need more\n\
> > +fine-grained control over argument size rounding, e.g. depending on whether\n\
> > +it is a named argument or not, or any other criteria that you choose to\n\
> > +place in the @var{ca} structure.\n\
> > +\n\
> > +The default hook will call @code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}.",
> > + unsigned int, (machine_mode mode, const_tree type, cumulative_args_t ca),
> > + default_function_arg_round_boundary_ca)
> > +
> >  /* Return the diagnostic message string if function without a prototype
> >     is not allowed for this 'val' argument; NULL otherwise. */
> >  DEFHOOK
> > diff --git a/gcc/target.h b/gcc/target.h
> > index d8f45fb99c3..5b28ece7e1c 100644
> > --- a/gcc/target.h
> > +++ b/gcc/target.h
> > @@ -51,22 +51,7 @@
> >  #include "insn-codes.h"
> >  #include "tm.h"
> >  #include "hard-reg-set.h"
> > -
> > -#if CHECKING_P
> > -
> > -struct cumulative_args_t { void *magic; void *p; };
> > -
> > -#else /* !CHECKING_P */
> > -
> > -/* When using a GCC build compiler, we could use
> > -   __attribute__((transparent_union)) to get cumulative_args_t function
> > -   arguments passed like scalars where the ABI would mandate a less
> > -   efficient way of argument passing otherwise.  However, that would come
> > -   at the cost of less type-safe !CHECKING_P compilation.  */
> > -
> > -union cumulative_args_t { void *p; };
> > -
> > -#endif /* !CHECKING_P */
> > +#include "cumulative-args.h"
> >
> >  /* Types of memory operation understood by the "by_pieces" infrastructure.
> >     Used by the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P target hook and
> > diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> > index 6f071f80231..d83f362a5ae 100644
> > --- a/gcc/targhooks.c
> > +++ b/gcc/targhooks.c
> > @@ -850,6 +850,14 @@ default_function_arg_boundary (machine_mode mode ATTRIBUTE_UNUSED,
> >    return PARM_BOUNDARY;
> >  }
> >
> > +unsigned int
> > +default_function_arg_boundary_ca (machine_mode mode ATTRIBUTE_UNUSED,
> > +                                 const_tree type ATTRIBUTE_UNUSED,
> > +                                 cumulative_args_t ca ATTRIBUTE_UNUSED)
> > +{
> > +  return default_function_arg_boundary (mode, type);
> > +}
> > +
> >  unsigned int
> >  default_function_arg_round_boundary (machine_mode mode ATTRIBUTE_UNUSED,
> >                                      const_tree type ATTRIBUTE_UNUSED)
> > @@ -857,6 +865,14 @@ default_function_arg_round_boundary (machine_mode mode ATTRIBUTE_UNUSED,
> >    return PARM_BOUNDARY;
> >  }
> >
> > +unsigned int
> > +default_function_arg_round_boundary_ca (machine_mode mode ATTRIBUTE_UNUSED,
> > +                                       const_tree type ATTRIBUTE_UNUSED,
> > +                                       cumulative_args_t ca ATTRIBUTE_UNUSED)
> > +{
> > +  return default_function_arg_round_boundary (mode, type);
> > +}
> > +
> >  void
> >  hook_void_bitmap (bitmap regs ATTRIBUTE_UNUSED)
> >  {
> > diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> > index 11e9d7dd1a8..f33374ef073 100644
> > --- a/gcc/targhooks.h
> > +++ b/gcc/targhooks.h
> > @@ -154,6 +154,12 @@ extern unsigned int default_function_arg_boundary (machine_mode,
> >                                                    const_tree);
> >  extern unsigned int default_function_arg_round_boundary (machine_mode,
> >                                                          const_tree);
> > +extern unsigned int default_function_arg_boundary_ca (machine_mode,
> > +                                                     const_tree,
> > +                                                     cumulative_args_t ca);
> > +extern unsigned int default_function_arg_round_boundary_ca (machine_mode,
> > +                                                           const_tree,
> > +                                                           cumulative_args_t ca);
> >  extern bool hook_bool_const_rtx_commutative_p (const_rtx, int);
> >  extern rtx default_function_value (const_tree, const_tree, bool);
> >  extern HARD_REG_SET default_zero_call_used_regs (HARD_REG_SET);
> > --
> > 2.30.1 (Apple Git-130)
> >
Richard Biener Nov. 22, 2021, 3:15 p.m. UTC | #3
On Mon, Nov 22, 2021 at 3:40 PM Maxim Blinov <maxim.blinov@embecosm.com> wrote:
>
> Hi Richard,
>
> The purpose of this patch is to give more of the target code access to
> the cumulative_args_t structure in order to enable certain (otherwise
> currently impossible) stack layout constraints, specifically for
> parameters that are passed over the stack. For example, there are some
> targets (not yet in GCC trunk) which explicitly require the
> distinction between named and variadic parameters in order to enforce
> different alignment rules (when passing over the stack.) Such a
> constraint would be accommodated by this patch.
>
> The patch itself is very straightforward and simply adds the parameter
> to the two functions which we'd like to extend. The motivation of
> defining new target hooks was to minimize the patch size.

I would prefer to adjust the existing hooks if there's currently no way to
implement the aarch64 darwin ABI instead of optimizing for patch size.

I have no opinion on whether passing down cummulative args is the
proper thing to do design-wise.  It might be that TARGET_FUNCTION_ARG_BOUNDARY
is only one part that cummulative args using code should look at
(given you don't show the other users of function_arg_boundary that do not
conveniently have a CA struct around).

Richard.

> A concrete user of this change that we have in mind is the AArch64
> Darwin parameter passing abi, which mandates that when passing on the
> stack, named parameters are to be naturally-aligned, however variadic
> ones are to be word-aligned. However this patch is completely generic
> in nature and should enable all targets to have more control over
> their parameter layout process.
>
> Best Regards,
> Maxim
>
> On Mon, 15 Nov 2021 at 07:11, Richard Biener <richard.guenther@gmail.com> wrote:
> >
> > On Sat, Nov 13, 2021 at 10:43 AM Maxim Blinov <maxim.blinov@embecosm.com> wrote:
> > >
> > > The two target hooks responsible for informing GCC about stack
> > > parameter alignment are `TARGET_FUNCTION_ARG_BOUNDARY` and
> > > `TARGET_FUNCTION_ARG_ROUND_BOUNDARY`, which currently only consider
> > > the tree and machine_mode of a specific given argument.
> > >
> > > Create two new target hooks suffixed with '_CA', and pass in a third
> > > `cumulative_args_t` parameter. This enables the backend to make
> > > alignment decisions based on the context of the whole function rather
> > > than individual parameters.
> > >
> > > The orignal machine_mode/tree type macros are not removed - they are
> > > called by the default implementations of `TARGET_...BOUNDARY_CA` and
> > > `TARGET_...ROUND_BOUNDARY_CA`. This is done with the intetnion of
> > > avoiding large mechanical modifications of nearly every backend in
> > > GCC. There is also a new flag, -fstack-use-cumulative-args, which
> > > provides a way to completely bypass the new `..._CA` macros. This
> > > feature is intended for debugging GCC itself.
> >
> > Just two quick comments without looking at the patch.
> >
> > Please do not introduce options in the user namespace -f... which are
> > for debugging only.  I think you should go without this part instead.
> >
> > Second, you fail to motivate the change.  I cannot make sense of
> > "This enables the backend to make alignment decisions based on the
> > context of the whole function rather than individual parameters."
> >
> > Richard.
> >
> > >
> > > gcc/ChangeLog:
> > >
> > >         * calls.c (initialize_argument_information): Pass `args_so_far`.
> > >         * common.opt: New flag `-fstack-use-cumulative-args`.
> > >         * config.gcc: No platforms currently use ..._CA-hooks: Set
> > >         -fstack-use-cumulative-args to be off by default.
> > >         * target.h (cumulative_args_t): Move declaration from here, to...
> > >         * cumulative-args.h (cumulative_args_t): ...this new file. This is
> > >         to permit backends to include the declaration of cumulative_args_t
> > >         without dragging in circular dependencies.
> > >         * function.c (assign_parm_find_entry_rtl): Provide
> > >         cumulative_args_t to locate_and_pad_parm.
> > >         (gimplify_parameters): Ditto.
> > >         (locate_and_pad_parm): Conditionally call new hooks if we're
> > >         invoked with -fstack-use-cumulative-args.
> > >         * function.h: Include cumulative-args.h.
> > >         (locate_and_pad_parm): Add cumulative_args_t parameter.
> > >         * target.def (function_arg_boundary_ca): Add.
> > >         (function_arg_round_boundary_ca): Ditto.
> > >         * targhooks.c (default_function_arg_boundary_ca): Implement.
> > >         (default_function_arg_round_boundary_ca): Ditto.
> > >         * targhooks.h (default_function_arg_boundary_ca): Declare.
> > >         (default_function_arg_round_boundary_ca): Ditto.
> > >         * doc/invoke.texi (-fstack-use-cumulative-args): Document.
> > >         * doc/tm.texi: Regenerate.
> > >         * doc/tm.texi.in: Ditto.
> > > ---
> > >  gcc/calls.c           |  3 +++
> > >  gcc/common.opt        |  4 ++++
> > >  gcc/config.gcc        |  7 +++++++
> > >  gcc/cumulative-args.h | 20 ++++++++++++++++++++
> > >  gcc/doc/invoke.texi   | 12 ++++++++++++
> > >  gcc/doc/tm.texi       | 20 ++++++++++++++++++++
> > >  gcc/doc/tm.texi.in    |  4 ++++
> > >  gcc/function.c        | 25 +++++++++++++++++++++----
> > >  gcc/function.h        |  2 ++
> > >  gcc/target.def        | 24 ++++++++++++++++++++++++
> > >  gcc/target.h          | 17 +----------------
> > >  gcc/targhooks.c       | 16 ++++++++++++++++
> > >  gcc/targhooks.h       |  6 ++++++
> > >  13 files changed, 140 insertions(+), 20 deletions(-)
> > >  create mode 100644 gcc/cumulative-args.h
> > >
> > > diff --git a/gcc/calls.c b/gcc/calls.c
> > > index 27b59f26ad3..cef612a6ef4 100644
> > > --- a/gcc/calls.c
> > > +++ b/gcc/calls.c
> > > @@ -1527,6 +1527,7 @@ initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED,
> > >  #endif
> > >                              reg_parm_stack_space,
> > >                              args[i].pass_on_stack ? 0 : args[i].partial,
> > > +                            args_so_far,
> > >                              fndecl, args_size, &args[i].locate);
> > >  #ifdef BLOCK_REG_PADDING
> > >        else
> > > @@ -4205,6 +4206,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
> > >                            argvec[count].reg != 0,
> > >  #endif
> > >                            reg_parm_stack_space, 0,
> > > +                          args_so_far,
> > >                            NULL_TREE, &args_size, &argvec[count].locate);
> > >
> > >        if (argvec[count].reg == 0 || argvec[count].partial != 0
> > > @@ -4296,6 +4298,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
> > >                                argvec[count].reg != 0,
> > >  #endif
> > >                                reg_parm_stack_space, argvec[count].partial,
> > > +                              args_so_far,
> > >                                NULL_TREE, &args_size, &argvec[count].locate);
> > >           args_size.constant += argvec[count].locate.size.constant;
> > >           gcc_assert (!argvec[count].locate.size.var);
> > > diff --git a/gcc/common.opt b/gcc/common.opt
> > > index de9b848eda5..982417c1e39 100644
> > > --- a/gcc/common.opt
> > > +++ b/gcc/common.opt
> > > @@ -2705,6 +2705,10 @@ fstack-usage
> > >  Common RejectNegative Var(flag_stack_usage)
> > >  Output stack usage information on a per-function basis.
> > >
> > > +fstack-use-cumulative-args
> > > +Common RejectNegative Var(flag_stack_use_cumulative_args) Init(STACK_USE_CUMULATIVE_ARGS_INIT)
> > > +Use cumulative args-based stack layout hooks.
> > > +
> > >  fstrength-reduce
> > >  Common Ignore
> > >  Does nothing.  Preserved for backward compatibility.
> > > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > > index edd12655c4a..046d691af56 100644
> > > --- a/gcc/config.gcc
> > > +++ b/gcc/config.gcc
> > > @@ -1070,6 +1070,13 @@ case ${target} in
> > >    ;;
> > >  esac
> > >
> > > +# Figure out if we need to enable -foff-stack-trampolines by default.
> > > +case ${target} in
> > > +*)
> > > +  tm_defines="$tm_defines STACK_USE_CUMULATIVE_ARGS_INIT=0"
> > > +  ;;
> > > +esac
> > > +
> > >  case ${target} in
> > >  aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
> > >         tm_file="${tm_file} dbxelf.h elfos.h newlib-stdint.h"
> > > diff --git a/gcc/cumulative-args.h b/gcc/cumulative-args.h
> > > new file mode 100644
> > > index 00000000000..b60928e37f9
> > > --- /dev/null
> > > +++ b/gcc/cumulative-args.h
> > > @@ -0,0 +1,20 @@
> > > +#ifndef GCC_CUMULATIVE_ARGS_H
> > > +#define GCC_CUMULATIVE_ARGS_H
> > > +
> > > +#if CHECKING_P
> > > +
> > > +struct cumulative_args_t { void *magic; void *p; };
> > > +
> > > +#else /* !CHECKING_P */
> > > +
> > > +/* When using a GCC build compiler, we could use
> > > +   __attribute__((transparent_union)) to get cumulative_args_t function
> > > +   arguments passed like scalars where the ABI would mandate a less
> > > +   efficient way of argument passing otherwise.  However, that would come
> > > +   at the cost of less type-safe !CHECKING_P compilation.  */
> > > +
> > > +union cumulative_args_t { void *p; };
> > > +
> > > +#endif /* !CHECKING_P */
> > > +
> > > +#endif /* GCC_CUMULATIVE_ARGS_H */
> > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > index 2aba4c70b44..7ffb2997c69 100644
> > > --- a/gcc/doc/invoke.texi
> > > +++ b/gcc/doc/invoke.texi
> > > @@ -670,6 +670,7 @@ Objective-C and Objective-C++ Dialects}.
> > >  -fverbose-asm  -fpack-struct[=@var{n}]  @gol
> > >  -fleading-underscore  -ftls-model=@var{model} @gol
> > >  -fstack-reuse=@var{reuse_level} @gol
> > > +-fstack-use-cumulative-args @gol
> > >  -ftrampolines  -ftrapv  -fwrapv @gol
> > >  -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]} @gol
> > >  -fstrict-volatile-bitfields  -fsync-libcalls}
> > > @@ -16625,6 +16626,17 @@ the behavior of older compilers in which temporaries' stack space is
> > >  not reused, the aggressive stack reuse can lead to runtime errors. This
> > >  option is used to control the temporary stack reuse optimization.
> > >
> > > +@item -fstack-use-cumulative-args
> > > +@opindex fstack_use_cumulative_args
> > > +This option instructs the compiler to use the
> > > +@code{cumulative_args_t}-based stack layout target hooks,
> > > +@code{TARGET_FUNCTION_ARG_BOUNDARY_CA} and
> > > +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA}. If a given target does
> > > +not define these hooks, the default behaviour is to fallback to using
> > > +the standard non-@code{_CA} variants instead. Certain targets (such as
> > > +AArch64 Darwin) require using the more advanced @code{_CA}-based
> > > +hooks: For these targets this option should be enabled by default.
> > > +
> > >  @item -ftrapv
> > >  @opindex ftrapv
> > >  This option generates traps for signed overflow on addition, subtraction,
> > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > > index 78a1af1ad4d..261eaf46ef5 100644
> > > --- a/gcc/doc/tm.texi
> > > +++ b/gcc/doc/tm.texi
> > > @@ -4322,6 +4322,16 @@ with the specified mode and type.  The default hook returns
> > >  @code{PARM_BOUNDARY} for all arguments.
> > >  @end deftypefn
> > >
> > > +@deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_BOUNDARY_CA (machine_mode @var{mode}, const_tree @var{type}, cumulative_args_t @var{ca})
> > > +This is the @code{cumulative_args_t}-based version of
> > > +@code{TARGET_FUNCTION_ARG_BOUNDARY}. Define this hook if you need more
> > > +fine-grained control over argument alignment, e.g. depending on whether
> > > +it is a named argument or not, or any other criteria that you choose to
> > > +place in the @var{ca} structure.
> > > +
> > > +The default hook will call @code{TARGET_FUNCTION_ARG_BOUNDARY}.
> > > +@end deftypefn
> > > +
> > >  @deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_ROUND_BOUNDARY (machine_mode @var{mode}, const_tree @var{type})
> > >  Normally, the size of an argument is rounded up to @code{PARM_BOUNDARY},
> > >  which is the default value for this hook.  You can define this hook to
> > > @@ -4329,6 +4339,16 @@ return a different value if an argument size must be rounded to a larger
> > >  value.
> > >  @end deftypefn
> > >
> > > +@deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA (machine_mode @var{mode}, const_tree @var{type}, cumulative_args_t @var{ca})
> > > +This is the @code{cumulative_args_t}-based version of
> > > +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}. Define this hook if you need more
> > > +fine-grained control over argument size rounding, e.g. depending on whether
> > > +it is a named argument or not, or any other criteria that you choose to
> > > +place in the @var{ca} structure.
> > > +
> > > +The default hook will call @code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}.
> > > +@end deftypefn
> > > +
> > >  @defmac FUNCTION_ARG_REGNO_P (@var{regno})
> > >  A C expression that is nonzero if @var{regno} is the number of a hard
> > >  register in which function arguments are sometimes passed.  This does
> > > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> > > index 4401550989e..933644dfa86 100644
> > > --- a/gcc/doc/tm.texi.in
> > > +++ b/gcc/doc/tm.texi.in
> > > @@ -3330,8 +3330,12 @@ required.
> > >
> > >  @hook TARGET_FUNCTION_ARG_BOUNDARY
> > >
> > > +@hook TARGET_FUNCTION_ARG_BOUNDARY_CA
> > > +
> > >  @hook TARGET_FUNCTION_ARG_ROUND_BOUNDARY
> > >
> > > +@hook TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA
> > > +
> > >  @defmac FUNCTION_ARG_REGNO_P (@var{regno})
> > >  A C expression that is nonzero if @var{regno} is the number of a hard
> > >  register in which function arguments are sometimes passed.  This does
> > > diff --git a/gcc/function.c b/gcc/function.c
> > > index 61b3bd036b8..1ebf3e0ffe5 100644
> > > --- a/gcc/function.c
> > > +++ b/gcc/function.c
> > > @@ -2610,7 +2610,9 @@ assign_parm_find_entry_rtl (struct assign_parm_data_all *all,
> > >
> > >    locate_and_pad_parm (data->arg.mode, data->arg.type, in_regs,
> > >                        all->reg_parm_stack_space,
> > > -                      entry_parm ? data->partial : 0, current_function_decl,
> > > +                      entry_parm ? data->partial : 0,
> > > +                      all->args_so_far,
> > > +                      current_function_decl,
> > >                        &all->stack_args_size, &data->locate);
> > >
> > >    /* Update parm_stack_boundary if this parameter is passed in the
> > > @@ -4027,6 +4029,7 @@ gimplify_parameters (gimple_seq *cleanup)
> > >  void
> > >  locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
> > >                      int reg_parm_stack_space, int partial,
> > > +                    cumulative_args_t ca,
> > >                      tree fndecl ATTRIBUTE_UNUSED,
> > >                      struct args_size *initial_offset_ptr,
> > >                      struct locate_and_pad_arg_data *locate)
> > > @@ -4064,9 +4067,23 @@ locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
> > >               ? arg_size_in_bytes (type)
> > >               : size_int (GET_MODE_SIZE (passed_mode)));
> > >    where_pad = targetm.calls.function_arg_padding (passed_mode, type);
> > > -  boundary = targetm.calls.function_arg_boundary (passed_mode, type);
> > > -  round_boundary = targetm.calls.function_arg_round_boundary (passed_mode,
> > > -                                                             type);
> > > +
> > > +  if (flag_stack_use_cumulative_args)
> > > +    {
> > > +      boundary = targetm.calls.function_arg_boundary_ca (passed_mode,
> > > +                                                        type,
> > > +                                                        ca);
> > > +      round_boundary = targetm.calls.function_arg_round_boundary_ca
> > > +       (passed_mode, type, ca);
> > > +    }
> > > +  else
> > > +    {
> > > +      boundary = targetm.calls.function_arg_boundary (passed_mode,
> > > +                                                     type);
> > > +      round_boundary = targetm.calls.function_arg_round_boundary
> > > +       (passed_mode, type);
> > > +    }
> > > +
> > >    locate->where_pad = where_pad;
> > >
> > >    /* Alignment can't exceed MAX_SUPPORTED_STACK_ALIGNMENT.  */
> > > diff --git a/gcc/function.h b/gcc/function.h
> > > index 899430833ce..26f342caba3 100644
> > > --- a/gcc/function.h
> > > +++ b/gcc/function.h
> > > @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
> > >  #ifndef GCC_FUNCTION_H
> > >  #define GCC_FUNCTION_H
> > >
> > > +#include "cumulative-args.h"
> > >
> > >  /* Stack of pending (incomplete) sequences saved by `start_sequence'.
> > >     Each element describes one pending sequence.
> > > @@ -661,6 +662,7 @@ extern int aggregate_value_p (const_tree, const_tree);
> > >  extern bool use_register_for_decl (const_tree);
> > >  extern gimple_seq gimplify_parameters (gimple_seq *);
> > >  extern void locate_and_pad_parm (machine_mode, tree, int, int, int,
> > > +                                cumulative_args_t,
> > >                                  tree, struct args_size *,
> > >                                  struct locate_and_pad_arg_data *);
> > >  extern void generate_setjmp_warnings (void);
> > > diff --git a/gcc/target.def b/gcc/target.def
> > > index 51ea167172b..bb56afab539 100644
> > > --- a/gcc/target.def
> > > +++ b/gcc/target.def
> > > @@ -4959,6 +4959,18 @@ with the specified mode and type.  The default hook returns\n\
> > >   unsigned int, (machine_mode mode, const_tree type),
> > >   default_function_arg_boundary)
> > >
> > > +DEFHOOK
> > > +(function_arg_boundary_ca,
> > > + "This is the @code{cumulative_args_t}-based version of\n\
> > > +@code{TARGET_FUNCTION_ARG_BOUNDARY}. Define this hook if you need more\n\
> > > +fine-grained control over argument alignment, e.g. depending on whether\n\
> > > +it is a named argument or not, or any other criteria that you choose to\n\
> > > +place in the @var{ca} structure.\n\
> > > +\n\
> > > +The default hook will call @code{TARGET_FUNCTION_ARG_BOUNDARY}.",
> > > + unsigned int, (machine_mode mode, const_tree type, cumulative_args_t ca),
> > > + default_function_arg_boundary_ca)
> > > +
> > >  DEFHOOK
> > >  (function_arg_round_boundary,
> > >   "Normally, the size of an argument is rounded up to @code{PARM_BOUNDARY},\n\
> > > @@ -4968,6 +4980,18 @@ value.",
> > >   unsigned int, (machine_mode mode, const_tree type),
> > >   default_function_arg_round_boundary)
> > >
> > > +DEFHOOK
> > > +(function_arg_round_boundary_ca,
> > > + "This is the @code{cumulative_args_t}-based version of\n\
> > > +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}. Define this hook if you need more\n\
> > > +fine-grained control over argument size rounding, e.g. depending on whether\n\
> > > +it is a named argument or not, or any other criteria that you choose to\n\
> > > +place in the @var{ca} structure.\n\
> > > +\n\
> > > +The default hook will call @code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}.",
> > > + unsigned int, (machine_mode mode, const_tree type, cumulative_args_t ca),
> > > + default_function_arg_round_boundary_ca)
> > > +
> > >  /* Return the diagnostic message string if function without a prototype
> > >     is not allowed for this 'val' argument; NULL otherwise. */
> > >  DEFHOOK
> > > diff --git a/gcc/target.h b/gcc/target.h
> > > index d8f45fb99c3..5b28ece7e1c 100644
> > > --- a/gcc/target.h
> > > +++ b/gcc/target.h
> > > @@ -51,22 +51,7 @@
> > >  #include "insn-codes.h"
> > >  #include "tm.h"
> > >  #include "hard-reg-set.h"
> > > -
> > > -#if CHECKING_P
> > > -
> > > -struct cumulative_args_t { void *magic; void *p; };
> > > -
> > > -#else /* !CHECKING_P */
> > > -
> > > -/* When using a GCC build compiler, we could use
> > > -   __attribute__((transparent_union)) to get cumulative_args_t function
> > > -   arguments passed like scalars where the ABI would mandate a less
> > > -   efficient way of argument passing otherwise.  However, that would come
> > > -   at the cost of less type-safe !CHECKING_P compilation.  */
> > > -
> > > -union cumulative_args_t { void *p; };
> > > -
> > > -#endif /* !CHECKING_P */
> > > +#include "cumulative-args.h"
> > >
> > >  /* Types of memory operation understood by the "by_pieces" infrastructure.
> > >     Used by the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P target hook and
> > > diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> > > index 6f071f80231..d83f362a5ae 100644
> > > --- a/gcc/targhooks.c
> > > +++ b/gcc/targhooks.c
> > > @@ -850,6 +850,14 @@ default_function_arg_boundary (machine_mode mode ATTRIBUTE_UNUSED,
> > >    return PARM_BOUNDARY;
> > >  }
> > >
> > > +unsigned int
> > > +default_function_arg_boundary_ca (machine_mode mode ATTRIBUTE_UNUSED,
> > > +                                 const_tree type ATTRIBUTE_UNUSED,
> > > +                                 cumulative_args_t ca ATTRIBUTE_UNUSED)
> > > +{
> > > +  return default_function_arg_boundary (mode, type);
> > > +}
> > > +
> > >  unsigned int
> > >  default_function_arg_round_boundary (machine_mode mode ATTRIBUTE_UNUSED,
> > >                                      const_tree type ATTRIBUTE_UNUSED)
> > > @@ -857,6 +865,14 @@ default_function_arg_round_boundary (machine_mode mode ATTRIBUTE_UNUSED,
> > >    return PARM_BOUNDARY;
> > >  }
> > >
> > > +unsigned int
> > > +default_function_arg_round_boundary_ca (machine_mode mode ATTRIBUTE_UNUSED,
> > > +                                       const_tree type ATTRIBUTE_UNUSED,
> > > +                                       cumulative_args_t ca ATTRIBUTE_UNUSED)
> > > +{
> > > +  return default_function_arg_round_boundary (mode, type);
> > > +}
> > > +
> > >  void
> > >  hook_void_bitmap (bitmap regs ATTRIBUTE_UNUSED)
> > >  {
> > > diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> > > index 11e9d7dd1a8..f33374ef073 100644
> > > --- a/gcc/targhooks.h
> > > +++ b/gcc/targhooks.h
> > > @@ -154,6 +154,12 @@ extern unsigned int default_function_arg_boundary (machine_mode,
> > >                                                    const_tree);
> > >  extern unsigned int default_function_arg_round_boundary (machine_mode,
> > >                                                          const_tree);
> > > +extern unsigned int default_function_arg_boundary_ca (machine_mode,
> > > +                                                     const_tree,
> > > +                                                     cumulative_args_t ca);
> > > +extern unsigned int default_function_arg_round_boundary_ca (machine_mode,
> > > +                                                           const_tree,
> > > +                                                           cumulative_args_t ca);
> > >  extern bool hook_bool_const_rtx_commutative_p (const_rtx, int);
> > >  extern rtx default_function_value (const_tree, const_tree, bool);
> > >  extern HARD_REG_SET default_zero_call_used_regs (HARD_REG_SET);
> > > --
> > > 2.30.1 (Apple Git-130)
> > >
Jeff Law Dec. 2, 2021, 9:08 p.m. UTC | #4
On 11/22/2021 8:15 AM, Richard Biener via Gcc-patches wrote:
> On Mon, Nov 22, 2021 at 3:40 PM Maxim Blinov <maxim.blinov@embecosm.com> wrote:
>> Hi Richard,
>>
>> The purpose of this patch is to give more of the target code access to
>> the cumulative_args_t structure in order to enable certain (otherwise
>> currently impossible) stack layout constraints, specifically for
>> parameters that are passed over the stack. For example, there are some
>> targets (not yet in GCC trunk) which explicitly require the
>> distinction between named and variadic parameters in order to enforce
>> different alignment rules (when passing over the stack.) Such a
>> constraint would be accommodated by this patch.
>>
>> The patch itself is very straightforward and simply adds the parameter
>> to the two functions which we'd like to extend. The motivation of
>> defining new target hooks was to minimize the patch size.
> I would prefer to adjust the existing hooks if there's currently no way to
> implement the aarch64 darwin ABI instead of optimizing for patch size.
Agreed.  Additionally I don't think we want any -f options controlling 
this behavior.


>
> I have no opinion on whether passing down cummulative args is the
> proper thing to do design-wise.  It might be that TARGET_FUNCTION_ARG_BOUNDARY
> is only one part that cummulative args using code should look at
> (given you don't show the other users of function_arg_boundary that do not
> conveniently have a CA struct around).
In the past I think we passed some additional parameters indicated where 
the last named parameter was so that it could be used to set up the CA 
struct.  But if the aarch64 darwin ABI needs that info somewhere we 
didn't before, then we'd likely need to add the CA structure.


Jeff
Iain Sandoe Dec. 2, 2021, 9:24 p.m. UTC | #5
> On 2 Dec 2021, at 21:08, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> On 11/22/2021 8:15 AM, Richard Biener via Gcc-patches wrote:
>> On Mon, Nov 22, 2021 at 3:40 PM Maxim Blinov <maxim.blinov@embecosm.com> wrote:
>>> Hi Richard,
>>> 
>>> The purpose of this patch is to give more of the target code access to
>>> the cumulative_args_t structure in order to enable certain (otherwise
>>> currently impossible) stack layout constraints, specifically for
>>> parameters that are passed over the stack. For example, there are some
>>> targets (not yet in GCC trunk) which explicitly require the
>>> distinction between named and variadic parameters in order to enforce
>>> different alignment rules (when passing over the stack.) Such a
>>> constraint would be accommodated by this patch.
>>> 
>>> The patch itself is very straightforward and simply adds the parameter
>>> to the two functions which we'd like to extend. The motivation of
>>> defining new target hooks was to minimize the patch size.
>> I would prefer to adjust the existing hooks if there's currently no way to
>> implement the aarch64 darwin ABI instead of optimizing for patch size.
> Agreed.  Additionally I don't think we want any -f options controlling this behavior.

Perhaps the choice of expression was not ideal - we were trying to minimize the
invasiveness of the change (my steer to Maxim, so mea culpa there).

The concern was that re-using existing hooks would touch every target, including
all those we have no way to test.

I wonder if C++ will allow us to have a default NULL argument to the hook
so that it’s a NOP change unless a target chooses to populate that arg.
(I guess an overload will not work since we are populating a fn pointer table).

>> I have no opinion on whether passing down cummulative args is the
>> proper thing to do design-wise.  It might be that TARGET_FUNCTION_ARG_BOUNDARY
>> is only one part that cummulative args using code should look at
>> (given you don't show the other users of function_arg_boundary that do not
>> conveniently have a CA struct around).
> In the past I think we passed some additional parameters indicated where the last named parameter was so that it could be used to set up the CA struct.  But if the aarch64 darwin ABI needs that info somewhere we didn't before, then we'd likely need to add the CA structure.

yeah, the problem is in knowing that we need to switch from natural alignment of
parms (so that parm N+1 is only padded sufficiently to be naturally aligned)
 to word boundary.

We’ll figure out a revised patch as soon as time permits.

Iain

> 
> 
> Jeff
Jeff Law Dec. 2, 2021, 9:37 p.m. UTC | #6
On 12/2/2021 2:24 PM, Iain Sandoe wrote:
>
>> On 2 Dec 2021, at 21:08, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>> On 11/22/2021 8:15 AM, Richard Biener via Gcc-patches wrote:
>>> On Mon, Nov 22, 2021 at 3:40 PM Maxim Blinov <maxim.blinov@embecosm.com> wrote:
>>>> Hi Richard,
>>>>
>>>> The purpose of this patch is to give more of the target code access to
>>>> the cumulative_args_t structure in order to enable certain (otherwise
>>>> currently impossible) stack layout constraints, specifically for
>>>> parameters that are passed over the stack. For example, there are some
>>>> targets (not yet in GCC trunk) which explicitly require the
>>>> distinction between named and variadic parameters in order to enforce
>>>> different alignment rules (when passing over the stack.) Such a
>>>> constraint would be accommodated by this patch.
>>>>
>>>> The patch itself is very straightforward and simply adds the parameter
>>>> to the two functions which we'd like to extend. The motivation of
>>>> defining new target hooks was to minimize the patch size.
>>> I would prefer to adjust the existing hooks if there's currently no way to
>>> implement the aarch64 darwin ABI instead of optimizing for patch size.
>> Agreed.  Additionally I don't think we want any -f options controlling this behavior.
> Perhaps the choice of expression was not ideal - we were trying to minimize the
> invasiveness of the change (my steer to Maxim, so mea culpa there).
No worries.  I was on your side of this problem eons ago, ironically in 
a closely related chunk of code.    I don't remember all the details 
anymore.  But we needed to pass around an additional argument to either 
FUNCTION_ARG or INIT_CUMULATIVE_ARGS for the PA to fix an obscure issue 
with passing FP values to static varargs/stdarg functions.



>
> The concern was that re-using existing hooks would touch every target, including
> all those we have no way to test.
True, but we actually do test most targets these days :-)  And if we're 
just adding an argument to the API for those functions, if we get it 
wrong, it should be fairly obvious.

>
> I wonder if C++ will allow us to have a default NULL argument to the hook
> so that it’s a NOP change unless a target chooses to populate that arg.
> (I guess an overload will not work since we are populating a fn pointer table).
>
>>> I have no opinion on whether passing down cummulative args is the
>>> proper thing to do design-wise.  It might be that TARGET_FUNCTION_ARG_BOUNDARY
>>> is only one part that cummulative args using code should look at
>>> (given you don't show the other users of function_arg_boundary that do not
>>> conveniently have a CA struct around).
>> In the past I think we passed some additional parameters indicated where the last named parameter was so that it could be used to set up the CA struct.  But if the aarch64 darwin ABI needs that info somewhere we didn't before, then we'd likely need to add the CA structure.
> yeah, the problem is in knowing that we need to switch from natural alignment of
> parms (so that parm N+1 is only padded sufficiently to be naturally aligned)
>   to word boundary.
>
> We’ll figure out a revised patch as soon as time permits.
Sounds good.

jeff
diff mbox series

Patch

diff --git a/gcc/calls.c b/gcc/calls.c
index 27b59f26ad3..cef612a6ef4 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1527,6 +1527,7 @@  initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED,
 #endif
 			     reg_parm_stack_space,
 			     args[i].pass_on_stack ? 0 : args[i].partial,
+			     args_so_far,
 			     fndecl, args_size, &args[i].locate);
 #ifdef BLOCK_REG_PADDING
       else
@@ -4205,6 +4206,7 @@  emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
 			   argvec[count].reg != 0,
 #endif
 			   reg_parm_stack_space, 0,
+			   args_so_far,
 			   NULL_TREE, &args_size, &argvec[count].locate);
 
       if (argvec[count].reg == 0 || argvec[count].partial != 0
@@ -4296,6 +4298,7 @@  emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
 			       argvec[count].reg != 0,
 #endif
 			       reg_parm_stack_space, argvec[count].partial,
+			       args_so_far,
 			       NULL_TREE, &args_size, &argvec[count].locate);
 	  args_size.constant += argvec[count].locate.size.constant;
 	  gcc_assert (!argvec[count].locate.size.var);
diff --git a/gcc/common.opt b/gcc/common.opt
index de9b848eda5..982417c1e39 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2705,6 +2705,10 @@  fstack-usage
 Common RejectNegative Var(flag_stack_usage)
 Output stack usage information on a per-function basis.
 
+fstack-use-cumulative-args
+Common RejectNegative Var(flag_stack_use_cumulative_args) Init(STACK_USE_CUMULATIVE_ARGS_INIT)
+Use cumulative args-based stack layout hooks.
+
 fstrength-reduce
 Common Ignore
 Does nothing.  Preserved for backward compatibility.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index edd12655c4a..046d691af56 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1070,6 +1070,13 @@  case ${target} in
   ;;
 esac
 
+# Figure out if we need to enable -foff-stack-trampolines by default.
+case ${target} in
+*)
+  tm_defines="$tm_defines STACK_USE_CUMULATIVE_ARGS_INIT=0"
+  ;;
+esac
+
 case ${target} in
 aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
 	tm_file="${tm_file} dbxelf.h elfos.h newlib-stdint.h"
diff --git a/gcc/cumulative-args.h b/gcc/cumulative-args.h
new file mode 100644
index 00000000000..b60928e37f9
--- /dev/null
+++ b/gcc/cumulative-args.h
@@ -0,0 +1,20 @@ 
+#ifndef GCC_CUMULATIVE_ARGS_H
+#define GCC_CUMULATIVE_ARGS_H
+
+#if CHECKING_P
+
+struct cumulative_args_t { void *magic; void *p; };
+
+#else /* !CHECKING_P */
+
+/* When using a GCC build compiler, we could use
+   __attribute__((transparent_union)) to get cumulative_args_t function
+   arguments passed like scalars where the ABI would mandate a less
+   efficient way of argument passing otherwise.  However, that would come
+   at the cost of less type-safe !CHECKING_P compilation.  */
+
+union cumulative_args_t { void *p; };
+
+#endif /* !CHECKING_P */
+
+#endif /* GCC_CUMULATIVE_ARGS_H */
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2aba4c70b44..7ffb2997c69 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -670,6 +670,7 @@  Objective-C and Objective-C++ Dialects}.
 -fverbose-asm  -fpack-struct[=@var{n}]  @gol
 -fleading-underscore  -ftls-model=@var{model} @gol
 -fstack-reuse=@var{reuse_level} @gol
+-fstack-use-cumulative-args @gol
 -ftrampolines  -ftrapv  -fwrapv @gol
 -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]} @gol
 -fstrict-volatile-bitfields  -fsync-libcalls}
@@ -16625,6 +16626,17 @@  the behavior of older compilers in which temporaries' stack space is
 not reused, the aggressive stack reuse can lead to runtime errors. This
 option is used to control the temporary stack reuse optimization.
 
+@item -fstack-use-cumulative-args
+@opindex fstack_use_cumulative_args
+This option instructs the compiler to use the
+@code{cumulative_args_t}-based stack layout target hooks,
+@code{TARGET_FUNCTION_ARG_BOUNDARY_CA} and
+@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA}. If a given target does
+not define these hooks, the default behaviour is to fallback to using
+the standard non-@code{_CA} variants instead. Certain targets (such as
+AArch64 Darwin) require using the more advanced @code{_CA}-based
+hooks: For these targets this option should be enabled by default.
+
 @item -ftrapv
 @opindex ftrapv
 This option generates traps for signed overflow on addition, subtraction,
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 78a1af1ad4d..261eaf46ef5 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4322,6 +4322,16 @@  with the specified mode and type.  The default hook returns
 @code{PARM_BOUNDARY} for all arguments.
 @end deftypefn
 
+@deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_BOUNDARY_CA (machine_mode @var{mode}, const_tree @var{type}, cumulative_args_t @var{ca})
+This is the @code{cumulative_args_t}-based version of
+@code{TARGET_FUNCTION_ARG_BOUNDARY}. Define this hook if you need more
+fine-grained control over argument alignment, e.g. depending on whether
+it is a named argument or not, or any other criteria that you choose to
+place in the @var{ca} structure.
+
+The default hook will call @code{TARGET_FUNCTION_ARG_BOUNDARY}.
+@end deftypefn
+
 @deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_ROUND_BOUNDARY (machine_mode @var{mode}, const_tree @var{type})
 Normally, the size of an argument is rounded up to @code{PARM_BOUNDARY},
 which is the default value for this hook.  You can define this hook to
@@ -4329,6 +4339,16 @@  return a different value if an argument size must be rounded to a larger
 value.
 @end deftypefn
 
+@deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA (machine_mode @var{mode}, const_tree @var{type}, cumulative_args_t @var{ca})
+This is the @code{cumulative_args_t}-based version of
+@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}. Define this hook if you need more
+fine-grained control over argument size rounding, e.g. depending on whether
+it is a named argument or not, or any other criteria that you choose to
+place in the @var{ca} structure.
+
+The default hook will call @code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}.
+@end deftypefn
+
 @defmac FUNCTION_ARG_REGNO_P (@var{regno})
 A C expression that is nonzero if @var{regno} is the number of a hard
 register in which function arguments are sometimes passed.  This does
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 4401550989e..933644dfa86 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3330,8 +3330,12 @@  required.
 
 @hook TARGET_FUNCTION_ARG_BOUNDARY
 
+@hook TARGET_FUNCTION_ARG_BOUNDARY_CA
+
 @hook TARGET_FUNCTION_ARG_ROUND_BOUNDARY
 
+@hook TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA
+
 @defmac FUNCTION_ARG_REGNO_P (@var{regno})
 A C expression that is nonzero if @var{regno} is the number of a hard
 register in which function arguments are sometimes passed.  This does
diff --git a/gcc/function.c b/gcc/function.c
index 61b3bd036b8..1ebf3e0ffe5 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2610,7 +2610,9 @@  assign_parm_find_entry_rtl (struct assign_parm_data_all *all,
 
   locate_and_pad_parm (data->arg.mode, data->arg.type, in_regs,
 		       all->reg_parm_stack_space,
-		       entry_parm ? data->partial : 0, current_function_decl,
+		       entry_parm ? data->partial : 0,
+		       all->args_so_far,
+		       current_function_decl,
 		       &all->stack_args_size, &data->locate);
 
   /* Update parm_stack_boundary if this parameter is passed in the
@@ -4027,6 +4029,7 @@  gimplify_parameters (gimple_seq *cleanup)
 void
 locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
 		     int reg_parm_stack_space, int partial,
+		     cumulative_args_t ca,
 		     tree fndecl ATTRIBUTE_UNUSED,
 		     struct args_size *initial_offset_ptr,
 		     struct locate_and_pad_arg_data *locate)
@@ -4064,9 +4067,23 @@  locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
 	      ? arg_size_in_bytes (type)
 	      : size_int (GET_MODE_SIZE (passed_mode)));
   where_pad = targetm.calls.function_arg_padding (passed_mode, type);
-  boundary = targetm.calls.function_arg_boundary (passed_mode, type);
-  round_boundary = targetm.calls.function_arg_round_boundary (passed_mode,
-							      type);
+
+  if (flag_stack_use_cumulative_args)
+    {
+      boundary = targetm.calls.function_arg_boundary_ca (passed_mode,
+							 type,
+							 ca);
+      round_boundary = targetm.calls.function_arg_round_boundary_ca
+	(passed_mode, type, ca);
+    }
+  else
+    {
+      boundary = targetm.calls.function_arg_boundary (passed_mode,
+						      type);
+      round_boundary = targetm.calls.function_arg_round_boundary
+	(passed_mode, type);
+    }
+
   locate->where_pad = where_pad;
 
   /* Alignment can't exceed MAX_SUPPORTED_STACK_ALIGNMENT.  */
diff --git a/gcc/function.h b/gcc/function.h
index 899430833ce..26f342caba3 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -20,6 +20,7 @@  along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_FUNCTION_H
 #define GCC_FUNCTION_H
 
+#include "cumulative-args.h"
 
 /* Stack of pending (incomplete) sequences saved by `start_sequence'.
    Each element describes one pending sequence.
@@ -661,6 +662,7 @@  extern int aggregate_value_p (const_tree, const_tree);
 extern bool use_register_for_decl (const_tree);
 extern gimple_seq gimplify_parameters (gimple_seq *);
 extern void locate_and_pad_parm (machine_mode, tree, int, int, int,
+				 cumulative_args_t,
 				 tree, struct args_size *,
 				 struct locate_and_pad_arg_data *);
 extern void generate_setjmp_warnings (void);
diff --git a/gcc/target.def b/gcc/target.def
index 51ea167172b..bb56afab539 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -4959,6 +4959,18 @@  with the specified mode and type.  The default hook returns\n\
  unsigned int, (machine_mode mode, const_tree type),
  default_function_arg_boundary)
 
+DEFHOOK
+(function_arg_boundary_ca,
+ "This is the @code{cumulative_args_t}-based version of\n\
+@code{TARGET_FUNCTION_ARG_BOUNDARY}. Define this hook if you need more\n\
+fine-grained control over argument alignment, e.g. depending on whether\n\
+it is a named argument or not, or any other criteria that you choose to\n\
+place in the @var{ca} structure.\n\
+\n\
+The default hook will call @code{TARGET_FUNCTION_ARG_BOUNDARY}.",
+ unsigned int, (machine_mode mode, const_tree type, cumulative_args_t ca),
+ default_function_arg_boundary_ca)
+
 DEFHOOK
 (function_arg_round_boundary,
  "Normally, the size of an argument is rounded up to @code{PARM_BOUNDARY},\n\
@@ -4968,6 +4980,18 @@  value.",
  unsigned int, (machine_mode mode, const_tree type),
  default_function_arg_round_boundary)
 
+DEFHOOK
+(function_arg_round_boundary_ca,
+ "This is the @code{cumulative_args_t}-based version of\n\
+@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}. Define this hook if you need more\n\
+fine-grained control over argument size rounding, e.g. depending on whether\n\
+it is a named argument or not, or any other criteria that you choose to\n\
+place in the @var{ca} structure.\n\
+\n\
+The default hook will call @code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}.",
+ unsigned int, (machine_mode mode, const_tree type, cumulative_args_t ca),
+ default_function_arg_round_boundary_ca)
+
 /* Return the diagnostic message string if function without a prototype
    is not allowed for this 'val' argument; NULL otherwise. */
 DEFHOOK
diff --git a/gcc/target.h b/gcc/target.h
index d8f45fb99c3..5b28ece7e1c 100644
--- a/gcc/target.h
+++ b/gcc/target.h
@@ -51,22 +51,7 @@ 
 #include "insn-codes.h"
 #include "tm.h"
 #include "hard-reg-set.h"
-
-#if CHECKING_P
-
-struct cumulative_args_t { void *magic; void *p; };
-
-#else /* !CHECKING_P */
-
-/* When using a GCC build compiler, we could use
-   __attribute__((transparent_union)) to get cumulative_args_t function
-   arguments passed like scalars where the ABI would mandate a less
-   efficient way of argument passing otherwise.  However, that would come
-   at the cost of less type-safe !CHECKING_P compilation.  */
-
-union cumulative_args_t { void *p; };
-
-#endif /* !CHECKING_P */
+#include "cumulative-args.h"
 
 /* Types of memory operation understood by the "by_pieces" infrastructure.
    Used by the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P target hook and
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 6f071f80231..d83f362a5ae 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -850,6 +850,14 @@  default_function_arg_boundary (machine_mode mode ATTRIBUTE_UNUSED,
   return PARM_BOUNDARY;
 }
 
+unsigned int
+default_function_arg_boundary_ca (machine_mode mode ATTRIBUTE_UNUSED,
+				  const_tree type ATTRIBUTE_UNUSED,
+				  cumulative_args_t ca ATTRIBUTE_UNUSED)
+{
+  return default_function_arg_boundary (mode, type);
+}
+
 unsigned int
 default_function_arg_round_boundary (machine_mode mode ATTRIBUTE_UNUSED,
 				     const_tree type ATTRIBUTE_UNUSED)
@@ -857,6 +865,14 @@  default_function_arg_round_boundary (machine_mode mode ATTRIBUTE_UNUSED,
   return PARM_BOUNDARY;
 }
 
+unsigned int
+default_function_arg_round_boundary_ca (machine_mode mode ATTRIBUTE_UNUSED,
+					const_tree type ATTRIBUTE_UNUSED,
+					cumulative_args_t ca ATTRIBUTE_UNUSED)
+{
+  return default_function_arg_round_boundary (mode, type);
+}
+
 void
 hook_void_bitmap (bitmap regs ATTRIBUTE_UNUSED)
 {
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 11e9d7dd1a8..f33374ef073 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -154,6 +154,12 @@  extern unsigned int default_function_arg_boundary (machine_mode,
 						   const_tree);
 extern unsigned int default_function_arg_round_boundary (machine_mode,
 							 const_tree);
+extern unsigned int default_function_arg_boundary_ca (machine_mode,
+						      const_tree,
+						      cumulative_args_t ca);
+extern unsigned int default_function_arg_round_boundary_ca (machine_mode,
+							    const_tree,
+							    cumulative_args_t ca);
 extern bool hook_bool_const_rtx_commutative_p (const_rtx, int);
 extern rtx default_function_value (const_tree, const_tree, bool);
 extern HARD_REG_SET default_zero_call_used_regs (HARD_REG_SET);