diff mbox series

Account for vector splat GPR->XMM move cost

Message ID 20230523151845.D26C213A10@imap2.suse-dmz.suse.de
State New
Headers show
Series Account for vector splat GPR->XMM move cost | expand

Commit Message

Richard Biener May 23, 2023, 3:18 p.m. UTC
The following also accounts for a GPR->XMM move cost for splat
operations and properly guards eliding the cost when moving from
memory only for SSE4.1 or HImode or larger operands.  This
doesn't fix the PR fully yet.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

	PR target/109944
	* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
	For vector construction or splats apply GPR->XMM move
	costing.  QImode memory can be handled directly only
	with SSE4.1 pinsrb.
---
 gcc/config/i386/i386.cc | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Uros Bizjak May 23, 2023, 3:44 p.m. UTC | #1
On Tue, May 23, 2023 at 5:18 PM Richard Biener <rguenther@suse.de> wrote:
>
> The following also accounts for a GPR->XMM move cost for splat
> operations and properly guards eliding the cost when moving from
> memory only for SSE4.1 or HImode or larger operands.  This
> doesn't fix the PR fully yet.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
>
> Thanks,
> Richard.
>
>         PR target/109944
>         * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
>         For vector construction or splats apply GPR->XMM move
>         costing.  QImode memory can be handled directly only
>         with SSE4.1 pinsrb.

OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.cc | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 38125ce284a..011a1fb0d6d 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -23654,7 +23654,7 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
>        stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
>        stmt_cost *= (TYPE_VECTOR_SUBPARTS (vectype) + 1);
>      }
> -  else if (kind == vec_construct
> +  else if ((kind == vec_construct || kind == scalar_to_vec)
>            && node
>            && SLP_TREE_DEF_TYPE (node) == vect_external_def
>            && INTEGRAL_TYPE_P (TREE_TYPE (vectype)))
> @@ -23687,7 +23687,9 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
>              Likewise with a BIT_FIELD_REF extracting from a vector
>              register we can hope to avoid using a GPR.  */
>           if (!is_gimple_assign (def)
> -             || (!gimple_assign_load_p (def)
> +             || ((!gimple_assign_load_p (def)
> +                  || (!TARGET_SSE4_1
> +                      && GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) == 1))
>                   && (gimple_assign_rhs_code (def) != BIT_FIELD_REF
>                       || !VECTOR_TYPE_P (TREE_TYPE
>                                 (TREE_OPERAND (gimple_assign_rhs1 (def), 0))))))
> --
> 2.35.3
diff mbox series

Patch

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 38125ce284a..011a1fb0d6d 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -23654,7 +23654,7 @@  ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
       stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
       stmt_cost *= (TYPE_VECTOR_SUBPARTS (vectype) + 1);
     }
-  else if (kind == vec_construct
+  else if ((kind == vec_construct || kind == scalar_to_vec)
 	   && node
 	   && SLP_TREE_DEF_TYPE (node) == vect_external_def
 	   && INTEGRAL_TYPE_P (TREE_TYPE (vectype)))
@@ -23687,7 +23687,9 @@  ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
 	     Likewise with a BIT_FIELD_REF extracting from a vector
 	     register we can hope to avoid using a GPR.  */
 	  if (!is_gimple_assign (def)
-	      || (!gimple_assign_load_p (def)
+	      || ((!gimple_assign_load_p (def)
+		   || (!TARGET_SSE4_1
+		       && GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) == 1))
 		  && (gimple_assign_rhs_code (def) != BIT_FIELD_REF
 		      || !VECTOR_TYPE_P (TREE_TYPE
 				(TREE_OPERAND (gimple_assign_rhs1 (def), 0))))))