Message ID | 20230523151845.D26C213A10@imap2.suse-dmz.suse.de |
---|---|
State | New |
Headers | show |
Series | Account for vector splat GPR->XMM move cost | expand |
On Tue, May 23, 2023 at 5:18 PM Richard Biener <rguenther@suse.de> wrote: > > The following also accounts for a GPR->XMM move cost for splat > operations and properly guards eliding the cost when moving from > memory only for SSE4.1 or HImode or larger operands. This > doesn't fix the PR fully yet. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > Thanks, > Richard. > > PR target/109944 > * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): > For vector construction or splats apply GPR->XMM move > costing. QImode memory can be handled directly only > with SSE4.1 pinsrb. OK. Thanks, Uros. > --- > gcc/config/i386/i386.cc | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index 38125ce284a..011a1fb0d6d 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -23654,7 +23654,7 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, > stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); > stmt_cost *= (TYPE_VECTOR_SUBPARTS (vectype) + 1); > } > - else if (kind == vec_construct > + else if ((kind == vec_construct || kind == scalar_to_vec) > && node > && SLP_TREE_DEF_TYPE (node) == vect_external_def > && INTEGRAL_TYPE_P (TREE_TYPE (vectype))) > @@ -23687,7 +23687,9 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, > Likewise with a BIT_FIELD_REF extracting from a vector > register we can hope to avoid using a GPR. */ > if (!is_gimple_assign (def) > - || (!gimple_assign_load_p (def) > + || ((!gimple_assign_load_p (def) > + || (!TARGET_SSE4_1 > + && GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) == 1)) > && (gimple_assign_rhs_code (def) != BIT_FIELD_REF > || !VECTOR_TYPE_P (TREE_TYPE > (TREE_OPERAND (gimple_assign_rhs1 (def), 0)))))) > -- > 2.35.3
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 38125ce284a..011a1fb0d6d 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -23654,7 +23654,7 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); stmt_cost *= (TYPE_VECTOR_SUBPARTS (vectype) + 1); } - else if (kind == vec_construct + else if ((kind == vec_construct || kind == scalar_to_vec) && node && SLP_TREE_DEF_TYPE (node) == vect_external_def && INTEGRAL_TYPE_P (TREE_TYPE (vectype))) @@ -23687,7 +23687,9 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, Likewise with a BIT_FIELD_REF extracting from a vector register we can hope to avoid using a GPR. */ if (!is_gimple_assign (def) - || (!gimple_assign_load_p (def) + || ((!gimple_assign_load_p (def) + || (!TARGET_SSE4_1 + && GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) == 1)) && (gimple_assign_rhs_code (def) != BIT_FIELD_REF || !VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (gimple_assign_rhs1 (def), 0))))))