Thoughts on memcmp expansion (PR43052)
diff mbox

Message ID CAFiYyc3Nvb3Om81Fs6eGpc7oOkvQxJD7Qa_RP-AQh3x_-w-LRw@mail.gmail.com
State New
Headers show

Commit Message

Richard Biener May 2, 2016, 12:52 p.m. UTC
On Thu, Apr 28, 2016 at 8:36 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 01/18/2016 10:22 AM, Richard Biener wrote:
>>
>> See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171 - the
>> inline expansion
>> for small sizes and equality compares should be done on GIMPLE.  Today the
>> strlen pass might be an appropriate place to do this given its
>> superior knowledge
>> about string lengths.
>>
>> The idea of turning eq feeding memcmp into a special memcmp_eq is good but
>> you have to avoid doing that too early - otherwise you'd lose on
>>
>>    res = memcmp (p, q, sz);
>>    if (memcmp (p, q, sz) == 0)
>>     ...
>>
>> that is, you have to make sure CSE got the chance to common the two calls.
>> This is why I think this kind of transform needs to happen in specific
>> places
>> (like during strlen opt) rather than in generic folding.
>
>
> Ok, here's an update. I kept pieces of your patch from that PR, but also
> translating memcmps larger than a single operation into memcmp_eq as in my
> previous patch.
>
> Then, I added by_pieces infrastructure for memcmp expansion. To avoid any
> more code duplication in this area, I abstracted the existing code and
> converted it to C++ classes since that seemed to fit pretty well.
>
> There are a few possible ways I could go with this, which is why I'm posting
> it more as a RFD at this point.
>  - should store_by_pieces be eliminated in favour of doing just
>    move_by_pieces with constfns?
>  - the C++ification could be extended, making move_by_pieces_d and
>    compare_by_pieces_d classes inheriting from a common base. This
>    would get rid of the callbacks, replacing them with virtuals,
>    and also make some of the current struct members private.
>  - could move all of the by_pieces stuff out into a separate file?
>
> Later, I think we'll also want to extend this to allow vector mode
> operations, but I think that's a separate patch.
>
> So, opinions what I should be doing with this patch? FWIW it bootstraps and
> tests OK on x86_64-linux.

+struct pieces_addr
+{
...
+  void *m_cfndata;
+public:
+  pieces_addr (rtx, bool, by_pieces_constfn, void *);

unless you strick private: somewhere the public: is redundant

Comments

Bernd Schmidt May 2, 2016, 12:57 p.m. UTC | #1
On 05/02/2016 02:52 PM, Richard Biener wrote:
> +struct pieces_addr
> +{
> ...
> +  void *m_cfndata;
> +public:
> +  pieces_addr (rtx, bool, by_pieces_constfn, void *);
>
> unless you strick private: somewhere the public: is redundant

Yeah, ideally I want to turn these into a classes rather than structs. 
Maybe that particular one can already be done, but I'm kind of wondering 
how far to take the C++ification of the other one.

> Index: gcc/tree-ssa-forwprop.c
> ===================================================================
> --- gcc/tree-ssa-forwprop.c     (revision 235474)
> +++ gcc/tree-ssa-forwprop.c     (working copy)
> @@ -1435,6 +1435,76 @@ simplify_builtin_call (gimple_stmt_itera
>              }
>          }
>         break;
> +
> +    case BUILT_IN_MEMCMP:
> +      {
>
> I think this doesn't belong in forwprop.  If we want to stick it into
> a pass rather than
> folding it should be in tree-ssa-strlen.c.

This part (and the other one you quoted) was essentially your prototype 
patch from PR52171. I can put it whereever you like, really.

> Note that we can handle size-1 memcmp even for ordered compares.

One would hope this doesn't occur very often...

> Jakub, where do you think this fits best?  Note that gimple-fold.c may
> not use immediate uses but would have to match this from the
> comparison (I still have to find a way to handle this in match.pd where
> the result expression contains virtual operands in the not toplevel stmt).


Bernd
Richard Biener May 2, 2016, 1:14 p.m. UTC | #2
On Mon, May 2, 2016 at 2:57 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 05/02/2016 02:52 PM, Richard Biener wrote:
>>
>> +struct pieces_addr
>> +{
>> ...
>> +  void *m_cfndata;
>> +public:
>> +  pieces_addr (rtx, bool, by_pieces_constfn, void *);
>>
>> unless you strick private: somewhere the public: is redundant
>
>
> Yeah, ideally I want to turn these into a classes rather than structs. Maybe
> that particular one can already be done, but I'm kind of wondering how far
> to take the C++ification of the other one.
>
>> Index: gcc/tree-ssa-forwprop.c
>> ===================================================================
>> --- gcc/tree-ssa-forwprop.c     (revision 235474)
>> +++ gcc/tree-ssa-forwprop.c     (working copy)
>> @@ -1435,6 +1435,76 @@ simplify_builtin_call (gimple_stmt_itera
>>              }
>>          }
>>         break;
>> +
>> +    case BUILT_IN_MEMCMP:
>> +      {
>>
>> I think this doesn't belong in forwprop.  If we want to stick it into
>> a pass rather than
>> folding it should be in tree-ssa-strlen.c.
>
>
> This part (and the other one you quoted) was essentially your prototype
> patch from PR52171. I can put it whereever you like, really.

I think it fits best in tree-ssa-strlen.c:strlen_optimize_stmt for the moment.

>> Note that we can handle size-1 memcmp even for ordered compares.
>
>
> One would hope this doesn't occur very often...

C++ templates .... but yes.

Richard.

>
>> Jakub, where do you think this fits best?  Note that gimple-fold.c may
>> not use immediate uses but would have to match this from the
>> comparison (I still have to find a way to handle this in match.pd where
>> the result expression contains virtual operands in the not toplevel stmt).
>
>
>
> Bernd

Patch
diff mbox

Index: gcc/tree-ssa-forwprop.c
===================================================================
--- gcc/tree-ssa-forwprop.c     (revision 235474)
+++ gcc/tree-ssa-forwprop.c     (working copy)
@@ -1435,6 +1435,76 @@  simplify_builtin_call (gimple_stmt_itera
            }
        }
       break;
+
+    case BUILT_IN_MEMCMP:
+      {

I think this doesn't belong in forwprop.  If we want to stick it into
a pass rather than
folding it should be in tree-ssa-strlen.c.

+       if (tree_fits_uhwi_p (len)
+           && (leni = tree_to_uhwi (len)) <= GET_MODE_SIZE (word_mode)
+           && exact_log2 (leni) != -1
+           && (align1 = get_pointer_alignment (arg1)) >= leni * BITS_PER_UNIT
+           && (align2 = get_pointer_alignment (arg2)) >= leni * BITS_PER_UNIT)
+         {
+           location_t loc = gimple_location (stmt2);
+           tree type, off;
+           type = build_nonstandard_integer_type (leni * BITS_PER_UNIT, 1);
+           gcc_assert (GET_MODE_SIZE (TYPE_MODE (type)) == leni);
+           off = build_int_cst
+            (build_pointer_type_for_mode (char_type_node, ptr_mode, true), 0);
+           arg1 = build2_loc (loc, MEM_REF, type, arg1, off);
+           arg2 = build2_loc (loc, MEM_REF, type, arg2, off);
+           res = fold_convert_loc (loc, TREE_TYPE (res),
+                                   fold_build2_loc (loc, NE_EXPR,
+                                                    boolean_type_node,
+                                                    arg1, arg2));
+           gimplify_and_update_call_from_tree (gsi_p, res);
+           return true;
+         }

for this part see gimple_fold_builtin_memory_op handling of

      /* If we can perform the copy efficiently with first doing all loads
         and then all stores inline it that way.  Currently efficiently
         means that we can load all the memory into a single integer
         register which is what MOVE_MAX gives us.  */

we might want to share a helper yielding the type of the load/store
or NULL if not possible.

Note that we can handle size-1 memcmp even for ordered compares.

Jakub, where do you think this fits best?  Note that gimple-fold.c may
not use immediate uses but would have to match this from the
comparison (I still have to find a way to handle this in match.pd where
the result expression contains virtual operands in the not toplevel stmt).

Richard.

>
> Bernd