Patchwork rtl expansion without zero/sign extension based on VRP

login
register
mail settings
Submitter Richard Guenther
Date May 13, 2013, 8:17 a.m.
Message ID <CAFiYyc2AXvWhZNwHv2EuKCZ5-v5aZFAkHV2fuDaHRK_-ZAHFJg@mail.gmail.com>
Download mbox | patch
Permalink /patch/243321/
State New
Headers show

Comments

Richard Guenther - May 13, 2013, 8:17 a.m.
On Mon, May 13, 2013 at 5:45 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi,
>
> This patch removes some of the redundant sign/zero
> extensions using value ranges informations generated by VRP.
>
> When GIMPLE_ASSIGN stmts with LHS type smaller than word is expanded to
> rtl, if we can prove that RHS expression value can always fit in LHS
> type and there is no sign conversion, truncation and extension to fit
> the type is redundant. Subreg and Zero/sign extensions are therefore
> redundant.
>
> For example, when an expression is evaluated and it's value is assigned
> to variable of type short, the generated rtl would look something like
> the following.
>
> (set (reg:SI 110)
>           (zero_extend:SI (subreg:HI (reg:SI 117) 0)))
>
> However, if during value range propagation, if we can say for certain
> that the value of the expression which is present in register 117 is
> within the limits of short and there is no sign conversion, we don’t
> need to perform the subreg and zero_extend; instead we can generate the
> following rtl.
>
> (set (reg:SI 110)
>           (reg:SI 117)))
>
> Attached patch adds value range information to gimple stmts during value
> range propagation pass and expands rtl without subreg and zero/sign
> extension if the value range is within the limits of it's type.
>
> This change improve the geomean of spec2k int benchmark with ref by about
> ~3.5% on an arm chromebook.
>
> Tested  on X86_64 and ARM.
>
> I would like review comments on this.

A few comments on the way you preserve this information.


My original plan to preserve value-range information was to re-use
SSA_NAME_PTR_INFO which is where we already store value-range
information for pointers:

struct GTY(()) ptr_info_def
{
  /* The points-to solution.  */
  struct pt_solution pt;

  /* Alignment and misalignment of the pointer in bytes.  Together
     align and misalign specify low known bits of the pointer.
     ptr & (align - 1) == misalign.  */

  /* When known, this is the power-of-two byte alignment of the object this
     pointer points into.  This is usually DECL_ALIGN_UNIT for decls and
     MALLOC_ABI_ALIGNMENT for allocated storage.  When the alignment is not
     known, it is zero.  Do not access directly but use functions
     get_ptr_info_alignment, set_ptr_info_alignment,
     mark_ptr_info_alignment_unknown and similar.  */
  unsigned int align;

  /* When alignment is known, the byte offset this pointer differs from the
     above alignment.  Access only through the same helper functions as align
     above.  */
  unsigned int misalign;
};

what you'd do is to make the ptr_info_def reference from tree_ssa_name a
reference to a union of the above (for pointer SSA names) and an alternate
info like

struct range_info_def {
  double_int min;
  double_int max;
};

or a more compressed format (a reference to a mode or type it fits for example).

The very specific case in question also points at the fact that we have
two conversion tree codes, NOP_EXPR and CONVERT_EXPR, and we've
tried to unify them (but didn't finish up that task)...

+static void
+process_stmt_for_ext_elimination ()
+{

we already do all the analysis when looking for opportunities to eliminate
the intermediate cast in (T2)(T1)x in simplify_conversion_using_ranges,
that's the place that should handle this case.

Richard.

> Thanks,
> Kugan
>
>
> 2013-05-09  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>     * gcc/gimple.h (gimple_is_exp_fit_lhs, gimple_set_exp_fit_lhs): New
> function.
>     * gcc/tree-vrp.c (process_stmt_for_ext_elimination): Likewise.
>     * (is_msb_set, range_fits_type): Likewise.
>     * (vrp_finalize): Call process_stmt_for_ext_elimination.
>     * gcc/dojump.c (do_compare_and_jump): generates rtl without
> zero/sign extension
>     if process_stmt_for_ext_elimination tells so.
>     * gcc/cfgexpand.c (expand_gimple_stmt_1): Likewise.
>
>

Patch

--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -191,6 +191,11 @@  struct GTY((chain_next ("%h.next")))
gimple_statement_base {
      in there.  */
   unsigned int subcode         : 16;

+  /* if an assignment gimple statement has RHS expression that can fit
+     LHS type, zero/sign extension to truncate is redundant.
+     Set this if we detect extension as redundant during VRP.  */
+  unsigned sign_zero_ext_redundant : 1;
+

this enlarges all gimple statements by 8 bytes, so it's out of the question.