Patchwork Continue strict-volatile-bitfields fixes

login
register
mail settings
Submitter Thomas Schwinge
Date April 27, 2012, 4:42 a.m.
Message ID <87ipglkcou.fsf@schwinge.name>
Download mbox | patch
Permalink /patch/155378/
State Superseded, archived
Headers show

Comments

Thomas Schwinge - April 27, 2012, 4:42 a.m.
Hi!

On Wed, 25 Apr 2012 13:51:16 +0200, Richard Guenther <richard.guenther@gmail.com> wrote:
> On Wed, Apr 25, 2012 at 1:27 PM, Thomas Schwinge
> <thomas@codesourcery.com> wrote:
> > On Thu, 19 Apr 2012 19:46:17 +0200, I wrote:
> >> Here is my current test case, reduced from gcc.dg/tree-ssa/20030922-1.c:
> >>
> >>     extern void abort (void);
> >>
> >>     enum tree_code
> >>     {
> >>       BIND_EXPR,
> >>     };
> >>     struct tree_common
> >>     {
> >>       enum tree_code code:8;
> >>     };
> >>     void
> >>     voidify_wrapper_expr (struct tree_common *common)
> >>     {
> >>       switch (common->code)
> >>         {
> >>         case BIND_EXPR:
> >>           if (common->code != BIND_EXPR)
> >>             abort ();
> >>         }
> >>     }
> >>
> >> The diff for -fdump-tree-all between compiling with
> >> -fno-strict-volatile-bitfields and -fstrict-volatile-bitfields begins
> >> with the following, right in 20030922-1.c.003t.original:
> >>
> >> diff -ru fnsvb/20030922-1.c.003t.original fsvb/20030922-1.c.003t.original
> >> --- fnsvb/20030922-1.c.003t.original    2012-04-19 16:51:18.322150866 +0200
> >> +++ fsvb/20030922-1.c.003t.original     2012-04-19 16:49:18.132088498 +0200
> >> @@ -7,7 +7,7 @@
> >>    switch ((int) common->code)
> >>      {
> >>        case 0:;
> >> -      if (common->code != 0)
> >> +      if ((BIT_FIELD_REF <*common, 32, 0> & 255) != 0)
> >>          {
> >>            abort ();
> >>          }
> >>
> >> That is, for -fno-strict-volatile-bitfields the second instance of
> >> »common->code« it is a component_ref, whereas for
> >> -fstrict-volatile-bitfields it is a bit_field_ref.  The former will be
> >> optimized as intended, the latter will not.  So, what should be emitted
> >> in the -fstrict-volatile-bitfields case?  A component_ref -- but this is
> >> what Bernd's commit r182545 (for PR51200) changed, I think?  Or should
> >> later optimization stages learn to better deal with a bit_field_ref, and
> >> where would this have to happen?
> >
> > I figured out why this generic code is behaving differently for SH
> > targets.  I compared to ARM as well as x86, for which indeed the
> > optimization possibility is not lost even when compiling with
> > -fstrict-volatile-bitfields.
> >
> > The component_ref inside the if statement (but not the one in the switch
> > statement) is fed into optimize_bit_field_compare where it is optimized
> > to »BIT_FIELD_REF <*common, 32, 0> & 255« only for SH, because
> > get_best_mode for SH returns SImode (ARM, x86: QImode), because SH has
> > »#define SLOW_BYTE_ACCESS 1«, and thus the original QImode is widened to
> > SImode, hoping for better code generation »since it may eliminate
> > subsequent memory access if subsequent accesses occur to other fields in
> > the same word of the structure, but to different bytes«.  (Well, the
> > converse happens here...)  After that, in optimize_bit_field_compare, for
> > ARM and x86, »nbitsize == lbitsize«, and thus the optimization is
> > aborted, whereas for SH it will be performed, that is, the component_ref
> > is replaced with »BIT_FIELD_REF <*common, 32, 0> & 255«.
> >
> > If I manually force SH's SImode back to QImode (that is, abort
> > optimize_bit_field_compare), the later optimization passes work as they
> > do for ARM and x86.
> >
> > Before Bernd's r182545, optimize_bit_field_compare returned early (that
> > is, aborted this optimization attempt), because »lbitsize ==
> > GET_MODE_BITSIZE (lmode)« (8 bits).  (With the current sources, lmode is
> > VOIDmode.)
> >
> > Is emmitting »BIT_FIELD_REF <*common, 32, 0> & 255« wrong in this case,
> > or should a later optimization pass be able to figure out that
> > »BIT_FIELD_REF <*common, 32, 0> & 255« is in fact the same as
> > common->code, and then be able to conflate these?  Any suggestions
> > where/how to tackle this?
> 
> The BIT_FIELD_REF is somewhat of a red-herring.  It is created by fold-const.c
> in optimize_bit_field_compare, code that I think should be removed completely.
> Or it needs to be made aware of strict-volatile bitfield and C++ memory model
> details.

As discussed with richi on IRC, I have now done testing (just gcc
testsuite) with optimize_bit_field_compare completely removed.  For SH,
this cures my testcase posted above as well as the the following FAILs,
all as expected:

    -FAIL: gcc.dg/tree-ssa/20030922-1.c scan-tree-dump-times dom2 "if " 0
    +PASS: gcc.dg/tree-ssa/20030922-1.c scan-tree-dump-times dom2 "if " 0
    
    -FAIL: gcc.dg/tree-ssa/foldconst-3.c scan-tree-dump-not optimized "tree_code_type"
    +PASS: gcc.dg/tree-ssa/foldconst-3.c scan-tree-dump-not optimized "tree_code_type"
    
    -FAIL: gcc.dg/tree-ssa/vrp15.c scan-tree-dump-times vrp1 "tree_code_length.42." 1
    +PASS: gcc.dg/tree-ssa/vrp15.c scan-tree-dump-times vrp1 "tree_code_length.42." 1

No regressions for ARM and x86.  I also manually confirmed that the
x86-specific tests introduced for PR32748 (gcc.target/i386/pr37248-*.c)
still produce the optimized (itermediate and assembly) code as suggested
in the *.c files.  (So, this optimization is (also) done elsewhere,
instead of in optimize_bit_field_compare.)  So, is removing it indeed the
way to go?

Instead of removing it completely -- are the diagonsis messages
(warnings) that it emits worth preserving, or are these covered
elsewhere?

gcc/
	* fold-const.c (fold_binary_loc): Don't invoke
	optimize_bit_field_compare.
	(optimize_bit_field_compare): Remove function.



Grüße,
 Thomas
Jakub Jelinek - April 27, 2012, 8:29 a.m.
On Fri, Apr 27, 2012 at 12:42:41PM +0800, Thomas Schwinge wrote:
> > > GET_MODE_BITSIZE (lmode)« (8 bits).  (With the current sources, lmode is
> > > VOIDmode.)
> > >
> > > Is emmitting »BIT_FIELD_REF <*common, 32, 0> & 255« wrong in this case,
> > > or should a later optimization pass be able to figure out that
> > > »BIT_FIELD_REF <*common, 32, 0> & 255« is in fact the same as
> > > common->code, and then be able to conflate these?  Any suggestions
> > > where/how to tackle this?
> > 
> > The BIT_FIELD_REF is somewhat of a red-herring.  It is created by fold-const.c
> > in optimize_bit_field_compare, code that I think should be removed completely.
> > Or it needs to be made aware of strict-volatile bitfield and C++ memory model
> > details.

I'd actually very much prefer the latter, just disable
optimize_bit_field_compare for strict-volatile bitfield mode and when
avoiding load data races in C++ memory model (that isn't going to be
default, right?).  This optimization is useful, and it is solely about
loads, so even C++ memory model usually shouldn't care.

	Jakub
Richard Guenther - April 27, 2012, 9 a.m.
On Fri, Apr 27, 2012 at 10:29 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Apr 27, 2012 at 12:42:41PM +0800, Thomas Schwinge wrote:
>> > > GET_MODE_BITSIZE (lmode)« (8 bits).  (With the current sources, lmode is
>> > > VOIDmode.)
>> > >
>> > > Is emmitting »BIT_FIELD_REF <*common, 32, 0> & 255« wrong in this case,
>> > > or should a later optimization pass be able to figure out that
>> > > »BIT_FIELD_REF <*common, 32, 0> & 255« is in fact the same as
>> > > common->code, and then be able to conflate these?  Any suggestions
>> > > where/how to tackle this?
>> >
>> > The BIT_FIELD_REF is somewhat of a red-herring.  It is created by fold-const.c
>> > in optimize_bit_field_compare, code that I think should be removed completely.
>> > Or it needs to be made aware of strict-volatile bitfield and C++ memory model
>> > details.
>
> I'd actually very much prefer the latter, just disable
> optimize_bit_field_compare for strict-volatile bitfield mode and when
> avoiding load data races in C++ memory model (that isn't going to be
> default, right?).  This optimization is useful, and it is solely about
> loads, so even C++ memory model usually shouldn't care.

But won't it re-introduce bugs like PR52080, 52097 or 48124?  Also the
proper place for this optimization is lowering and CSEing of bit-field loads
(not that this lowering already exists).

Well, I'm obviously biased here - fold doing this kind of transform looks
out-of-place.

Richard.

>        Jakub
Jakub Jelinek - April 27, 2012, 9:05 a.m.
On Fri, Apr 27, 2012 at 11:00:19AM +0200, Richard Guenther wrote:
> But won't it re-introduce bugs like PR52080, 52097 or 48124?  Also the

No.  All those are about bitfield stores, not reads.  All extract_bit*
functions currently pass 0, 0 as bitrange_{start,end}.

> proper place for this optimization is lowering and CSEing of bit-field loads
> (not that this lowering already exists).

I agree, but we don't have that yet.  When it is added,
optimize_bitfield* should be certainly moved there, but if we do that
before, we regress generated code quality.

	Jakub

Patch

Index: fold-const.c
===================================================================
--- fold-const.c	(revision 186856)
+++ fold-const.c	(working copy)
@@ -103,8 +103,6 @@  static tree pedantic_omit_one_operand_loc (locatio
 static tree distribute_bit_expr (location_t, enum tree_code, tree, tree, tree);
 static tree make_bit_field_ref (location_t, tree, tree,
 				HOST_WIDE_INT, HOST_WIDE_INT, int);
-static tree optimize_bit_field_compare (location_t, enum tree_code,
-					tree, tree, tree);
 static tree decode_field_reference (location_t, tree, HOST_WIDE_INT *,
 				    HOST_WIDE_INT *,
 				    enum machine_mode *, int *, int *,
@@ -3306,182 +3304,6 @@  make_bit_field_ref (location_t loc, tree inner, tr
 
   return result;
 }
-
-/* Optimize a bit-field compare.
-
-   There are two cases:  First is a compare against a constant and the
-   second is a comparison of two items where the fields are at the same
-   bit position relative to the start of a chunk (byte, halfword, word)
-   large enough to contain it.  In these cases we can avoid the shift
-   implicit in bitfield extractions.
-
-   For constants, we emit a compare of the shifted constant with the
-   BIT_AND_EXPR of a mask and a byte, halfword, or word of the operand being
-   compared.  For two fields at the same position, we do the ANDs with the
-   similar mask and compare the result of the ANDs.
-
-   CODE is the comparison code, known to be either NE_EXPR or EQ_EXPR.
-   COMPARE_TYPE is the type of the comparison, and LHS and RHS
-   are the left and right operands of the comparison, respectively.
-
-   If the optimization described above can be done, we return the resulting
-   tree.  Otherwise we return zero.  */
-
-static tree
-optimize_bit_field_compare (location_t loc, enum tree_code code,
-			    tree compare_type, tree lhs, tree rhs)
-{
-  HOST_WIDE_INT lbitpos, lbitsize, rbitpos, rbitsize, nbitpos, nbitsize;
-  tree type = TREE_TYPE (lhs);
-  tree signed_type, unsigned_type;
-  int const_p = TREE_CODE (rhs) == INTEGER_CST;
-  enum machine_mode lmode, rmode, nmode;
-  int lunsignedp, runsignedp;
-  int lvolatilep = 0, rvolatilep = 0;
-  tree linner, rinner = NULL_TREE;
-  tree mask;
-  tree offset;
-
-  /* Get all the information about the extractions being done.  If the bit size
-     if the same as the size of the underlying object, we aren't doing an
-     extraction at all and so can do nothing.  We also don't want to
-     do anything if the inner expression is a PLACEHOLDER_EXPR since we
-     then will no longer be able to replace it.  */
-  linner = get_inner_reference (lhs, &lbitsize, &lbitpos, &offset, &lmode,
-				&lunsignedp, &lvolatilep, false);
-  if (linner == lhs || lbitsize == GET_MODE_BITSIZE (lmode) || lbitsize < 0
-      || offset != 0 || TREE_CODE (linner) == PLACEHOLDER_EXPR)
-    return 0;
-
- if (!const_p)
-   {
-     /* If this is not a constant, we can only do something if bit positions,
-	sizes, and signedness are the same.  */
-     rinner = get_inner_reference (rhs, &rbitsize, &rbitpos, &offset, &rmode,
-				   &runsignedp, &rvolatilep, false);
-
-     if (rinner == rhs || lbitpos != rbitpos || lbitsize != rbitsize
-	 || lunsignedp != runsignedp || offset != 0
-	 || TREE_CODE (rinner) == PLACEHOLDER_EXPR)
-       return 0;
-   }
-
-  /* See if we can find a mode to refer to this field.  We should be able to,
-     but fail if we can't.  */
-  if (lvolatilep
-      && GET_MODE_BITSIZE (lmode) > 0
-      && flag_strict_volatile_bitfields > 0)
-    nmode = lmode;
-  else
-    nmode = get_best_mode (lbitsize, lbitpos, 0, 0,
-			   const_p ? TYPE_ALIGN (TREE_TYPE (linner))
-			   : MIN (TYPE_ALIGN (TREE_TYPE (linner)),
-				  TYPE_ALIGN (TREE_TYPE (rinner))),
-			   word_mode, lvolatilep || rvolatilep);
-  if (nmode == VOIDmode)
-    return 0;
-
-  /* Set signed and unsigned types of the precision of this mode for the
-     shifts below.  */
-  signed_type = lang_hooks.types.type_for_mode (nmode, 0);
-  unsigned_type = lang_hooks.types.type_for_mode (nmode, 1);
-
-  /* Compute the bit position and size for the new reference and our offset
-     within it. If the new reference is the same size as the original, we
-     won't optimize anything, so return zero.  */
-  nbitsize = GET_MODE_BITSIZE (nmode);
-  nbitpos = lbitpos & ~ (nbitsize - 1);
-  lbitpos -= nbitpos;
-  if (nbitsize == lbitsize)
-    return 0;
-
-  if (BYTES_BIG_ENDIAN)
-    lbitpos = nbitsize - lbitsize - lbitpos;
-
-  /* Make the mask to be used against the extracted field.  */
-  mask = build_int_cst_type (unsigned_type, -1);
-  mask = const_binop (LSHIFT_EXPR, mask, size_int (nbitsize - lbitsize));
-  mask = const_binop (RSHIFT_EXPR, mask,
-		      size_int (nbitsize - lbitsize - lbitpos));
-
-  if (! const_p)
-    /* If not comparing with constant, just rework the comparison
-       and return.  */
-    return fold_build2_loc (loc, code, compare_type,
-			fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type,
-				     make_bit_field_ref (loc, linner,
-							 unsigned_type,
-							 nbitsize, nbitpos,
-							 1),
-				     mask),
-			fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type,
-				     make_bit_field_ref (loc, rinner,
-							 unsigned_type,
-							 nbitsize, nbitpos,
-							 1),
-				     mask));
-
-  /* Otherwise, we are handling the constant case. See if the constant is too
-     big for the field.  Warn and return a tree of for 0 (false) if so.  We do
-     this not only for its own sake, but to avoid having to test for this
-     error case below.  If we didn't, we might generate wrong code.
-
-     For unsigned fields, the constant shifted right by the field length should
-     be all zero.  For signed fields, the high-order bits should agree with
-     the sign bit.  */
-
-  if (lunsignedp)
-    {
-      if (! integer_zerop (const_binop (RSHIFT_EXPR,
-					fold_convert_loc (loc,
-							  unsigned_type, rhs),
-					size_int (lbitsize))))
-	{
-	  warning (0, "comparison is always %d due to width of bit-field",
-		   code == NE_EXPR);
-	  return constant_boolean_node (code == NE_EXPR, compare_type);
-	}
-    }
-  else
-    {
-      tree tem = const_binop (RSHIFT_EXPR,
-			      fold_convert_loc (loc, signed_type, rhs),
-			      size_int (lbitsize - 1));
-      if (! integer_zerop (tem) && ! integer_all_onesp (tem))
-	{
-	  warning (0, "comparison is always %d due to width of bit-field",
-		   code == NE_EXPR);
-	  return constant_boolean_node (code == NE_EXPR, compare_type);
-	}
-    }
-
-  /* Single-bit compares should always be against zero.  */
-  if (lbitsize == 1 && ! integer_zerop (rhs))
-    {
-      code = code == EQ_EXPR ? NE_EXPR : EQ_EXPR;
-      rhs = build_int_cst (type, 0);
-    }
-
-  /* Make a new bitfield reference, shift the constant over the
-     appropriate number of bits and mask it with the computed mask
-     (in case this was a signed field).  If we changed it, make a new one.  */
-  lhs = make_bit_field_ref (loc, linner, unsigned_type, nbitsize, nbitpos, 1);
-  if (lvolatilep)
-    {
-      TREE_SIDE_EFFECTS (lhs) = 1;
-      TREE_THIS_VOLATILE (lhs) = 1;
-    }
-
-  rhs = const_binop (BIT_AND_EXPR,
-		     const_binop (LSHIFT_EXPR,
-				  fold_convert_loc (loc, unsigned_type, rhs),
-				  size_int (lbitpos)),
-		     mask);
-
-  lhs = build2_loc (loc, code, compare_type,
-		    build2 (BIT_AND_EXPR, unsigned_type, lhs, mask), rhs);
-  return lhs;
-}
 
 /* Subroutine for fold_truth_andor_1: decode a field reference.
 
@@ -12811,18 +12633,6 @@  fold_binary_loc (location_t loc,
 	    return omit_one_operand_loc (loc, type, rslt, arg0);
 	}
 
-      /* If this is a comparison of a field, we may be able to simplify it.  */
-      if ((TREE_CODE (arg0) == COMPONENT_REF
-	   || TREE_CODE (arg0) == BIT_FIELD_REF)
-	  /* Handle the constant case even without -O
-	     to make sure the warnings are given.  */
-	  && (optimize || TREE_CODE (arg1) == INTEGER_CST))
-	{
-	  t1 = optimize_bit_field_compare (loc, code, type, arg0, arg1);
-	  if (t1)
-	    return t1;
-	}
-
       /* Optimize comparisons of strlen vs zero to a compare of the
 	 first character of the string vs zero.  To wit,
 		strlen(ptr) == 0   =>  *ptr == 0