Patchwork [0/8] Add optabs alternatives for insv, extv and extzv

login
register
mail settings
Submitter Richard Sandiford
Date Nov. 28, 2012, 1:58 p.m.
Message ID <87lidlvnh1.fsf@sandifor-thinkpad.stglab.manchester.uk.ibm.com>
Download mbox | patch
Permalink /patch/202460/
State New
Headers show

Comments

Richard Sandiford - Nov. 28, 2012, 1:58 p.m.
Ramana Radhakrishnan <ramana.gcc@googlemail.com> writes:
> On Tue, Nov 27, 2012 at 8:22 PM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> Ramana Radhakrishnan <ramrad01@arm.com> writes:
>>>> Tested on x86_64-linux-gnu, powerpc64-linux-gnu and mipsisa64-elf.
>>>> Also tested by making sure that there were no changes in assembly
>>>> output for a set of gcc .ii files.  On the other hand, the -march=octeon
>>>> output for a set of mips64-linux-gnu gcc .ii files showed the optimisation
>>>> kicking in as hoped.
>>>
>>> This sequence of patches caused a regression in
>>> gcc.target/arm/volatile-bitfields-1.c . I haven't reviewed the patch
>>> stack in great detail yet, but it looks like this sequence of patches
>>> doesn't take the ARM volatile bitfields into account. (193600 is fine,
>>> 193606 is not).
>>
>> Looks like I was wrong to drop the conditions from patch 5.
>> Please could you try the attached?
>
> Fixes volatile-bitfields-1.c but appears to break volatile-bitfields-4.c :( .

This is because the structure we are given is:

    (mem/v/j:SI (reg/v/f:SI 110 [ t ]) [3 t_2(D)->a+0 S1 A32])

i.e. a 1-byte SImode reference.  The strange size comes from the
set_mem_attributes_minus_bitpos code that was mentioned earlier
in the series, where we set the size based on the type even if
it doesn't match the mode.

The original rules for forcing strict-volatile bitfields from memory
to registers were different (or at least written in a different way)
between store_bit_field_1 -- where we force to a register in an attempt
to use register insvs -- and store_fixed_bit_field -- where we use the
fallback shifts and masks -- even though logically I thought they'd be
the same.  In store_bit_field_1 the mode was chosen like this:

      /* Get the mode to use for inserting into this field.  If OP0 is
	 BLKmode, get the smallest mode consistent with the alignment. If
	 OP0 is a non-BLKmode object that is no wider than OP_MODE, use its
	 mode. Otherwise, use the smallest mode containing the field.  */

      if (GET_MODE (op0) == BLKmode
	  || (op_mode != MAX_MACHINE_MODE
	      && GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode)))
	bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
				  (op_mode == MAX_MACHINE_MODE
				   ? VOIDmode : op_mode),
				  MEM_VOLATILE_P (op0));
      else
	bestmode = GET_MODE (op0);

      if (bestmode != VOIDmode
	  && GET_MODE_SIZE (bestmode) >= GET_MODE_SIZE (fieldmode)
	  && !(SLOW_UNALIGNED_ACCESS (bestmode, MEM_ALIGN (op0))
	       && GET_MODE_BITSIZE (bestmode) > MEM_ALIGN (op0)))
	{

i.e. no explicit test of flag_strict_volatile_bitfields.
In store_fixed_bit_field we had:

      mode = GET_MODE (op0);
      if (GET_MODE_BITSIZE (mode) == 0
	  || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode))
	mode = word_mode;

      if (MEM_VOLATILE_P (op0)
          && GET_MODE_BITSIZE (GET_MODE (op0)) > 0
	  && flag_strict_volatile_bitfields > 0)
	mode = GET_MODE (op0);
      else
	mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
			      MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));

The same thing applied to extract_bit_field_1 and extract_fixed_bit_field,
with the latter using:

      if (MEM_VOLATILE_P (op0)
	  && flag_strict_volatile_bitfields > 0)
	{
	  if (GET_MODE_BITSIZE (GET_MODE (op0)) > 0)
	    mode = GET_MODE (op0);
	  else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0)
	    mode = GET_MODE (target);
	  else
	    mode = tmode;
	}
      else
	mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
			      MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));

This patch seems to fix the volatile bitfield tests, both with the
default arm-linux-gnueabi (which already works) and with
-mcpu=cortex-a8.  Could you give it a proper test?

Richard


gcc/
	* expmed.c (adjust_bit_field_mem_for_reg): Add handling of volatile
	bitfields.
Eric Botcazou - Nov. 28, 2012, 11:16 p.m.
> This is because the structure we are given is:
> 
>     (mem/v/j:SI (reg/v/f:SI 110 [ t ]) [3 t_2(D)->a+0 S1 A32])
> 
> i.e. a 1-byte SImode reference.  The strange size comes from the
> set_mem_attributes_minus_bitpos code that was mentioned earlier
> in the series, where we set the size based on the type even if
> it doesn't match the mode.

I think that's wrong, we should have S4 and drop the MEM_EXPR as we would do 
it with adjust_bitfield_address.  In other words, if the size computed from 
the tree is smaller than the mode size, we don't use it and clear MEM_EXPR.
Richard Sandiford - Nov. 29, 2012, 10:31 a.m.
Eric Botcazou <ebotcazou@adacore.com> writes:
>> This is because the structure we are given is:
>> 
>>     (mem/v/j:SI (reg/v/f:SI 110 [ t ]) [3 t_2(D)->a+0 S1 A32])
>> 
>> i.e. a 1-byte SImode reference.  The strange size comes from the
>> set_mem_attributes_minus_bitpos code that was mentioned earlier
>> in the series, where we set the size based on the type even if
>> it doesn't match the mode.
>
> I think that's wrong, we should have S4 and drop the MEM_EXPR as we would do 
> it with adjust_bitfield_address.  In other words, if the size computed from 
> the tree is smaller than the mode size, we don't use it and clear MEM_EXPR.

I agree that this kind of MEM is less than ideal, but I thought:

	      set_mem_attributes_minus_bitpos (to_rtx, to, 0, bitpos);

said that the attributes of TO_RTX will to be TO _once a pending offset-and-
narrowing operation has been applied_.  So we have:

  /* If we modified OFFSET based on T, then subtract the outstanding
     bit position offset.  Similarly, increase the size of the accessed
     object to contain the negative offset.  */
  if (apply_bitpos)
    {
      gcc_assert (attrs.offset_known_p);
      attrs.offset -= apply_bitpos / BITS_PER_UNIT;
      if (attrs.size_known_p)
	attrs.size += apply_bitpos / BITS_PER_UNIT;
    }

I didn't think we necessarily expected the width of the reference
(TO_RTX) and the width of the type (TO) to match at this stage.
That's different from adjust_bitfield_address, where the
offset-and-narrowing operation itself is applied.

The difference between the width of the reference and the width
of T is what led to:

   http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00262.html

As things stand, APPLY_BITPOS is only nonzero if we set both the
MEM_EXPR and MEM_SIZE from T.  There are also cases (like this one)
where we don't set the MEM_EXPR from T but do set the MEM_SIZE from T.
The bitpos will be applied either way, so I thought MEM_SIZE should be
the same in both cases.  That doesn't fix this problem of course, it's
just an argument that the relationship between the width of the reference
mode, the MEM_SIZE and the width of T seems pretty complicated with the
current interface.

Maybe set_mem_attributes_minus_bitpos (but not set_mem_attributes)
should only set the MEM_EXPR and leave the MEM_SIZE unchanged?

Before submitting the patched linked above, I tried getting rid
of set_mem_attributes_minus_bitpos and passing the tree down instead.
Then we could set the attributes at the time of the offset-and-narrowing
operation, where the size and offset of the final reference are known.
That didn't seem like an easy change to make though, and became a
bit of a distraction from the main patches.

Anyway, given the breakage that this series has already caused,
I'd prefer not to tackle stuff like this as well.  I'd only used
MEM_SIZE in the first attempted patch out of habit.  I think the
revised patch more obviously matches the *_fixed_bit_field functions
and is more generally in keeping with the existing checks.
(It's deliberately more conservative though, only using register
bitfields if both the bit_field_mode_iterator and strict volatile
bitfield rules are met.)

Richard
Eric Botcazou - Nov. 29, 2012, 3:28 p.m.
> I agree that this kind of MEM is less than ideal, but I thought:
> 
> 	      set_mem_attributes_minus_bitpos (to_rtx, to, 0, bitpos);
> 
> said that the attributes of TO_RTX will to be TO _once a pending offset-and-
> narrowing operation has been applied_.  So we have:
> 
>   /* If we modified OFFSET based on T, then subtract the outstanding
>      bit position offset.  Similarly, increase the size of the accessed
>      object to contain the negative offset.  */
>   if (apply_bitpos)
>     {
>       gcc_assert (attrs.offset_known_p);
>       attrs.offset -= apply_bitpos / BITS_PER_UNIT;
>       if (attrs.size_known_p)
> 	attrs.size += apply_bitpos / BITS_PER_UNIT;
>     }
> 
> I didn't think we necessarily expected the width of the reference
> (TO_RTX) and the width of the type (TO) to match at this stage.
> That's different from adjust_bitfield_address, where the
> offset-and-narrowing operation itself is applied.

I was essentially thinking of the size adjustment just above that one: if the 
mode size is known, setting a smaller size without further ado seems awkward.

So the questionable MEM doesn't survive long?  OK, maybe...

> The difference between the width of the reference and the width
> of T is what led to:
> 
>    http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00262.html
> 
> As things stand, APPLY_BITPOS is only nonzero if we set both the
> MEM_EXPR and MEM_SIZE from T.  There are also cases (like this one)
> where we don't set the MEM_EXPR from T but do set the MEM_SIZE from T.
> The bitpos will be applied either way, so I thought MEM_SIZE should be
> the same in both cases.  That doesn't fix this problem of course, it's
> just an argument that the relationship between the width of the reference
> mode, the MEM_SIZE and the width of T seems pretty complicated with the
> current interface.

MEM_SIZE and MEM_EXPR are used alone by the aliasing machinery to disambiguate 
memory references, so they need be conservative wrt the actual memory access.

> Maybe set_mem_attributes_minus_bitpos (but not set_mem_attributes)
> should only set the MEM_EXPR and leave the MEM_SIZE unchanged?
> 
> Before submitting the patched linked above, I tried getting rid
> of set_mem_attributes_minus_bitpos and passing the tree down instead.
> Then we could set the attributes at the time of the offset-and-narrowing
> operation, where the size and offset of the final reference are known.
> That didn't seem like an easy change to make though, and became a
> bit of a distraction from the main patches.
> 
> Anyway, given the breakage that this series has already caused,
> I'd prefer not to tackle stuff like this as well.  I'd only used
> MEM_SIZE in the first attempted patch out of habit.  I think the
> revised patch more obviously matches the *_fixed_bit_field functions
> and is more generally in keeping with the existing checks.
> (It's deliberately more conservative though, only using register
> bitfields if both the bit_field_mode_iterator and strict volatile
> bitfield rules are met.)

Well, rewriting the bitfield machinery of the middle-end is a once-in-a-decade 
undertaking, so some fallouts are to be expected. :-)  That wasn't too bad in 
the end.  But I agree with the cautious approach from now on.

Patch

Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c	(revision 193881)
+++ gcc/expmed.c	(working copy)
@@ -372,6 +372,17 @@ 
 				bitregion_end, MEM_ALIGN (op0),
 				MEM_VOLATILE_P (op0));
   enum machine_mode best_mode;
+  if (MEM_VOLATILE_P (op0) && flag_strict_volatile_bitfields > 0)
+    {
+      unsigned int size = GET_MODE_BITSIZE (GET_MODE (op0));
+      if (size > 0)
+	while (iter.next_mode (&best_mode))
+	  if (GET_MODE_BITSIZE (best_mode) == size)
+	    return narrow_bit_field_mem (op0, best_mode, bitsize, bitnum,
+					 new_bitnum);
+      return NULL_RTX;
+    }
+
   if (iter.next_mode (&best_mode))
     {
       /* We can use a memory in BEST_MODE.  See whether this is true for