diff mbox

[2/4] fix -fstrict-volatile-bitfields bugs, v3

Message ID 51D0F67D.6050803@codesourcery.com
State New
Headers show

Commit Message

Sandra Loosemore July 1, 2013, 3:24 a.m. UTC
This patch is a rewrite of parts 2, 3, and 4 of the last version of the 
patch series.  It tries to address Bernd Edlinger's comments about the 
last version in PR56997 and my own dissatisfaction with how messy this 
code was!

The particular problem Bernd pointed out was that, upon finding an 
unaligned bitfield that it had to split across multiple accesses, it was 
attempting to do each access in the mode of the container instead of 
giving up and doing what it does normally, with the end result that it 
could write past the end of the object.  This is because 
store_fixed_bit_field calls itself recursively for split fields, and 
likewise for extract_fixed_bit_field.  Also, by the time it recognizes 
that it can't follow the -fstrict-volatile-bitfields rules, it's already 
thrown away state relating to bit regions and skipped a bunch of 
possible optimizations.

After thinking about it for a while, I decided the best approach was to 
test whether -fstrict-volatile-bitfields applies at the very beginning 
and skip straight to the code that handles it.  Then the more deeply 
nested code need not check for -fstrict-volatile-bitfields at all, which 
makes it a lot simpler.

This version of the patch still overrides the bit region specified by 
the C/C++ memory model when -fstrict-volatile-bitfields applies, but 
that's now happening in store_bit_field and doesn't affect cases where 
it can't output the the single access anyway.

I think there is still a case where this version of the patch could 
still end up reading or writing past the end of an object.  For example, 
consider a packed struct that is 3 bytes long, with a bitfield that has 
int type.  I am not sure how to detect this situation at this point in 
the code, but leaving just that case un-fixed for now seems better than 
leaving the current mess in place.  Perhaps it could be addressed by a 
follow-up patch.  Also, assuming the change in part 4 of the new series 
to make -fstrict-volatile-bitfields not be the default on any target any 
more is required as a condition of due to overriding the new memory 
model, programmers at least will be less likely to be taken by surprise 
when it does something weird with packed structures.

Is this version OK to commit, or at least getting closer?  I'd like to 
wind this project up soon, somehow or another.

-Sandra

Comments

Bernd Edlinger July 1, 2013, 9:27 p.m. UTC | #1
Hi Sandra,


good work!


I tested with arm-none-eabi.


Now both variants, with and without -fstrict-volatile-bitfields
produce the expected results for EABI-compliant structures,
and still create reasonable code with non-compliant packed structures.


> I think there is still a case where this version of the patch could
> still end up reading or writing past the end of an object. For example,
> consider a packed struct that is 3 bytes long, with a bitfield that has
> int type. I am not sure how to detect this situation at this point in
> the code, but leaving just that case un-fixed for now seems better than
> leaving the current mess in place. Perhaps it could be addressed by a
> follow-up patch. Also, assuming the change in part 4 of the new series
> to make -fstrict-volatile-bitfields not be the default on any target any
> more is required as a condition of due to overriding the new memory
> model, programmers at least will be less likely to be taken by surprise
> when it does something weird with packed structures.
>
> Is this version OK to commit, or at least getting closer? I'd like to
> wind this project up soon, somehow or another.
>


The patch fixes both aspects of PR56341 (unaligned access fault and write
incorrect value). Regardless of -fstrict-volatile-bitfields setting.


Fixes also PR48748 + PR56997, except for the access outside the structure.
Apparently these accesses do not happen if the struct is an external object,
they only happen if the struct is a comdat object (and the compiler possibly
knows the extra bytes exist and are not used by anything?).


Nevertheless this patch is a significant improvement, and those left-overs
should be fixed after this is committed.


Regards
Bernd.
diff mbox

Patch

diff -u gcc/expmed.c gcc/expmed.c
--- gcc/expmed.c	(working copy)
+++ gcc/expmed.c	(working copy)
@@ -415,6 +415,42 @@  lowpart_bit_field_p (unsigned HOST_WIDE_
     return bitnum % BITS_PER_WORD == 0;
 }
 
+/* Return true if -fstrict-volatile-bitfields applies an access of OP0
+   containing BITSIZE bits starting at BITNUM, with field mode FIELDMODE.  */
+
+static bool
+strict_volatile_bitfield_p (rtx op0, unsigned HOST_WIDE_INT bitsize,
+			    unsigned HOST_WIDE_INT bitnum,
+			    enum machine_mode fieldmode)
+{
+  unsigned HOST_WIDE_INT modesize = GET_MODE_BITSIZE (fieldmode);
+
+  /* -fstrict-volatile-bitfields must be enabled and we must have a
+     volatile MEM.  */
+  if (!MEM_P (op0)
+      || !MEM_VOLATILE_P (op0)
+      || flag_strict_volatile_bitfields <= 0)
+    return false;
+
+  /* Non-integral modes likely only happen with packed structures.
+     Punt.  */
+  if (!SCALAR_INT_MODE_P (fieldmode))
+    return false;
+
+  /* The bit size must not be larger than the field mode, and
+     the field mode must not be larger than a word.  */
+  if (bitsize > modesize || modesize > BITS_PER_WORD)
+    return false;
+
+  /* Check for cases of unaligned fields that must be split.  */
+  if (bitnum % BITS_PER_UNIT + bitsize > modesize
+      || (STRICT_ALIGNMENT
+	  && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize > modesize))
+    return false;
+
+  return true;
+}
+
 /* Return true if OP is a memory and if a bitfield of size BITSIZE at
    bit number BITNUM can be treated as a simple value of mode MODE.  */
 
@@ -813,12 +849,8 @@  store_bit_field_1 (rtx str_rtx, unsigned
      cheap register alternative is available.  */
   if (MEM_P (op0))
     {
-      /* Do not use unaligned memory insvs for volatile bitfields when
-	 -fstrict-volatile-bitfields is in effect.  */
-      if (!(MEM_VOLATILE_P (op0)
-	    && flag_strict_volatile_bitfields > 0)
-	  && get_best_mem_extraction_insn (&insv, EP_insv, bitsize, bitnum,
-					   fieldmode)
+      if (get_best_mem_extraction_insn (&insv, EP_insv, bitsize, bitnum,
+					fieldmode)
 	  && store_bit_field_using_insv (&insv, op0, bitsize, bitnum, value))
 	return true;
 
@@ -871,6 +903,27 @@  store_bit_field (rtx str_rtx, unsigned H
 		 enum machine_mode fieldmode,
 		 rtx value)
 {
+  /* Handle -fstrict-volatile-bitfields in the cases where it applies.  */
+  if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, fieldmode))
+    {
+
+      /* Storing any naturally aligned field can be done with a simple
+	 store.  For targets that support fast unaligned memory, any
+	 naturally sized, unit aligned field can be done directly.  */
+      if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, fieldmode))
+	{
+	  str_rtx = adjust_bitfield_address (str_rtx, fieldmode,
+					     bitnum / BITS_PER_UNIT);
+	  emit_move_insn (str_rtx, value);
+	}
+      else
+	/* Explicitly override the C/C++ memory model; ignore the
+	   bit range so that we can do the access in the mode mandated
+	   by -fstrict-volatile-bitfields instead.  */
+	store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value);
+      return;
+    }
+
   /* Under the C++0x memory model, we must not touch bits outside the
      bit region.  Adjust the address to start at the beginning of the
      bit region.  */
@@ -923,29 +976,12 @@  store_fixed_bit_field (rtx op0, unsigned
 
   if (MEM_P (op0))
     {
-      unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE;
-
-      if (bitregion_end)
-	maxbits = bitregion_end - bitregion_start + 1;
-
-      /* Get the proper mode to use for this field.  We want a mode that
-	 includes the entire field.  If such a mode would be larger than
-	 a word, we won't be doing the extraction the normal way.
-	 We don't want a mode bigger than the destination.  */
-
       mode = GET_MODE (op0);
       if (GET_MODE_BITSIZE (mode) == 0
 	  || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode))
 	mode = word_mode;
-
-      if (MEM_VOLATILE_P (op0)
-          && GET_MODE_BITSIZE (GET_MODE (op0)) > 0
-	  && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
-	  && flag_strict_volatile_bitfields > 0)
-	mode = GET_MODE (op0);
-      else
-	mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
-			      MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+      mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
+			    MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
 
       if (mode == VOIDmode)
 	{
@@ -1429,19 +1465,8 @@  extract_bit_field_1 (rtx str_rtx, unsign
      If that's wrong, the solution is to test for it and set TARGET to 0
      if needed.  */
 
-  /* If the bitfield is volatile, we need to make sure the access
-     remains on a type-aligned boundary.  */
-  if (GET_CODE (op0) == MEM
-      && MEM_VOLATILE_P (op0)
-      && GET_MODE_BITSIZE (GET_MODE (op0)) > 0
-      && flag_strict_volatile_bitfields > 0)
-    goto no_subreg_mode_swap;
-
-  /* Only scalar integer modes can be converted via subregs.  There is an
-     additional problem for FP modes here in that they can have a precision
-     which is different from the size.  mode_for_size uses precision, but
-     we want a mode based on the size, so we must avoid calling it for FP
-     modes.  */
+  /* Get the mode of the field to use for atomic access or subreg
+     conversion.  */
   mode1 = mode;
   if (SCALAR_INT_MODE_P (tmode))
     {
@@ -1474,8 +1499,6 @@  extract_bit_field_1 (rtx str_rtx, unsign
       return convert_extracted_bit_field (op0, mode, tmode, unsignedp);
     }
 
- no_subreg_mode_swap:
-
   /* Handle fields bigger than a word.  */
 
   if (bitsize > BITS_PER_WORD)
@@ -1595,11 +1618,8 @@  extract_bit_field_1 (rtx str_rtx, unsign
      cheap register alternative is available.  */
   if (MEM_P (op0))
     {
-      /* Do not use extv/extzv for volatile bitfields when
-         -fstrict-volatile-bitfields is in effect.  */
-      if (!(MEM_VOLATILE_P (op0) && flag_strict_volatile_bitfields > 0)
-	  && get_best_mem_extraction_insn (&extv, pattern, bitsize, bitnum,
-					   tmode))
+      if (get_best_mem_extraction_insn (&extv, pattern, bitsize, bitnum,
+					tmode))
 	{
 	  rtx result = extract_bit_field_using_extv (&extv, op0, bitsize,
 						     bitnum, unsignedp,
@@ -1665,6 +1685,31 @@  extract_bit_field (rtx str_rtx, unsigned
 		   unsigned HOST_WIDE_INT bitnum, int unsignedp, rtx target,
 		   enum machine_mode mode, enum machine_mode tmode)
 {
+  enum machine_mode mode1;
+
+  /* Handle -fstrict-volatile-bitfields in the cases where it applies.  */
+  if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) > 0)
+    mode1 = GET_MODE (str_rtx);
+  else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0)
+    mode1 = GET_MODE (target);
+  else
+    mode1 = tmode;
+
+  if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, mode1))
+    {
+      rtx result;
+
+      /* Extraction of a full MODE1 value can be done with a load as long as
+	 the field is on a byte boundary and is sufficiently aligned.  */
+      if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, mode1))
+	result = adjust_bitfield_address (str_rtx, mode1,
+					  bitnum / BITS_PER_UNIT);
+      else
+	result = extract_fixed_bit_field (mode, str_rtx, bitsize, bitnum,
+					  target, unsignedp);
+      return convert_extracted_bit_field (result, mode, tmode, unsignedp);
+    }
+  
   return extract_bit_field_1 (str_rtx, bitsize, bitnum, unsignedp,
 			      target, mode, tmode, true);
 }
@@ -1691,45 +1736,19 @@  extract_fixed_bit_field (enum machine_mo
 	 includes the entire field.  If such a mode would be larger than
 	 a word, we won't be doing the extraction the normal way.  */
 
-      if (MEM_VOLATILE_P (op0)
-	  && flag_strict_volatile_bitfields > 0)
-	{
-	  if (GET_MODE_BITSIZE (GET_MODE (op0)) > 0)
-	    mode = GET_MODE (op0);
-	  else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0)
-	    mode = GET_MODE (target);
-	  else
-	    mode = tmode;
-	}
-      else
-	mode = get_best_mode (bitsize, bitnum, 0, 0,
-			      MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
+      mode = GET_MODE (op0);
+      if (GET_MODE_BITSIZE (mode) == 0
+	  || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode))
+	mode = word_mode;
+      mode = get_best_mode (bitsize, bitnum, 0, 0,
+			    MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
 
       if (mode == VOIDmode)
 	/* The only way this should occur is if the field spans word
 	   boundaries.  */
 	return extract_split_bit_field (op0, bitsize, bitnum, unsignedp);
 
-      unsigned int total_bits = GET_MODE_BITSIZE (mode);
-      HOST_WIDE_INT bit_offset = bitnum - bitnum % total_bits;
-
-      /* If we're accessing a volatile MEM, we can't apply BIT_OFFSET
-	 if it results in a multi-word access where we otherwise wouldn't
-	 have one.  So, check for that case here.  */
-      if (MEM_P (op0)
-	  && MEM_VOLATILE_P (op0)
-	  && flag_strict_volatile_bitfields > 0
-	  && bitnum % BITS_PER_UNIT + bitsize <= total_bits
-	  && bitnum % GET_MODE_BITSIZE (mode) + bitsize > total_bits)
-	{
-	  /* If the target doesn't support unaligned access, give up and
-	     split the access into two.  */
-	  if (STRICT_ALIGNMENT)
-	    return extract_split_bit_field (op0, bitsize, bitnum, unsignedp);
-	  bit_offset = bitnum - bitnum % BITS_PER_UNIT;
-	}
-      op0 = adjust_bitfield_address (op0, mode, bit_offset / BITS_PER_UNIT);
-      bitnum -= bit_offset;
+      op0 = narrow_bit_field_mem (op0, mode, bitsize, bitnum, &bitnum);
     }
 
   mode = GET_MODE (op0);
diff -u gcc/doc/invoke.texi gcc/doc/invoke.texi
--- gcc/doc/invoke.texi	(working copy)
+++ gcc/doc/invoke.texi	(working copy)
@@ -20887,6 +20887,12 @@  instruction, even though that accesses b
 any portion of the bit-field, or memory-mapped registers unrelated to
 the one being updated.
 
+In some cases, such as when the @code{packed} attribute is applied to a 
+structure field, it may not be possible to access the field with a single
+read or write that is correctly aligned for the target machine.  In this
+case GCC falls back to generating multiple accesses rather than code that 
+will fault or truncate the result at run time.
+
 The default value of this option is determined by the application binary
 interface for the target processor.