Patchwork reimplement -fstrict-volatile-bitfields

login
register
mail settings
Submitter Sandra Loosemore
Date June 12, 2013, 9:59 p.m.
Message ID <51B8EF2F.6090901@codesourcery.com>
Download mbox | patch
Permalink /patch/250900/
State New
Headers show

Comments

Sandra Loosemore - June 12, 2013, 9:59 p.m.
Background:  on ARM and some other targets, the ABI requires that 
volatile bit-fields be accessed atomically in a mode that corresponds to 
the declared type of the field, which conflicts with GCC's normal 
behavior of doing accesses in a mode that might correspond to the size 
of a general-purpose register, the size of the bit-field, or the bit 
range corresponding to the C++ memory model.  This is what the 
-fstrict-volatile-bitfields flag does, and it is the default on ARM and 
other targets where the ABI requires this behavior.

Both the original patch that added -fstrict-volatile-bitfields support 
and a subsequent followup patch that tried to unbreak handling of packed 
structures (where fields might not be properly aligned to do the single 
access otherwise mandated by -fstrict-volatile-bitfields) only handled 
bit-field reads, and not writes.  Last year I submitted a patch we've 
had locally for some time to extend the existing implementation to 
writes, but it was rejected on the grounds that the current 
implementation is too broken to fix or extend.  I didn't have time then 
to start over from scratch, and meanwhile, the bug reports about 
-fstrict-volatile-bitfields have continued to pile up.  So let's try 
again to fix this, this time working from the ground up.

 From last year's discussion, it seemed that there were two primary 
objections to the current implementation:

(1) It was seen as inappropriate that warnings about conflicts between 
unaligned fields and -fstrict-volatile-bitfields were being emitted 
during expand.  It was suggested that any diagnostics ought to be 
emitted by the various language front ends instead.

(2) The way packed fields are being detected is buggy and an abstraction 
violation, and passing around a packedp flag to all the bit-field expand 
functions is ugly.

And, my own complaints about the current implementation:

(3) Users expect packed structures to work even on targets where 
-fstrict-volatile-bitfields is the default, so the compiler shouldn't 
generate code for accesses to unaligned fields that either faults at run 
time due to the unaligned access or silently produces an incorrect 
result (e.g., by only accessing part of the bit-field), with or without 
a warning at compile time.

(4) There's pointless divergence between the bit-field store and extract 
code that has led to a number of bugs.

I've come up with a new patch that tries to address all these issues.

For problem (1), I've eliminated the warnings from expand.  I'm not 
opposed to adding them back to the front ends, as previously suggested, 
but given that they were only previously implemented for reads and not 
writes and that it was getting the different-warning-for-packed-fields 
behavior wrong in some cases, getting rid of the warnings is at least as 
correct as adding them for bit-field writes, too.  ;-)

I've killed the packedp flag from item (2) completely too.

For problem (3), my reading of the ARM ABI document is that the 
requirements for atomic access to volatile bit-fields only apply to 
bit-fields that are aligned according to the ABI requirements.  If a 
user has used GCC extensions to create a non-ABI-compliant packed 
structure where an atomic bit-field access of the correct size is not 
possible or valid on the target, then GCC ought to define some 
reasonable access behavior for those bit-fields too as a further 
extension -- whether or not it complies with the ABI requirements for 
unpacked structures -- rather than just generating invalid code.  In 
particular,  generating the access using whatever technique it would 
fall back to if -fstrict-volatile-bitfields didn't apply, in cases where 
it *cannot* be applied, seems perfectly reasonable to me.

To address problem (4), I've tried to make the code for handling 
-fstrict-volatile-bitfields similar in the read and write cases.  I 
think there is probably more that can be done here in terms of 
refactoring some of the now-common code and changing the interfaces to 
be more consistent as well, but I think it would be more clear to 
separate changes that are just code cleanup from those that are intended 
to change behavior.  I'm willing to work on code refactoring as a 
followup patch if the maintainers recommend or require that.

I've regression tested the attached patch on arm-none-eabi as well as 
bootstrapping and regression testing on x86_64-linux-gnu.  I also did 
some spot testing on mipsisa32r2-sde-elf.  I verified that all the new 
test cases pass on these targets with this patch.  Without the patch, I 
saw these failures:

pr23623.c: ARM, x86_64, MIPS
pr48784-1.c: ARM, x86_64, MIPS
pr48784-2.c: none
pr56341-1.c: ARM, MIPS
pr56341-2.c: none
pr56997-1.c: ARM
pr56997-2.c: ARM, MIPS
pr56997-3.c: ARM, MIPS
volatile-bitfields-3.c: ARM, x86_64, MIPS

Here are some comments on specific parts of the patch.

The "meat" of the patch is rewriting the logic for 
-fstrict-volatile-bitfields in extract_fixed_bit_field, and making 
store_fixed_bit_field use parallel logic.

The change to make extract_bit_field_1 not skip over the
simple_mem_bitfield_p case was necessary to avoid introducing new 
failures of the pr56997-* test cases on x86_64.  store_bit_field_1 
already has parallel code, except for getting the field mode passed in 
explicitly as an argument.

The change to store_bit_field_1 and extract_bit_field_1 to make them 
skip both cheap register alternatives if -fstrict-volatile-bitfields 
applies, instead of just the first one, is for the 
volatile-bitfields-3.c test case.  As noted above, this example was 
previously failing on all three targets I tried, and I was specifically 
using x86_64 to debug this one.

The hack to get_bit_range fixes a regression in the pr23623 test case 
that has been present since the bit range support was added, I think. 
There might be a better place to do this, but it seems clear to me that 
when -fstrict-volatile-bitfields applies it has to override the bit 
range that would be used for normal accesses.  It seemed to me during my 
incremental testing and debugging that this change was independent of 
the others (e.g., it fixed the regression even with no other changes).

Anyway, what do you think of this patch?  It does fix a whole lot of 
bugs, and seems like at least an incremental improvement over the 
current code.  I realize this is probably going to take a couple 
iterations to get right.  If there are additional cleanups or 
refactoring of the bit-field code required, in addition to the 
functional changes, can they be done as followup patches or do I need to 
work out the entire series of patches in advance?

-Sandra
Richard Guenther - June 14, 2013, 12:31 p.m.
On Wed, Jun 12, 2013 at 11:59 PM, Sandra Loosemore
<sandra@codesourcery.com> wrote:
> Background:  on ARM and some other targets, the ABI requires that volatile
> bit-fields be accessed atomically in a mode that corresponds to the declared
> type of the field, which conflicts with GCC's normal behavior of doing
> accesses in a mode that might correspond to the size of a general-purpose
> register, the size of the bit-field, or the bit range corresponding to the
> C++ memory model.  This is what the -fstrict-volatile-bitfields flag does,
> and it is the default on ARM and other targets where the ABI requires this
> behavior.
>
> Both the original patch that added -fstrict-volatile-bitfields support and a
> subsequent followup patch that tried to unbreak handling of packed
> structures (where fields might not be properly aligned to do the single
> access otherwise mandated by -fstrict-volatile-bitfields) only handled
> bit-field reads, and not writes.  Last year I submitted a patch we've had
> locally for some time to extend the existing implementation to writes, but
> it was rejected on the grounds that the current implementation is too broken
> to fix or extend.  I didn't have time then to start over from scratch, and
> meanwhile, the bug reports about -fstrict-volatile-bitfields have continued
> to pile up.  So let's try again to fix this, this time working from the
> ground up.
>
> From last year's discussion, it seemed that there were two primary
> objections to the current implementation:
>
> (1) It was seen as inappropriate that warnings about conflicts between
> unaligned fields and -fstrict-volatile-bitfields were being emitted during
> expand.  It was suggested that any diagnostics ought to be emitted by the
> various language front ends instead.
>
> (2) The way packed fields are being detected is buggy and an abstraction
> violation, and passing around a packedp flag to all the bit-field expand
> functions is ugly.
>
> And, my own complaints about the current implementation:
>
> (3) Users expect packed structures to work even on targets where
> -fstrict-volatile-bitfields is the default, so the compiler shouldn't
> generate code for accesses to unaligned fields that either faults at run
> time due to the unaligned access or silently produces an incorrect result
> (e.g., by only accessing part of the bit-field), with or without a warning
> at compile time.
>
> (4) There's pointless divergence between the bit-field store and extract
> code that has led to a number of bugs.
>
> I've come up with a new patch that tries to address all these issues.
>
> For problem (1), I've eliminated the warnings from expand.  I'm not opposed
> to adding them back to the front ends, as previously suggested, but given
> that they were only previously implemented for reads and not writes and that
> it was getting the different-warning-for-packed-fields behavior wrong in
> some cases, getting rid of the warnings is at least as correct as adding
> them for bit-field writes, too.  ;-)
>
> I've killed the packedp flag from item (2) completely too.
>
> For problem (3), my reading of the ARM ABI document is that the requirements
> for atomic access to volatile bit-fields only apply to bit-fields that are
> aligned according to the ABI requirements.  If a user has used GCC
> extensions to create a non-ABI-compliant packed structure where an atomic
> bit-field access of the correct size is not possible or valid on the target,
> then GCC ought to define some reasonable access behavior for those
> bit-fields too as a further extension -- whether or not it complies with the
> ABI requirements for unpacked structures -- rather than just generating
> invalid code.  In particular,  generating the access using whatever
> technique it would fall back to if -fstrict-volatile-bitfields didn't apply,
> in cases where it *cannot* be applied, seems perfectly reasonable to me.
>
> To address problem (4), I've tried to make the code for handling
> -fstrict-volatile-bitfields similar in the read and write cases.  I think
> there is probably more that can be done here in terms of refactoring some of
> the now-common code and changing the interfaces to be more consistent as
> well, but I think it would be more clear to separate changes that are just
> code cleanup from those that are intended to change behavior.  I'm willing
> to work on code refactoring as a followup patch if the maintainers recommend
> or require that.
>
> I've regression tested the attached patch on arm-none-eabi as well as
> bootstrapping and regression testing on x86_64-linux-gnu.  I also did some
> spot testing on mipsisa32r2-sde-elf.  I verified that all the new test cases
> pass on these targets with this patch.  Without the patch, I saw these
> failures:
>
> pr23623.c: ARM, x86_64, MIPS
> pr48784-1.c: ARM, x86_64, MIPS
> pr48784-2.c: none
> pr56341-1.c: ARM, MIPS
> pr56341-2.c: none
> pr56997-1.c: ARM
> pr56997-2.c: ARM, MIPS
> pr56997-3.c: ARM, MIPS
> volatile-bitfields-3.c: ARM, x86_64, MIPS
>
> Here are some comments on specific parts of the patch.
>
> The "meat" of the patch is rewriting the logic for
> -fstrict-volatile-bitfields in extract_fixed_bit_field, and making
> store_fixed_bit_field use parallel logic.
>
> The change to make extract_bit_field_1 not skip over the
> simple_mem_bitfield_p case was necessary to avoid introducing new failures
> of the pr56997-* test cases on x86_64.  store_bit_field_1 already has
> parallel code, except for getting the field mode passed in explicitly as an
> argument.
>
> The change to store_bit_field_1 and extract_bit_field_1 to make them skip
> both cheap register alternatives if -fstrict-volatile-bitfields applies,
> instead of just the first one, is for the volatile-bitfields-3.c test case.
> As noted above, this example was previously failing on all three targets I
> tried, and I was specifically using x86_64 to debug this one.
>
> The hack to get_bit_range fixes a regression in the pr23623 test case that
> has been present since the bit range support was added, I think. There might
> be a better place to do this, but it seems clear to me that when
> -fstrict-volatile-bitfields applies it has to override the bit range that
> would be used for normal accesses.  It seemed to me during my incremental
> testing and debugging that this change was independent of the others (e.g.,
> it fixed the regression even with no other changes).
>
> Anyway, what do you think of this patch?  It does fix a whole lot of bugs,
> and seems like at least an incremental improvement over the current code.  I
> realize this is probably going to take a couple iterations to get right.  If
> there are additional cleanups or refactoring of the bit-field code required,
> in addition to the functional changes, can they be done as followup patches
> or do I need to work out the entire series of patches in advance?

I think we can split the patch up, so let me do piecewise approval of changes.

The changes that remove the packedp flag passing around and remove
the warning code are ok.  The store_bit_field_1 change is ok, as is
the similar extract_bit_field_1 change (guard the other register copying path).

Please split those out and commit them separately (I suppose they are
strict progressions in testing).

That leaves a much smaller set of changes for which I have one comment
for now:

@@ -4508,6 +4508,16 @@ get_bit_range (unsigned HOST_WIDE_INT *b

   gcc_assert (TREE_CODE (exp) == COMPONENT_REF);

+  /* If -fstrict-volatile-bitfields was given and this is a volatile
+     access, then we must ignore any DECL_BIT_FIELD_REPRESENTATIVE and
+     do the access in the mode of the field.  */
+  if (TREE_THIS_VOLATILE (exp)
+      && flag_strict_volatile_bitfields > 0)
+    {
+      *bitstart = *bitend = 0;
+      return;
+    }

with the past reviews where I said the implementation was broken anyway
I was hinting at that DECL_BIT_FIELD_REPRESENTATIVE should simply
be chosen in a way to adhere to flag_strict_volatile_bitfields.  I had the
impression that this alone should be enough to implement strict volatile
bitfields correctly and no code in the backend would need to check
for flag_strict_volatile_bitfields.  I may of course be wrong here, as
-fstrict-volatile-bitfields may apply to cases where the bitfield is _not_
declared volatile but only the decl?  Thus,

struct X {
  int i : 3;
};

volatile struct X x;

Is x.i an access that needs to adhere to strict volatile bitfield accesses?

Note that IIRC whether you short-cut get_bit_range in the above way
you still get the access to use the mode of the FIELD_DECL.

If the above constitutes a strict volatile bitfield access then the very
correct implementation of strict volatile bitfield accesses is to
adjust the memory accesses the frontends generate (consider
LTO and a TU compiled with -fstrict-volatile-bitfields and another TU
compiled with -fno-strict-volatile-bitfields).  Which it probably does
reasonably well already (setting TREE_THIS_VOLATILE on the
outermost bit-field COMPONENT_REF).  If that is not preserved
then converting the accesses to BIT_FIELD_REFs is another
possibility.  That said, the behavior of GIMPLE IL should not change
dependent on a compile flag.

Richard.

> -Sandra
>
Sandra Loosemore - June 14, 2013, 4:31 p.m.
On 06/14/2013 06:31 AM, Richard Biener wrote:

> I think we can split the patch up, so let me do piecewise approval of changes.
>
> The changes that remove the packedp flag passing around and remove
> the warning code are ok.  The store_bit_field_1 change is ok, as is
> the similar extract_bit_field_1 change (guard the other register copying path).
>
> Please split those out and commit them separately (I suppose they are
> strict progressions in testing).

I will work on splitting the patch, but I feel a little uncomfortable 
about starting to commit it piecewise without some more clear indication 
that this is the right direction.  Otherwise we'll just be back in the 
situation where things are still inconsistent and broken, but in a 
different way than before.

> That leaves a much smaller set of changes for which I have one comment
> for now:
>
> @@ -4508,6 +4508,16 @@ get_bit_range (unsigned HOST_WIDE_INT *b
>
>     gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
>
> +  /* If -fstrict-volatile-bitfields was given and this is a volatile
> +     access, then we must ignore any DECL_BIT_FIELD_REPRESENTATIVE and
> +     do the access in the mode of the field.  */
> +  if (TREE_THIS_VOLATILE (exp)
> +      && flag_strict_volatile_bitfields > 0)
> +    {
> +      *bitstart = *bitend = 0;
> +      return;
> +    }
>
> with the past reviews where I said the implementation was broken anyway
> I was hinting at that DECL_BIT_FIELD_REPRESENTATIVE should simply
> be chosen in a way to adhere to flag_strict_volatile_bitfields.  I had the
> impression that this alone should be enough to implement strict volatile
> bitfields correctly and no code in the backend would need to check
> for flag_strict_volatile_bitfields.  I may of course be wrong here, as
> -fstrict-volatile-bitfields may apply to cases where the bitfield is _not_
> declared volatile but only the decl?  Thus,
>
> struct X {
>    int i : 3;
> };
>
> volatile struct X x;
>
> Is x.i an access that needs to adhere to strict volatile bitfield accesses?

Yes, exactly; see the new pr23623.c test case included with the patch, 
for instance.  Or the volatile bitfield access could be through a 
volatile-qualified pointer, as in the volatile-bitfields-3.c test case. 
  Bernd Schmidt and I had some internal discussion about this and 
neither of us saw how messing with DECL_BIT_FIELD_REPRESENTATIVE could 
help where the field is not declared volatile but the object being 
referenced is.

> Note that IIRC whether you short-cut get_bit_range in the above way
> you still get the access to use the mode of the FIELD_DECL.

As I noted in my previous message, the pr23623.c test case was failing 
on all the targets I tried without this patch hunk, so the access was 
clearly not using the mode of the FIELD_DECL without it.

> If the above constitutes a strict volatile bitfield access then the very
> correct implementation of strict volatile bitfield accesses is to
> adjust the memory accesses the frontends generate (consider
> LTO and a TU compiled with -fstrict-volatile-bitfields and another TU
> compiled with -fno-strict-volatile-bitfields).  Which it probably does
> reasonably well already (setting TREE_THIS_VOLATILE on the
> outermost bit-field COMPONENT_REF).  If that is not preserved
> then converting the accesses to BIT_FIELD_REFs is another
> possibility.  That said, the behavior of GIMPLE IL should not change
> dependent on a compile flag.

I am kind of lost, here.  -fstrict-volatile-bitfields is a code 
generation option, not a front end option, and it seems like LTO should 
not need any special handling here any more than it does for any of the 
gazillion other code generation options supported by GCC.

-Sandra
Richard Guenther - June 17, 2013, 9:15 a.m.
On Fri, Jun 14, 2013 at 6:31 PM, Sandra Loosemore
<sandra@codesourcery.com> wrote:
> On 06/14/2013 06:31 AM, Richard Biener wrote:
>
>> I think we can split the patch up, so let me do piecewise approval of
>> changes.
>>
>> The changes that remove the packedp flag passing around and remove
>> the warning code are ok.  The store_bit_field_1 change is ok, as is
>> the similar extract_bit_field_1 change (guard the other register copying
>> path).
>>
>> Please split those out and commit them separately (I suppose they are
>> strict progressions in testing).
>
>
> I will work on splitting the patch, but I feel a little uncomfortable about
> starting to commit it piecewise without some more clear indication that this
> is the right direction.  Otherwise we'll just be back in the situation where
> things are still inconsistent and broken, but in a different way than
> before.
>
>
>> That leaves a much smaller set of changes for which I have one comment
>> for now:
>>
>> @@ -4508,6 +4508,16 @@ get_bit_range (unsigned HOST_WIDE_INT *b
>>
>>     gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
>>
>> +  /* If -fstrict-volatile-bitfields was given and this is a volatile
>> +     access, then we must ignore any DECL_BIT_FIELD_REPRESENTATIVE and
>> +     do the access in the mode of the field.  */
>> +  if (TREE_THIS_VOLATILE (exp)
>> +      && flag_strict_volatile_bitfields > 0)
>> +    {
>> +      *bitstart = *bitend = 0;
>> +      return;
>> +    }
>>
>> with the past reviews where I said the implementation was broken anyway
>> I was hinting at that DECL_BIT_FIELD_REPRESENTATIVE should simply
>> be chosen in a way to adhere to flag_strict_volatile_bitfields.  I had the
>> impression that this alone should be enough to implement strict volatile
>> bitfields correctly and no code in the backend would need to check
>> for flag_strict_volatile_bitfields.  I may of course be wrong here, as
>> -fstrict-volatile-bitfields may apply to cases where the bitfield is _not_
>> declared volatile but only the decl?  Thus,
>>
>> struct X {
>>    int i : 3;
>> };
>>
>> volatile struct X x;
>>
>> Is x.i an access that needs to adhere to strict volatile bitfield
>> accesses?
>
>
> Yes, exactly; see the new pr23623.c test case included with the patch, for
> instance.  Or the volatile bitfield access could be through a
> volatile-qualified pointer, as in the volatile-bitfields-3.c test case.
> Bernd Schmidt and I had some internal discussion about this and neither of
> us saw how messing with DECL_BIT_FIELD_REPRESENTATIVE could help where the
> field is not declared volatile but the object being referenced is.

Yeah, I can see that.

>> Note that IIRC whether you short-cut get_bit_range in the above way
>> you still get the access to use the mode of the FIELD_DECL.
>
>
> As I noted in my previous message, the pr23623.c test case was failing on
> all the targets I tried without this patch hunk, so the access was clearly
> not using the mode of the FIELD_DECL without it.
>
>
>> If the above constitutes a strict volatile bitfield access then the very
>> correct implementation of strict volatile bitfield accesses is to
>> adjust the memory accesses the frontends generate (consider
>> LTO and a TU compiled with -fstrict-volatile-bitfields and another TU
>> compiled with -fno-strict-volatile-bitfields).  Which it probably does
>> reasonably well already (setting TREE_THIS_VOLATILE on the
>> outermost bit-field COMPONENT_REF).  If that is not preserved
>> then converting the accesses to BIT_FIELD_REFs is another
>> possibility.  That said, the behavior of GIMPLE IL should not change
>> dependent on a compile flag.
>
>
> I am kind of lost, here.  -fstrict-volatile-bitfields is a code generation
> option, not a front end option, and it seems like LTO should not need any
> special handling here any more than it does for any of the gazillion other
> code generation options supported by GCC.

LTO doesn't have any handling of code-generation options.  And it cannot
reliably.  It has a few hacks to support mixed -fstrict-aliasing units for
example, but it will be completely lost if you mix -fstrict-volatile-bitfields
with -fno-strict-volatile-bitfields TUs.

Of course the easiest solution here is always adhere to the ABI and
simply drop the command-line flag.  Make it a target hook instead.

That said, with the goal to have bitfield accesses lowered to their
final memory access patterns way earlier than RTL expansion we'll end
up using BIT_FIELD_REF-like accesses for the strict volatile bitfields
at some point anyway.

Richard.


> -Sandra
>

Patch

Index: gcc/expr.h
===================================================================
--- gcc/expr.h	(revision 199963)
+++ gcc/expr.h	(working copy)
@@ -704,7 +704,7 @@  extern void store_bit_field (rtx, unsign
 			     unsigned HOST_WIDE_INT,
 			     enum machine_mode, rtx);
 extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
-			      unsigned HOST_WIDE_INT, int, bool, rtx,
+			      unsigned HOST_WIDE_INT, int, rtx,
 			      enum machine_mode, enum machine_mode);
 extern rtx extract_low_bits (enum machine_mode, enum machine_mode, rtx);
 extern rtx expand_mult (enum machine_mode, rtx, rtx, rtx, int);
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c	(revision 199963)
+++ gcc/expmed.c	(working copy)
@@ -54,7 +54,7 @@  static void store_split_bit_field (rtx, 
 				   rtx);
 static rtx extract_fixed_bit_field (enum machine_mode, rtx,
 				    unsigned HOST_WIDE_INT,
-				    unsigned HOST_WIDE_INT, rtx, int, bool);
+				    unsigned HOST_WIDE_INT, rtx, int);
 static rtx mask_rtx (enum machine_mode, int, int, int);
 static rtx lshift_value (enum machine_mode, rtx, int, int);
 static rtx extract_split_bit_field (rtx, unsigned HOST_WIDE_INT,
@@ -810,15 +810,15 @@  store_bit_field_1 (rtx str_rtx, unsigned
     return true;
 
   /* If OP0 is a memory, try copying it to a register and seeing if a
-     cheap register alternative is available.  */
-  if (MEM_P (op0))
+     cheap register alternative is available.  Do not try these tricks if
+     -fstrict-volatile-bitfields is in effect, since they may not respect
+     the mode of the access.  */
+  if (MEM_P (op0)
+      && !(MEM_VOLATILE_P (op0)
+	   && flag_strict_volatile_bitfields > 0))
     {
-      /* Do not use unaligned memory insvs for volatile bitfields when
-	 -fstrict-volatile-bitfields is in effect.  */
-      if (!(MEM_VOLATILE_P (op0)
-	    && flag_strict_volatile_bitfields > 0)
-	  && get_best_mem_extraction_insn (&insv, EP_insv, bitsize, bitnum,
-					   fieldmode)
+      if (get_best_mem_extraction_insn (&insv, EP_insv, bitsize, bitnum,
+					fieldmode)
 	  && store_bit_field_using_insv (&insv, op0, bitsize, bitnum, value))
 	return true;
 
@@ -933,19 +933,35 @@  store_fixed_bit_field (rtx op0, unsigned
 	 a word, we won't be doing the extraction the normal way.
 	 We don't want a mode bigger than the destination.  */
 
-      mode = GET_MODE (op0);
-      if (GET_MODE_BITSIZE (mode) == 0
-	  || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode))
-	mode = word_mode;
-
       if (MEM_VOLATILE_P (op0)
           && GET_MODE_BITSIZE (GET_MODE (op0)) > 0
 	  && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
 	  && flag_strict_volatile_bitfields > 0)
-	mode = GET_MODE (op0);
+	{
+	  unsigned int modesize;
+
+	  mode = GET_MODE (op0);
+
+	  /* Check for oversized or unaligned field that we can't write
+	     with a single access of the required type, and fall back to
+	     the default behavior.  */
+	  modesize = GET_MODE_BITSIZE (mode);
+	  if (bitnum % BITS_PER_UNIT + bitsize > modesize
+	      || (STRICT_ALIGNMENT
+		  && bitnum % GET_MODE_ALIGNMENT (mode) + bitsize > modesize))
+	    mode = get_best_mode (bitsize, bitnum, 0, 0,
+				  MEM_ALIGN (op0), word_mode, true);
+	}
       else
-	mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
-			      MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+	{
+	  mode = GET_MODE (op0);
+	  if (GET_MODE_BITSIZE (mode) == 0
+	      || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode))
+	    mode = word_mode;
+
+	  mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
+				MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+	}
 
       if (mode == VOIDmode)
 	{
@@ -1128,7 +1144,7 @@  store_split_bit_field (rtx op0, unsigned
 		 endianness compensation) to fetch the piece we want.  */
 	      part = extract_fixed_bit_field (word_mode, value, thissize,
 					      total_bits - bitsize + bitsdone,
-					      NULL_RTX, 1, false);
+					      NULL_RTX, 1);
 	    }
 	}
       else
@@ -1140,7 +1156,7 @@  store_split_bit_field (rtx op0, unsigned
 			    & (((HOST_WIDE_INT) 1 << thissize) - 1));
 	  else
 	    part = extract_fixed_bit_field (word_mode, value, thissize,
-					    bitsdone, NULL_RTX, 1, false);
+					    bitsdone, NULL_RTX, 1);
 	}
 
       /* If OP0 is a register, then handle OFFSET here.
@@ -1301,8 +1317,7 @@  extract_bit_field_using_extv (const extr
 
 static rtx
 extract_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
-		     unsigned HOST_WIDE_INT bitnum,
-		     int unsignedp, bool packedp, rtx target,
+		     unsigned HOST_WIDE_INT bitnum, int unsignedp, rtx target,
 		     enum machine_mode mode, enum machine_mode tmode,
 		     bool fallback_p)
 {
@@ -1430,24 +1445,35 @@  extract_bit_field_1 (rtx str_rtx, unsign
      If that's wrong, the solution is to test for it and set TARGET to 0
      if needed.  */
 
-  /* If the bitfield is volatile, we need to make sure the access
-     remains on a type-aligned boundary.  */
+  /* Get the mode of the field to use for atomic access or subreg
+     conversion.  */
+  mode1 = mode;
+
+  /* For the -fstrict-volatile-bitfields case, we must use the mode of the
+     field instead.  */
   if (GET_CODE (op0) == MEM
       && MEM_VOLATILE_P (op0)
       && GET_MODE_BITSIZE (GET_MODE (op0)) > 0
       && flag_strict_volatile_bitfields > 0)
-    goto no_subreg_mode_swap;
+    {
+      if (GET_MODE_BITSIZE (GET_MODE (op0)) > 0)
+	mode1 = GET_MODE (op0);
+      else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0)
+	mode1 = GET_MODE (target);
+      else
+	mode1 = tmode;
+    }
 
   /* Only scalar integer modes can be converted via subregs.  There is an
      additional problem for FP modes here in that they can have a precision
      which is different from the size.  mode_for_size uses precision, but
      we want a mode based on the size, so we must avoid calling it for FP
      modes.  */
-  mode1 = mode;
-  if (SCALAR_INT_MODE_P (tmode))
+  else if (SCALAR_INT_MODE_P (tmode))
     {
       enum machine_mode try_mode = mode_for_size (bitsize,
 						  GET_MODE_CLASS (tmode), 0);
+
       if (try_mode != BLKmode)
 	mode1 = try_mode;
     }
@@ -1475,8 +1501,6 @@  extract_bit_field_1 (rtx str_rtx, unsign
       return convert_extracted_bit_field (op0, mode, tmode, unsignedp);
     }
 
- no_subreg_mode_swap:
-
   /* Handle fields bigger than a word.  */
 
   if (bitsize > BITS_PER_WORD)
@@ -1517,7 +1541,7 @@  extract_bit_field_1 (rtx str_rtx, unsign
 	  rtx result_part
 	    = extract_bit_field_1 (op0, MIN (BITS_PER_WORD,
 					     bitsize - i * BITS_PER_WORD),
-				   bitnum + bit_offset, 1, false, target_part,
+				   bitnum + bit_offset, 1, target_part,
 				   mode, word_mode, fallback_p);
 
 	  gcc_assert (target_part);
@@ -1593,14 +1617,14 @@  extract_bit_field_1 (rtx str_rtx, unsign
     }
 
   /* If OP0 is a memory, try copying it to a register and seeing if a
-     cheap register alternative is available.  */
-  if (MEM_P (op0))
+     cheap register alternative is available.  Do not try these tricks if
+     -fstrict-volatile-bitfields is in effect, since they may not respect
+     the mode of the access.  */
+  if (MEM_P (op0)
+      && !(MEM_VOLATILE_P (op0) && flag_strict_volatile_bitfields > 0))
     {
-      /* Do not use extv/extzv for volatile bitfields when
-         -fstrict-volatile-bitfields is in effect.  */
-      if (!(MEM_VOLATILE_P (op0) && flag_strict_volatile_bitfields > 0)
-	  && get_best_mem_extraction_insn (&extv, pattern, bitsize, bitnum,
-					   tmode))
+      if (get_best_mem_extraction_insn (&extv, pattern, bitsize, bitnum,
+					tmode))
 	{
 	  rtx result = extract_bit_field_using_extv (&extv, op0, bitsize,
 						     bitnum, unsignedp,
@@ -1621,7 +1645,7 @@  extract_bit_field_1 (rtx str_rtx, unsign
 	{
 	  xop0 = copy_to_reg (xop0);
 	  rtx result = extract_bit_field_1 (xop0, bitsize, bitpos,
-					    unsignedp, packedp, target,
+					    unsignedp, target,
 					    mode, tmode, false);
 	  if (result)
 	    return result;
@@ -1641,7 +1665,7 @@  extract_bit_field_1 (rtx str_rtx, unsign
   gcc_assert (int_mode != BLKmode);
 
   target = extract_fixed_bit_field (int_mode, op0, bitsize, bitnum,
-				    target, unsignedp, packedp);
+				    target, unsignedp);
   return convert_extracted_bit_field (target, mode, tmode, unsignedp);
 }
 
@@ -1652,7 +1676,6 @@  extract_bit_field_1 (rtx str_rtx, unsign
 
    STR_RTX is the structure containing the byte (a REG or MEM).
    UNSIGNEDP is nonzero if this is an unsigned bit field.
-   PACKEDP is nonzero if the field has the packed attribute.
    MODE is the natural mode of the field value once extracted.
    TMODE is the mode the caller would like the value to have;
    but the value may be returned with type MODE instead.
@@ -1664,10 +1687,10 @@  extract_bit_field_1 (rtx str_rtx, unsign
 
 rtx
 extract_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
-		   unsigned HOST_WIDE_INT bitnum, int unsignedp, bool packedp,
-		   rtx target, enum machine_mode mode, enum machine_mode tmode)
+		   unsigned HOST_WIDE_INT bitnum, int unsignedp, rtx target,
+		   enum machine_mode mode, enum machine_mode tmode)
 {
-  return extract_bit_field_1 (str_rtx, bitsize, bitnum, unsignedp, packedp,
+  return extract_bit_field_1 (str_rtx, bitsize, bitnum, unsignedp,
 			      target, mode, tmode, true);
 }
 
@@ -1675,8 +1698,6 @@  extract_bit_field (rtx str_rtx, unsigned
    from bit BITNUM of OP0.
 
    UNSIGNEDP is nonzero for an unsigned bit field (don't sign-extend value).
-   PACKEDP is true if the field has the packed attribute.
-
    If TARGET is nonzero, attempts to store the value there
    and return TARGET, but this is not guaranteed.
    If TARGET is not used, create a pseudo-reg of mode TMODE for the value.  */
@@ -1685,7 +1706,7 @@  static rtx
 extract_fixed_bit_field (enum machine_mode tmode, rtx op0,
 			 unsigned HOST_WIDE_INT bitsize,
 			 unsigned HOST_WIDE_INT bitnum, rtx target,
-			 int unsignedp, bool packedp)
+			 int unsignedp)
 {
   enum machine_mode mode;
 
@@ -1698,12 +1719,24 @@  extract_fixed_bit_field (enum machine_mo
       if (MEM_VOLATILE_P (op0)
 	  && flag_strict_volatile_bitfields > 0)
 	{
+	  unsigned int modesize;
+
 	  if (GET_MODE_BITSIZE (GET_MODE (op0)) > 0)
 	    mode = GET_MODE (op0);
 	  else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0)
 	    mode = GET_MODE (target);
 	  else
 	    mode = tmode;
+
+	  /* Check for oversized or unaligned field that we can't read
+	     with a single access of the required type, and fall back to
+	     the default behavior.  */
+	  modesize = GET_MODE_BITSIZE (mode);
+	  if (bitnum % BITS_PER_UNIT + bitsize > modesize
+	      || (STRICT_ALIGNMENT
+		  && bitnum % GET_MODE_ALIGNMENT (mode) + bitsize > modesize))
+	    mode = get_best_mode (bitsize, bitnum, 0, 0,
+				  MEM_ALIGN (op0), word_mode, true);
 	}
       else
 	mode = get_best_mode (bitsize, bitnum, 0, 0,
@@ -1714,61 +1747,7 @@  extract_fixed_bit_field (enum machine_mo
 	   boundaries.  */
 	return extract_split_bit_field (op0, bitsize, bitnum, unsignedp);
 
-      unsigned int total_bits = GET_MODE_BITSIZE (mode);
-      HOST_WIDE_INT bit_offset = bitnum - bitnum % total_bits;
-
-      /* If we're accessing a volatile MEM, we can't apply BIT_OFFSET
-	 if it results in a multi-word access where we otherwise wouldn't
-	 have one.  So, check for that case here.  */
-      if (MEM_P (op0)
-	  && MEM_VOLATILE_P (op0)
-	  && flag_strict_volatile_bitfields > 0
-	  && bitnum % BITS_PER_UNIT + bitsize <= total_bits
-	  && bitnum % GET_MODE_BITSIZE (mode) + bitsize > total_bits)
-	{
-	  if (STRICT_ALIGNMENT)
-	    {
-	      static bool informed_about_misalignment = false;
-
-	      if (packedp)
-		{
-		  if (bitsize == total_bits)
-		    warning_at (input_location, OPT_fstrict_volatile_bitfields,
-				"multiple accesses to volatile structure"
-				" member because of packed attribute");
-		  else
-		    warning_at (input_location, OPT_fstrict_volatile_bitfields,
-				"multiple accesses to volatile structure"
-				" bitfield because of packed attribute");
-
-		  return extract_split_bit_field (op0, bitsize, bitnum,
-						  unsignedp);
-		}
-
-	      if (bitsize == total_bits)
-		warning_at (input_location, OPT_fstrict_volatile_bitfields,
-			    "mis-aligned access used for structure member");
-	      else
-		warning_at (input_location, OPT_fstrict_volatile_bitfields,
-			    "mis-aligned access used for structure bitfield");
-
-	      if (! informed_about_misalignment)
-		{
-		  informed_about_misalignment = true;
-		  inform (input_location,
-			  "when a volatile object spans multiple type-sized"
-			  " locations, the compiler must choose between using"
-			  " a single mis-aligned access to preserve the"
-			  " volatility, or using multiple aligned accesses"
-			  " to avoid runtime faults; this code may fail at"
-			  " runtime if the hardware does not allow this"
-			  " access");
-		}
-	    }
-	  bit_offset = bitnum - bitnum % BITS_PER_UNIT;
-	}
-      op0 = adjust_bitfield_address (op0, mode, bit_offset / BITS_PER_UNIT);
-      bitnum -= bit_offset;
+      op0 = narrow_bit_field_mem (op0, mode, bitsize, bitnum, &bitnum);
     }
 
   mode = GET_MODE (op0);
@@ -1939,7 +1918,7 @@  extract_split_bit_field (rtx op0, unsign
 	 whose meaning is determined by BYTES_PER_UNIT.
 	 OFFSET is in UNITs, and UNIT is in bits.  */
       part = extract_fixed_bit_field (word_mode, word, thissize,
-				      offset * unit + thispos, 0, 1, false);
+				      offset * unit + thispos, 0, 1);
       bitsdone += thissize;
 
       /* Shift this part into place for the result.  */
Index: gcc/expr.c
===================================================================
--- gcc/expr.c	(revision 199963)
+++ gcc/expr.c	(working copy)
@@ -1704,7 +1704,7 @@  emit_group_load_1 (rtx *tmps, rtx dst, r
 		  && (!REG_P (tmps[i]) || GET_MODE (tmps[i]) != mode))
 		tmps[i] = extract_bit_field (tmps[i], bytelen * BITS_PER_UNIT,
 					     (bytepos % slen0) * BITS_PER_UNIT,
-					     1, false, NULL_RTX, mode, mode);
+					     1, NULL_RTX, mode, mode);
 	    }
 	  else
 	    {
@@ -1714,7 +1714,7 @@  emit_group_load_1 (rtx *tmps, rtx dst, r
 	      mem = assign_stack_temp (GET_MODE (src), slen);
 	      emit_move_insn (mem, src);
 	      tmps[i] = extract_bit_field (mem, bytelen * BITS_PER_UNIT,
-					   0, 1, false, NULL_RTX, mode, mode);
+					   0, 1, NULL_RTX, mode, mode);
 	    }
 	}
       /* FIXME: A SIMD parallel will eventually lead to a subreg of a
@@ -1755,7 +1755,7 @@  emit_group_load_1 (rtx *tmps, rtx dst, r
 	tmps[i] = src;
       else
 	tmps[i] = extract_bit_field (src, bytelen * BITS_PER_UNIT,
-				     bytepos * BITS_PER_UNIT, 1, false, NULL_RTX,
+				     bytepos * BITS_PER_UNIT, 1, NULL_RTX,
 				     mode, mode);
 
       if (shift)
@@ -2198,7 +2198,7 @@  copy_blkmode_from_reg (rtx target, rtx s
 	 bitpos for the destination store (left justified).  */
       store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, 0, copy_mode,
 		       extract_bit_field (src, bitsize,
-					  xbitpos % BITS_PER_WORD, 1, false,
+					  xbitpos % BITS_PER_WORD, 1,
 					  NULL_RTX, copy_mode, copy_mode));
     }
 }
@@ -2275,7 +2275,7 @@  copy_blkmode_to_reg (enum machine_mode m
       store_bit_field (dst_word, bitsize, xbitpos % BITS_PER_WORD,
 		       0, 0, word_mode,
 		       extract_bit_field (src_word, bitsize,
-					  bitpos % BITS_PER_WORD, 1, false,
+					  bitpos % BITS_PER_WORD, 1,
 					  NULL_RTX, word_mode, word_mode));
     }
 
@@ -3020,7 +3020,7 @@  read_complex_part (rtx cplx, bool imag_p
     }
 
   return extract_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0,
-			    true, false, NULL_RTX, imode, imode);
+			    true, NULL_RTX, imode, imode);
 }
 
 /* A subroutine of emit_move_insn_1.  Yet another lowpart generator.
@@ -4508,6 +4508,16 @@  get_bit_range (unsigned HOST_WIDE_INT *b
 
   gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
 
+  /* If -fstrict-volatile-bitfields was given and this is a volatile
+     access, then we must ignore any DECL_BIT_FIELD_REPRESENTATIVE and
+     do the access in the mode of the field.  */
+  if (TREE_THIS_VOLATILE (exp)
+      && flag_strict_volatile_bitfields > 0)
+    {
+      *bitstart = *bitend = 0;
+      return;
+    }
+
   field = TREE_OPERAND (exp, 1);
   repr = DECL_BIT_FIELD_REPRESENTATIVE (field);
   /* If we do not have a DECL_BIT_FIELD_REPRESENTATIVE there is no
@@ -6502,7 +6512,7 @@  store_field (rtx target, HOST_WIDE_INT b
 	      temp_target = gen_reg_rtx (mode);
 	      temp_target
 	        = extract_bit_field (temp, size * BITS_PER_UNIT, 0, 1,
-				     false, temp_target, mode, mode);
+				     temp_target, mode, mode);
 	      temp = temp_target;
 	    }
 	}
@@ -9695,8 +9705,8 @@  expand_expr_real_1 (tree exp, rtx target
 	    else if (SLOW_UNALIGNED_ACCESS (mode, align))
 	      temp = extract_bit_field (temp, GET_MODE_BITSIZE (mode),
 					0, TYPE_UNSIGNED (TREE_TYPE (exp)),
-					true, (modifier == EXPAND_STACK_PARM
-					       ? NULL_RTX : target),
+					(modifier == EXPAND_STACK_PARM
+					 ? NULL_RTX : target),
 					mode, mode);
 	  }
 	return temp;
@@ -9890,7 +9900,6 @@  expand_expr_real_1 (tree exp, rtx target
 	HOST_WIDE_INT bitsize, bitpos;
 	tree offset;
 	int volatilep = 0, must_force_mem;
-	bool packedp = false;
 	tree tem = get_inner_reference (exp, &bitsize, &bitpos, &offset,
 					&mode1, &unsignedp, &volatilep, true);
 	rtx orig_op0, memloc;
@@ -9901,11 +9910,6 @@  expand_expr_real_1 (tree exp, rtx target
 	   infinitely recurse.  */
 	gcc_assert (tem != exp);
 
-	if (TYPE_PACKED (TREE_TYPE (TREE_OPERAND (exp, 0)))
-	    || (TREE_CODE (TREE_OPERAND (exp, 1)) == FIELD_DECL
-		&& DECL_PACKED (TREE_OPERAND (exp, 1))))
-	  packedp = true;
-
 	/* If TEM's type is a union of variable size, pass TARGET to the inner
 	   computation, since it will need a temporary and TARGET is known
 	   to have to do.  This occurs in unchecked conversion in Ada.  */
@@ -10133,7 +10137,7 @@  expand_expr_real_1 (tree exp, rtx target
 	    if (MEM_P (op0) && REG_P (XEXP (op0, 0)))
 	      mark_reg_pointer (XEXP (op0, 0), MEM_ALIGN (op0));
 
-	    op0 = extract_bit_field (op0, bitsize, bitpos, unsignedp, packedp,
+	    op0 = extract_bit_field (op0, bitsize, bitpos, unsignedp,
 				     (modifier == EXPAND_STACK_PARM
 				      ? NULL_RTX : target),
 				     ext_mode, ext_mode);
Index: gcc/calls.c
===================================================================
--- gcc/calls.c	(revision 199963)
+++ gcc/calls.c	(working copy)
@@ -1026,7 +1026,7 @@  store_unaligned_arguments_into_pseudos (
 	    int bitsize = MIN (bytes * BITS_PER_UNIT, BITS_PER_WORD);
 
 	    args[i].aligned_regs[j] = reg;
-	    word = extract_bit_field (word, bitsize, 0, 1, false, NULL_RTX,
+	    word = extract_bit_field (word, bitsize, 0, 1, NULL_RTX,
 				      word_mode, word_mode);
 
 	    /* There is no need to restrict this code to loading items
Index: gcc/config/tilegx/tilegx.c
===================================================================
--- gcc/config/tilegx/tilegx.c	(revision 199963)
+++ gcc/config/tilegx/tilegx.c	(working copy)
@@ -1872,7 +1872,7 @@  tilegx_expand_unaligned_load (rtx dest_r
       rtx extracted =
 	extract_bit_field (gen_lowpart (DImode, wide_result),
 			   bitsize, bit_offset % BITS_PER_UNIT,
-			   !sign, false, gen_lowpart (DImode, dest_reg),
+			   !sign, gen_lowpart (DImode, dest_reg),
 			   DImode, DImode);
 
       if (extracted != dest_reg)
Index: gcc/config/tilepro/tilepro.c
===================================================================
--- gcc/config/tilepro/tilepro.c	(revision 199963)
+++ gcc/config/tilepro/tilepro.c	(working copy)
@@ -1676,7 +1676,7 @@  tilepro_expand_unaligned_load (rtx dest_
       rtx extracted =
 	extract_bit_field (gen_lowpart (SImode, wide_result),
 			   bitsize, bit_offset % BITS_PER_UNIT,
-			   !sign, false, gen_lowpart (SImode, dest_reg),
+			   !sign, gen_lowpart (SImode, dest_reg),
 			   SImode, SImode);
 
       if (extracted != dest_reg)
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 199963)
+++ gcc/doc/invoke.texi	(working copy)
@@ -20887,16 +20887,16 @@  instruction, even though that accesses b
 any portion of the bit-field, or memory-mapped registers unrelated to
 the one being updated.
 
-If the target requires strict alignment, and honoring the field
-type would require violating this alignment, a warning is issued.
-If the field has @code{packed} attribute, the access is done without
-honoring the field type.  If the field doesn't have @code{packed}
-attribute, the access is done honoring the field type.  In both cases,
-GCC assumes that the user knows something about the target hardware
-that it is unaware of.
+In some cases, such as when the @code{packed} attribute is applied to a 
+structure field, it may not be possible to access the field with a single
+read or write that is correctly aligned for the target machine.  In this
+case GCC falls back to generating multiple accesses rather than code that 
+will fault or truncate the result at run time, regardless of whether
+@option{-fstrict-volatile-bitfields} is in effect.
 
 The default value of this option is determined by the application binary
-interface for the target processor.
+interface for the target processor.  In particular, on ARM targets using
+AAPCS, @option{-fstrict-volatile-bitfields} is the default.
 
 @item -fsync-libcalls
 @opindex fsync-libcalls
Index: gcc/testsuite/gcc.dg/pr23623.c
===================================================================
--- gcc/testsuite/gcc.dg/pr23623.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr23623.c	(revision 0)
@@ -0,0 +1,45 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrict-volatile-bitfields -fdump-rtl-final" } */
+
+/* With -fstrict-volatile-bitfields, the volatile accesses to bf2.b
+   and bf3.b must do unsigned int reads/writes.  The non-volatile
+   accesses to bf1.b are not so constrained.  */
+
+extern struct
+{
+  unsigned int b : 1;
+} bf1;
+
+extern volatile struct
+{
+  unsigned int b : 1;
+} bf2;
+
+extern struct
+{
+  volatile unsigned int b : 1;
+} bf3;
+
+void writeb(void)
+{
+  bf1.b = 1;
+  bf2.b = 1;	/* volatile read + volatile write */
+  bf3.b = 1;	/* volatile read + volatile write */
+}
+
+extern unsigned int x1, x2, x3;
+
+void readb(void)
+{
+  x1 = bf1.b;
+  x2 = bf2.b;   /* volatile write */
+  x3 = bf3.b;   /* volatile write */
+}
+
+/* There should be 6 volatile MEMs total, but scan-rtl-dump-times counts
+   the number of match variables and not the number of matches.  Since
+   the parenthesized subexpression in the regexp introduces an extra match
+   variable, we need to give a count of 12 instead of 6 here.  */
+/* { dg-final { scan-rtl-dump-times "mem/v(/.)*:SI" 12 "final" } } */
+/* { dg-final { cleanup-rtl-dump "final" } } */
+
Index: gcc/testsuite/gcc.dg/pr48784-1.c
===================================================================
--- gcc/testsuite/gcc.dg/pr48784-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr48784-1.c	(revision 0)
@@ -0,0 +1,18 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrict-volatile-bitfields" } */
+
+#include <stdlib.h>
+
+#pragma pack(1)
+volatile struct S0 {
+   signed a : 7;
+   unsigned b : 28;  /* b can't be fetched with an aligned 32-bit access, */
+                     /* but it certainly can be fetched with an unaligned access */
+} g = {0,0xfffffff};
+
+int main() {
+  unsigned b = g.b;
+  if (b != 0xfffffff)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr48784-2.c
===================================================================
--- gcc/testsuite/gcc.dg/pr48784-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr48784-2.c	(revision 0)
@@ -0,0 +1,18 @@ 
+/* { dg-do run } */
+/* { dg-options "-fno-strict-volatile-bitfields" } */
+
+#include <stdlib.h>
+
+#pragma pack(1)
+volatile struct S0 {
+   signed a : 7;
+   unsigned b : 28;  /* b can't be fetched with an aligned 32-bit access, */
+                     /* but it certainly can be fetched with an unaligned access */
+} g = {0,0xfffffff};
+
+int main() {
+  unsigned b = g.b;
+  if (b != 0xfffffff)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr56341-1.c
===================================================================
--- gcc/testsuite/gcc.dg/pr56341-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr56341-1.c	(revision 0)
@@ -0,0 +1,41 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrict-volatile-bitfields" } */
+
+#include <stdlib.h>
+#include <stdio.h>
+
+struct test0
+{
+  unsigned char b1[2];
+} __attribute__((packed, aligned(2)));
+
+struct test1
+{
+  volatile unsigned long a1;
+  unsigned char b1[4];
+} __attribute__((packed, aligned(2)));
+
+struct test2
+{
+  struct test0 t0;
+  struct test1 t1;
+  struct test0 t2;
+} __attribute__((packed, aligned(2)));
+
+struct test2 xx;
+struct test2 *x1 = &xx;
+
+#define MAGIC 0x12345678
+
+void test0 (struct test2* x1)
+{
+  x1->t1.a1 = MAGIC;
+}
+
+int main()
+{
+  test0 (x1);
+  if (xx.t1.a1 != MAGIC)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr56341-2.c
===================================================================
--- gcc/testsuite/gcc.dg/pr56341-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr56341-2.c	(revision 0)
@@ -0,0 +1,41 @@ 
+/* { dg-do run } */
+/* { dg-options "-fno-strict-volatile-bitfields" } */
+
+#include <stdlib.h>
+#include <stdio.h>
+
+struct test0
+{
+  unsigned char b1[2];
+} __attribute__((packed, aligned(2)));
+
+struct test1
+{
+  volatile unsigned long a1;
+  unsigned char b1[4];
+} __attribute__((packed, aligned(2)));
+
+struct test2
+{
+  struct test0 t0;
+  struct test1 t1;
+  struct test0 t2;
+} __attribute__((packed, aligned(2)));
+
+struct test2 xx;
+struct test2 *x1 = &xx;
+
+#define MAGIC 0x12345678
+
+void test0 (struct test2* x1)
+{
+  x1->t1.a1 = MAGIC;
+}
+
+int main()
+{
+  test0 (x1);
+  if (xx.t1.a1 != MAGIC)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr56997-1.c
===================================================================
--- gcc/testsuite/gcc.dg/pr56997-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr56997-1.c	(revision 0)
@@ -0,0 +1,46 @@ 
+/* Test volatile access to unaligned field.  */
+/* { dg-do run } */
+/* { dg-options "-fstrict-volatile-bitfields" } */
+
+#include <string.h>
+#include <stdlib.h>
+#include <stdint.h>
+
+#define test_type uint16_t
+#define MAGIC 0x102u
+
+typedef struct s{
+ unsigned char Prefix;
+ test_type Type;
+}__attribute((__packed__)) ss;
+
+volatile ss v;
+ss g;
+
+void __attribute__((noinline))
+foo (test_type u)
+{
+  v.Type = u;
+}
+
+test_type __attribute__((noinline))
+bar (void)
+{
+  return v.Type;
+}
+
+int main()
+{
+  test_type temp;
+  foo(MAGIC);
+  memcpy(&g, (void *)&v, sizeof(g));
+  if (g.Type != MAGIC)
+    abort ();
+
+  g.Type = MAGIC;
+  memcpy((void *)&v, &g, sizeof(v));
+  temp = bar();
+  if (temp != MAGIC)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr56997-2.c
===================================================================
--- gcc/testsuite/gcc.dg/pr56997-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr56997-2.c	(revision 0)
@@ -0,0 +1,46 @@ 
+/* Test volatile access to unaligned field.  */
+/* { dg-do run } */
+/* { dg-options "-fstrict-volatile-bitfields" } */
+
+#include <string.h>
+#include <stdlib.h>
+#include <stdint.h>
+
+#define test_type uint32_t
+#define MAGIC 0x1020304u
+
+typedef struct s{
+ unsigned char Prefix;
+ test_type Type;
+}__attribute((__packed__)) ss;
+
+volatile ss v;
+ss g;
+
+void __attribute__((noinline))
+foo (test_type u)
+{
+  v.Type = u;
+}
+
+test_type __attribute__((noinline))
+bar (void)
+{
+  return v.Type;
+}
+
+int main()
+{
+  test_type temp;
+  foo(MAGIC);
+  memcpy(&g, (void *)&v, sizeof(g));
+  if (g.Type != MAGIC)
+    abort ();
+
+  g.Type = MAGIC;
+  memcpy((void *)&v, &g, sizeof(v));
+  temp = bar();
+  if (temp != MAGIC)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr56997-3.c
===================================================================
--- gcc/testsuite/gcc.dg/pr56997-3.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr56997-3.c	(revision 0)
@@ -0,0 +1,46 @@ 
+/* Test volatile access to unaligned field.  */
+/* { dg-do run } */
+/* { dg-options "-fstrict-volatile-bitfields" } */
+
+#include <string.h>
+#include <stdlib.h>
+#include <stdint.h>
+
+#define test_type uint64_t
+#define MAGIC 0x102030405060708ul
+
+typedef struct s{
+ unsigned char Prefix;
+ test_type Type;
+}__attribute((__packed__)) ss;
+
+volatile ss v;
+ss g;
+
+void __attribute__((noinline))
+foo (test_type u)
+{
+  v.Type = u;
+}
+
+test_type __attribute__((noinline))
+bar (void)
+{
+  return v.Type;
+}
+
+int main()
+{
+  test_type temp;
+  foo(MAGIC);
+  memcpy(&g, (void *)&v, sizeof(g));
+  if (g.Type != MAGIC)
+    abort ();
+
+  g.Type = MAGIC;
+  memcpy((void *)&v, &g, sizeof(v));
+  temp = bar();
+  if (temp != MAGIC)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/volatile-bitfields-3.c
===================================================================
--- gcc/testsuite/gcc.dg/volatile-bitfields-3.c	(revision 0)
+++ gcc/testsuite/gcc.dg/volatile-bitfields-3.c	(revision 0)
@@ -0,0 +1,42 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrict-volatile-bitfields -fdump-rtl-final" } */
+
+/* On x86_64, this test case used to incorrectly generate SImode
+   accesses to the field y instead of HImode.  */
+
+#include <stdint.h>
+
+struct x
+{
+  int32_t x: 8;
+  char : 0;
+  int16_t y : 8;
+} z;
+
+int foo1 (volatile struct x *p)
+{
+  return p->x;  /* SImode read */
+}
+
+void foo2 (volatile struct x *p, int i)
+{
+  p->x = i;	/* SImode read + write */
+}
+
+int bar1 (volatile struct x *p)
+{
+  return p->y;  /* HImode read */
+}
+
+void bar2 (volatile struct x *p, int i)
+{
+  p->y = i;	/* HImode read + write */
+}
+
+/* There should be 3 SImode and 3 HImode volatile MEMs, but
+   scan-rtl-dump-times counts the number of match variables and not the
+   number of matches.  Since the parenthesized subexpression in the regexp
+   introduces an extra match variable, we need to double the count.  */
+/* { dg-final { scan-rtl-dump-times "mem/v(/.)*:SI" 6 "final" } } */
+/* { dg-final { scan-rtl-dump-times "mem/v(/.)*:HI" 6 "final" } } */
+/* { dg-final { cleanup-rtl-dump "final" } } */