Message ID | 52463CA4.7060303@codesourcery.com |
---|---|
State | New |
Headers | show |
On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore <sandra@codesourcery.com> wrote: > This patch fixes various -fstrict-volatile-bitfields wrong-code bugs, > including instances where bitfield values were being quietly truncated as > well as bugs relating to using the wrong width. The code changes are > identical to those in the previous version of this patch series (originally > posted in June and re-pinged many times since then), but for this version I > have cleaned up the test cases to remove dependencies on header files, as > Bernd requested. Ok. Thanks, Richard. > -Sandra >
On 10/18/2013 04:50 AM, Richard Biener wrote: > On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore > <sandra@codesourcery.com> wrote: >> This patch fixes various -fstrict-volatile-bitfields wrong-code bugs, >> including instances where bitfield values were being quietly truncated as >> well as bugs relating to using the wrong width. The code changes are >> identical to those in the previous version of this patch series (originally >> posted in June and re-pinged many times since then), but for this version I >> have cleaned up the test cases to remove dependencies on header files, as >> Bernd requested. > > Ok. Just to clarify, is this approval unconditional, independent of part 2 and other patches or changes still under active discussion? -Sandra
Sandra Loosemore <sandra@codesourcery.com> wrote: >On 10/18/2013 04:50 AM, Richard Biener wrote: >> On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore >> <sandra@codesourcery.com> wrote: >>> This patch fixes various -fstrict-volatile-bitfields wrong-code >bugs, >>> including instances where bitfield values were being quietly >truncated as >>> well as bugs relating to using the wrong width. The code changes >are >>> identical to those in the previous version of this patch series >(originally >>> posted in June and re-pinged many times since then), but for this >version I >>> have cleaned up the test cases to remove dependencies on header >files, as >>> Bernd requested. >> >> Ok. > >Just to clarify, is this approval unconditional, independent of part 2 >and other patches or changes still under active discussion? Yes. Richard. >-Sandra
On 10/18/2013 10:38 AM, Richard Biener wrote: > Sandra Loosemore <sandra@codesourcery.com> wrote: >> On 10/18/2013 04:50 AM, Richard Biener wrote: >>> On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore >>> <sandra@codesourcery.com> wrote: >>>> This patch fixes various -fstrict-volatile-bitfields wrong-code >> bugs, >>>> including instances where bitfield values were being quietly >> truncated as >>>> well as bugs relating to using the wrong width. The code changes >> are >>>> identical to those in the previous version of this patch series >> (originally >>>> posted in June and re-pinged many times since then), but for this >> version I >>>> have cleaned up the test cases to remove dependencies on header >> files, as >>>> Bernd requested. >>> >>> Ok. >> >> Just to clarify, is this approval unconditional, independent of part 2 >> and other patches or changes still under active discussion? > > Yes. Hrmmmm. After some further testing, I'm afraid this patch is still buggy. :-( I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new pr56997-1.c testcase, it got stuck in infinite recursion between store_split_bit_field/store_fixed_bit_field and/or extract_split_bit_field/extract_fixed_bit_field. This didn't show up in my previous mainline testing. The difference between 4.8 and mainline head is the alignment of the incoming str_rtx passed to store_bit_field/extract_bit_field, due to the changes in r199898. The alignment is 8 bits on mainline, and 32 on 4.8. It seems to me that the bitfield code ought to handle store/extract from a more-aligned object and it's probably possible to construct an example that fails in this way on mainline as well. It looks like there are conflicting assumptions between get_best_mode, the places that call it in store_fixed_bit_field and extract_fixed_bit_field, and the code that actually does the splitting -- which uses a unit based on the MEM_ALIGN of the incoming rtx, rather than its mode. In the case where it's failing, get_best_mode is deciding it can't do a HImode access without splitting, but then the split code is assuming SImode units because of the 32-bit alignment, but not actually changing the mode of the rtx to match that. On top of that, this is one of the cases that strict_volatile_bitfield_p checks for and returns false, but the callers of store_bit_field and extract_bit_field in expr.c have already fiddled with the mode of the incoming rtx assuming that -fstrict-volatile-bitfields does apply. It doesn't get into that infinite recursion if it's compiled with -fno-strict-volatile-bitfields instead; in that case the incoming rtx has BLKmode, get_best_mode chooses SImode, and it's able to do the access without splitting at all. Anyway.... I tried a couple different bandaids that solved the infinite recursion problem but caused regressions elsewhere, and now I'm not sure of the right place to fix this. Given that there is also still ongoing discussion about making this a 3-way switch (etc) I am going to hold off on committing this patch for now. -Sandra
Hi, On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote: > Hrmmmm. After some further testing, I'm afraid this patch is still > buggy. :-( > > I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new > pr56997-1.c testcase, it got stuck in infinite recursion between > store_split_bit_field/store_fixed_bit_field and/or > extract_split_bit_field/extract_fixed_bit_field. This didn't show up in > my previous mainline testing. > > The difference between 4.8 and mainline head is the alignment of the > incoming str_rtx passed to store_bit_field/extract_bit_field, due to the > changes in r199898. The alignment is 8 bits on mainline, and 32 on 4.8. > It seems to me that the bitfield code ought to handle store/extract > from a more-aligned object and it's probably possible to construct an > example that fails in this way on mainline as well. > > It looks like there are conflicting assumptions between get_best_mode, > the places that call it in store_fixed_bit_field and > extract_fixed_bit_field, and the code that actually does the splitting > -- which uses a unit based on the MEM_ALIGN of the incoming rtx, rather > than its mode. In the case where it's failing, get_best_mode is > deciding it can't do a HImode access without splitting, but then the > split code is assuming SImode units because of the 32-bit alignment, but > not actually changing the mode of the rtx to match that. > > On top of that, this is one of the cases that strict_volatile_bitfield_p > checks for and returns false, but the callers of store_bit_field and > extract_bit_field in expr.c have already fiddled with the mode of the > incoming rtx assuming that -fstrict-volatile-bitfields does apply. It > doesn't get into that infinite recursion if it's compiled with > -fno-strict-volatile-bitfields instead; in that case the incoming rtx > has BLKmode, get_best_mode chooses SImode, and it's able to do the > access without splitting at all. > > Anyway.... I tried a couple different bandaids that solved the infinite > recursion problem but caused regressions elsewhere, and now I'm not sure > of the right place to fix this. Given that there is also still ongoing > discussion about making this a 3-way switch (etc) I am going to hold off > on committing this patch for now. > > -Sandra > You are right, if I modify pr56997-1 the patch crashes on trunk: typedef struct s{ char Prefix; short Type; }__attribute__((packed,aligned(4))) ss; please add a test case for this. This way we have op0 aligned 32 and HI mode selected bitoffset=8 numbits=16. crashes only when reading this value, because the access tries to use SImode. For some reason the alignment seems to be wrong in 4.8. Thanks Bernd.
Well, one more point where the current patch is probably wrong: the AAPCS states that for volatile bit-field access: "For a write operation the read must always occur even if the entire contents of the container will be replaced" that means struct s { volatile int a:32; } ss; ss.a=1; //needs to read the value exactly once and write the new value. currently we just store. Bernd.
Hi Richard/Joseph, I noticed, this test case crashes on arm-eabi already witout the patch. extern void abort (void); #define test_type unsigned short #define MAGIC (unsigned short)0x102u typedef struct s{ unsigned char Prefix[1]; test_type Type; }__attribute((__packed__,__aligned__(4))) ss; volatile ss v; ss g; void __attribute__((noinline)) foo (test_type u) { v.Type = u; } test_type __attribute__((noinline)) bar (void) { return v.Type; } However when compiled with -fno-strict-volatile-bitfields it does not crash, but AFAIK the generated code for foo() violates the C++ memory model: foo: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r2, .L2 ldr r3, [r2] bic r3, r3, #16711680 bic r3, r3, #65280 orr r3, r3, r0, asl #8 str r3, [r2] bx lr On Intel the generated code uses unaligned access, but is OK for the memory model: foo: .LFB0: .cfi_startproc movw %di, v+1(%rip) ret Am I right, or is the code OK for the Memory model? Regards Bernd.
On Wed, Oct 23, 2013 at 9:11 AM, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote: > Hi Richard/Joseph, > > I noticed, this test case crashes on arm-eabi already witout the patch. > > extern void abort (void); > > #define test_type unsigned short > #define MAGIC (unsigned short)0x102u > > typedef struct s{ > unsigned char Prefix[1]; > test_type Type; > }__attribute((__packed__,__aligned__(4))) ss; > > volatile ss v; > ss g; > > void __attribute__((noinline)) > foo (test_type u) > { > v.Type = u; > } > > test_type __attribute__((noinline)) > bar (void) > { > return v.Type; > } > > > However when compiled with -fno-strict-volatile-bitfields it does not crash, > but AFAIK the generated code for foo() violates the C++ memory model: > > foo: > @ Function supports interworking. > @ args = 0, pretend = 0, frame = 0 > @ frame_needed = 0, uses_anonymous_args = 0 > @ link register save eliminated. > ldr r2, .L2 > ldr r3, [r2] > bic r3, r3, #16711680 > bic r3, r3, #65280 > orr r3, r3, r0, asl #8 > str r3, [r2] > bx lr > > > On Intel the generated code uses unaligned access, but is OK for the memory model: > > foo: > .LFB0: > .cfi_startproc > movw %di, v+1(%rip) > ret > > > Am I right, or is the code OK for the Memory model? The C++ memory model says that you may not introduce a data-race and thus you have to access Type without touching Prefix. Richard. > Regards > Bernd.
Hi, On Wed, 23 Oct 2013 14:37:43, Richard Biener wrote: > The C++ memory model says that you may not introduce a data-race > and thus you have to access Type without touching Prefix. Thanks. Right now I see the following priorities: 1. make -fno-strict-volatile-bitfields respect the C++ memory model => I think, I know how to do that and can prepare a patch for that. This patch should fix the recursion problem that Sandra pointed out. 2. make Sandra's -fstrict-volatile-bitfields aware of the C++ memory model. => that means passing bitregion_start/end to strict_volatile_bitfield_p() and return false if the access is outside. (this is -fstrict-volatile-bitfields=gnu) 3. another patch can add -fstrict-volatile-bitfields=aapcs later, but I don't think we have to hurry for that. Bernd.
Hi Sandra, On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote: > > On 10/18/2013 10:38 AM, Richard Biener wrote: >> Sandra Loosemore <sandra@codesourcery.com> wrote: >>> On 10/18/2013 04:50 AM, Richard Biener wrote: >>>> On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore >>>> <sandra@codesourcery.com> wrote: >>>>> This patch fixes various -fstrict-volatile-bitfields wrong-code >>> bugs, >>>>> including instances where bitfield values were being quietly >>> truncated as >>>>> well as bugs relating to using the wrong width. The code changes >>> are >>>>> identical to those in the previous version of this patch series >>> (originally >>>>> posted in June and re-pinged many times since then), but for this >>> version I >>>>> have cleaned up the test cases to remove dependencies on header >>> files, as >>>>> Bernd requested. >>>> >>>> Ok. >>> >>> Just to clarify, is this approval unconditional, independent of part 2 >>> and other patches or changes still under active discussion? >> >> Yes. > > Hrmmmm. After some further testing, I'm afraid this patch is still > buggy. :-( > > I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new > pr56997-1.c testcase, it got stuck in infinite recursion between > store_split_bit_field/store_fixed_bit_field and/or > extract_split_bit_field/extract_fixed_bit_field. This didn't show up in > my previous mainline testing. > > The difference between 4.8 and mainline head is the alignment of the > incoming str_rtx passed to store_bit_field/extract_bit_field, due to the > changes in r199898. The alignment is 8 bits on mainline, and 32 on 4.8. > It seems to me that the bitfield code ought to handle store/extract > from a more-aligned object and it's probably possible to construct an > example that fails in this way on mainline as well. > > It looks like there are conflicting assumptions between get_best_mode, > the places that call it in store_fixed_bit_field and > extract_fixed_bit_field, and the code that actually does the splitting > -- which uses a unit based on the MEM_ALIGN of the incoming rtx, rather > than its mode. In the case where it's failing, get_best_mode is > deciding it can't do a HImode access without splitting, but then the > split code is assuming SImode units because of the 32-bit alignment, but > not actually changing the mode of the rtx to match that. > > On top of that, this is one of the cases that strict_volatile_bitfield_p > checks for and returns false, but the callers of store_bit_field and > extract_bit_field in expr.c have already fiddled with the mode of the > incoming rtx assuming that -fstrict-volatile-bitfields does apply. It > doesn't get into that infinite recursion if it's compiled with > -fno-strict-volatile-bitfields instead; in that case the incoming rtx > has BLKmode, get_best_mode chooses SImode, and it's able to do the > access without splitting at all. > > Anyway.... I tried a couple different bandaids that solved the infinite > recursion problem but caused regressions elsewhere, and now I'm not sure > of the right place to fix this. Given that there is also still ongoing > discussion about making this a 3-way switch (etc) I am going to hold off > on committing this patch for now. > > -Sandra > I have attached an update to your patch, that should a) fix the recursion problem. b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. Bernd.
On 10/28/2013 03:20 AM, Bernd Edlinger wrote: > > On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote: >> >> I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new >> pr56997-1.c testcase, it got stuck in infinite recursion between >> store_split_bit_field/store_fixed_bit_field and/or >> extract_split_bit_field/extract_fixed_bit_field. This didn't show up in >> my previous mainline testing. >> >> [snip] > > I have attached an update to your patch, that should > a) fix the recursion problem. > b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. > Thanks for picking up the ball on this -- I've been busy with other tasks for several days. I again tried backporting the patch series along with your fix to GCC 4.8 and tested on arm-none-eabi. I found that it was still getting stuck in infinite recursion unless the test from this patch hunk > @@ -1699,6 +1711,14 @@ extract_bit_field (rtx str_rtx, unsigned > { > enum machine_mode mode1; > > + /* Handle extraction of unaligned fields, > + this can happen in -fstrict-volatile-bitfields. */ > + if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0 > + && GET_MODE_BITSIZE (GET_MODE (str_rtx)) < GET_MODE_BITSIZE (word_mode) > + && bitnum % GET_MODE_BITSIZE (GET_MODE (str_rtx)) + bitsize > + > GET_MODE_BITSIZE (GET_MODE (str_rtx)) ) > + str_rtx = adjust_address (str_rtx, word_mode, 0); > + > /* Handle -fstrict-volatile-bitfields in the cases where it applies. */ > if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) > 0) > mode1 = GET_MODE (str_rtx); was also inserted into store_bit_field. I also think that, minimally, this needs to be rewritten as something like if (MEM_P (str_rtx) && STRICT_ALIGNMENT && GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0 && GET_MODE_BITSIZE (GET_MODE (str_rtx)) < GET_MODE_BITSIZE (word_mode) && (bitnum % MEM_ALIGN (str_rtx) + bitsize > GET_MODE_BITSIZE (GET_MODE (str_rtx)))) str_rtx = adjust_address (str_rtx, word_mode, 0); Otherwise, we'll end up using word_mode instead of the field mode on targets that support unaligned accesses. I tested that (again, 4.8 and arm-none-eabi) and results looked good, but of course ARM isn't one of the targets that supports unaligned accesses. :-S I'm still not convinced this is the right fix, though. It seems to me that callers of store_bit_field and extract_bit_field in expr.c ought not to be messing with the mode of the rtx when flag_strict_volatile_bitfields > 0, because that is losing information about the original mode that we are trying to patch up here by forcing it to word_mode. Instead, I think the callers ought to be passing the declared field mode into store_bit_field and extract_bit_field, and those functions ought to change the mode of the incoming rtx to match the field mode only if strict_volatile_bitfield_p assures us that the insertion/extraction can be done in that mode. Alternatively, if strict_volatile_bitfield_p returns false but flag_strict_volatile_bitfields > 0, then always force to word_mode and change the -fstrict-volatile-bitfields documentation to indicate that's the fallback if the insertion/extraction cannot be done in the declared mode, rather than claiming that it tries to do the same thing as if -fstrict-volatile-bitfields were not enabled at all. Either way, still needs more work, and more thorough testing (more targets, and obviously trunk as well as the 4.8 backport) before I'd consider this ready to commit. I might or might not be able to find some more time to work on this in the next week.... -Sandra
Hi, On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote: > > On 10/28/2013 03:20 AM, Bernd Edlinger wrote: >> >> On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote: >>> >>> I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new >>> pr56997-1.c testcase, it got stuck in infinite recursion between >>> store_split_bit_field/store_fixed_bit_field and/or >>> extract_split_bit_field/extract_fixed_bit_field. This didn't show up in >>> my previous mainline testing. >>> >>> [snip] >> >> I have attached an update to your patch, that should >> a) fix the recursion problem. >> b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model. >> > > Thanks for picking up the ball on this -- I've been busy with other > tasks for several days. > > I again tried backporting the patch series along with your fix to GCC > 4.8 and tested on arm-none-eabi. I found that it was still getting > stuck in infinite recursion unless the test from this patch hunk > hmmm.... Actually I used your path on a clean 4.8.2 and built for --target=arm-eabi. I have seen the recursion on the extract_bit_field, but not on store_bit_field so far, maybe you could give me a hint which test case exposes the other flavour of this recursion problem. >> @@ -1699,6 +1711,14 @@ extract_bit_field (rtx str_rtx, unsigned >> { >> enum machine_mode mode1; >> >> + /* Handle extraction of unaligned fields, >> + this can happen in -fstrict-volatile-bitfields. */ >> + if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0 >> + && GET_MODE_BITSIZE (GET_MODE (str_rtx)) < GET_MODE_BITSIZE (word_mode) >> + && bitnum % GET_MODE_BITSIZE (GET_MODE (str_rtx)) + bitsize >> +> GET_MODE_BITSIZE (GET_MODE (str_rtx)) ) >> + str_rtx = adjust_address (str_rtx, word_mode, 0); >> + >> /* Handle -fstrict-volatile-bitfields in the cases where it applies. */ >> if (GET_MODE_BITSIZE (GET_MODE (str_rtx))> 0) >> mode1 = GET_MODE (str_rtx); > > was also inserted into store_bit_field. > > I also think that, minimally, this needs to be rewritten as something like > > if (MEM_P (str_rtx) > && STRICT_ALIGNMENT > && GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0 > && GET_MODE_BITSIZE (GET_MODE (str_rtx)) < GET_MODE_BITSIZE > (word_mode) > && (bitnum % MEM_ALIGN (str_rtx) + bitsize >> GET_MODE_BITSIZE (GET_MODE (str_rtx)))) > str_rtx = adjust_address (str_rtx, word_mode, 0); > Yes, that looks fine. > Otherwise, we'll end up using word_mode instead of the field mode on > targets that support unaligned accesses. I tested that (again, 4.8 and > arm-none-eabi) and results looked good, but of course ARM isn't one of > the targets that supports unaligned accesses. :-S > > I'm still not convinced this is the right fix, though. It seems to me > that callers of store_bit_field and extract_bit_field in expr.c ought > not to be messing with the mode of the rtx when > flag_strict_volatile_bitfields> 0, because that is losing information > about the original mode that we are trying to patch up here by forcing > it to word_mode. Instead, I think the callers ought to be passing the > declared field mode into store_bit_field and extract_bit_field, and > those functions ought to change the mode of the incoming rtx to match > the field mode only if strict_volatile_bitfield_p assures us that the > insertion/extraction can be done in that mode. > The problem starts likely at expr.c when this is done: if (volatilep && flag_strict_volatile_bitfields> 0) to_rtx = adjust_address (to_rtx, mode1, 0); this restricts the possible access mode not only for bit-fields but for all possible volatile members. But -fstrict-volatile-bitfields is supposed to affect bit-fields only. > Alternatively, if strict_volatile_bitfield_p returns false but > flag_strict_volatile_bitfields> 0, then always force to word_mode and > change the -fstrict-volatile-bitfields documentation to indicate that's > the fallback if the insertion/extraction cannot be done in the declared > mode, rather than claiming that it tries to do the same thing as if > -fstrict-volatile-bitfields were not enabled at all. > > Either way, still needs more work, and more thorough testing (more > targets, and obviously trunk as well as the 4.8 backport) before I'd > consider this ready to commit. I might or might not be able to find > some more time to work on this in the next week.... > Yes. And it would be good to reach Richard's Nov-21 deadline for GCC-4.9 Bernd. > -Sandra >
On 10/29/2013 02:51 AM, Bernd Edlinger wrote: > On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote: >> >> I again tried backporting the patch series along with your fix to GCC >> 4.8 and tested on arm-none-eabi. I found that it was still getting >> stuck in infinite recursion unless the test from this patch hunk >> > > hmmm.... > Actually I used your path on a clean 4.8.2 and built for --target=arm-eabi. > I have seen the recursion on the extract_bit_field, but not on store_bit_field > so far, maybe you could give me a hint which test case exposes the other > flavour of this recursion problem. Sorry, I was going to describe that in more detail but I forgot. It was compiling pr56997-1.c with -mthumb, like: arm-none-eabi-gcc /path/to/gcc/testsuite/gcc.dg/pr56997-1.c -c -fno-diagnostics-show-caret -fstrict-volatile-bitfields -DSTACK_SIZE=16384 -mthumb Other testing with -mthumb looked fine. >> Either way, still needs more work, and more thorough testing (more >> targets, and obviously trunk as well as the 4.8 backport) before I'd >> consider this ready to commit. I might or might not be able to find >> some more time to work on this in the next week.... > > Yes. > > And it would be good to reach Richard's Nov-21 deadline for GCC-4.9 Yeah, I know the clock is ticking. :-( -Sandra
Index: gcc/expmed.c =================================================================== --- gcc/expmed.c (revision 203003) +++ gcc/expmed.c (working copy) @@ -415,6 +415,42 @@ lowpart_bit_field_p (unsigned HOST_WIDE_ return bitnum % BITS_PER_WORD == 0; } +/* Return true if -fstrict-volatile-bitfields applies an access of OP0 + containing BITSIZE bits starting at BITNUM, with field mode FIELDMODE. */ + +static bool +strict_volatile_bitfield_p (rtx op0, unsigned HOST_WIDE_INT bitsize, + unsigned HOST_WIDE_INT bitnum, + enum machine_mode fieldmode) +{ + unsigned HOST_WIDE_INT modesize = GET_MODE_BITSIZE (fieldmode); + + /* -fstrict-volatile-bitfields must be enabled and we must have a + volatile MEM. */ + if (!MEM_P (op0) + || !MEM_VOLATILE_P (op0) + || flag_strict_volatile_bitfields <= 0) + return false; + + /* Non-integral modes likely only happen with packed structures. + Punt. */ + if (!SCALAR_INT_MODE_P (fieldmode)) + return false; + + /* The bit size must not be larger than the field mode, and + the field mode must not be larger than a word. */ + if (bitsize > modesize || modesize > BITS_PER_WORD) + return false; + + /* Check for cases of unaligned fields that must be split. */ + if (bitnum % BITS_PER_UNIT + bitsize > modesize + || (STRICT_ALIGNMENT + && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize > modesize)) + return false; + + return true; +} + /* Return true if OP is a memory and if a bitfield of size BITSIZE at bit number BITNUM can be treated as a simple value of mode MODE. */ @@ -813,12 +849,8 @@ store_bit_field_1 (rtx str_rtx, unsigned cheap register alternative is available. */ if (MEM_P (op0)) { - /* Do not use unaligned memory insvs for volatile bitfields when - -fstrict-volatile-bitfields is in effect. */ - if (!(MEM_VOLATILE_P (op0) - && flag_strict_volatile_bitfields > 0) - && get_best_mem_extraction_insn (&insv, EP_insv, bitsize, bitnum, - fieldmode) + if (get_best_mem_extraction_insn (&insv, EP_insv, bitsize, bitnum, + fieldmode) && store_bit_field_using_insv (&insv, op0, bitsize, bitnum, value)) return true; @@ -871,6 +903,27 @@ store_bit_field (rtx str_rtx, unsigned H enum machine_mode fieldmode, rtx value) { + /* Handle -fstrict-volatile-bitfields in the cases where it applies. */ + if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, fieldmode)) + { + + /* Storing any naturally aligned field can be done with a simple + store. For targets that support fast unaligned memory, any + naturally sized, unit aligned field can be done directly. */ + if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, fieldmode)) + { + str_rtx = adjust_bitfield_address (str_rtx, fieldmode, + bitnum / BITS_PER_UNIT); + emit_move_insn (str_rtx, value); + } + else + /* Explicitly override the C/C++ memory model; ignore the + bit range so that we can do the access in the mode mandated + by -fstrict-volatile-bitfields instead. */ + store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value); + return; + } + /* Under the C++0x memory model, we must not touch bits outside the bit region. Adjust the address to start at the beginning of the bit region. */ @@ -923,29 +976,12 @@ store_fixed_bit_field (rtx op0, unsigned if (MEM_P (op0)) { - unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE; - - if (bitregion_end) - maxbits = bitregion_end - bitregion_start + 1; - - /* Get the proper mode to use for this field. We want a mode that - includes the entire field. If such a mode would be larger than - a word, we won't be doing the extraction the normal way. - We don't want a mode bigger than the destination. */ - mode = GET_MODE (op0); if (GET_MODE_BITSIZE (mode) == 0 || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode)) mode = word_mode; - - if (MEM_VOLATILE_P (op0) - && GET_MODE_BITSIZE (GET_MODE (op0)) > 0 - && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits - && flag_strict_volatile_bitfields > 0) - mode = GET_MODE (op0); - else - mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end, - MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0)); + mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end, + MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0)); if (mode == VOIDmode) { @@ -1429,19 +1465,8 @@ extract_bit_field_1 (rtx str_rtx, unsign If that's wrong, the solution is to test for it and set TARGET to 0 if needed. */ - /* If the bitfield is volatile, we need to make sure the access - remains on a type-aligned boundary. */ - if (GET_CODE (op0) == MEM - && MEM_VOLATILE_P (op0) - && GET_MODE_BITSIZE (GET_MODE (op0)) > 0 - && flag_strict_volatile_bitfields > 0) - goto no_subreg_mode_swap; - - /* Only scalar integer modes can be converted via subregs. There is an - additional problem for FP modes here in that they can have a precision - which is different from the size. mode_for_size uses precision, but - we want a mode based on the size, so we must avoid calling it for FP - modes. */ + /* Get the mode of the field to use for atomic access or subreg + conversion. */ mode1 = mode; if (SCALAR_INT_MODE_P (tmode)) { @@ -1474,8 +1499,6 @@ extract_bit_field_1 (rtx str_rtx, unsign return convert_extracted_bit_field (op0, mode, tmode, unsignedp); } - no_subreg_mode_swap: - /* Handle fields bigger than a word. */ if (bitsize > BITS_PER_WORD) @@ -1595,11 +1618,8 @@ extract_bit_field_1 (rtx str_rtx, unsign cheap register alternative is available. */ if (MEM_P (op0)) { - /* Do not use extv/extzv for volatile bitfields when - -fstrict-volatile-bitfields is in effect. */ - if (!(MEM_VOLATILE_P (op0) && flag_strict_volatile_bitfields > 0) - && get_best_mem_extraction_insn (&extv, pattern, bitsize, bitnum, - tmode)) + if (get_best_mem_extraction_insn (&extv, pattern, bitsize, bitnum, + tmode)) { rtx result = extract_bit_field_using_extv (&extv, op0, bitsize, bitnum, unsignedp, @@ -1665,6 +1685,31 @@ extract_bit_field (rtx str_rtx, unsigned unsigned HOST_WIDE_INT bitnum, int unsignedp, rtx target, enum machine_mode mode, enum machine_mode tmode) { + enum machine_mode mode1; + + /* Handle -fstrict-volatile-bitfields in the cases where it applies. */ + if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) > 0) + mode1 = GET_MODE (str_rtx); + else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0) + mode1 = GET_MODE (target); + else + mode1 = tmode; + + if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, mode1)) + { + rtx result; + + /* Extraction of a full MODE1 value can be done with a load as long as + the field is on a byte boundary and is sufficiently aligned. */ + if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, mode1)) + result = adjust_bitfield_address (str_rtx, mode1, + bitnum / BITS_PER_UNIT); + else + result = extract_fixed_bit_field (mode, str_rtx, bitsize, bitnum, + target, unsignedp); + return convert_extracted_bit_field (result, mode, tmode, unsignedp); + } + return extract_bit_field_1 (str_rtx, bitsize, bitnum, unsignedp, target, mode, tmode, true); } @@ -1691,45 +1736,19 @@ extract_fixed_bit_field (enum machine_mo includes the entire field. If such a mode would be larger than a word, we won't be doing the extraction the normal way. */ - if (MEM_VOLATILE_P (op0) - && flag_strict_volatile_bitfields > 0) - { - if (GET_MODE_BITSIZE (GET_MODE (op0)) > 0) - mode = GET_MODE (op0); - else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0) - mode = GET_MODE (target); - else - mode = tmode; - } - else - mode = get_best_mode (bitsize, bitnum, 0, 0, - MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0)); + mode = GET_MODE (op0); + if (GET_MODE_BITSIZE (mode) == 0 + || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode)) + mode = word_mode; + mode = get_best_mode (bitsize, bitnum, 0, 0, + MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0)); if (mode == VOIDmode) /* The only way this should occur is if the field spans word boundaries. */ return extract_split_bit_field (op0, bitsize, bitnum, unsignedp); - unsigned int total_bits = GET_MODE_BITSIZE (mode); - HOST_WIDE_INT bit_offset = bitnum - bitnum % total_bits; - - /* If we're accessing a volatile MEM, we can't apply BIT_OFFSET - if it results in a multi-word access where we otherwise wouldn't - have one. So, check for that case here. */ - if (MEM_P (op0) - && MEM_VOLATILE_P (op0) - && flag_strict_volatile_bitfields > 0 - && bitnum % BITS_PER_UNIT + bitsize <= total_bits - && bitnum % GET_MODE_BITSIZE (mode) + bitsize > total_bits) - { - /* If the target doesn't support unaligned access, give up and - split the access into two. */ - if (STRICT_ALIGNMENT) - return extract_split_bit_field (op0, bitsize, bitnum, unsignedp); - bit_offset = bitnum - bitnum % BITS_PER_UNIT; - } - op0 = adjust_bitfield_address (op0, mode, bit_offset / BITS_PER_UNIT); - bitnum -= bit_offset; + op0 = narrow_bit_field_mem (op0, mode, bitsize, bitnum, &bitnum); } mode = GET_MODE (op0); Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 203003) +++ gcc/doc/invoke.texi (working copy) @@ -21169,6 +21169,12 @@ instruction, even though that accesses b any portion of the bit-field, or memory-mapped registers unrelated to the one being updated. +In some cases, such as when the @code{packed} attribute is applied to a +structure field, it may not be possible to access the field with a single +read or write that is correctly aligned for the target machine. In this +case GCC falls back to generating multiple accesses rather than code that +will fault or truncate the result at run time. + The default value of this option is determined by the application binary interface for the target processor. Index: gcc/testsuite/gcc.dg/pr23623.c =================================================================== --- gcc/testsuite/gcc.dg/pr23623.c (revision 0) +++ gcc/testsuite/gcc.dg/pr23623.c (revision 0) @@ -0,0 +1,45 @@ +/* { dg-do compile } */ +/* { dg-options "-fstrict-volatile-bitfields -fdump-rtl-final" } */ + +/* With -fstrict-volatile-bitfields, the volatile accesses to bf2.b + and bf3.b must do unsigned int reads/writes. The non-volatile + accesses to bf1.b are not so constrained. */ + +extern struct +{ + unsigned int b : 1; +} bf1; + +extern volatile struct +{ + unsigned int b : 1; +} bf2; + +extern struct +{ + volatile unsigned int b : 1; +} bf3; + +void writeb(void) +{ + bf1.b = 1; + bf2.b = 1; /* volatile read + volatile write */ + bf3.b = 1; /* volatile read + volatile write */ +} + +extern unsigned int x1, x2, x3; + +void readb(void) +{ + x1 = bf1.b; + x2 = bf2.b; /* volatile write */ + x3 = bf3.b; /* volatile write */ +} + +/* There should be 6 volatile MEMs total, but scan-rtl-dump-times counts + the number of match variables and not the number of matches. Since + the parenthesized subexpression in the regexp introduces an extra match + variable, we need to give a count of 12 instead of 6 here. */ +/* { dg-final { scan-rtl-dump-times "mem/v(/.)*:SI" 12 "final" } } */ +/* { dg-final { cleanup-rtl-dump "final" } } */ + Index: gcc/testsuite/gcc.dg/pr48784-1.c =================================================================== --- gcc/testsuite/gcc.dg/pr48784-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr48784-1.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do run } */ +/* { dg-options "-fstrict-volatile-bitfields" } */ + +extern void abort (void); + +#pragma pack(1) +volatile struct S0 { + signed a : 7; + unsigned b : 28; /* b can't be fetched with an aligned 32-bit access, */ + /* but it certainly can be fetched with an unaligned access */ +} g = {0,0xfffffff}; + +int main() { + unsigned b = g.b; + if (b != 0xfffffff) + abort (); + return 0; +} Index: gcc/testsuite/gcc.dg/pr48784-2.c =================================================================== --- gcc/testsuite/gcc.dg/pr48784-2.c (revision 0) +++ gcc/testsuite/gcc.dg/pr48784-2.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do run } */ +/* { dg-options "-fno-strict-volatile-bitfields" } */ + +extern void abort (void); + +#pragma pack(1) +volatile struct S0 { + signed a : 7; + unsigned b : 28; /* b can't be fetched with an aligned 32-bit access, */ + /* but it certainly can be fetched with an unaligned access */ +} g = {0,0xfffffff}; + +int main() { + unsigned b = g.b; + if (b != 0xfffffff) + abort (); + return 0; +} Index: gcc/testsuite/gcc.dg/pr56341-1.c =================================================================== --- gcc/testsuite/gcc.dg/pr56341-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr56341-1.c (revision 0) @@ -0,0 +1,40 @@ +/* { dg-do run } */ +/* { dg-options "-fstrict-volatile-bitfields" } */ + +extern void abort (void); + +struct test0 +{ + unsigned char b1[2]; +} __attribute__((packed, aligned(2))); + +struct test1 +{ + volatile unsigned long a1; + unsigned char b1[4]; +} __attribute__((packed, aligned(2))); + +struct test2 +{ + struct test0 t0; + struct test1 t1; + struct test0 t2; +} __attribute__((packed, aligned(2))); + +struct test2 xx; +struct test2 *x1 = &xx; + +#define MAGIC 0x12345678 + +void test0 (struct test2* x1) +{ + x1->t1.a1 = MAGIC; +} + +int main() +{ + test0 (x1); + if (xx.t1.a1 != MAGIC) + abort (); + return 0; +} Index: gcc/testsuite/gcc.dg/pr56341-2.c =================================================================== --- gcc/testsuite/gcc.dg/pr56341-2.c (revision 0) +++ gcc/testsuite/gcc.dg/pr56341-2.c (revision 0) @@ -0,0 +1,40 @@ +/* { dg-do run } */ +/* { dg-options "-fno-strict-volatile-bitfields" } */ + +extern void abort (void); + +struct test0 +{ + unsigned char b1[2]; +} __attribute__((packed, aligned(2))); + +struct test1 +{ + volatile unsigned long a1; + unsigned char b1[4]; +} __attribute__((packed, aligned(2))); + +struct test2 +{ + struct test0 t0; + struct test1 t1; + struct test0 t2; +} __attribute__((packed, aligned(2))); + +struct test2 xx; +struct test2 *x1 = &xx; + +#define MAGIC 0x12345678 + +void test0 (struct test2* x1) +{ + x1->t1.a1 = MAGIC; +} + +int main() +{ + test0 (x1); + if (xx.t1.a1 != MAGIC) + abort (); + return 0; +} Index: gcc/testsuite/gcc.dg/pr56997-1.c =================================================================== --- gcc/testsuite/gcc.dg/pr56997-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr56997-1.c (revision 0) @@ -0,0 +1,44 @@ +/* Test volatile access to unaligned field. */ +/* { dg-do run } */ +/* { dg-options "-fstrict-volatile-bitfields" } */ + +extern void abort (void); + +#define test_type unsigned short +#define MAGIC (unsigned short)0x102u + +typedef struct s{ + unsigned char Prefix; + test_type Type; +}__attribute((__packed__)) ss; + +volatile ss v; +ss g; + +void __attribute__((noinline)) +foo (test_type u) +{ + v.Type = u; +} + +test_type __attribute__((noinline)) +bar (void) +{ + return v.Type; +} + +int main() +{ + test_type temp; + foo(MAGIC); + __builtin_memcpy(&g, (void *)&v, sizeof(g)); + if (g.Type != MAGIC) + abort (); + + g.Type = MAGIC; + __builtin_memcpy((void *)&v, &g, sizeof(v)); + temp = bar(); + if (temp != MAGIC) + abort (); + return 0; +} Index: gcc/testsuite/gcc.dg/pr56997-2.c =================================================================== --- gcc/testsuite/gcc.dg/pr56997-2.c (revision 0) +++ gcc/testsuite/gcc.dg/pr56997-2.c (revision 0) @@ -0,0 +1,44 @@ +/* Test volatile access to unaligned field. */ +/* { dg-do run } */ +/* { dg-options "-fstrict-volatile-bitfields" } */ + +extern void abort (void); + +#define test_type unsigned int +#define MAGIC 0x1020304u + +typedef struct s{ + unsigned char Prefix; + test_type Type; +}__attribute((__packed__)) ss; + +volatile ss v; +ss g; + +void __attribute__((noinline)) +foo (test_type u) +{ + v.Type = u; +} + +test_type __attribute__((noinline)) +bar (void) +{ + return v.Type; +} + +int main() +{ + test_type temp; + foo(MAGIC); + __builtin_memcpy(&g, (void *)&v, sizeof(g)); + if (g.Type != MAGIC) + abort (); + + g.Type = MAGIC; + __builtin_memcpy((void *)&v, &g, sizeof(v)); + temp = bar(); + if (temp != MAGIC) + abort (); + return 0; +} Index: gcc/testsuite/gcc.dg/pr56997-3.c =================================================================== --- gcc/testsuite/gcc.dg/pr56997-3.c (revision 0) +++ gcc/testsuite/gcc.dg/pr56997-3.c (revision 0) @@ -0,0 +1,44 @@ +/* Test volatile access to unaligned field. */ +/* { dg-do run } */ +/* { dg-options "-fstrict-volatile-bitfields" } */ + +extern void abort (void); + +#define test_type unsigned long long +#define MAGIC 0x102030405060708ull + +typedef struct s{ + unsigned char Prefix; + test_type Type; +}__attribute((__packed__)) ss; + +volatile ss v; +ss g; + +void __attribute__((noinline)) +foo (test_type u) +{ + v.Type = u; +} + +test_type __attribute__((noinline)) +bar (void) +{ + return v.Type; +} + +int main() +{ + test_type temp; + foo(MAGIC); + __builtin_memcpy(&g, (void *)&v, sizeof(g)); + if (g.Type != MAGIC) + abort (); + + g.Type = MAGIC; + __builtin_memcpy((void *)&v, &g, sizeof(v)); + temp = bar(); + if (temp != MAGIC) + abort (); + return 0; +}