Fix another wrong-code bug with -fstrict-volatile-bitfields

Message ID	DUB118-W37417CC2C8EA59261DEA78E41C0@phx.gbl
State	New
Headers	show Return-Path: <gcc-patches-return-393098-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:from:to:cc:subject:date:in-reply-to:references :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=GQwvqw6io8M/VHNeDPTyk9E4wGxsyi1BQ7ahIV7tKYG2lyNyk3uC7 T7CTqrU14/5Ly/sfnYXvvddWINVH8dtxhq/1npxpiB1Jk+rEIyNZ4NtIdJQsFNVZ g6UWzMPfZtmEmM9ACfUrO8I+q92t2PqKAZyBwvrHzAMk2CYltnR9Ko= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org Message-ID: <DUB118-W37417CC2C8EA59261DEA78E41C0@phx.gbl> From: Bernd Edlinger <bernd.edlinger@hotmail.de> To: Richard Biener <richard.guenther@gmail.com> CC: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>, Eric Botcazou <ebotcazou@adacore.com> Subject: RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields Date: Fri, 6 Mar 2015 10:29:43 +0100 In-Reply-To: <CAFiYyc18F-6ssg5pbXNR3oRL1y3TTLK_21qdnN=qM-d7H-gyyw@mail.gmail.com> References: <DUB118-W288F9097DAAF6E3E544713E41E0@phx.gbl>, <DUB118-W1B6CB327B0F0BBDB67F1BE41E0@phx.gbl>, <CAFiYyc19LMd7=dLFWu+5UpvmxdqfBL-A=VeqHx_xFS3gYMMCSg@mail.gmail.com>, <DUB118-W252167753CCC32AFAEE0C6E41F0@phx.gbl>, <CAFiYyc06G1oPUo3YiEzC+ssY7DFhk5hpP+DKHAMtuRVDjyBk3A@mail.gmail.com>, <DUB118-W25A5AE145605983891EF19E41F0@phx.gbl>, <CAFiYyc18F-6ssg5pbXNR3oRL1y3TTLK_21qdnN=qM-d7H-gyyw@mail.gmail.com> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0

Bernd Edlinger March 6, 2015, 9:29 a.m. UTC

Hi,

On Thu, 5 Mar 2015 16:36:48, Richard Biener wrote:
>
> On Thu, Mar 5, 2015 at 4:05 PM, Bernd Edlinger
> <bernd.edlinger@hotmail.de> wrote:
>>
>> every access is split in 4 QImode accesses. but that is as
>> expected, because the structure is byte aligned.
>
> No, it is not expected because the CPU can handle unaligned SImode
> reads/writes just fine (even if not as an atomic operation).
> The C++ memory model allows an SImode read to s.y (-fstrict-volatile-bitfields
> would, as well, but the CPU doesn't guarantee atomic operation here)
>

Hmm, well.  I understand.

However this is the normal way how the non-strict-volatile-bitfields
work.

But I can probably work out a patch that enables the strict-volatile-bitfields
to generate un-aligned SImode accesses when necessary on !STRICT_ALIGNMENT
targets.

Now, I checked again the ARM port, and I am a bit surprised, because
with gcc 4.9.0 this structure gets 4 QImode accesses which is correct,
but with recent trunk (gcc version 5.0.0 20150301) I get SImode accesses,
but the structure is not aligned and the compiler cant possibly know how the memory
will be aligned.  Something must have changed in be meantime, but it wasn't by me.
IIRC the field mode in this example was QImode but now it seems to be SImode.

struct s
{
  unsigned int y:31;
} __attribute__((packed));

int
test (volatile struct s* x)
{
  x->y=0x7FFFFFFF;
  return x->y;
}

So what would you think of this change at strict_volatile_bitfield_p?

of course this is incomplete, and needs special handling for !STRICT_ALIGNMENT
in the strict-volatile-bitfields code path later.

Thanks
Bernd.

Bernd Edlinger March 6, 2015, 11:48 a.m. UTC | #1

Hi Richard,

here is my new proposal, it addresses your objections and generates
"better" code for this test case:

main:
.LFB0:
    .cfi_startproc
    pushq    %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    global+1(%rip), %eax
    orl    $2147483647, %eax
    movl    %eax, global+1(%rip)
    movl    global+1(%rip), %eax
    andl    $2147483647, %eax
    cmpl    $2147483647, %eax
    je    .L2
    call    abort
.L2:
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc


I also tried to fix the comments.

Reg-tested on x86_64 successfully and ARM is still running.

Is it OK for trunk?



Thanks
Bernd.
gcc:
2015-03-06  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* expmed.c (strict_volatile_bitfield_p): For STRICT_ALIGNMENT
	check that MEM_ALIGN (op0) allows a MODESIZE access.
	(store_bit_field, extract_bit_field): For !STRICT_ALIGNMENT explicitly
	generate an unaligned access if the field crosses a word boundary.

testsuite:
2015-03-06  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* gcc.dg/20150306-1.c: New test.

Richard Biener March 9, 2015, 2:30 p.m. UTC | #2

On Fri, Mar 6, 2015 at 12:48 PM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> Hi Richard,
>
> here is my new proposal, it addresses your objections and generates
> "better" code for this test case:
>
> main:
> .LFB0:
>     .cfi_startproc
>     pushq    %rbp
>     .cfi_def_cfa_offset 16
>     .cfi_offset 6, -16
>     movq    %rsp, %rbp
>     .cfi_def_cfa_register 6
>     movl    global+1(%rip), %eax
>     orl    $2147483647, %eax
>     movl    %eax, global+1(%rip)
>     movl    global+1(%rip), %eax
>     andl    $2147483647, %eax
>     cmpl    $2147483647, %eax
>     je    .L2
>     call    abort
> .L2:
>     movl    $0, %eax
>     popq    %rbp
>     .cfi_def_cfa 7, 8
>     ret
>     .cfi_endproc
>
>
> I also tried to fix the comments.
>
> Reg-tested on x86_64 successfully and ARM is still running.
>
> Is it OK for trunk?

Looks ok to me apart from

   /* Check for cases of unaligned fields that must be split.  */
-  if (bitnum % BITS_PER_UNIT + bitsize > modesize
-      || (STRICT_ALIGNMENT
-         && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize > modesize))
+  if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
+      + bitsize > modesize
+      || (STRICT_ALIGNMENT && MEM_ALIGN (op0) < modesize))
     return false;

where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize
(in both places).

Please leave Eric the chance to comment.

Thanks,
Richard.

>
>
> Thanks
> Bernd.
>

Bernd Edlinger March 10, 2015, 1:40 p.m. UTC | #3

Hi Richard and Eric,

On Mon, 9 Mar 2015 15:30:31, Richard Biener wrote:
>> Reg-tested on x86_64 successfully and ARM is still running.

ARM completed without regressions meanwhile.

>>
>> Is it OK for trunk?
>
> Looks ok to me apart from
>
> /* Check for cases of unaligned fields that must be split. */
> - if (bitnum % BITS_PER_UNIT + bitsize> modesize
> - || (STRICT_ALIGNMENT
> - && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize> modesize))
> + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
> + + bitsize> modesize
> + || (STRICT_ALIGNMENT && MEM_ALIGN (op0) < modesize))
> return false;
>
> where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize
> (in both places).
>
> Please leave Eric the chance to comment.
>

Just to clarify a few things here:

I try to make the checks in strict_volatile_bitfield_p
to be consistent with the strict volatile bit-field code path
that follows if we return true here.

I would summarize the current implementation of the
strict volatile bit-field code as follows:

For strict alignment, we access the structure as
if it were an array of fieldmode.  A multiple of modesize
is added to the base address, and one single read and/or
write access must be sufficient to do the job.  The access
must be Ok regarding the target's alignment restrictions.
That does not change, what changed with the previous patch
is a missed optimization with the EP_insv code pattern.

For non strict alignment, a multiple of modesize is added
to the base address, but if the range [bitnum:bitnum+bitsize-1]
spans two fieldmode words, which should only happen if we
use packed structures, a byte offset is added to the base address.
The byte offset is chosen as small as possible, to not reach beyond
the bit field region.  That is new.  This change is irrelevant for the use case
of accessing a device register, but the generated code is more compact.

Usually we have GET_MODE_ALIGNMENT(fieldmode)==modesize
for all SCALAR_INT_MODE_P(fieldmode).

The only exceptions are complex numbers, and targets where 
ADJUST_ALIGNMENT is used in the modes.def, right?

The targets that do that are few, and the modes are mostly vector modes.
So I did not find any target where the MODE_ALIGNMENT would make
a difference here.  Therefore I think it is more or less a matter of taste.
But please correct me if I am wrong.

If there are cases, where MODE_ALIGNMENT<MODE_BITSIZE, 
changing the second condition MEM_ALIGN(op0)<modesize to
MEM_ALIGN(op0)<GET_MODE_ALIGNMENT(filedmode) would
still be consistent, and probably be more correct.

But I think changing the first condition would allow cases where this
assertion in the patch does no longer hold:
gcc_assert (bitnum + bitsize <= GET_MODE_BITSIZE (fieldmode));

Thanks
Bernd.

Bernd Edlinger March 13, 2015, 9:26 a.m. UTC | #4

Hi,

are there any more comments on this?

I would like to apply the patch as is, unless we find a
a way to get to a test case, maybe with a cross-compiler,
where the MODE_ALIGNMENT is different from MODE_BITSIZE.

Currently, I think that does not happen.

Thanks
Bernd.

> Date: Tue, 10 Mar 2015 14:40:52 +0100
>
> Hi Richard and Eric,
>
> On Mon, 9 Mar 2015 15:30:31, Richard Biener wrote:
>>> Reg-tested on x86_64 successfully and ARM is still running.
>
> ARM completed without regressions meanwhile.
>
>>>
>>> Is it OK for trunk?
>>
>> Looks ok to me apart from
>>
>> /* Check for cases of unaligned fields that must be split. */
>> - if (bitnum % BITS_PER_UNIT + bitsize> modesize
>> - || (STRICT_ALIGNMENT
>> - && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize> modesize))
>> + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
>> + + bitsize> modesize
>> + || (STRICT_ALIGNMENT && MEM_ALIGN (op0) < modesize))
>> return false;
>>
>> where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize
>> (in both places).
>>
>> Please leave Eric the chance to comment.
>>
>
> Just to clarify a few things here:
>
> I try to make the checks in strict_volatile_bitfield_p
> to be consistent with the strict volatile bit-field code path
> that follows if we return true here.
>
> I would summarize the current implementation of the
> strict volatile bit-field code as follows:
>
> For strict alignment, we access the structure as
> if it were an array of fieldmode.  A multiple of modesize
> is added to the base address, and one single read and/or
> write access must be sufficient to do the job.  The access
> must be Ok regarding the target's alignment restrictions.
> That does not change, what changed with the previous patch
> is a missed optimization with the EP_insv code pattern.
>
> For non strict alignment, a multiple of modesize is added
> to the base address, but if the range [bitnum:bitnum+bitsize-1]
> spans two fieldmode words, which should only happen if we
> use packed structures, a byte offset is added to the base address.
> The byte offset is chosen as small as possible, to not reach beyond
> the bit field region.  That is new.  This change is irrelevant for the use case
> of accessing a device register, but the generated code is more compact.
>
> Usually we have GET_MODE_ALIGNMENT(fieldmode)==modesize
> for all SCALAR_INT_MODE_P(fieldmode).
>
> The only exceptions are complex numbers, and targets where
> ADJUST_ALIGNMENT is used in the modes.def, right?
>
> The targets that do that are few, and the modes are mostly vector modes.
> So I did not find any target where the MODE_ALIGNMENT would make
> a difference here.  Therefore I think it is more or less a matter of taste.
> But please correct me if I am wrong.
>
> If there are cases, where MODE_ALIGNMENT<MODE_BITSIZE,
> changing the second condition MEM_ALIGN(op0)<modesize to
> MEM_ALIGN(op0)<GET_MODE_ALIGNMENT(filedmode) would
> still be consistent, and probably be more correct.
>
> But I think changing the first condition would allow cases where this
> assertion in the patch does no longer hold:
> gcc_assert (bitnum + bitsize <= GET_MODE_BITSIZE (fieldmode));
>
>
>
> Thanks
> Bernd.
>

Mikael Pettersson March 14, 2015, 12:24 p.m. UTC | #5

Bernd Edlinger writes:
 > Hi,
 > 
 > are there any more comments on this?
 > 
 > I would like to apply the patch as is, unless we find a
 > a way to get to a test case, maybe with a cross-compiler,
 > where the MODE_ALIGNMENT is different from MODE_BITSIZE.
 > 
 > Currently, I think that does not happen.

On m68k-linux GET_MODE_ALIGNMENT (SImode) == 16 while
GET_MODE_BITSIZE (SImode) == 32.

I don't know what that means for your patch, just wanted
to inform you that such targets do exist.

/Mikael

 > 
 > Thanks
 > Bernd.
 > 
 > > Date: Tue, 10 Mar 2015 14:40:52 +0100
 > >
 > > Hi Richard and Eric,
 > >
 > > On Mon, 9 Mar 2015 15:30:31, Richard Biener wrote:
 > >>> Reg-tested on x86_64 successfully and ARM is still running.
 > >
 > > ARM completed without regressions meanwhile.
 > >
 > >>>
 > >>> Is it OK for trunk?
 > >>
 > >> Looks ok to me apart from
 > >>
 > >> /* Check for cases of unaligned fields that must be split. */
 > >> - if (bitnum % BITS_PER_UNIT + bitsize> modesize
 > >> - || (STRICT_ALIGNMENT
 > >> - && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize> modesize))
 > >> + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
 > >> + + bitsize> modesize
 > >> + || (STRICT_ALIGNMENT && MEM_ALIGN (op0) < modesize))
 > >> return false;
 > >>
 > >> where I'd use GET_MODE_ALIGNMENT (fieldmode) rather than modesize
 > >> (in both places).
 > >>
 > >> Please leave Eric the chance to comment.
 > >>
 > >
 > > Just to clarify a few things here:
 > >
 > > I try to make the checks in strict_volatile_bitfield_p
 > > to be consistent with the strict volatile bit-field code path
 > > that follows if we return true here.
 > >
 > > I would summarize the current implementation of the
 > > strict volatile bit-field code as follows:
 > >
 > > For strict alignment, we access the structure as
 > > if it were an array of fieldmode.  A multiple of modesize
 > > is added to the base address, and one single read and/or
 > > write access must be sufficient to do the job.  The access
 > > must be Ok regarding the target's alignment restrictions.
 > > That does not change, what changed with the previous patch
 > > is a missed optimization with the EP_insv code pattern.
 > >
 > > For non strict alignment, a multiple of modesize is added
 > > to the base address, but if the range [bitnum:bitnum+bitsize-1]
 > > spans two fieldmode words, which should only happen if we
 > > use packed structures, a byte offset is added to the base address.
 > > The byte offset is chosen as small as possible, to not reach beyond
 > > the bit field region.  That is new.  This change is irrelevant for the use case
 > > of accessing a device register, but the generated code is more compact.
 > >
 > > Usually we have GET_MODE_ALIGNMENT(fieldmode)==modesize
 > > for all SCALAR_INT_MODE_P(fieldmode).
 > >
 > > The only exceptions are complex numbers, and targets where
 > > ADJUST_ALIGNMENT is used in the modes.def, right?
 > >
 > > The targets that do that are few, and the modes are mostly vector modes.
 > > So I did not find any target where the MODE_ALIGNMENT would make
 > > a difference here.  Therefore I think it is more or less a matter of taste.
 > > But please correct me if I am wrong.
 > >
 > > If there are cases, where MODE_ALIGNMENT<MODE_BITSIZE,
 > > changing the second condition MEM_ALIGN(op0)<modesize to
 > > MEM_ALIGN(op0)<GET_MODE_ALIGNMENT(filedmode) would
 > > still be consistent, and probably be more correct.
 > >
 > > But I think changing the first condition would allow cases where this
 > > assertion in the patch does no longer hold:
 > > gcc_assert (bitnum + bitsize <= GET_MODE_BITSIZE (fieldmode));
 > >
 > >
 > >
 > > Thanks
 > > Bernd.
 > >
 >  		 	   		  
--

Bernd Edlinger March 14, 2015, 9:37 p.m. UTC | #6

Hi,

On Sat, 14 Mar 2015 13:24:33, Mikael Pettersson wrote:
>
> Bernd Edlinger writes:
>> Hi,
>>
>> are there any more comments on this?
>>
>> I would like to apply the patch as is, unless we find a
>> a way to get to a test case, maybe with a cross-compiler,
>> where the MODE_ALIGNMENT is different from MODE_BITSIZE.
>>
>> Currently, I think that does not happen.
>
> On m68k-linux GET_MODE_ALIGNMENT (SImode) == 16 while
> GET_MODE_BITSIZE (SImode) == 32.
>
> I don't know what that means for your patch, just wanted
> to inform you that such targets do exist.
>

Oh I see, thanks.

This is due to BIGGEST_ALIGNMENT=16, STRICT_ALIGNMENT=1 by default on
that architecture.

If I change this check as suggested:

  if (bitnum % (STRICT_ALIGNMENT ? GET_MODE_ALIGNMENT (fieldmode) : BITS_PER_UNIT)
      + bitsize> modesize
      || (STRICT_ALIGNMENT && MEM_ALIGN (op0) < GET_MODE_ALIGNMENT (fieldmode)))
    return false;


Then I can get the assertion failed in store_bit_field:
  gcc_assert (bitnum + bitsize <= GET_MODE_BITSIZE (fieldmode));

With this example:

cat test.c
struct s
{
  short y:16;
  int x:31;
};

void
f(volatile struct s* z, int x)
{
  z->x=x;
}

m68k-elf-gcc -fstrict-volatile-bitfields -mstrict-align -mno-align-int -O2 -S test.c
test.c: In function 'f':
test.c:10:7: internal compiler error: in store_bit_field, at expmed.c:1005
   z->x=x;
       ^

what I said before..,

without the patch the test case generates
just invalid code which assigns only 16 bits.

There is also a problem in this check.  I had to make
short y:16 a bit filed to bypass that, initially I wrote short y;

  /* Check for cases where the C++ memory model applies.  */
  if (bitregion_end != 0
      && (bitnum - bitnum % modesize < bitregion_start
          || bitnum - bitnum % modesize + modesize - 1> bitregion_end))
    return false;

This assumes also that the access is at a modesize boundary.

If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit
architecture. I doubt that the strict alignment code makes any sense for
modesize> BIGGEST_ALIGNMENT.

I think I should change this check

  /* The bit size must not be larger than the field mode, and
     the field mode must not be larger than a word.  */
  if (bitsize> modesize || modesize> BITS_PER_WORD)
    return false;

to this:

  if (bitsize> modesize || modesize> BITS_PER_WORD
      || modesize> BIGGEST_ALIGNMENT)
    return false;

This should avoid these oddities.


Bernd.

Eric Botcazou March 16, 2015, 8:49 a.m. UTC | #7

> If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit
> architecture. I doubt that the strict alignment code makes any sense for
> modesize> BIGGEST_ALIGNMENT.

Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of 
BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK.

Bernd Edlinger March 16, 2015, 10:53 a.m. UTC | #8

Hi,


when looking at the m68k I realized the following, which is
a general problem...

If the alignment of the structure is less than sizeof(field), the
strict volatile bitfields code may read beyond the end of the
structure!

Consider this example:

struct s
{
  char x : 8;
  volatile unsigned int y : 31;
  volatile unsigned int z : 1;
} __attribute__((packed));

struct s global;


Here we have sizeof(struct s) = 5, alignment(global) == 1,
However when we access global.z we read a 32-bit word
at offset 4, which touches 3 bytes that are not safe to use. 

Something like that does never happen with -fno-strict-volatile-bitfields,
because IIRC, with the only exception of the simple_mem_bitfield_p code path,
there is never an access mode used which is larger than MEM_ALIGN(x).

In this example, if I want to use the packed attribute,
I also have to use the aligned(4) attribute, this satisfies the
check "MEM_ALIGN (op0) < modesize", which is IMO always necessary
for strict volatile bitfields, not only on STRICT_ALIGNMENT targets.

On a target, that has BIGGEST_ALIGNMENT < BITS_PER_WORD,
to use the strict volatile bitfields, you have to add the __attribute__((aligned(4)))
to the structure.

I had to do that on the pr23623.c test case, to have it passed on m68k for instance.


I have attached the updated patch.  As explained before, the check
MEM_ALIGN (op0) < modesize should always be done in strict_volatile_bitfield_p.

For the targets, that usually enable -fstrict-volatile-bitfields, nothing changes,
Except when we use "packed" on the structure, we need to add also an aligned(4)
attribute. For m68k where the natural alignment of any structure is <=2 we need to
force aligned(4) if we want to ensure the access is in SImode.

Boot-strapped and reg-tested on x86_64-linux-gnu.
OK for trunk?


Thanks
Bernd.
gcc:
2015-03-15  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* expmed.c (strict_volatile_bitfield_p): Check that MEM_ALIGN allows
	a MODESIZE access.
	(store_bit_field, extract_bit_field): For !STRICT_ALIGNMENT explicitly
	generate an unaligned access if the field crosses a word boundary.

testsuite:
2015-03-15  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* gcc.dg/pr23623.c: Added aligned attribute.
	* gcc.dg/20141029-1.c: Likewise.
	* gcc.dg/20150306-1.c: New test.

Hans-Peter Nilsson March 17, 2015, 12:31 a.m. UTC | #9

On Mon, 16 Mar 2015, Eric Botcazou wrote:
> > If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit
> > architecture. I doubt that the strict alignment code makes any sense for
> > modesize> BIGGEST_ALIGNMENT.
>
> Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of
> BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK.

The context of the above statement is somewhat ambiguous: if the
statement is taken to include "no matter what is specified in
the program, including use of __attribute__ ((__aligned__ ...))"
then I have to object.  (I guess Eric, you didn't actually mean
that, though.)

The definition of BIGGEST_ALIGNMENT is (tm.texi.in):
"Biggest alignment that any data type can require on this
machine, in bits.  Note that this is not the biggest alignment
that is supported, just the biggest alignment that, when
violated, may cause a fault."

So, IMNSHO we'd better *support* a *larger* alignment, as in "if
the code specified it by explicit means" at least up to
MAX_OFILE_ALIGNMENT.  But, "perfectly OK" when the code didn't
explicitly say anything else.

BTW, unfortunately, BIGGEST_ALIGNMENT can't be blindly followed
for atomic accesses to undecorated (without explicit alignment
specifiers) code either.  Rather the minimum alignment is the
natural alignment *absolutely everywhere no matter what target*
which matters for targets where
 BIGGEST_ALIGNMENT < natural_alignment(<all supported types>)
(well, as long as the natural alignment is smaller than the
target page-size or cache-line, where at least one of the
concepts is usually applicable).

Q: So why not adjust the BIGGEST_ALIGNMENT definition in such
targets to be at least the natural alignment of supported
atomic types?
A: Because that unfortunately has bad effects on generated code
for all accesses to all sizes now <= BIGGEST_ALIGNMENT.

brgds, H-P
PS. It's very unfortunate that there's now __BIGGEST_ALIGNMENT__
visible to programs.

Andreas Schwab March 17, 2015, 12:34 a.m. UTC | #10

Hans-Peter Nilsson <hp@bitrange.com> writes:

> Q: So why not adjust the BIGGEST_ALIGNMENT definition in such
> targets to be at least the natural alignment of supported
> atomic types?

A: Because it's an ABI change.

Andreas.

Hans-Peter Nilsson March 17, 2015, 12:42 a.m. UTC | #11

On Tue, 17 Mar 2015, Andreas Schwab wrote:
> Hans-Peter Nilsson <hp@bitrange.com> writes:
>
> > Q: So why not adjust the BIGGEST_ALIGNMENT definition in such
> > targets to be at least the natural alignment of supported
> > atomic types?
>
> A: Because it's an ABI change.

I intended that to be included in "bad effects";
"A: Because that unfortunately has bad effects on generated code
for all accesses to all sizes now <= BIGGEST_ALIGNMENT."

but thanks for being specific (and I didn't remember exactly
*what* bad effects it was that I saw when I tried that long ago :)

brgds, H-P

Bernd Edlinger March 18, 2015, 7:33 a.m. UTC | #12

Hi,


On Mon, 16 Mar 2015 20:31:53, Hans-Peter Nilsson wrote:
>
> On Mon, 16 Mar 2015, Eric Botcazou wrote:
>>> If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit
>>> architecture. I doubt that the strict alignment code makes any sense for
>>> modesize> BIGGEST_ALIGNMENT.
>>
>> Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of
>> BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK.
>
> The context of the above statement is somewhat ambiguous: if the
> statement is taken to include "no matter what is specified in
> the program, including use of __attribute__ ((__aligned__ ...))"
> then I have to object. (I guess Eric, you didn't actually mean
> that, though.)
>
> The definition of BIGGEST_ALIGNMENT is (tm.texi.in):
> "Biggest alignment that any data type can require on this
> machine, in bits. Note that this is not the biggest alignment
> that is supported, just the biggest alignment that, when
> violated, may cause a fault."
>
> So, IMNSHO we'd better *support* a *larger* alignment, as in "if
> the code specified it by explicit means" at least up to
> MAX_OFILE_ALIGNMENT. But, "perfectly OK" when the code didn't
> explicitly say anything else.
>

Absolutely.

The idea if this patch, is to *require* the use of __attribute__((__aligned__(x)))
for all structures that need the strict volatile bitfields semantics.
At least if the ABI has BIGGEST_ALIGNMENT<BITS_PER_WORD.

I don't see any alternative.


Thanks
Bernd.

Bernd Edlinger March 23, 2015, 9:09 a.m. UTC | #13

Hi,

I'd like to ping for this patch, which I hope can still go in the gcc-5 release:
See https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00817.html for the
patch files.


Thanks,
Bernd.


> Date: Mon, 16 Mar 2015 11:53:00 +0100
>
>
> Hi,
>
>
> when looking at the m68k I realized the following, which is
> a general problem...
>
> If the alignment of the structure is less than sizeof(field), the
> strict volatile bitfields code may read beyond the end of the
> structure!
>
> Consider this example:
>
> struct s
> {
>   char x : 8;
>   volatile unsigned int y : 31;
>   volatile unsigned int z : 1;
> } __attribute__((packed));
>
> struct s global;
>
>
> Here we have sizeof(struct s) = 5, alignment(global) == 1,
> However when we access global.z we read a 32-bit word
> at offset 4, which touches 3 bytes that are not safe to use.
>
> Something like that does never happen with -fno-strict-volatile-bitfields,
> because IIRC, with the only exception of the simple_mem_bitfield_p code path,
> there is never an access mode used which is larger than MEM_ALIGN(x).
>
> In this example, if I want to use the packed attribute,
> I also have to use the aligned(4) attribute, this satisfies the
> check "MEM_ALIGN (op0) < modesize", which is IMO always necessary
> for strict volatile bitfields, not only on STRICT_ALIGNMENT targets.
>
> On a target, that has BIGGEST_ALIGNMENT < BITS_PER_WORD,
> to use the strict volatile bitfields, you have to add the __attribute__((aligned(4)))
> to the structure.
>
> I had to do that on the pr23623.c test case, to have it passed on m68k for instance.
>
>
> I have attached the updated patch.  As explained before, the check
> MEM_ALIGN (op0) < modesize should always be done in strict_volatile_bitfield_p.
>
> For the targets, that usually enable -fstrict-volatile-bitfields, nothing changes,
> Except when we use "packed" on the structure, we need to add also an aligned(4)
> attribute. For m68k where the natural alignment of any structure is <=2 we need to
> force aligned(4) if we want to ensure the access is in SImode.
>
> Boot-strapped and reg-tested on x86_64-linux-gnu.
> OK for trunk?
>
>
> Thanks
> Bernd.
>

Bernd Edlinger March 30, 2015, 4:42 a.m. UTC | #14

Ping again...

it's a wrong code bug.

Thanks
Bernd.

> Date: Mon, 23 Mar 2015 10:09:35 +0100
>
> Hi,
>
> I'd like to ping for this patch, which I hope can still go in the gcc-5 release:
> See https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00817.html for the
> patch files.
>
>
> Thanks,
> Bernd.
>
>
>> Date: Mon, 16 Mar 2015 11:53:00 +0100
>>
>>
>> Hi,
>>
>>
>> when looking at the m68k I realized the following, which is
>> a general problem...
>>
>> If the alignment of the structure is less than sizeof(field), the
>> strict volatile bitfields code may read beyond the end of the
>> structure!
>>
>> Consider this example:
>>
>> struct s
>> {
>> char x : 8;
>> volatile unsigned int y : 31;
>> volatile unsigned int z : 1;
>> } __attribute__((packed));
>>
>> struct s global;
>>
>>
>> Here we have sizeof(struct s) = 5, alignment(global) == 1,
>> However when we access global.z we read a 32-bit word
>> at offset 4, which touches 3 bytes that are not safe to use.
>>
>> Something like that does never happen with -fno-strict-volatile-bitfields,
>> because IIRC, with the only exception of the simple_mem_bitfield_p code path,
>> there is never an access mode used which is larger than MEM_ALIGN(x).
>>
>> In this example, if I want to use the packed attribute,
>> I also have to use the aligned(4) attribute, this satisfies the
>> check "MEM_ALIGN (op0) < modesize", which is IMO always necessary
>> for strict volatile bitfields, not only on STRICT_ALIGNMENT targets.
>>
>> On a target, that has BIGGEST_ALIGNMENT < BITS_PER_WORD,
>> to use the strict volatile bitfields, you have to add the __attribute__((aligned(4)))
>> to the structure.
>>
>> I had to do that on the pr23623.c test case, to have it passed on m68k for instance.
>>
>>
>> I have attached the updated patch. As explained before, the check
>> MEM_ALIGN (op0) < modesize should always be done in strict_volatile_bitfield_p.
>>
>> For the targets, that usually enable -fstrict-volatile-bitfields, nothing changes,
>> Except when we use "packed" on the structure, we need to add also an aligned(4)
>> attribute. For m68k where the natural alignment of any structure is <=2 we need to
>> force aligned(4) if we want to ensure the access is in SImode.
>>
>> Boot-strapped and reg-tested on x86_64-linux-gnu.
>> OK for trunk?
>>
>>
>> Thanks
>> Bernd.
>>
>

Richard Biener March 30, 2015, 10:33 a.m. UTC | #15

On Mon, Mar 16, 2015 at 11:53 AM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
>
> Hi,
>
>
> when looking at the m68k I realized the following, which is
> a general problem...
>
> If the alignment of the structure is less than sizeof(field), the
> strict volatile bitfields code may read beyond the end of the
> structure!
>
> Consider this example:
>
> struct s
> {
>   char x : 8;
>   volatile unsigned int y : 31;
>   volatile unsigned int z : 1;
> } __attribute__((packed));
>
> struct s global;
>
>
> Here we have sizeof(struct s) = 5, alignment(global) == 1,
> However when we access global.z we read a 32-bit word
> at offset 4, which touches 3 bytes that are not safe to use.
>
> Something like that does never happen with -fno-strict-volatile-bitfields,
> because IIRC, with the only exception of the simple_mem_bitfield_p code path,
> there is never an access mode used which is larger than MEM_ALIGN(x).

Are you sure?  There is still PR36043 ...

> In this example, if I want to use the packed attribute,
> I also have to use the aligned(4) attribute, this satisfies the
> check "MEM_ALIGN (op0) < modesize", which is IMO always necessary
> for strict volatile bitfields, not only on STRICT_ALIGNMENT targets.

But your patch still somehow special-cases them.

> On a target, that has BIGGEST_ALIGNMENT < BITS_PER_WORD,
> to use the strict volatile bitfields, you have to add the __attribute__((aligned(4)))
> to the structure.
>
> I had to do that on the pr23623.c test case, to have it passed on m68k for instance.
>
>
> I have attached the updated patch.  As explained before, the check
> MEM_ALIGN (op0) < modesize should always be done in strict_volatile_bitfield_p.
>
> For the targets, that usually enable -fstrict-volatile-bitfields, nothing changes,
> Except when we use "packed" on the structure, we need to add also an aligned(4)
> attribute. For m68k where the natural alignment of any structure is <=2 we need to
> force aligned(4) if we want to ensure the access is in SImode.
>
> Boot-strapped and reg-tested on x86_64-linux-gnu.
> OK for trunk?

So - shouldn't the check be

  if (MEM_ALIGN (op0) < GET_MODE_ALIGNMENT (fieldmode))
    return false;

instead?  And looking at

   /* Check for cases of unaligned fields that must be split.  */
-  if (bitnum % BITS_PER_UNIT + bitsize > modesize
-      || (STRICT_ALIGNMENT
-         && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize > modesize))
+  if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
+      + bitsize > modesize)
     return false;

I wonder what the required semantics are - are they not that we need to access
the whole "underlying" field with the access (the C++ memory model only
requires we don't go beyond the field)?  It seems that information isn't readily
available here though.  So the check checks that we can access the field
with a single access using fieldmode.  Which means (again),

  if (bitnum % (STRICT_ALIGNMENT ? GET_MODE_ALIGNMENT (fieldmode) :
BITS_PER_UNIT)
      + bitsize > modesize)

Also this means that for !STRICT_ALIGNMENT platforms the MEM_ALIGN check isn't
sufficient which is what the other hunks in the patch are about to fix?

It seems at least the

@@ -988,6 +995,16 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I

          str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum,
                                          &bitnum);
+         if (!STRICT_ALIGNMENT
+             && bitnum + bitsize > GET_MODE_BITSIZE (fieldmode))
+           {
+             unsigned HOST_WIDE_INT offset;
+             offset = (bitnum + bitsize + BITS_PER_UNIT - 1
+                       - GET_MODE_BITSIZE (fieldmode)) / BITS_PER_UNIT;
+             str_rtx = adjust_bitfield_address (str_rtx, fieldmode, offset);
+             bitnum -= offset * BITS_PER_UNIT;
+           }
+         gcc_assert (bitnum + bitsize <= GET_MODE_BITSIZE (fieldmode));

hunks could do with a comment.

That said, I fail to realize how the issue is specific to
strict-volatile bitfields?

I also hope somebody else will also look at the patch - I'm not feeling like the
appropriate person to review changes in this area (even if I did so in
the past).

Thanks,
Richard.

>
> Thanks
> Bernd.
>

Bernd Edlinger March 30, 2015, 7:42 p.m. UTC | #16

Hi,

On Mon, 30 Mar 2015 12:33:47, Richard Biener wrote:
>
> So - shouldn't the check be
>
> if (MEM_ALIGN (op0) < GET_MODE_ALIGNMENT (fieldmode))
> return false;
>

No. Because this example would access memory beyond the end of structure
on m66k:

volatile struct s
{
  unsigned x:15;
} g;

int x = g.x;

but when MEM_ALIGN(op0) == fieldmode we know that
sizeof(struct s) == sizeof(int).


> instead? And looking at
>
> /* Check for cases of unaligned fields that must be split. */
> - if (bitnum % BITS_PER_UNIT + bitsize> modesize
> - || (STRICT_ALIGNMENT
> - && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize> modesize))
> + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
> + + bitsize> modesize)
> return false;
>
> I wonder what the required semantics are - are they not that we need to access
> the whole "underlying" field with the access (the C++ memory model only
> requires we don't go beyond the field)? It seems that information isn't readily
> available here though. So the check checks that we can access the field
> with a single access using fieldmode. Which means (again),
>

And another requirement must of course be, that we should never read anything
outside of the structure's limits.


> if (bitnum % (STRICT_ALIGNMENT ? GET_MODE_ALIGNMENT (fieldmode) :
> BITS_PER_UNIT)
> + bitsize> modesize)
>
> Also this means that for !STRICT_ALIGNMENT platforms the MEM_ALIGN check isn't
> sufficient which is what the other hunks in the patch are about to fix?
>
> It seems at least the
>
> @@ -988,6 +995,16 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I
>
> str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum,
> &bitnum);
> + if (!STRICT_ALIGNMENT
> + && bitnum + bitsize> GET_MODE_BITSIZE (fieldmode))
> + {
> + unsigned HOST_WIDE_INT offset;
> + offset = (bitnum + bitsize + BITS_PER_UNIT - 1
> + - GET_MODE_BITSIZE (fieldmode)) / BITS_PER_UNIT;
> + str_rtx = adjust_bitfield_address (str_rtx, fieldmode, offset);
> + bitnum -= offset * BITS_PER_UNIT;
> + }
> + gcc_assert (bitnum + bitsize <= GET_MODE_BITSIZE (fieldmode));
>
> hunks could do with a comment.
>
> That said, I fail to realize how the issue is specific to
> strict-volatile bitfields?
>

I will not try to defend this hunk at the moment, because it is actually not needed to
fix the wrong code bug,


> I also hope somebody else will also look at the patch - I'm not feeling like the
> appropriate person to review changes in this area (even if I did so in
> the past).
>


Sorry, but I just have to continue...

So, now I removed these unaligned access parts again from the patch,

Attached you will find the new reduced version of the patch.

Boot-strapped and regression-tested on x86_64-linux-gnu.

OK for trunk?


Thanks,
Bernd.
gcc:
2015-03-30  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* expmed.c (strict_volatile_bitfield_p): Check that the access will
	not cross a MODESIZE boundary.
	(store_bit_field, extract_bit_field): Added assertions in the
	strict volatile bitfields code path.

testsuite:
2015-03-30  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* gcc.dg/pr23623.c: Added aligned attribute.
	* gcc.dg/20141029-1.c: Likewise.
	* gcc.dg/20150306-1.c: New test.

Richard Biener March 31, 2015, 1:50 p.m. UTC | #17

On Mon, Mar 30, 2015 at 9:42 PM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> Hi,
>
> On Mon, 30 Mar 2015 12:33:47, Richard Biener wrote:
>>
>> So - shouldn't the check be
>>
>> if (MEM_ALIGN (op0) < GET_MODE_ALIGNMENT (fieldmode))
>> return false;
>>
>
> No. Because this example would access memory beyond the end of structure
> on m66k:
>
> volatile struct s
> {
>   unsigned x:15;
> } g;
>
> int x = g.x;
>
> but when MEM_ALIGN(op0) == fieldmode we know that
> sizeof(struct s) == sizeof(int).

We don't for

volatile struct s
{
  unsigned x:15__attribute__((packed)) ;
} g __attribute__((aligned(4)));


>> instead? And looking at
>>
>> /* Check for cases of unaligned fields that must be split. */
>> - if (bitnum % BITS_PER_UNIT + bitsize> modesize
>> - || (STRICT_ALIGNMENT
>> - && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize> modesize))
>> + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT)
>> + + bitsize> modesize)
>> return false;
>>
>> I wonder what the required semantics are - are they not that we need to access
>> the whole "underlying" field with the access (the C++ memory model only
>> requires we don't go beyond the field)? It seems that information isn't readily
>> available here though. So the check checks that we can access the field
>> with a single access using fieldmode. Which means (again),
>>
>
> And another requirement must of course be, that we should never read anything
> outside of the structure's limits.

Yes.  bitregion_end should provide a good hint here?

>> if (bitnum % (STRICT_ALIGNMENT ? GET_MODE_ALIGNMENT (fieldmode) :
>> BITS_PER_UNIT)
>> + bitsize> modesize)
>>
>> Also this means that for !STRICT_ALIGNMENT platforms the MEM_ALIGN check isn't
>> sufficient which is what the other hunks in the patch are about to fix?
>>
>> It seems at least the
>>
>> @@ -988,6 +995,16 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I
>>
>> str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum,
>> &bitnum);
>> + if (!STRICT_ALIGNMENT
>> + && bitnum + bitsize> GET_MODE_BITSIZE (fieldmode))
>> + {
>> + unsigned HOST_WIDE_INT offset;
>> + offset = (bitnum + bitsize + BITS_PER_UNIT - 1
>> + - GET_MODE_BITSIZE (fieldmode)) / BITS_PER_UNIT;
>> + str_rtx = adjust_bitfield_address (str_rtx, fieldmode, offset);
>> + bitnum -= offset * BITS_PER_UNIT;
>> + }
>> + gcc_assert (bitnum + bitsize <= GET_MODE_BITSIZE (fieldmode));
>>
>> hunks could do with a comment.
>>
>> That said, I fail to realize how the issue is specific to
>> strict-volatile bitfields?
>>
>
> I will not try to defend this hunk at the moment, because it is actually not needed to
> fix the wrong code bug,
>
>
>> I also hope somebody else will also look at the patch - I'm not feeling like the
>> appropriate person to review changes in this area (even if I did so in
>> the past).
>>
>
>
> Sorry, but I just have to continue...
>
> So, now I removed these unaligned access parts again from the patch,
>
> Attached you will find the new reduced version of the patch.
>
> Boot-strapped and regression-tested on x86_64-linux-gnu.
>
> OK for trunk?

Ok if nobody else complains within 24h.

Thanks,
Richard.

>
> Thanks,
> Bernd.
>

Fix another wrong-code bug with -fstrict-volatile-bitfields

Commit Message

Comments

Patch