Patchwork [PR43814] Assume function arguments of pointer type are aligned.

login
register
mail settings
Submitter Tom de Vries
Date Sept. 20, 2011, 11:13 a.m.
Message ID <4E787543.1090009@codesourcery.com>
Download mbox | patch
Permalink /patch/115498/
State New
Headers show

Comments

Tom de Vries - Sept. 20, 2011, 11:13 a.m.
Hi Richard,

I have a patch for PR43814. It introduces an option that assumes that function
arguments of pointer type are aligned, and uses that information in
tree-ssa-ccp. This enables the memcpy in pr43814-2.c to be inlined.

I tested the patch successfully on-by-default on x86_64 and i686 (both gcc only
builds).

I also tested the patch on-by-default for ARM (gcc/glibc build). The patch
generated wrong code for uselocale.c:
...
glibc/locale/locale.h:
...
/* This value can be passed to `uselocale' and may be returned by
   it. Passing this value to any other function has undefined behavior.  */
# define LC_GLOBAL_LOCALE       ((__locale_t) -1L)
...
glibc/locale/uselocale.c:
...
locale_t
__uselocale (locale_t newloc)
{
  locale_t oldloc = _NL_CURRENT_LOCALE;

  if (newloc != NULL)
    {
      const locale_t locobj
        = newloc == LC_GLOBAL_LOCALE ? &_nl_global_locale : newloc;

...
The assumption that function arguments of pointer type are aligned, allowed the
test 'newloc == LC_GLOBAL_LOCALE' to evaluate to false.
But the usage of ((__locale_t) -1L) as function argument in uselocale violates
that assumption.

Fixing the definition of LC_GLOBAL_LOCALE allowed the gcc tests to run without
regressions for ARM.

Furthermore, the patch fixes ipa-sra-2.c and ipa-sra-6.c regressions on ARM,
discussed here:
- http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00930.html
- http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00459.html

But, since glibc uses this construct currently, the option is off-by-default for
now.

OK for trunk?

Thanks,
- Tom

2011-09-20  Tom de Vries <tom@codesourcery.com>

	PR target/43814
	* tree-ssa-ccp.c (get_align_value): New function, factored out of
	get_value_from_alignment.
	(get_value_from_alignment): Use get_align_value.
	(get_value_for_expr): Use get_align_value to handle alignment of
	function argument pointers.
	* common.opt (faligned-pointer-argument): New option.
	* doc/invoke.texi (Optimization Options): Add
	-faligned-pointer-argument.
	(-faligned-pointer-argument): New item.

	* gcc/testsuite/gcc.dg/pr43814.c: New test.
	* gcc/testsuite/gcc.target/arm/pr43814-2.c: New test.
Richard Guenther - Sept. 24, 2011, 9:31 a.m.
On Tue, Sep 20, 2011 at 1:13 PM, Tom de Vries <vries@codesourcery.com> wrote:
> Hi Richard,
>
> I have a patch for PR43814. It introduces an option that assumes that function
> arguments of pointer type are aligned, and uses that information in
> tree-ssa-ccp. This enables the memcpy in pr43814-2.c to be inlined.
>
> I tested the patch successfully on-by-default on x86_64 and i686 (both gcc only
> builds).
>
> I also tested the patch on-by-default for ARM (gcc/glibc build). The patch
> generated wrong code for uselocale.c:
> ...
> glibc/locale/locale.h:
> ...
> /* This value can be passed to `uselocale' and may be returned by
>   it. Passing this value to any other function has undefined behavior.  */
> # define LC_GLOBAL_LOCALE       ((__locale_t) -1L)
> ...
> glibc/locale/uselocale.c:
> ...
> locale_t
> __uselocale (locale_t newloc)
> {
>  locale_t oldloc = _NL_CURRENT_LOCALE;
>
>  if (newloc != NULL)
>    {
>      const locale_t locobj
>        = newloc == LC_GLOBAL_LOCALE ? &_nl_global_locale : newloc;
>
> ...
> The assumption that function arguments of pointer type are aligned, allowed the
> test 'newloc == LC_GLOBAL_LOCALE' to evaluate to false.
> But the usage of ((__locale_t) -1L) as function argument in uselocale violates
> that assumption.
>
> Fixing the definition of LC_GLOBAL_LOCALE allowed the gcc tests to run without
> regressions for ARM.
>
> Furthermore, the patch fixes ipa-sra-2.c and ipa-sra-6.c regressions on ARM,
> discussed here:
> - http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00930.html
> - http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00459.html
>
> But, since glibc uses this construct currently, the option is off-by-default for
> now.
>
> OK for trunk?

No, I don't like to have an option to control this.  And given the issue
you spotted it doesn't look like the best idea either.  This area in GCC
is simply too fragile right now :/

It would be nice if we could extend IPA-CP to propagate alignment
information though.

And at some point devise a reliable way for frontends to communicate
alignment constraints the middle-end can rely on (well, yes, you could
argue that's what TYPE_ALIGN is about, and I thought that maybe we
can unconditionally rely on TYPE_ALIGN for pointer PARM_DECLs
at least - your example shows we can't).

In the end I'd probably say the patch is ok without the option (thus
turned on by default), but if LC_GLOBAL_LOCALE is part of the
glibc ABI then we clearly can't do this.

Richard.

> Thanks,
> - Tom
>
> 2011-09-20  Tom de Vries <tom@codesourcery.com>
>
>        PR target/43814
>        * tree-ssa-ccp.c (get_align_value): New function, factored out of
>        get_value_from_alignment.
>        (get_value_from_alignment): Use get_align_value.
>        (get_value_for_expr): Use get_align_value to handle alignment of
>        function argument pointers.
>        * common.opt (faligned-pointer-argument): New option.
>        * doc/invoke.texi (Optimization Options): Add
>        -faligned-pointer-argument.
>        (-faligned-pointer-argument): New item.
>
>        * gcc/testsuite/gcc.dg/pr43814.c: New test.
>        * gcc/testsuite/gcc.target/arm/pr43814-2.c: New test.
>
Jakub Jelinek - Sept. 24, 2011, 9:40 a.m.
On Sat, Sep 24, 2011 at 11:31:25AM +0200, Richard Guenther wrote:
> In the end I'd probably say the patch is ok without the option (thus
> turned on by default), but if LC_GLOBAL_LOCALE is part of the
> glibc ABI then we clearly can't do this.

Yes, LC_GLOBAL_LOCALE is part of glibc ABI.  I guess we could only assume
the alignment if the pointer is actually dereferenced on the statement
that checks the ABI or in some stmt that dominates the spot where you want
to check the alignment.  It is IMHO quite common to pass arbitrary values
in pointer types, then cast them back or just compare.

	Jakub
Richard Guenther - Sept. 24, 2011, 10:42 a.m.
On Sat, Sep 24, 2011 at 11:40 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Sat, Sep 24, 2011 at 11:31:25AM +0200, Richard Guenther wrote:
>> In the end I'd probably say the patch is ok without the option (thus
>> turned on by default), but if LC_GLOBAL_LOCALE is part of the
>> glibc ABI then we clearly can't do this.
>
> Yes, LC_GLOBAL_LOCALE is part of glibc ABI.  I guess we could only assume
> the alignment if the pointer is actually dereferenced on the statement
> that checks the ABI or in some stmt that dominates the spot where you want
> to check the alignment.  It is IMHO quite common to pass arbitrary values
> in pointer types, then cast them back or just compare.

Yeah (even if technically invoking undefined behavior in C).  Checking if
there is a dereference post-dominating function entry with sth like

  FOR_EACH_IMM_USE_STMT (... ptr ...)
     if (stmt_post_dominates_entry && contains derefrence of ptr)
       alignment = TYPE_ALIGN (...);

and otherwise not assuming anything about parameter alignment might work.
Be careful to check the alignment of the dereference though,

typedef int int_unaligned __attribute__((aligned(1)));
int foo (int *p)
{
  int_unaligned *q = p;
  return *q;
}

will be MEM[p] but with (well, hopefully ;)) TYPE_ALIGN of TREE_TYPE (MEM[p])
being 1.  And yes, you'd have to look into handled-components as well.  I guess
you'll face similar problems as we do with tree-sra.c
tree_non_mode_aligned_mem_p
(you need to assume eventually misaligned accesses the same way expansion
does for the dereference, otherwise you'll run into issues on
strict-align targets).

As that de-refrence thing doesn't really fit the CCP propagation you
won't be able
to handle

int foo (int *p)
{
  int *q = (char *)p + 3;
  return *q;
}

and assume q is aligned (and p is misaligned by 1).

That is, if the definition of a pointer is post-dominated by a derefrence
we could assume proper alignment for that pointer (as opposed to just
special-casing its default definition).  Would be certainly interesting to
see what kind of fallout we would get from that ;)

Richard.

>        Jakub
>
Carrot Wei - Sept. 28, 2011, 1:57 p.m.
Hi Tom

What's the behavior of your patch to the following case

typedef int int_unaligned __attribute__((aligned(1)));
int foo (int_unaligned *p)
{
  return *p;
}

thanks
Carrot

On Tue, Sep 20, 2011 at 7:13 PM, Tom de Vries <vries@codesourcery.com> wrote:
> Hi Richard,
>
> I have a patch for PR43814. It introduces an option that assumes that function
> arguments of pointer type are aligned, and uses that information in
> tree-ssa-ccp. This enables the memcpy in pr43814-2.c to be inlined.
>
> I tested the patch successfully on-by-default on x86_64 and i686 (both gcc only
> builds).
>
> I also tested the patch on-by-default for ARM (gcc/glibc build). The patch
> generated wrong code for uselocale.c:
> ...
> glibc/locale/locale.h:
> ...
> /* This value can be passed to `uselocale' and may be returned by
>   it. Passing this value to any other function has undefined behavior.  */
> # define LC_GLOBAL_LOCALE       ((__locale_t) -1L)
> ...
> glibc/locale/uselocale.c:
> ...
> locale_t
> __uselocale (locale_t newloc)
> {
>  locale_t oldloc = _NL_CURRENT_LOCALE;
>
>  if (newloc != NULL)
>    {
>      const locale_t locobj
>        = newloc == LC_GLOBAL_LOCALE ? &_nl_global_locale : newloc;
>
> ...
> The assumption that function arguments of pointer type are aligned, allowed the
> test 'newloc == LC_GLOBAL_LOCALE' to evaluate to false.
> But the usage of ((__locale_t) -1L) as function argument in uselocale violates
> that assumption.
>
> Fixing the definition of LC_GLOBAL_LOCALE allowed the gcc tests to run without
> regressions for ARM.
>
> Furthermore, the patch fixes ipa-sra-2.c and ipa-sra-6.c regressions on ARM,
> discussed here:
> - http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00930.html
> - http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00459.html
>
> But, since glibc uses this construct currently, the option is off-by-default for
> now.
>
> OK for trunk?
>
> Thanks,
> - Tom
>
> 2011-09-20  Tom de Vries <tom@codesourcery.com>
>
>        PR target/43814
>        * tree-ssa-ccp.c (get_align_value): New function, factored out of
>        get_value_from_alignment.
>        (get_value_from_alignment): Use get_align_value.
>        (get_value_for_expr): Use get_align_value to handle alignment of
>        function argument pointers.
>        * common.opt (faligned-pointer-argument): New option.
>        * doc/invoke.texi (Optimization Options): Add
>        -faligned-pointer-argument.
>        (-faligned-pointer-argument): New item.
>
>        * gcc/testsuite/gcc.dg/pr43814.c: New test.
>        * gcc/testsuite/gcc.target/arm/pr43814-2.c: New test.
>
Tom de Vries - Sept. 28, 2011, 4:22 p.m.
On 09/28/2011 03:57 PM, Carrot Wei wrote:
> Hi Tom
> 
> What's the behavior of your patch to the following case
> 
> typedef int int_unaligned __attribute__((aligned(1)));
> int foo (int_unaligned *p)
> {
>   return *p;
> }
> 

I modified the example slightly:

test.c:
...
typedef int __attribute__((aligned(2))) int_unaligned;
int foo (int_unaligned *p)
{
  return *(p+1);
}
...

test.c.023t.ccp1:
...
  # PT = anything
  # ALIGN = 2, MISALIGN = 0
  D.2723_2 = pD.1604_1(D) + 4;
...

Thanks,
- Tom

> thanks
> Carrot
> 
> On Tue, Sep 20, 2011 at 7:13 PM, Tom de Vries <vries@codesourcery.com> wrote:
>> Hi Richard,
>>
>> I have a patch for PR43814. It introduces an option that assumes that function
>> arguments of pointer type are aligned, and uses that information in
>> tree-ssa-ccp. This enables the memcpy in pr43814-2.c to be inlined.
>>
>> I tested the patch successfully on-by-default on x86_64 and i686 (both gcc only
>> builds).
>>
>> I also tested the patch on-by-default for ARM (gcc/glibc build). The patch
>> generated wrong code for uselocale.c:
>> ...
>> glibc/locale/locale.h:
>> ...
>> /* This value can be passed to `uselocale' and may be returned by
>>   it. Passing this value to any other function has undefined behavior.  */
>> # define LC_GLOBAL_LOCALE       ((__locale_t) -1L)
>> ...
>> glibc/locale/uselocale.c:
>> ...
>> locale_t
>> __uselocale (locale_t newloc)
>> {
>>  locale_t oldloc = _NL_CURRENT_LOCALE;
>>
>>  if (newloc != NULL)
>>    {
>>      const locale_t locobj
>>        = newloc == LC_GLOBAL_LOCALE ? &_nl_global_locale : newloc;
>>
>> ...
>> The assumption that function arguments of pointer type are aligned, allowed the
>> test 'newloc == LC_GLOBAL_LOCALE' to evaluate to false.
>> But the usage of ((__locale_t) -1L) as function argument in uselocale violates
>> that assumption.
>>
>> Fixing the definition of LC_GLOBAL_LOCALE allowed the gcc tests to run without
>> regressions for ARM.
>>
>> Furthermore, the patch fixes ipa-sra-2.c and ipa-sra-6.c regressions on ARM,
>> discussed here:
>> - http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00930.html
>> - http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00459.html
>>
>> But, since glibc uses this construct currently, the option is off-by-default for
>> now.
>>
>> OK for trunk?
>>
>> Thanks,
>> - Tom
>>
>> 2011-09-20  Tom de Vries <tom@codesourcery.com>
>>
>>        PR target/43814
>>        * tree-ssa-ccp.c (get_align_value): New function, factored out of
>>        get_value_from_alignment.
>>        (get_value_from_alignment): Use get_align_value.
>>        (get_value_for_expr): Use get_align_value to handle alignment of
>>        function argument pointers.
>>        * common.opt (faligned-pointer-argument): New option.
>>        * doc/invoke.texi (Optimization Options): Add
>>        -faligned-pointer-argument.
>>        (-faligned-pointer-argument): New item.
>>
>>        * gcc/testsuite/gcc.dg/pr43814.c: New test.
>>        * gcc/testsuite/gcc.target/arm/pr43814-2.c: New test.
>>
Maxim Kuvyrkov - Sept. 28, 2011, 6:29 p.m.
On 24/09/2011, at 9:40 PM, Jakub Jelinek wrote:

> On Sat, Sep 24, 2011 at 11:31:25AM +0200, Richard Guenther wrote:
>> In the end I'd probably say the patch is ok without the option (thus
>> turned on by default), but if LC_GLOBAL_LOCALE is part of the
>> glibc ABI then we clearly can't do this.
> 
> Yes, LC_GLOBAL_LOCALE is part of glibc ABI.  I guess we could only assume
> the alignment if the pointer is actually dereferenced on the statement
> that checks the ABI or in some stmt that dominates the spot where you want
> to check the alignment.  It is IMHO quite common to pass arbitrary values
> in pointer types, then cast them back or just compare.

I can relate to camps of both GCC developers and GLIBC developers, and I understand the benefits and liabilities of Tom's optimization.  Ultimately, the problem we need to solve is to find a way to manage non-conforming code with a compiler that tries to squeeze out as much performance from valid code as possible.

I think Tom's suggestion to have an option (either enabled or disabled by default) is a very good solution given the circumstances.  I think we should document the option as a transitional measure designed to give time to GLIBC and others to catch up, and obsolete it in the next release.  GLIBC patch to fix locale_t definition is attached.

In this submission Tom is being punished for his thoroughness in testing the optimization.  Let this be a lesson to all of us to steer clear of GLIBC.  [Kidding.]

We had similar discussions several times already, and it seems the guiding principle of whether enabling new optimization that may break undefined code patterns is to balance expected performance benefits against how frequently the offending code pattern is used.  Returning to the case at hand: Is there code comparing a pointer to an integer?  Yes.  Is it common?  Comparing to a zero, absolutely; to a non-zero .... errr.  Probably not that much.

The performance benefits from the optimization are quite significant.  Pointer alignment has tremendous effect on expanding __builtin_{mem,str}* functions.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
David Miller - Sept. 28, 2011, 6:35 p.m.
From: Maxim Kuvyrkov <maxim@codesourcery.com>
Date: Thu, 29 Sep 2011 07:29:12 +1300

> GLIBC patch to fix locale_t definition is attached.

Isn't this going to result in byte loads being used to dereference
all locale_t pointers on targets like sparc and mips?
Maxim Kuvyrkov - Sept. 28, 2011, 6:40 p.m.
On 29/09/2011, at 7:35 AM, David Miller wrote:

> From: Maxim Kuvyrkov <maxim@codesourcery.com>
> Date: Thu, 29 Sep 2011 07:29:12 +1300
> 
>> GLIBC patch to fix locale_t definition is attached.
> 
> Isn't this going to result in byte loads being used to dereference
> all locale_t pointers on targets like sparc and mips?

Yes, that's the price for binary compatibility for comparing a pointer to -1.  I just hope that no common program has locale functions on its critical path.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
David Miller - Sept. 28, 2011, 6:41 p.m.
From: Maxim Kuvyrkov <maxim@codesourcery.com>
Date: Thu, 29 Sep 2011 07:40:55 +1300

> On 29/09/2011, at 7:35 AM, David Miller wrote:
> 
>> From: Maxim Kuvyrkov <maxim@codesourcery.com>
>> Date: Thu, 29 Sep 2011 07:29:12 +1300
>> 
>>> GLIBC patch to fix locale_t definition is attached.
>> 
>> Isn't this going to result in byte loads being used to dereference
>> all locale_t pointers on targets like sparc and mips?
> 
> Yes, that's the price for binary compatibility for comparing a pointer to -1.  I just hope that no common program has locale functions on its critical path.

I personally don't think this is acceptable, critical path or not.
Maxim Kuvyrkov - Sept. 28, 2011, 6:45 p.m.
On 29/09/2011, at 7:41 AM, David Miller wrote:

> From: Maxim Kuvyrkov <maxim@codesourcery.com>
> Date: Thu, 29 Sep 2011 07:40:55 +1300
> 
>> On 29/09/2011, at 7:35 AM, David Miller wrote:
>> 
>>> From: Maxim Kuvyrkov <maxim@codesourcery.com>
>>> Date: Thu, 29 Sep 2011 07:29:12 +1300
>>> 
>>>> GLIBC patch to fix locale_t definition is attached.
>>> 
>>> Isn't this going to result in byte loads being used to dereference
>>> all locale_t pointers on targets like sparc and mips?
>> 
>> Yes, that's the price for binary compatibility for comparing a pointer to -1.  I just hope that no common program has locale functions on its critical path.
> 
> I personally don't think this is acceptable, critical path or not.

OK.  Do you have an alternative suggestion that would fix non-portable use of locale_t?

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
David Miller - Sept. 28, 2011, 6:48 p.m.
From: Maxim Kuvyrkov <maxim@codesourcery.com>
Date: Thu, 29 Sep 2011 07:45:17 +1300

> OK.  Do you have an alternative suggestion that would fix non-portable use of locale_t?

Don't optimize something that is invalidated by a quite common practice?

What about people who encode invalid pointers with "0xdeadbeef", do we need
to audit every source tree that does this too?  This invalidates the
optimization's preconditions as well.

You're not going to eradicate all the code in the world which uses unaligned
constants to encode pointers to make them have special meanings in certain
situations.

We use the "-1" thing in the Linux kernel too I believe.  I'd go so far as
to say this kind of thing is pervasive.
Maxim Kuvyrkov - Sept. 28, 2011, 6:58 p.m.
David,

To summarize, your opinion seems to be to not enable the optimization by default, correct?

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics


On 29/09/2011, at 7:48 AM, David Miller wrote:

> From: Maxim Kuvyrkov <maxim@codesourcery.com>
> Date: Thu, 29 Sep 2011 07:45:17 +1300
> 
>> OK.  Do you have an alternative suggestion that would fix non-portable use of locale_t?
> 
> Don't optimize something that is invalidated by a quite common practice?
> 
> What about people who encode invalid pointers with "0xdeadbeef", do we need
> to audit every source tree that does this too?  This invalidates the
> optimization's preconditions as well.
> 
> You're not going to eradicate all the code in the world which uses unaligned
> constants to encode pointers to make them have special meanings in certain
> situations.
> 
> We use the "-1" thing in the Linux kernel too I believe.  I'd go so far as
> to say this kind of thing is pervasive.
> 
>
David Miller - Sept. 28, 2011, 7 p.m.
From: Maxim Kuvyrkov <maxim@codesourcery.com>
Date: Thu, 29 Sep 2011 07:58:01 +1300

> To summarize, your opinion seems to be to not enable the
> optimization by default, correct?

Yes, and I don't think we ever could do so.

BTW, another common paradigm is using the "always clear" bits of a
pointer to encode state bits.
Jeff Law - Sept. 28, 2011, 7:06 p.m.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09/28/11 13:00, David Miller wrote:
> From: Maxim Kuvyrkov <maxim@codesourcery.com> Date: Thu, 29 Sep
> 2011 07:58:01 +1300
> 
>> To summarize, your opinion seems to be to not enable the 
>> optimization by default, correct?
> 
> Yes, and I don't think we ever could do so.
> 
> BTW, another common paradigm is using the "always clear" bits of a 
> pointer to encode state bits.
The PA (and IA64 IIRC) store tidbits of info in the low order (always
zero) bits of function addresses.  May not be relevant for the
discussion at hand, but figured I'd mention it.

jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOg3AYAAoJEBRtltQi2kC7j3AIAIUdlIfYpDYagNio0sfX+OkD
7LnBhXH15IU6NyaPQ7JYDfcr98I42tmBIwRNfzVyZJBiWSLhEYnFZV9WrZ3tubha
7WGIfXqA4sxLapOrTwGwJTO5Z1lCGjIgz3H8WPv1nKxMeQ1hmeLVaXrKUc9fiJQ5
tAp7g0ZRXW8RbJ3GmA4IYogIRSLK9OmrG3UPBPL6ARUG1vYgLVFGfW/k6ddrEcWe
HaFy31S57rVH8jTDlY769FbJyHajQ86ZJByNQ6hIvTeRojomAacsjRAthrpGKqxj
/yd3Rj6dOvogqp3dpaKFLOwvn2Zo2SUDt/bQkPWs/iRUlyxNnICaiZfo0FsJgD4=
=w/UC
-----END PGP SIGNATURE-----
Jakub Jelinek - Sept. 28, 2011, 7:13 p.m.
On Wed, Sep 28, 2011 at 03:00:09PM -0400, David Miller wrote:
> From: Maxim Kuvyrkov <maxim@codesourcery.com>
> Date: Thu, 29 Sep 2011 07:58:01 +1300
> 
> > To summarize, your opinion seems to be to not enable the
> > optimization by default, correct?
> 
> Yes, and I don't think we ever could do so.
> 
> BTW, another common paradigm is using the "always clear" bits of a
> pointer to encode state bits.

Yeah, I think it is a bad optimization that is going to break lots of code
and the glibc patch is definitely unacceptable (and not doing what you think
it is doing, your change didn't do anything at all, as the aligned attribute
applies to the pointer type and not to the pointed type).

The alignment of the argument pointer can be IMHO only hard assumed
if there is a load or store using that type dominating the spot where you
are checking it, and the target is strict alignment target.
Then we can both assume that alignment, optimize away tests on the low bits
etc.
Otherwise, what you could do is just use the pointed type's alignment as
a hint, likely alignment, e.g. on non-strict alignment target you could
expand code optimistically assuming that it is very likely aligned, but
still shouldn't optimize away low bits tests.

	Jakub
Richard Guenther - Sept. 28, 2011, 7:56 p.m.
On Wed, Sep 28, 2011 at 9:13 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Sep 28, 2011 at 03:00:09PM -0400, David Miller wrote:
>> From: Maxim Kuvyrkov <maxim@codesourcery.com>
>> Date: Thu, 29 Sep 2011 07:58:01 +1300
>>
>> > To summarize, your opinion seems to be to not enable the
>> > optimization by default, correct?
>>
>> Yes, and I don't think we ever could do so.
>>
>> BTW, another common paradigm is using the "always clear" bits of a
>> pointer to encode state bits.
>
> Yeah, I think it is a bad optimization that is going to break lots of code
> and the glibc patch is definitely unacceptable (and not doing what you think
> it is doing, your change didn't do anything at all, as the aligned attribute
> applies to the pointer type and not to the pointed type).
>
> The alignment of the argument pointer can be IMHO only hard assumed
> if there is a load or store using that type dominating the spot where you
> are checking it, and the target is strict alignment target.
> Then we can both assume that alignment, optimize away tests on the low bits
> etc.
> Otherwise, what you could do is just use the pointed type's alignment as
> a hint, likely alignment, e.g. on non-strict alignment target you could
> expand code optimistically assuming that it is very likely aligned, but
> still shouldn't optimize away low bits tests.

There is nothing like "very likely aligned" ;)  Note that what is new is
that we now no longer assume alignment by default (we did in the past)
and that we derive properties about the pointer _value_ from alignment.

I think we can derive pointer values when we see dereferences, the
code wouldn't be portable to strict-alignment targets otherwise.  We
can offer a -fno-strict-alignment option to disable deriving alignment
from types.

Richard.

>        Jakub
>
Jakub Jelinek - Sept. 28, 2011, 8:08 p.m.
On Wed, Sep 28, 2011 at 09:56:27PM +0200, Richard Guenther wrote:
> There is nothing like "very likely aligned" ;)  Note that what is new is

On non-strict aligned targets there is no reason not to have something like
"very likely aligned".  You would expand stores/loads as if it was aligned
in that case, and if it isn't, all you'd need to ensure from that is that
you don't derive properties about the pointer value from the "likely
aligned" info, only from alignment.

"Very likely aligned" is interesting to the vectorizer too, if it is very
likely something is sufficiently aligned, the vectorizer could decide to
assume the alignment in the vectorized loop and add the check for the
alignment to the loop guards.  In the likely case the vectorized loop would
be used (if other guards were true too), in the unlikely case it is
unaligned it would just use a slower loop.

> that we now no longer assume alignment by default (we did in the past)
> and that we derive properties about the pointer _value_ from alignment.
> 
> I think we can derive pointer values when we see dereferences, the
> code wouldn't be portable to strict-alignment targets otherwise.  We

But any references?  If you have
int foo (int *p)
{
  memcpy (p, "a", 1);
  return ((uintptr_t) p & 3) == 0;
}
then even if p isn't aligned, this could work even on strict aligned
targets.

Anyway, the arbitrary value in a pointer thing is much more important then
the rest, so having the dominating dereference test is very important.

	Jakub
Mike Stump - Sept. 28, 2011, 10:19 p.m.
On Sep 20, 2011, at 4:13 AM, Tom de Vries wrote:
> I have a patch for PR43814. It introduces an option that assumes that function
> arguments of pointer type are aligned, and uses that information in
> tree-ssa-ccp. This enables the memcpy in pr43814-2.c to be inlined.

I'm not a huge fan of an option that is very hard to use.  I think this option would be hard to use.  I'd rather have a port just assert that no code will be compiled that is weird in this way, then the front end can check for weird values on int to pointer conversions with constants and complain about the code, if they tried it.  If Android is safe in this respect, then, they can just turn it on, and then force anyone porting software to their platform to `fix' their code.  For systems that are clean, and new systems, we can recommend they set the option on.  For legacy systems that don't want to change or just want to compile legacy software, well, they can opt out.
David Miller - Sept. 28, 2011, 10:43 p.m.
From: Mike Stump <mikestump@comcast.net>
Date: Wed, 28 Sep 2011 15:19:10 -0700

> If Android is safe in this respect, then, they can just turn it on,
> and then force anyone porting software to their platform to `fix'
> their code.

They'd have to then know to turn this option off when building the kernel,
which does use such constructs extensively.

I think this whole idea has too many downsides to be considered seriously.

People write problems like this, lots of people.  It's a pervasive
technique to encode boolean state into the low bits of a pointer or to
represent special "token" pointers using integers such as "-1".
Jakub Jelinek - Sept. 28, 2011, 10:52 p.m.
On Wed, Sep 28, 2011 at 06:43:04PM -0400, David Miller wrote:
> From: Mike Stump <mikestump@comcast.net>
> Date: Wed, 28 Sep 2011 15:19:10 -0700
> 
> > If Android is safe in this respect, then, they can just turn it on,
> > and then force anyone porting software to their platform to `fix'
> > their code.
> 
> They'd have to then know to turn this option off when building the kernel,
> which does use such constructs extensively.
> 
> I think this whole idea has too many downsides to be considered seriously.
> 
> People write problems like this, lots of people.  It's a pervasive
> technique to encode boolean state into the low bits of a pointer or to
> represent special "token" pointers using integers such as "-1".

Or "1".  Just do
grep '\*)[[:blank:]]*1' *.[chS]
to see how often an integer value is stored into a pointer.
And it is not just void * pointers, it is struct cgraph_node *
or struct varpool_node * too and the pointed types there certainly
have higher alignment than 1.
Now repeat the same with google codesearch.

	Jakub
Richard Guenther - Sept. 29, 2011, 9:11 a.m.
On Wed, Sep 28, 2011 at 10:08 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Sep 28, 2011 at 09:56:27PM +0200, Richard Guenther wrote:
>> There is nothing like "very likely aligned" ;)  Note that what is new is
>
> On non-strict aligned targets there is no reason not to have something like
> "very likely aligned".  You would expand stores/loads as if it was aligned
> in that case, and if it isn't, all you'd need to ensure from that is that
> you don't derive properties about the pointer value from the "likely
> aligned" info, only from alignment.

As if aligned, say, using movaps? ;)  Then we can as well derive
properties about the pointer value.

Note that I don't see a real difference between expanding a load/store
assuming the pointer is aligned from deriving properties about the pointer
value.  To support existing questionable code I'd do both only if
there is a dereference post-dominating the pointer assignment of course.

> "Very likely aligned" is interesting to the vectorizer too, if it is very
> likely something is sufficiently aligned, the vectorizer could decide to
> assume the alignment in the vectorized loop and add the check for the
> alignment to the loop guards.  In the likely case the vectorized loop would
> be used (if other guards were true too), in the unlikely case it is
> unaligned it would just use a slower loop.
>
>> that we now no longer assume alignment by default (we did in the past)
>> and that we derive properties about the pointer _value_ from alignment.
>>
>> I think we can derive pointer values when we see dereferences, the
>> code wouldn't be portable to strict-alignment targets otherwise.  We
>
> But any references?  If you have
> int foo (int *p)
> {
>  memcpy (p, "a", 1);
>  return ((uintptr_t) p & 3) == 0;
> }
> then even if p isn't aligned, this could work even on strict aligned
> targets.

Well, the above isn't a "dereference".  Or rather, if it is, it's a dereference
through a type with assumed alignment of 1 byte.

> Anyway, the arbitrary value in a pointer thing is much more important then
> the rest, so having the dominating dereference test is very important.

Yes.

Richard.

>        Jakub
>
Jakub Jelinek - Sept. 29, 2011, 9:20 a.m.
On Thu, Sep 29, 2011 at 11:11:12AM +0200, Richard Guenther wrote:
> On Wed, Sep 28, 2011 at 10:08 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Wed, Sep 28, 2011 at 09:56:27PM +0200, Richard Guenther wrote:
> >> There is nothing like "very likely aligned" ;)  Note that what is new is
> >
> > On non-strict aligned targets there is no reason not to have something like
> > "very likely aligned".  You would expand stores/loads as if it was aligned
> > in that case, and if it isn't, all you'd need to ensure from that is that
> > you don't derive properties about the pointer value from the "likely
> > aligned" info, only from alignment.
> 
> As if aligned, say, using movaps? ;)  Then we can as well derive
> properties about the pointer value.
> 
> Note that I don't see a real difference between expanding a load/store
> assuming the pointer is aligned from deriving properties about the pointer
> value.  To support existing questionable code I'd do both only if
> there is a dereference post-dominating the pointer assignment of course.

I was talking about e.g. PR49442, where using the unaligned stores has
severe performance hit.  If compiler was hinted that the store pointers are
likely aligned, it could version on the alignment.

> > But any references?  If you have
> > int foo (int *p)
> > {
> >  memcpy (p, "a", 1);
> >  return ((uintptr_t) p & 3) == 0;
> > }
> > then even if p isn't aligned, this could work even on strict aligned
> > targets.
> 
> Well, the above isn't a "dereference".  Or rather, if it is, it's a dereference
> through a type with assumed alignment of 1 byte.

If the dereference is only through the declared type of the parameter, I
feel much safer about it.  But I thought this thread was exactly about
memcpy/memset with int * pointer not dereferenced in any other way.

	Jakub
Richard Guenther - Sept. 29, 2011, 9:26 a.m.
On Thu, Sep 29, 2011 at 11:20 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Sep 29, 2011 at 11:11:12AM +0200, Richard Guenther wrote:
>> On Wed, Sep 28, 2011 at 10:08 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> > On Wed, Sep 28, 2011 at 09:56:27PM +0200, Richard Guenther wrote:
>> >> There is nothing like "very likely aligned" ;)  Note that what is new is
>> >
>> > On non-strict aligned targets there is no reason not to have something like
>> > "very likely aligned".  You would expand stores/loads as if it was aligned
>> > in that case, and if it isn't, all you'd need to ensure from that is that
>> > you don't derive properties about the pointer value from the "likely
>> > aligned" info, only from alignment.
>>
>> As if aligned, say, using movaps? ;)  Then we can as well derive
>> properties about the pointer value.
>>
>> Note that I don't see a real difference between expanding a load/store
>> assuming the pointer is aligned from deriving properties about the pointer
>> value.  To support existing questionable code I'd do both only if
>> there is a dereference post-dominating the pointer assignment of course.
>
> I was talking about e.g. PR49442, where using the unaligned stores has
> severe performance hit.  If compiler was hinted that the store pointers are
> likely aligned, it could version on the alignment.

Well, it could do so anyway?  It just doesn't as the vectorizer cost model
and hw description says using misaligned moves is just fine.

>> > But any references?  If you have
>> > int foo (int *p)
>> > {
>> >  memcpy (p, "a", 1);
>> >  return ((uintptr_t) p & 3) == 0;
>> > }
>> > then even if p isn't aligned, this could work even on strict aligned
>> > targets.
>>
>> Well, the above isn't a "dereference".  Or rather, if it is, it's a dereference
>> through a type with assumed alignment of 1 byte.
>
> If the dereference is only through the declared type of the parameter, I
> feel much safer about it.  But I thought this thread was exactly about
> memcpy/memset with int * pointer not dereferenced in any other way.

Maybe that was the PR the patch is about, but surely we can't treat
a memcpy (p, ..) like *p.  Which means we'd use the declared type
of p only.  And we can do so only for parameters (intermediate conversions
are dropped), which would make the case apply to artificial testcases only
(and thus not worth).

Richard.

>        Jakub
>
Bernd Schmidt - Sept. 29, 2011, 10:51 a.m.
On 09/29/11 11:26, Richard Guenther wrote:
> Maybe that was the PR the patch is about, but surely we can't treat
> a memcpy (p, ..) like *p.  Which means we'd use the declared type
> of p only.  And we can do so only for parameters (intermediate conversions
> are dropped), which would make the case apply to artificial testcases only
> (and thus not worth).

Didn't we used to do exactly that, though?

Googling shows

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=325670

but I seem to remember PRs in gcc bugzilla marked not a bug. Or maybe my
memory is playing tricks on me.


Bernd
Richard Guenther - Sept. 29, 2011, 11:01 a.m.
On Thu, Sep 29, 2011 at 12:51 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 09/29/11 11:26, Richard Guenther wrote:
>> Maybe that was the PR the patch is about, but surely we can't treat
>> a memcpy (p, ..) like *p.  Which means we'd use the declared type
>> of p only.  And we can do so only for parameters (intermediate conversions
>> are dropped), which would make the case apply to artificial testcases only
>> (and thus not worth).
>
> Didn't we used to do exactly that, though?

Yes we did.  The point is this is all language semantics and we are not
good at expressing it given the various GCC extensions we have.

struct { int i; } __attribute__((aligned(1))) x;

and &x->i will have type int *, not int __attribute__((aligned(1))) *.
Basically because we share FIELD_DECLs for type variants and
alignment consitutes a variant, the FIELD_DECL has natural
alignment and whenever the context (that we access a component of x)
vanishes the alignment information vanishes.

Thus, the middle-end cannot reasonably derive anything from type
alignment.  Because the frontends get it wrong.  Which means we
used to miscompile valid code (using GCC extensions, of course).

Richard.

> Googling shows
>
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=325670
>
> but I seem to remember PRs in gcc bugzilla marked not a bug. Or maybe my
> memory is playing tricks on me.
>
>
> Bernd
>

Patch

Index: gcc/tree-ssa-ccp.c
===================================================================
--- gcc/tree-ssa-ccp.c (revision 178804)
+++ gcc/tree-ssa-ccp.c (working copy)
@@ -500,20 +500,14 @@  value_to_double_int (prop_value_t val)
     return double_int_zero;
 }
 
-/* Return the value for the address expression EXPR based on alignment
-   information.  */
+/* Return the value for an expr of type TYPE with alignment ALIGN and offset
+   BITPOS relative to the alignment.  */
 
 static prop_value_t
-get_value_from_alignment (tree expr)
+get_align_value (unsigned int align, tree type, unsigned HOST_WIDE_INT bitpos)
 {
-  tree type = TREE_TYPE (expr);
   prop_value_t val;
-  unsigned HOST_WIDE_INT bitpos;
-  unsigned int align;
-
-  gcc_assert (TREE_CODE (expr) == ADDR_EXPR);
 
-  align = get_object_alignment_1 (TREE_OPERAND (expr, 0), &bitpos);
   val.mask
     = double_int_and_not (POINTER_TYPE_P (type) || TYPE_UNSIGNED (type)
 			  ? double_int_mask (TYPE_PRECISION (type))
@@ -529,6 +523,21 @@  get_value_from_alignment (tree expr)
   return val;
 }
 
+/* Return the value for the address expression EXPR based on alignment
+   information.  */
+
+static prop_value_t
+get_value_from_alignment (tree expr)
+{
+  unsigned int align;
+  unsigned HOST_WIDE_INT bitpos;
+
+  gcc_assert (TREE_CODE (expr) == ADDR_EXPR);
+
+  align = get_object_alignment_1 (TREE_OPERAND (expr, 0), &bitpos);
+  return get_align_value (align, TREE_TYPE (expr), bitpos);
+}
+
 /* Return the value for the tree operand EXPR.  If FOR_BITS_P is true
    return constant bits extracted from alignment information for
    invariant addresses.  */
@@ -540,11 +549,21 @@  get_value_for_expr (tree expr, bool for_
 
   if (TREE_CODE (expr) == SSA_NAME)
     {
+      tree var = SSA_NAME_VAR (expr);
       val = *get_value (expr);
-      if (for_bits_p
-	  && val.lattice_val == CONSTANT
+      if (!for_bits_p)
+	return val;
+
+      if (val.lattice_val == CONSTANT
 	  && TREE_CODE (val.value) == ADDR_EXPR)
 	val = get_value_from_alignment (val.value);
+      else if (flag_aligned_pointer_argument
+	       && val.lattice_val == VARYING
+	       && SSA_NAME_IS_DEFAULT_DEF (expr)
+	       && TREE_CODE (var) == PARM_DECL
+	       && POINTER_TYPE_P (TREE_TYPE (var)))
+	val = get_align_value (TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))),
+			       TREE_TYPE (var), 0);
     }
   else if (is_gimple_min_invariant (expr)
 	   && (!for_bits_p || TREE_CODE (expr) != ADDR_EXPR))
Index: gcc/common.opt
===================================================================
--- gcc/common.opt (revision 178804)
+++ gcc/common.opt (working copy)
@@ -786,6 +786,10 @@  Driver Undocumented
 fabi-version=
 Common Joined RejectNegative UInteger Var(flag_abi_version) Init(2)
 
+faligned-pointer-argument
+Common Report Var(flag_aligned_pointer_argument) Optimization
+Assume function arguments of pointer type are aligned.
+
 falign-functions
 Common Report Var(align_functions,0) Optimization UInteger
 Align the start of functions
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi (revision 178804)
+++ gcc/doc/invoke.texi (working copy)
@@ -345,7 +345,8 @@  Objective-C and Objective-C++ Dialects}.
 
 @item Optimization Options
 @xref{Optimize Options,,Options that Control Optimization}.
-@gccoptlist{-falign-functions[=@var{n}] -falign-jumps[=@var{n}] @gol
+@gccoptlist{-faligned-pointer-argument -falign-functions[=@var{n}] @gol
+-falign-jumps[=@var{n}] @gol
 -falign-labels[=@var{n}] -falign-loops[=@var{n}] -fassociative-math @gol
 -fauto-inc-dec -fbranch-probabilities -fbranch-target-load-optimize @gol
 -fbranch-target-load-optimize2 -fbtr-bb-exclusive -fcaller-saves @gol
@@ -7552,6 +7553,10 @@  arithmetic on constants, the overflowed
 The @option{-fstrict-overflow} option is enabled at levels
 @option{-O2}, @option{-O3}, @option{-Os}.
 
+@item -faligned-pointer-argument
+@opindex faligned-pointer-argument
+Assume function arguments of pointer type are aligned.
+
 @item -falign-functions
 @itemx -falign-functions=@var{n}
 @opindex falign-functions
Index: gcc/testsuite/gcc.dg/pr43814.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr43814.c (revision 0)
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -faligned-pointer-argument -fdump-tree-ccp1-all" } */
+
+void f (int *a)
+{
+  *(a+1) = 0;
+}
+
+/* { dg-final { scan-tree-dump-times "ALIGN = \[2-9\]\[0-9\]*, MISALIGN = 0" 1 "ccp1"} } */
+/* { dg-final { cleanup-tree-dump "ccp1" } } */
+
Index: gcc/testsuite/gcc.target/arm/pr43814-2.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.target/arm/pr43814-2.c (revision 0)
@@ -0,0 +1,33 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -faligned-pointer-argument -fdump-rtl-expand" } */
+
+typedef unsigned int size_t;
+extern void* memcpy (void *, const void *, size_t);
+
+typedef union JValue {
+  void* l;
+} JValue;
+typedef struct Object {
+  int x;
+} Object;
+
+extern __inline__ long long
+dvmGetArgLong (const unsigned int* args, int elem)
+{
+  long long val;
+  memcpy (&val, &args[elem], 8);
+  return val;
+}
+
+void
+Dalvik_sun_misc_Unsafe_getObject (const unsigned int* args, JValue* pResult)
+{
+  Object* obj = (Object*) args[1];
+  long long offset = dvmGetArgLong (args, 2);
+  Object** address = (Object**) (((unsigned char*) obj) + offset);
+  pResult->l = ((void*) *address);
+}
+
+/* { dg-final { scan-rtl-dump-times "memcpy" 0 "expand"} } */
+/* { dg-final { cleanup-tree-dump "expand" } } */
+