diff mbox

Pass correct memory attributes for build constant

Message ID CA+yXCZCUgioKCimPVvvU10Zwc9a+TFBWBxjbU6pHLMkW8+YV3w@mail.gmail.com
State New
Headers show

Commit Message

Kito Cheng June 25, 2014, 3:35 p.m. UTC
Hi all:
  This patch is fix constant memory's symbol_ref don't have right
alignment info since `exp` don't set alignment (and should not set
alignment info for `exp`)  , use `decl` to set_mem_attributes for
right alignment info.

ChangLog
2014-06-25  Kito Cheng  <kito@0xlab.org>

        * varasm.c (build_constant_desc): Use decl to set mem
        attributes since exp don't have correct aligment info.

Comments

Jeff Law June 25, 2014, 9:01 p.m. UTC | #1
On 06/25/14 09:35, Kito Cheng wrote:
> Hi all:
>    This patch is fix constant memory's symbol_ref don't have right
> alignment info since `exp` don't set alignment (and should not set
> alignment info for `exp`)  , use `decl` to set_mem_attributes for
> right alignment info.
>
> ChangLog
> 2014-06-25  Kito Cheng  <kito@0xlab.org>
>
>          * varasm.c (build_constant_desc): Use decl to set mem
>          attributes since exp don't have correct aligment info.
Do you have a testcase for this?

jeff
Kito Cheng June 26, 2014, 3:38 a.m. UTC | #2
For example in arm-elf-eabi, movmem need word align, otherwise it will
expand a libcall:

And gcc configure with "--target=arm-elf-eabi --disable-nls
--disable-shared --enable-languages=c,c++ --enable-threads=single
--enable-lto --with-newlib"

test.c:
extern bar(unsigned char p[3][2]);
void foo(int i)
{
    unsigned char data[3][2] = {{1,1}, {1,0}, {1,1}};

    bar(data);
}


Without this patch:
...
        .text
        .align  2
        .global foo
        .type   foo, %function
foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 0, uses_anonymous_args = 0
        str     lr, [sp, #-4]!
        sub     sp, sp, #12
        mov     r2, #6
        ldr     r1, .L3
        mov     r0, sp
        bl      memcpy @ LC0 has 4 byte align, but arm_gen_movmemqi
can't get right alignment info
        mov     r0, sp
        bl      bar
        add     sp, sp, #12
        @ sp needed
        ldr     lr, [sp], #4
        bx      lr
.L4:
        .align  2
.L3:
        .word   .LANCHOR0
        .size   foo, .-foo
        .section        .rodata
        .align  2
        .set    .LANCHOR0,. + 0
.LC0:
        .byte   1
        .byte   1
        .byte   1
        .byte   0
        .byte   1
        .byte   1


With this patch:

...
foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 0, uses_anonymous_args = 0
        str     lr, [sp, #-4]!
        ldr     r3, .L3 @ LC0 has 4 byte align, arm_gen_movmemqi get
right alignment info, so expand it!
        ldmia   r3, {r0, r1}
        sub     sp, sp, #12
        str     r0, [sp]
        mov     r0, sp
        strh    r1, [sp, #4]    @ movhi
        bl      bar
        add     sp, sp, #12
        @ sp needed
        ldr     lr, [sp], #4
        bx      lr
...

On Thu, Jun 26, 2014 at 5:01 AM, Jeff Law <law@redhat.com> wrote:
> On 06/25/14 09:35, Kito Cheng wrote:
>>
>> Hi all:
>>    This patch is fix constant memory's symbol_ref don't have right
>> alignment info since `exp` don't set alignment (and should not set
>> alignment info for `exp`)  , use `decl` to set_mem_attributes for
>> right alignment info.
>>
>> ChangLog
>> 2014-06-25  Kito Cheng  <kito@0xlab.org>
>>
>>          * varasm.c (build_constant_desc): Use decl to set mem
>>          attributes since exp don't have correct aligment info.
>
> Do you have a testcase for this?
>
> jeff
>
Jeff Law June 27, 2014, 10:34 p.m. UTC | #3
On 06/25/14 21:38, Kito Cheng wrote:
> For example in arm-elf-eabi, movmem need word align, otherwise it will
> expand a libcall:
>
> And gcc configure with "--target=arm-elf-eabi --disable-nls
> --disable-shared --enable-languages=c,c++ --enable-threads=single
> --enable-lto --with-newlib"
>
> test.c:
> extern bar(unsigned char p[3][2]);
> void foo(int i)
> {
>      unsigned char data[3][2] = {{1,1}, {1,0}, {1,1}};
>
>      bar(data);
> }
First, note, I'm not an ARM expert.  However, the first question I have 
is are we sure the initializer is always going to be suitably aligned? 
   What guarantees this initializer is going to have 32 bit alignment 
like you want?  I can see how that *section* gets its alignment, but I 
don't offhand see what in the ARM backend ensures that a CONSTRUCTOR 
tree has larger than normal known alignment.

I think that needs to be settled first, then we need to verify that the 
trees are correctly carrying that alignment requirement around and that 
the code uses it appropriately (and I have my doubts because EXP is a 
CONSTRUCTOR element and those seem to be largely ignored in the code 
we're looking to change).

I would also strongly recommend turning your testcase into something we 
can add to the testsuite.

If you look in gcc/testsuite/gcc.target/arm you'll see several examples. 
  I think you just want to compile this down to assembly code with the 
optimizer enabled, then verify there is no call to memcpy in the 
resulting output.  20030909-1.c would seem to be a reasonable example of 
a test that does something similar.


Jeff
Jan Hubicka June 28, 2014, 12:14 a.m. UTC | #4
> On 06/25/14 21:38, Kito Cheng wrote:
> >For example in arm-elf-eabi, movmem need word align, otherwise it will
> >expand a libcall:
> >
> >And gcc configure with "--target=arm-elf-eabi --disable-nls
> >--disable-shared --enable-languages=c,c++ --enable-threads=single
> >--enable-lto --with-newlib"
> >
> >test.c:
> >extern bar(unsigned char p[3][2]);
> >void foo(int i)
> >{
> >     unsigned char data[3][2] = {{1,1}, {1,0}, {1,1}};
> >
> >     bar(data);
> >}
> First, note, I'm not an ARM expert.  However, the first question I
> have is are we sure the initializer is always going to be suitably
> aligned?   What guarantees this initializer is going to have 32 bit

On not so related note,  vectorizer knows everything about how to increase
alignment of a declaration (not in constant pool for now, because that one is
riddled by implementation defaults) on demand.   It would be nice to teach
string operation expanders to do so when they think it helps.
(see vect_can_force_dr_alignment_p)
> alignment like you want?  I can see how that *section* gets its
> alignment, but I don't offhand see what in the ARM backend ensures
> that a CONSTRUCTOR tree has larger than normal known alignment.

I think those boil down into CONSTANT_DECL that is aligned by DATA_ALIGNMENT.
But it would be great to have API to boos these up on the demand.
I find the whole macro macinery around memcpy/memset bit difficult - one think
I would like it do to is this trick, other thing is to allow vector registers to
be used by COPY_BY_PIECES.

Honza
> 
> I think that needs to be settled first, then we need to verify that
> the trees are correctly carrying that alignment requirement around
> and that the code uses it appropriately (and I have my doubts
> because EXP is a CONSTRUCTOR element and those seem to be largely
> ignored in the code we're looking to change).
> 
> I would also strongly recommend turning your testcase into something
> we can add to the testsuite.
> 
> If you look in gcc/testsuite/gcc.target/arm you'll see several
> examples.  I think you just want to compile this down to assembly
> code with the optimizer enabled, then verify there is no call to
> memcpy in the resulting output.  20030909-1.c would seem to be a
> reasonable example of a test that does something similar.
> 
> 
> Jeff
Kito Cheng June 30, 2014, 6:17 a.m. UTC | #5
>>test.c:
>>extern bar(unsigned char p[3][2]);
>>void foo(int i)
>>{
>>     unsigned char data[3][2] = {{1,1}, {1,0}, {1,1}};
>>
>>     bar(data);
>>}
> First, note, I'm not an ARM expert.  However, the first question I
> have is are we sure the initializer is always going to be suitably
> aligned?   What guarantees this initializer is going to have 32 bit

It's a ARRAY_TYPE for the data and ARM require 32-bit align for that.
(Aligned by DATA_ALIGNMENT as Jan say.)

> I think that needs to be settled first, then we need to verify that
> the trees are correctly carrying that alignment requirement around
> and that the code uses it appropriately (and I have my doubts
> because EXP is a CONSTRUCTOR element and those seem to be largely
> ignored in the code we're looking to change).

The key problem is `exp` don't have right alignment info, but `decl` have,
 we can observe this in the code:

varasm.c
3166   /* Construct the VAR_DECL associated with the constant.  */
3167   decl = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier (label),
3168                      TREE_TYPE (exp));
3169   DECL_ARTIFICIAL (decl) = 1;
3170   DECL_IGNORED_P (decl) = 1;
3171   TREE_READONLY (decl) = 1;
3172   TREE_STATIC (decl) = 1;
3173   TREE_ADDRESSABLE (decl) = 1;
...
3181   if (TREE_CODE (exp) == STRING_CST)
3182     {
3183 #ifdef CONSTANT_ALIGNMENT
3184       DECL_ALIGN (decl) = CONSTANT_ALIGNMENT (exp, DECL_ALIGN (decl));
3185 #endif
3186     }
3187   else
3188     align_variable (decl, 0);

`decl` get alignment info here but `exp` doesn't.

...
3203   rtl = gen_const_mem (TYPE_MODE (TREE_TYPE (exp)), symbol);
3204   set_mem_attributes (rtl, exp, 1);
but here, we use `exp` to set memory attribute for MEM rtl.

> I would also strongly recommend turning your testcase into something
> we can add to the testsuite.
>
> If you look in gcc/testsuite/gcc.target/arm you'll see several
> examples.  I think you just want to compile this down to assembly
> code with the optimizer enabled, then verify there is no call to
> memcpy in the resulting output.  20030909-1.c would seem to be a
> reasonable example of a test that does something similar.

Hmmm, it's not target dependent problem, but this problem only can
observe by some target since not every target use MEM_ALIGN info for code gen,
the most common case is movmem pattern, they use alignment info to
decide expand or not.

So I am not sure is does it reasonable to make a testcase for target?
diff mbox

Patch

diff --git a/gcc/varasm.c b/gcc/varasm.c
index 7be56f1..6e7ca5a 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -3201,7 +3201,7 @@  build_constant_desc (tree exp)
   TREE_CONSTANT_POOL_ADDRESS_P (symbol) = 1;
 
   rtl = gen_const_mem (TYPE_MODE (TREE_TYPE (exp)), symbol);
-  set_mem_attributes (rtl, exp, 1);
+  set_mem_attributes (rtl, decl, 1);
   set_mem_alias_set (rtl, 0);
   set_mem_alias_set (rtl, const_alias_set);