Message ID | 1570850535-46183-1-git-send-email-guojiufu@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | rs6000: -flto forgets 'no-vsx' function attributes (PR target/70010) | expand |
Hi! On Sat, Oct 12, 2019 at 11:22:15AM +0800, Jiufu Guo wrote: > As expected in the PR, when a function is marked with no-vsx, we would > assume user has good reason to disable VSX code generation for the function. > To avoid VSX code generation, this function should not be inlined into VSX > function. But your patch also disables inlining if the callee merely defaulted to no-vsx. Can you see if the option is explicitly disabled here? > PR target/70010 > * config/rs6000/rs6000.c (rs6000_can_inline_p): Check 'vsx' feature > for caller and callee Each sentence should end with a period, also in changelogs. > PR target/70010 > * gcc.target/powerpc/inline_error.c: New test Ditto. > - /* Callee's options should a subset of the caller's, i.e. a vsx function > - can inline an altivec function but a non-vsx function can't inline a > - vsx function. */ > - if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags) > - == callee_opts->x_rs6000_isa_flags) > + /* Callee's options should a subset of the caller's. In addition, > + vsx function should not inline function with non-vsx by which > + programmer may intend to disable VSX code generation. */ Two spaces after a period. How about something like /* Callee's options should be a subset of the caller's. Also, a function without VSX enabled should not be inlined into one with VSX enabled, because it may be important it is disabled there; see PR70010. */ It's not clear to me why this is important, and what makes -mvsx different from all other similar options? Segher
On 10/12/19 3:46 AM, Segher Boessenkool wrote: > Two spaces after a period. How about something like > > /* Callee's options should be a subset of the caller's. Also, a function > without VSX enabled should not be inlined into one with VSX enabled, > because it may be important it is disabled there; see PR70010. */ > > It's not clear to me why this is important, and what makes -mvsx different > from all other similar options? I agree, there is nothing special about VSX here and the other similar options like Altivec, HTM, etc., etc. should all be handled similarly. I agree with your other comment that we should be looking at explicit option usage versus default options. However, the way we now implement default CPU, the gcc driver always passes a -mcpu= option to cc1 no matter if the user used -mcpu= or not, so -mcpu= will always looks like an explicit option. So when -mcpu=power[789] is passed to cc1 (via explicit user usage or default cpu), does that look like -mvsx was also explicitly used? I'm guessing not. So if we have a caller compiled with -mcpu=power8 (VSX and Altivec are implicitly enabled) and a callee compiled with -mcpu=power6 (VSX and Altivec is not enabled ...implicitly), do we allow inlining? I would say we shouldn't, but the VSX and Altivec flags in the callee are a subset of the caller's flags. It must be that the ISA* flags in rs6000_isa_flags that save us from not inlining? Therefore, I'd say that the callee's flags should be a subset of the caller's flags as the current code does now, but we should be also checking that the callee doesn't have an explicitly used option flag(s) that conflicts with the callers flags (implicit or explicit). That means the caller's flags must match exactly the callee's flags, for those flags that were explicitly set in the callee. Peter
On October 14, 2019 5:31:58 PM GMT+02:00, Peter Bergner <bergner@linux.ibm.com> wrote: >On 10/12/19 3:46 AM, Segher Boessenkool wrote: >> Two spaces after a period. How about something like >> >> /* Callee's options should be a subset of the caller's. Also, >a function >> without VSX enabled should not be inlined into one with VSX >enabled, >> because it may be important it is disabled there; see PR70010. */ >> >> It's not clear to me why this is important, and what makes -mvsx >different >> from all other similar options? > >I agree, there is nothing special about VSX here and the other similar >options >like Altivec, HTM, etc., etc. should all be handled similarly. > >I agree with your other comment that we should be looking at explicit >option >usage versus default options. However, the way we now implement >default CPU, >the gcc driver always passes a -mcpu= option to cc1 no matter if the >user >used -mcpu= or not, so -mcpu= will always looks like an explicit >option. >So when -mcpu=power[789] is passed to cc1 (via explicit user usage or >default >cpu), does that look like -mvsx was also explicitly used? I'm guessing >not. > >So if we have a caller compiled with -mcpu=power8 (VSX and Altivec are >implicitly >enabled) and a callee compiled with -mcpu=power6 (VSX and Altivec is >not enabled >...implicitly), do we allow inlining? I would say we shouldn't, but >the VSX >and Altivec flags in the callee are a subset of the caller's flags. It >must >be that the ISA* flags in rs6000_isa_flags that save us from not >inlining? > >Therefore, I'd say that the callee's flags should be a subset of the >caller's >flags as the current code does now, but we should be also checking that >the >callee doesn't have an explicitly used option flag(s) that conflicts >with >the callers flags (implicit or explicit). That means the caller's >flags >must match exactly the callee's flags, for those flags that were >explicitly >set in the callee. The general case should be that if the caller ISA supports the callee one then inlining is OK. If this is not wanted in some cases then there are options like using a noinline attribute. > >Peter
On Mon, Oct 14, 2019 at 10:31:58AM -0500, Peter Bergner wrote: > I agree with your other comment that we should be looking at explicit option > usage versus default options. However, the way we now implement default CPU, > the gcc driver always passes a -mcpu= option to cc1 no matter if the user > used -mcpu= or not, so -mcpu= will always looks like an explicit option. > So when -mcpu=power[789] is passed to cc1 (via explicit user usage or default > cpu), does that look like -mvsx was also explicitly used? I'm guessing not. -mdebug=reg tells you (answer: explicit means explicit, for everything I tried anyway). Segher
On Mon, Oct 14, 2019 at 06:35:06PM +0200, Richard Biener wrote:
> The general case should be that if the caller ISA supports the callee one then inlining is OK. If this is not wanted in some cases then there are options like using a noinline attribute.
I agree, and that is what we already do afaik.
But in this case, the callee explicitly disables something (-mno-vsx),
while the caller has it enabled (all modern cpus have VSX). If it ends
up being inlined, it will get VSX insns generated for it. Which is what
Jiu Fu's patch aims to prevent.
So you are saying the GCC policy is that you should use noinline on the
callee in such cases?
Segher
On 10/14/19 2:57 PM, Segher Boessenkool wrote: > On Mon, Oct 14, 2019 at 06:35:06PM +0200, Richard Biener wrote: >> The general case should be that if the caller ISA supports the callee one >> then inlining is OK. If this is not wanted in some cases then there are >> options like using a noinline attribute. > > I agree, and that is what we already do afaik. I agree on the making sure the caller's ISA supports the callee's ISA before allowing inlining...and I think that's what our code it "trying" to do. It just has some bugs. > But in this case, the callee explicitly disables something (-mno-vsx), > while the caller has it enabled (all modern cpus have VSX). If it ends > up being inlined, it will get VSX insns generated for it. Which is what > Jiu Fu's patch aims to prevent. > > So you are saying the GCC policy is that you should use noinline on the > callee in such cases? I don't think sprinkling noinline's around will work given LTO can inline random functions from one object file into a function in another object file and we have no idea what the options were used for both files. I think that would just force us to end up putting nolines on all fuctions that might be compiled with LTO. I think we just need to fix the bug in the current logic when checking whether the caller's ISA flags supports the callee's ISA flags. ...and for that, I think we just need to add a test that enforces that the caller's ISA flags match exactly the callee's flags, for those flags that were explicitly set in the callee. The patch below seems to fix the issue (regtesting now). Does this look like what we want? Peter gcc/ * config/rs6000/rs6000.c (rs6000_can_inline_p): Handle explicit options. gcc.testsuite/ * gcc.target/powerpc/pr70010.c: New test. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 276975) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -23976,13 +23976,18 @@ rs6000_can_inline_p (tree caller, tree c else { struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); + HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags; struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); + HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags; + HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit; - /* Callee's options should a subset of the caller's, i.e. a vsx function - can inline an altivec function but a non-vsx function can't inline a - vsx function. */ - if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags) - == callee_opts->x_rs6000_isa_flags) + /* The callee's options must be a subset of the caller's options, i.e. + a vsx function may inline an altivec function, but a non-vsx function + must not inline a vsx function. However, for those options that the + callee has explicitly set, then we must enforce that the callee's + and caller's options match exactly; see PR70010. */ + if (((caller_isa & callee_isa) == callee_isa) + && (caller_isa & explicit_isa) == (callee_isa & explicit_isa)) ret = true; } Index: gcc/testsuite/gcc.target/powerpc/pr70010.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr70010.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/pr70010.c (working copy) @@ -0,0 +1,21 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -finline" } */ +/* { dg-final { scan-assembler "bl vadd_no_vsx" } } */ + +typedef int vec_t __attribute__((vector_size(16))); + +static vec_t +__attribute__((__target__("no-vsx"))) +vadd_no_vsx (vec_t a, vec_t b) +{ + return a + b; +} + +vec_t +__attribute__((__target__("vsx"))) +call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) +{ + return vadd_no_vsx (x, y) - z; +} bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.i typedef int vec_t __attribute__((vector_size(16))); static vec_t __attribute__((__target__("no-vsx"))) vadd_no_vsx (vec_t a, vec_t b) { return a + b; } vec_t __attribute__((__target__("vsx"))) call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) { return vadd_no_vsx (x, y) - z; } bergner@pike:~/gcc/BUGS/CPU$ old-gcc -O2 -finline -S pr70010.i bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.s call_vadd_no_vsx: vadduwm 2,2,3 vsubuwm 2,2,4 blr bergner@pike:~/gcc/BUGS/CPU$ /home/bergner/gcc/build/gcc-fsf-mainline-cpu-flags-debug/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-cpu-flags-debug/gcc -O2 -finline -S pr70010.i bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.s vadd_no_vsx: vadduwm 2,2,3 blr .long 0 call_vadd_no_vsx: 0: addis 2,12,.TOC.-.LCF1@ha addi 2,2,.TOC.-.LCF1@l mflr 0 std 0,16(1) stdu 1,-48(1) li 0,32 stvx 31,1,0 xxlor 63,36,36 bl vadd_no_vsx addi 1,1,48 li 0,-16 vsubuwm 2,2,31 lvx 31,1,0 ld 0,16(1) mtlr 0 blr
Thanks for all your reviews and comments, very helpful! Peter Bergner <bergner@linux.ibm.com> writes: > I think we just need to fix the bug in the current logic when checking > whether the caller's ISA flags supports the callee's ISA flags. ...and > for that, I think we just need to add a test that enforces that the > caller's ISA flags match exactly the callee's flags, for those flags > that were explicitly set in the callee. The patch below seems to fix > the issue (regtesting now). Does this look like what we want? Great sugguestions, thanks again! I was trying to figure out how to check 'explicitly disabled feature', after checking your sugguestion deeperly. I notice that, rs6000_isa_flags_explicit is set no matter a feature is "explicitly disabled" or "explicitly enabled". And to check a feature (e.g. VSX) is disabled explicitly, below code could help: (explicit_isa & VSX) && (callee_isa & explicit_isa & VSX == 0) This patch is indeed what I want. > > Peter > > > gcc/ > * config/rs6000/rs6000.c (rs6000_can_inline_p): Handle explicit > options. > > gcc.testsuite/ > * gcc.target/powerpc/pr70010.c: New test. > > > Index: gcc/config/rs6000/rs6000.c > =================================================================== > --- gcc/config/rs6000/rs6000.c (revision 276975) > +++ gcc/config/rs6000/rs6000.c (working copy) > @@ -23976,13 +23976,18 @@ rs6000_can_inline_p (tree caller, tree c > else > { > struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); > + HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags; > struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); > + HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags; > + HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit; > > - /* Callee's options should a subset of the caller's, i.e. a vsx function > - can inline an altivec function but a non-vsx function can't inline a > - vsx function. */ > - if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags) > - == callee_opts->x_rs6000_isa_flags) > + /* The callee's options must be a subset of the caller's options, i.e. > + a vsx function may inline an altivec function, but a non-vsx function > + must not inline a vsx function. However, for those options that the > + callee has explicitly set, then we must enforce that the callee's > + and caller's options match exactly; see PR70010. */ > + if (((caller_isa & callee_isa) == callee_isa) > + && (caller_isa & explicit_isa) == (callee_isa & explicit_isa)) > ret = true; > } > > Index: gcc/testsuite/gcc.target/powerpc/pr70010.c > =================================================================== > --- gcc/testsuite/gcc.target/powerpc/pr70010.c (nonexistent) > +++ gcc/testsuite/gcc.target/powerpc/pr70010.c (working copy) > @@ -0,0 +1,21 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > +/* { dg-options "-O2 -finline" } */ > +/* { dg-final { scan-assembler "bl vadd_no_vsx" } } */ > + > +typedef int vec_t __attribute__((vector_size(16))); > + > +static vec_t > +__attribute__((__target__("no-vsx"))) > +vadd_no_vsx (vec_t a, vec_t b) > +{ > + return a + b; > +} > + > +vec_t > +__attribute__((__target__("vsx"))) > +call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) > +{ > + return vadd_no_vsx (x, y) - z; > +} > > > > bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.i > typedef int vec_t __attribute__((vector_size(16))); > > static vec_t > __attribute__((__target__("no-vsx"))) > vadd_no_vsx (vec_t a, vec_t b) > { > return a + b; > } > > vec_t > __attribute__((__target__("vsx"))) > call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) > { > return vadd_no_vsx (x, y) - z; > } > > bergner@pike:~/gcc/BUGS/CPU$ old-gcc -O2 -finline -S pr70010.i > bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.s > call_vadd_no_vsx: > vadduwm 2,2,3 > vsubuwm 2,2,4 > blr > > bergner@pike:~/gcc/BUGS/CPU$ /home/bergner/gcc/build/gcc-fsf-mainline-cpu-flags-debug/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-cpu-flags-debug/gcc -O2 -finline -S pr70010.i > bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.s > vadd_no_vsx: > vadduwm 2,2,3 > blr > .long 0 > > call_vadd_no_vsx: > 0: addis 2,12,.TOC.-.LCF1@ha > addi 2,2,.TOC.-.LCF1@l > mflr 0 > std 0,16(1) > stdu 1,-48(1) > li 0,32 > stvx 31,1,0 > xxlor 63,36,36 > bl vadd_no_vsx > addi 1,1,48 > li 0,-16 > vsubuwm 2,2,31 > lvx 31,1,0 > ld 0,16(1) > mtlr 0 > blr
Jiufu Guo <guojiufu@linux.ibm.com> writes: > Thanks for all your reviews and comments, very helpful! > > Peter Bergner <bergner@linux.ibm.com> writes: > >> I think we just need to fix the bug in the current logic when checking >> whether the caller's ISA flags supports the callee's ISA flags. ...and >> for that, I think we just need to add a test that enforces that the >> caller's ISA flags match exactly the callee's flags, for those flags >> that were explicitly set in the callee. The patch below seems to fix >> the issue (regtesting now). Does this look like what we want? > Great sugguestions, thanks again! > I was trying to figure out how to check 'explicitly disabled feature', > after checking your sugguestion deeperly. I notice that, > rs6000_isa_flags_explicit is set no matter a feature is > "explicitly disabled" or "explicitly enabled". > And to check a feature (e.g. VSX) is disabled explicitly, below code > could help: > (explicit_isa & VSX) && (callee_isa & explicit_isa & VSX == 0) typo, it should be: (explicit_isa & VSX) && (callee_isa & explicit_isa & VSX) == 0 > > This patch is indeed what I want. >> >> Peter >> >>
Peter Bergner <bergner@linux.ibm.com> writes: .... > > I think we just need to fix the bug in the current logic when checking > whether the caller's ISA flags supports the callee's ISA flags. ...and > for that, I think we just need to add a test that enforces that the > caller's ISA flags match exactly the callee's flags, for those flags > that were explicitly set in the callee. The patch below seems to fix > the issue (regtesting now). Does this look like what we want? > > Peter > And another issue: Behavior is still inconsistent between "-mno-vsx -flto" and "-mno-vsx" for user code. Previous patch makes it consistent between "-mvsx -flto" and "-mvsx". cat novsx.c vector int c, a, b; static inline void __attribute__ ((__always_inline__, target ("no-vsx"))) foo () { c = a + b; } int main () { foo (); c = a + b; } As I expected, 'foo' would be able to inline into 'main'. -flto works as expect, but without -flto, it failed. $GCC_BUILD/gcc/xgcc -B$GCC_BUILD/gcc novsx.c -O2 -mno-vsx -flto ## ---> inlined $GCC_BUILD/gcc/xgcc -B$GCC_BUILD/gcc novsx.c -O2 -mno-vsx /home/guojiufu/temp/novsx.c: In function 'main': /home/guojiufu/temp/novsx.c:6:1: error: inlining failed in call to 'always_inline' 'foo': target specific option mismatch 6 | foo () /* { dg-error "inlining failed in call to .* target specific option mismatch" } */ | ^~~ /home/guojiufu/temp/novsx.c:14:3: note: called from here 14 | foo (); /* { dg-message "called from here" } */ | ^~~~~~ To handle this, we could fix the code a little more. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index d2a9f26..1f26c58 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -23971,7 +23971,18 @@ rs6000_can_inline_p (tree caller, tree callee) /* If caller has no option attributes, but callee does then it is not ok to inline. */ else if (!caller_tree) - ret = false; + { + struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); + HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags; + HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit; + + /* Propogate global flags to caller. */ + HOST_WIDE_INT caller_isa = rs6000_isa_flags; + + if (((caller_isa & callee_isa) == callee_isa) + && (caller_isa & explicit_isa) == (callee_isa & explicit_isa)) + ret = true; + } else { struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); And combine with your patch as below: gcc/ * config/rs6000/rs6000.c (rs6000_can_inline_p): Handle explicit options. gcc.testsuite/ * gcc.target/powerpc/pr70010.c: New test. * gcc.target/powerpc/pr70010-1.c: New test. * gcc.target/powerpc/pr70010-2.c: New test. * gcc.target/powerpc/pr70010-3.c: New test. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index d1434a9..68a9224 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -23968,21 +23968,25 @@ rs6000_can_inline_p (tree caller, tree callee) if (!callee_tree) ret = true; - /* If caller has no option attributes, but callee does then it is not ok to - inline. */ - else if (!caller_tree) - ret = false; - else { - struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); - - /* Callee's options should a subset of the caller's, i.e. a vsx function - can inline an altivec function but a non-vsx function can't inline a - vsx function. */ - if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags) - == callee_opts->x_rs6000_isa_flags) + HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags; + HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit; + + /* If caller has no option attributes (without -flto), propogate global + flags to caller, else use itself's option attributes. */ + HOST_WIDE_INT caller_isa + = caller_tree ? TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags + : rs6000_isa_flags; + + /* The callee's options must be a subset of the caller's options, i.e. + a vsx function may inline an altivec function, but a non-vsx function + must not inline a vsx function. However, for those options that the + callee has explicitly set, then we must enforce that the callee's + and caller's options match exactly; see PR70010. */ + if (((caller_isa & callee_isa) == callee_isa) + && (caller_isa & explicit_isa) == (callee_isa & explicit_isa)) ret = true; } diff --git a/gcc/testsuite/gcc.target/powerpc/pr70010-1.c b/gcc/testsuite/gcc.target/powerpc/pr70010-1.c new file mode 100644 index 0000000..78870db --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr70010-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -flto -mvsx" } */ + +vector int c, a, b; + +static inline void __attribute__ ((__always_inline__, target ("no-vsx"))) +foo () /* { dg-error "inlining failed in call to .* target specific option mismatch" } */ +{ + c = a + b; +} + +int +main () +{ + foo (); /* { dg-message "called from here" } */ + c = a + b; +} diff --git a/gcc/testsuite/gcc.target/powerpc/pr70010-2.c b/gcc/testsuite/gcc.target/powerpc/pr70010-2.c new file mode 100644 index 0000000..4c09b21 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr70010-2.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -flto -mno-vsx" } */ + +vector int c, a, b; + +static inline void __attribute__ ((__always_inline__, target ("no-vsx"))) +foo () +{ + c = a + b; +} + +int +main () +{ + foo (); + c = a + b; +} diff --git a/gcc/testsuite/gcc.target/powerpc/pr70010-3.c b/gcc/testsuite/gcc.target/powerpc/pr70010-3.c new file mode 100644 index 0000000..bca3187 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr70010-3.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mno-vsx" } */ + +vector int c, a, b; + +static inline void __attribute__ ((__always_inline__, target ("no-vsx"))) +foo () +{ + c = a + b; +} + +int +main () +{ + foo (); + c = a + b; +} diff --git a/gcc/testsuite/gcc.target/powerpc/pr70010.c b/gcc/testsuite/gcc.target/powerpc/pr70010.c new file mode 100644 index 0000000..2e6582c --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr70010.c @@ -0,0 +1,21 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -finline" } */ +/* { dg-final { scan-assembler "bl vadd_no_vsx" } } */ + +typedef int vec_t __attribute__((vector_size(16))); + +static vec_t +__attribute__((__target__("no-vsx"))) +vadd_no_vsx (vec_t a, vec_t b) +{ + return a + b; +} + +vec_t +__attribute__((__target__("vsx"))) +call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) +{ + return vadd_no_vsx (x, y) - z; +}
Hi! On Mon, Oct 14, 2019 at 07:18:11PM -0500, Peter Bergner wrote: > On 10/14/19 2:57 PM, Segher Boessenkool wrote: > > On Mon, Oct 14, 2019 at 06:35:06PM +0200, Richard Biener wrote: > >> The general case should be that if the caller ISA supports the callee one > >> then inlining is OK. If this is not wanted in some cases then there are > >> options like using a noinline attribute. > > > > I agree, and that is what we already do afaik. > > I agree on the making sure the caller's ISA supports the callee's ISA > before allowing inlining...and I think that's what our code it "trying" > to do. It just has some bugs. It says that it wants the callee ISA to be a subset of the caller ISA, and that is what it does. Which is exactly what we want. But as the PR points out it is *also* useful to not inline callees that explicitly disabled some ISA feature the caller has. Which we never promissed, but perhaps we should. > I don't think sprinkling noinline's around will work given LTO can > inline random functions from one object file into a function in another > object file and we have no idea what the options were used for both files. > I think that would just force us to end up putting nolines on all fuctions > that might be compiled with LTO. I think LTO should handle noinline attributes just fine, it would be a much bigger problem than this PR if it doesn't! Missing date+name+address here; and missing PR ref. > gcc/ > * config/rs6000/rs6000.c (rs6000_can_inline_p): Handle explicit > options. "Prohibit inlining if the callee explicitly disables some isa_flags the caller has", or something like that? A few more words please. > --- gcc/testsuite/gcc.target/powerpc/pr70010.c (nonexistent) > +++ gcc/testsuite/gcc.target/powerpc/pr70010.c (working copy) > @@ -0,0 +1,21 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ Just /* { dg-do compile } */ please: everything in gcc.target checks for powerpc*-*-* always (and compile is the default, but it's nice documentation). > +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ You don't need this I think. If this test cannot work on Darwin for some reason, the Darwin maintainers will take care of it. Unless you *know* they want (or need) the skip here? > +/* { dg-require-effective-target powerpc_vsx_ok } */ > +/* { dg-options "-O2 -finline" } */ -finline does nothing; it isn't even documented, only -fno-inline is. > +/* { dg-final { scan-assembler "bl vadd_no_vsx" } } */ Please use /* { dg-final { scan-assembler "\mbl vadd_no_vsx\M" } } */ so that it won't match "rldibl vadd_no_vsx", etc. Doesn't matter much for this testcase, but it's a good habit (and using it makes me not comment about it, also useful ;-) ) Okay for trunk with those tweaks. Thanks! Segher
On Tue, Oct 15, 2019 at 2:18 AM Peter Bergner <bergner@linux.ibm.com> wrote: > > On 10/14/19 2:57 PM, Segher Boessenkool wrote: > > On Mon, Oct 14, 2019 at 06:35:06PM +0200, Richard Biener wrote: > >> The general case should be that if the caller ISA supports the callee one > >> then inlining is OK. If this is not wanted in some cases then there are > >> options like using a noinline attribute. > > > > I agree, and that is what we already do afaik. > > I agree on the making sure the caller's ISA supports the callee's ISA > before allowing inlining...and I think that's what our code it "trying" > to do. It just has some bugs. > > > > But in this case, the callee explicitly disables something (-mno-vsx), > > while the caller has it enabled (all modern cpus have VSX). If it ends > > up being inlined, it will get VSX insns generated for it. Which is what > > Jiu Fu's patch aims to prevent. > > > > So you are saying the GCC policy is that you should use noinline on the > > callee in such cases? > > I don't think sprinkling noinline's around will work given LTO can > inline random functions from one object file into a function in another > object file and we have no idea what the options were used for both files. > I think that would just force us to end up putting nolines on all fuctions > that might be compiled with LTO. There's a function attribute for this. > I think we just need to fix the bug in the current logic when checking > whether the caller's ISA flags supports the callee's ISA flags. ...and > for that, I think we just need to add a test that enforces that the > caller's ISA flags match exactly the callee's flags, for those flags > that were explicitly set in the callee. The patch below seems to fix > the issue (regtesting now). Does this look like what we want? I believe this is going to bite you exactly in the case you want the opposite behavior. If you have CUs compiled with defaults and a specialized one with VSX that calls into generic compiled functions you _do_ want to allow inlining into the VSX enabled routines. Just think of LTO, C++ and comdats - you'll get a random comdat entity at link time for inlining - either from the VSX CU or the non-VSX one. Richard. > Peter > > > gcc/ > * config/rs6000/rs6000.c (rs6000_can_inline_p): Handle explicit > options. > > gcc.testsuite/ > * gcc.target/powerpc/pr70010.c: New test. > > > Index: gcc/config/rs6000/rs6000.c > =================================================================== > --- gcc/config/rs6000/rs6000.c (revision 276975) > +++ gcc/config/rs6000/rs6000.c (working copy) > @@ -23976,13 +23976,18 @@ rs6000_can_inline_p (tree caller, tree c > else > { > struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); > + HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags; > struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); > + HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags; > + HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit; > > - /* Callee's options should a subset of the caller's, i.e. a vsx function > - can inline an altivec function but a non-vsx function can't inline a > - vsx function. */ > - if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags) > - == callee_opts->x_rs6000_isa_flags) > + /* The callee's options must be a subset of the caller's options, i.e. > + a vsx function may inline an altivec function, but a non-vsx function > + must not inline a vsx function. However, for those options that the > + callee has explicitly set, then we must enforce that the callee's > + and caller's options match exactly; see PR70010. */ > + if (((caller_isa & callee_isa) == callee_isa) > + && (caller_isa & explicit_isa) == (callee_isa & explicit_isa)) > ret = true; > } > > Index: gcc/testsuite/gcc.target/powerpc/pr70010.c > =================================================================== > --- gcc/testsuite/gcc.target/powerpc/pr70010.c (nonexistent) > +++ gcc/testsuite/gcc.target/powerpc/pr70010.c (working copy) > @@ -0,0 +1,21 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > +/* { dg-options "-O2 -finline" } */ > +/* { dg-final { scan-assembler "bl vadd_no_vsx" } } */ > + > +typedef int vec_t __attribute__((vector_size(16))); > + > +static vec_t > +__attribute__((__target__("no-vsx"))) > +vadd_no_vsx (vec_t a, vec_t b) > +{ > + return a + b; > +} > + > +vec_t > +__attribute__((__target__("vsx"))) > +call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) > +{ > + return vadd_no_vsx (x, y) - z; > +} > > > > bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.i > typedef int vec_t __attribute__((vector_size(16))); > > static vec_t > __attribute__((__target__("no-vsx"))) > vadd_no_vsx (vec_t a, vec_t b) > { > return a + b; > } > > vec_t > __attribute__((__target__("vsx"))) > call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) > { > return vadd_no_vsx (x, y) - z; > } > > bergner@pike:~/gcc/BUGS/CPU$ old-gcc -O2 -finline -S pr70010.i > bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.s > call_vadd_no_vsx: > vadduwm 2,2,3 > vsubuwm 2,2,4 > blr > > bergner@pike:~/gcc/BUGS/CPU$ /home/bergner/gcc/build/gcc-fsf-mainline-cpu-flags-debug/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-cpu-flags-debug/gcc -O2 -finline -S pr70010.i > bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.s > vadd_no_vsx: > vadduwm 2,2,3 > blr > .long 0 > > call_vadd_no_vsx: > 0: addis 2,12,.TOC.-.LCF1@ha > addi 2,2,.TOC.-.LCF1@l > mflr 0 > std 0,16(1) > stdu 1,-48(1) > li 0,32 > stvx 31,1,0 > xxlor 63,36,36 > bl vadd_no_vsx > addi 1,1,48 > li 0,-16 > vsubuwm 2,2,31 > lvx 31,1,0 > ld 0,16(1) > mtlr 0 > blr
Segher Boessenkool <segher@kernel.crashing.org> wrote: > Hi! > > On Mon, Oct 14, 2019 at 07:18:11PM -0500, Peter Bergner wrote: >> On 10/14/19 2:57 PM, Segher Boessenkool wrote: >>> On Mon, Oct 14, 2019 at 06:35:06PM +0200, Richard Biener wrote: >>>> The general case should be that if the caller ISA supports the callee one >>>> then inlining is OK. If this is not wanted in some cases then there are >>>> options like using a noinline attribute. >>> >>> I agree, and that is what we already do afaik. >> >> I agree on the making sure the caller's ISA supports the callee's ISA >> before allowing inlining...and I think that's what our code it "trying" >> to do. It just has some bugs. > > It says that it wants the callee ISA to be a subset of the caller ISA, > and that is what it does. Which is exactly what we want. But as the > PR points out it is *also* useful to not inline callees that explicitly > disabled some ISA feature the caller has. Which we never promissed, but > perhaps we should. > >> I don't think sprinkling noinline's around will work given LTO can >> inline random functions from one object file into a function in another >> object file and we have no idea what the options were used for both files. >> I think that would just force us to end up putting nolines on all fuctions >> that might be compiled with LTO. > > I think LTO should handle noinline attributes just fine, it would be a > much bigger problem than this PR if it doesn't! > > > Missing date+name+address here; and missing PR ref. > >> gcc/ >> * config/rs6000/rs6000.c (rs6000_can_inline_p): Handle explicit >> options. > > "Prohibit inlining if the callee explicitly disables some isa_flags the > caller has", or something like that? A few more words please. > >> --- gcc/testsuite/gcc.target/powerpc/pr70010.c (nonexistent) >> +++ gcc/testsuite/gcc.target/powerpc/pr70010.c (working copy) >> @@ -0,0 +1,21 @@ >> +/* { dg-do compile { target { powerpc*-*-* } } } */ > > Just > /* { dg-do compile } */ > please: everything in gcc.target checks for powerpc*-*-* always (and > compile is the default, but it's nice documentation). > >> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ > > You don't need this I think. If this test cannot work on Darwin for > some reason, the Darwin maintainers will take care of it. Unless you > *know* they want (or need) the skip here? Right, it should not be needed because .. >> +/* { dg-require-effective-target powerpc_vsx_ok } */ … this ^ should DTRT for Darwin too. If not, then that’s the place I want fix it. >> +/* { dg-options "-O2 -finline" } */ > > -finline does nothing; it isn't even documented, only -fno-inline is. > >> +/* { dg-final { scan-assembler "bl vadd_no_vsx" } } */ > > Please use > /* { dg-final { scan-assembler "\mbl vadd_no_vsx\M" } } */ > so that it won't match "rldibl vadd_no_vsx", etc. Doesn't matter much > for this testcase, but it's a good habit (and using it makes me not > comment about it, also useful ;-) ) > > Okay for trunk with those tweaks. Thanks! > > > Segher
On Okt 15 2019, Segher Boessenkool <segher@kernel.crashing.org> wrote: > Please use > /* { dg-final { scan-assembler "\mbl vadd_no_vsx\M" } } */ ITYM /* { dg-final { scan-assembler {\mbl vadd_no_vsx\M} } } */ Andreas.
Hi! On Tue, Oct 15, 2019 at 05:15:07PM +0800, Jiufu Guo wrote: > And another issue: Behavior is still inconsistent between "-mno-vsx > -flto" and "-mno-vsx" for user code. Previous patch makes it consistent > between "-mvsx -flto" and "-mvsx". > $GCC_BUILD/gcc/xgcc -B$GCC_BUILD/gcc novsx.c -O2 -mno-vsx > /home/guojiufu/temp/novsx.c: In function 'main': > /home/guojiufu/temp/novsx.c:6:1: error: inlining failed in call to 'always_inline' 'foo': target specific option mismatch So what should we do about this? There are arguments for *both* behaviours, and apparently with LTO we do not know which flags are explicit? > + /* Propogate global flags to caller. */ > + HOST_WIDE_INT caller_isa = rs6000_isa_flags; I don't think that is right, or, I don't see why, anyway? > + > + if (((caller_isa & callee_isa) == callee_isa) > + && (caller_isa & explicit_isa) == (callee_isa & explicit_isa)) > + ret = true; > + } Segher
On Tue, Oct 15, 2019 at 11:52:26AM +0200, Andreas Schwab wrote: > On Okt 15 2019, Segher Boessenkool <segher@kernel.crashing.org> wrote: > > > Please use > > /* { dg-final { scan-assembler "\mbl vadd_no_vsx\M" } } */ > > ITYM > > /* { dg-final { scan-assembler {\mbl vadd_no_vsx\M} } } */ Ha yes, thanks :-) That's the problem with cut-and-paste (which is also the cause of all these strange dg lines, I think). Segher
On Tue, Oct 15, 2019 at 11:32:27AM +0200, Richard Biener wrote: > > I think we just need to fix the bug in the current logic when checking > > whether the caller's ISA flags supports the callee's ISA flags. ...and > > for that, I think we just need to add a test that enforces that the > > caller's ISA flags match exactly the callee's flags, for those flags > > that were explicitly set in the callee. The patch below seems to fix > > the issue (regtesting now). Does this look like what we want? > > I believe this is going to bite you exactly in the case you want the > opposite behavior. If you have CUs compiled with defaults and > a specialized one with VSX that calls into generic compiled functions > you _do_ want to allow inlining into the VSX enabled routines. Yes, but *not* inlining is relatively harmless, while inlining can be fatal. I don't see how we can handle both scenarios optimally. > Just > think of LTO, C++ and comdats - you'll get a random comdat entity > at link time for inlining - either from the VSX CU or the non-VSX one. This would make LTO totally unusable, with or without this patch? Something else must be going on? Segher
On Tue, Oct 15, 2019 at 12:07 PM Segher Boessenkool <segher@kernel.crashing.org> wrote: > > On Tue, Oct 15, 2019 at 11:32:27AM +0200, Richard Biener wrote: > > > I think we just need to fix the bug in the current logic when checking > > > whether the caller's ISA flags supports the callee's ISA flags. ...and > > > for that, I think we just need to add a test that enforces that the > > > caller's ISA flags match exactly the callee's flags, for those flags > > > that were explicitly set in the callee. The patch below seems to fix > > > the issue (regtesting now). Does this look like what we want? > > > > I believe this is going to bite you exactly in the case you want the > > opposite behavior. If you have CUs compiled with defaults and > > a specialized one with VSX that calls into generic compiled functions > > you _do_ want to allow inlining into the VSX enabled routines. > > Yes, but *not* inlining is relatively harmless, while inlining can be > fatal. I don't see how we can handle both scenarios optimally. How can it be fatal to inline a non-VSX function into a VSX one? > > Just > > think of LTO, C++ and comdats - you'll get a random comdat entity > > at link time for inlining - either from the VSX CU or the non-VSX one. > > This would make LTO totally unusable, with or without this patch? Something > else must be going on? It's the same without LTO - the linker will simply choose (randomly) one of the comdats from one of the CUs providing it, not caring about some built with and some without VSX. Richard. > > Segher
On Tue, Oct 15, 2019 at 01:19:51PM +0200, Richard Biener wrote: > On Tue, Oct 15, 2019 at 12:07 PM Segher Boessenkool > <segher@kernel.crashing.org> wrote: > > On Tue, Oct 15, 2019 at 11:32:27AM +0200, Richard Biener wrote: > > > > I think we just need to fix the bug in the current logic when checking > > > > whether the caller's ISA flags supports the callee's ISA flags. ...and > > > > for that, I think we just need to add a test that enforces that the > > > > caller's ISA flags match exactly the callee's flags, for those flags > > > > that were explicitly set in the callee. The patch below seems to fix > > > > the issue (regtesting now). Does this look like what we want? > > > > > > I believe this is going to bite you exactly in the case you want the > > > opposite behavior. If you have CUs compiled with defaults and > > > a specialized one with VSX that calls into generic compiled functions > > > you _do_ want to allow inlining into the VSX enabled routines. > > > > Yes, but *not* inlining is relatively harmless, while inlining can be > > fatal. I don't see how we can handle both scenarios optimally. > > How can it be fatal to inline a non-VSX function into a VSX one? Oh I misread, I thought it was the other way around. > > > Just > > > think of LTO, C++ and comdats - you'll get a random comdat entity > > > at link time for inlining - either from the VSX CU or the non-VSX one. > > > > This would make LTO totally unusable, with or without this patch? Something > > else must be going on? > > It's the same without LTO - the linker will simply choose (randomly) > one of the comdats from one of the CUs providing it, not caring about > some built with and some without VSX. Hrm, so how does that ever work? Segher
On Tue, Oct 15, 2019 at 1:33 PM Segher Boessenkool <segher@kernel.crashing.org> wrote: > > On Tue, Oct 15, 2019 at 01:19:51PM +0200, Richard Biener wrote: > > On Tue, Oct 15, 2019 at 12:07 PM Segher Boessenkool > > <segher@kernel.crashing.org> wrote: > > > On Tue, Oct 15, 2019 at 11:32:27AM +0200, Richard Biener wrote: > > > > > I think we just need to fix the bug in the current logic when checking > > > > > whether the caller's ISA flags supports the callee's ISA flags. ...and > > > > > for that, I think we just need to add a test that enforces that the > > > > > caller's ISA flags match exactly the callee's flags, for those flags > > > > > that were explicitly set in the callee. The patch below seems to fix > > > > > the issue (regtesting now). Does this look like what we want? > > > > > > > > I believe this is going to bite you exactly in the case you want the > > > > opposite behavior. If you have CUs compiled with defaults and > > > > a specialized one with VSX that calls into generic compiled functions > > > > you _do_ want to allow inlining into the VSX enabled routines. > > > > > > Yes, but *not* inlining is relatively harmless, while inlining can be > > > fatal. I don't see how we can handle both scenarios optimally. > > > > How can it be fatal to inline a non-VSX function into a VSX one? > > Oh I misread, I thought it was the other way around. > > > > > Just > > > > think of LTO, C++ and comdats - you'll get a random comdat entity > > > > at link time for inlining - either from the VSX CU or the non-VSX one. > > > > > > This would make LTO totally unusable, with or without this patch? Something > > > else must be going on? > > > > It's the same without LTO - the linker will simply choose (randomly) > > one of the comdats from one of the CUs providing it, not caring about > > some built with and some without VSX. > > Hrm, so how does that ever work? Possibly people are "not doing this"? Aka have a program with a runtime capability check for VSX, dispatch to std::vector/algorithm CUs with/without -mvsx and then link the result? Because any instantiated template therein will end up as a comdat... Guess we still live in a mostly C world ;) Richard. > > Segher
Segher Boessenkool <segher@kernel.crashing.org> writes: > Hi! > > On Tue, Oct 15, 2019 at 05:15:07PM +0800, Jiufu Guo wrote: >> And another issue: Behavior is still inconsistent between "-mno-vsx >> -flto" and "-mno-vsx" for user code. Previous patch makes it consistent >> between "-mvsx -flto" and "-mvsx". > >> $GCC_BUILD/gcc/xgcc -B$GCC_BUILD/gcc novsx.c -O2 -mno-vsx >> /home/guojiufu/temp/novsx.c: In function 'main': >> /home/guojiufu/temp/novsx.c:6:1: error: inlining failed in call to >> 'always_inline' 'foo': target specific option mismatch > > So what should we do about this? There are arguments for *both* > behaviours, and apparently with LTO we do not know which flags are > explicit? With LTO, rs6000_isa_flags_explicit and rs6000_isa_flags are also checkable. rs6000_isa_flags indicates effective features; bits of rs6000_isa_flags_explicit indicate if it is specified explicitly. The difference between 'with LTO' and 'without LTO' is: Wit LTO, if a function does not have target attribute in source code, then using option attribute from command line(e.g. -mcpu=xxx) as isa_flags; and if features (e.g. -mno-vsx, -mvsx) are specified on command line, isa_flags_explicit is also set. If a function has target attribute in source code explicitly, then isa_flags_explicit is set to reflect feature is specified explicitly besides explicit features from command line; and isa_flags is set as the effective features. Without LTO, if a function does not have target attribute in source code, then it's isa_flags_explicit and isa_flags are as NULL (target_options == NULL). If a function has target attribute in source code, it is same as 'with LTO'. > >> + /* Propogate global flags to caller. */ >> + HOST_WIDE_INT caller_isa = rs6000_isa_flags; > > I don't think that is right, or, I don't see why, anyway? Propogating global flags is exiting behavior for function during LTO compiling; and global flags are also used to caculate isa_flags/isa_flags_explicit for function which has target attribute in source code. So, it would make sense to using global flags for caller here. One more exmaple to explain this: For example a function "int foo () {xxxxx;}", to compile it using command "gcc -mcpu=power8 -mno-vsx"; we should using power8+no-vsx instructions for this function. > >> + >> + if (((caller_isa & callee_isa) == callee_isa) >> + && (caller_isa & explicit_isa) == (callee_isa & explicit_isa)) >> + ret = true; >> + } > > > Segher
On 10/15/19 4:32 AM, Richard Biener wrote: > I believe this is going to bite you exactly in the case you want the > opposite behavior. If you have CUs compiled with defaults and > a specialized one with VSX that calls into generic compiled functions > you _do_ want to allow inlining into the VSX enabled routines. First off, there's nothing special about VSX/non-VSX here, so when I talk about VSX below, I'm really just using it as a stand-in for any option. So what you're saying is that a VSX enabled function that calls a non-VSX enabled function should be able to inline that non-VSX function. That is what the current code allows and I agree that should still be allowed. My extra code only disallows that scenario *IF* the user explicitly said DO NOT compile the callee function with VSX. It still allows it to be inlined if the callee was compiled without VSX implicitly / by default, so I don't think my patch disallows the scenario you mention above. If the user explicitly said not to compile a function with a particular option, how can we justify ignoring that request just because we're inlining it? We don't do that for the out of line version of that callee function. > How can it be fatal to inline a non-VSX function into a VSX one? I can think of scenarios where it could be fatal (again, VSX is just a stand-in for any option), but maybe the user used -mno-vsx for performance reasons or maybe this is kernel code and the user knows this function will be run with VSX hardware disabled or ... Peter
On October 15, 2019 5:09:52 PM GMT+02:00, Peter Bergner <bergner@linux.ibm.com> wrote: >On 10/15/19 4:32 AM, Richard Biener wrote: >> I believe this is going to bite you exactly in the case you want the >> opposite behavior. If you have CUs compiled with defaults and >> a specialized one with VSX that calls into generic compiled functions >> you _do_ want to allow inlining into the VSX enabled routines. > >First off, there's nothing special about VSX/non-VSX here, so when I >talk >about VSX below, I'm really just using it as a stand-in for any option. > >So what you're saying is that a VSX enabled function that calls a >non-VSX >enabled function should be able to inline that non-VSX function. That >is >what the current code allows and I agree that should still be allowed. >My extra code only disallows that scenario *IF* the user explicitly >said >DO NOT compile the callee function with VSX. It still allows it to be >inlined if the callee was compiled without VSX implicitly / by default, >so I don't think my patch disallows the scenario you mention above. > >If the user explicitly said not to compile a function with a particular >option, how can we justify ignoring that request just because we're >inlining it? We don't do that for the out of line version of that >callee >function. You can probably tell whether there's an explicit -mno-vsx on the command line I wonder how you can tell apart explicit vs. Implicit in the LTO context where the option is represented as target attribute on the function. >> How can it be fatal to inline a non-VSX function into a VSX one? > >I can think of scenarios where it could be fatal (again, VSX is just a >stand-in for any option), but maybe the user used -mno-vsx for >performance >reasons or maybe this is kernel code and the user knows this function >will >be run with VSX hardware disabled or ... > > >Peter
On 10/15/19 10:44 AM, Richard Biener wrote: > On October 15, 2019 5:09:52 PM GMT+02:00, Peter Bergner <bergner@linux.ibm.com> wrote: >> If the user explicitly said not to compile a function with a particular >> option, how can we justify ignoring that request just because we're >> inlining it? We don't do that for the out of line version of that >> callee function. > > I wonder how you can tell apart explicit vs. Implicit in the LTO context > where the option is represented as target attribute on the function. Ah, so all of the options used to compile the callee in an LTO context, whether they were implicit or explicit will look explicit? I agree, that would be a problem! Jiufu, can you see if there is a way to determine whether a callee option in the LTO context really was an explicit option versus an implicit/default option? ...or does your follow on patch to my patch do that? Peter
On 10/15/19 4:56 AM, Segher Boessenkool wrote: > On Tue, Oct 15, 2019 at 05:15:07PM +0800, Jiufu Guo wrote: >> And another issue: Behavior is still inconsistent between "-mno-vsx >> -flto" and "-mno-vsx" for user code. Previous patch makes it consistent >> between "-mvsx -flto" and "-mvsx". > >> $GCC_BUILD/gcc/xgcc -B$GCC_BUILD/gcc novsx.c -O2 -mno-vsx >> /home/guojiufu/temp/novsx.c: In function 'main': >> /home/guojiufu/temp/novsx.c:6:1: error: inlining failed in call to 'always_inline' 'foo': target specific option mismatch > > So what should we do about this? There are arguments for *both* > behaviours, and apparently with LTO we do not know which flags are > explicit? I'd say this is user error, telling the compiler it has to inline the callee function, but then using incompatible options on the caller and the callee, so that it cannot. I think the error message is the correct thing here. Peter
On Tue, Oct 15, 2019 at 12:47:02PM -0500, Peter Bergner wrote: > On 10/15/19 4:56 AM, Segher Boessenkool wrote: > > On Tue, Oct 15, 2019 at 05:15:07PM +0800, Jiufu Guo wrote: > >> And another issue: Behavior is still inconsistent between "-mno-vsx > >> -flto" and "-mno-vsx" for user code. Previous patch makes it consistent > >> between "-mvsx -flto" and "-mvsx". > > > >> $GCC_BUILD/gcc/xgcc -B$GCC_BUILD/gcc novsx.c -O2 -mno-vsx > >> /home/guojiufu/temp/novsx.c: In function 'main': > >> /home/guojiufu/temp/novsx.c:6:1: error: inlining failed in call to 'always_inline' 'foo': target specific option mismatch > > > > So what should we do about this? There are arguments for *both* > > behaviours, and apparently with LTO we do not know which flags are > > explicit? > > I'd say this is user error, telling the compiler it has to inline the callee > function, but then using incompatible options on the caller and the callee, > so that it cannot. I think the error message is the correct thing here. Everything here is -mno-vsx though? The always_inline "foo" has it as target attribute, but everyone (also) has it from the command line arg? Segher
On 10/15/19 1:21 PM, Segher Boessenkool wrote: > On Tue, Oct 15, 2019 at 12:47:02PM -0500, Peter Bergner wrote: >> I'd say this is user error, telling the compiler it has to inline the callee >> function, but then using incompatible options on the caller and the callee, >> so that it cannot. I think the error message is the correct thing here. > > Everything here is -mno-vsx though? The always_inline "foo" has it as > target attribute, but everyone (also) has it from the command line arg? Oh, I thought main() was compiled with -mvsx. My bad! :-) Peter
On 10/15/19 11:12 AM, Peter Bergner wrote: > On 10/15/19 10:44 AM, Richard Biener wrote: >> On October 15, 2019 5:09:52 PM GMT+02:00, Peter Bergner <bergner@linux.ibm.com> wrote: >>> If the user explicitly said not to compile a function with a particular >>> option, how can we justify ignoring that request just because we're >>> inlining it? We don't do that for the out of line version of that >>> callee function. >> >> I wonder how you can tell apart explicit vs. Implicit in the LTO context >> where the option is represented as target attribute on the function. > > Ah, so all of the options used to compile the callee in an LTO context, > whether they were implicit or explicit will look explicit? I agree, that > would be a problem! > > Jiufu, can you see if there is a way to determine whether a callee > option in the LTO context really was an explicit option versus an > implicit/default option? ...or does your follow on patch to my > patch do that? So I set a break point on rs6000_can_inline_p() when it's called from lto1, using a few different unit test cases, and it seems the callee's ->x_rs6000_isa_flags_explicit is set correctly! That means we can differentiate between implicitly and explicitly set options. Peter
Segher Boessenkool <segher@kernel.crashing.org> writes: > So what should we do about this? There are arguments for *both* > behaviours, and apparently with LTO we do not know which flags are > explicit? Actually, from my testing, it seems the rs6000_isa_flags_explicit flags are set correctly in LTO! On 10/15/19 7:45 AM, Jiufu Guo wrote: > The difference between 'with LTO' and 'without LTO' is: > Wit LTO, if a function does not have target attribute in source code, > then using option attribute from command line(e.g. -mcpu=xxx) as > isa_flags; and if features (e.g. -mno-vsx, -mvsx) are specified on > command line, isa_flags_explicit is also set. > If a function has target attribute in source code explicitly, > then isa_flags_explicit is set to reflect feature is specified > explicitly besides explicit features from command line; and isa_flags is > set as the effective features. > > Without LTO, if a function does not have target attribute in source code, > then it's isa_flags_explicit and isa_flags are as NULL (target_options > == NULL). > If a function has target attribute in source code, it is same as 'with > LTO'. After reading this a few times and compiling a few unit tests cases in gdb/cc1/lto1, I agree with your explanation above and with your addition to my patch. However, how about a slight modification to your change with some updated comments? Updated patch below. This also incorporates Segher and Andreas's changes to my test case. I have not looked at your extra test cases, so I have not added them to this patch, but I like the idea of adding test cases that exercise this code in LTO context. I am going on vacation tomorrow, so Jiufu, can you take over ownership of this combined patch, along with your extra test cases? I have not bootstrapped or regtested this, so that still needs doing. If you're busy, I can pick this up when I return to work on Monday. Peter gcc/ PR target/70010 * config/rs6000/rs6000.c (rs6000_can_inline_p): Prohibit inlining if the callee explicitly disables some isa_flags the caller is using. gcc.testsuite/ PR target/70010 * gcc.target/powerpc/pr70010.c: New test. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 276975) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -23964,25 +23964,31 @@ rs6000_can_inline_p (tree caller, tree c tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller); tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); - /* If callee has no option attributes, then it is ok to inline. */ + /* If the callee has no option attributes, then it is ok to inline. */ if (!callee_tree) ret = true; - /* If caller has no option attributes, but callee does then it is not ok to - inline. */ - else if (!caller_tree) - ret = false; - else { - struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); + HOST_WIDE_INT caller_isa; struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); + HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags; + HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit; + + /* If the caller has option attributes, then use them. + Otherwise, use the command line options. */ + if (caller_tree) + caller_isa = TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags; + else + caller_isa = rs6000_isa_flags; - /* Callee's options should a subset of the caller's, i.e. a vsx function - can inline an altivec function but a non-vsx function can't inline a - vsx function. */ - if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags) - == callee_opts->x_rs6000_isa_flags) + /* The callee's options must be a subset of the caller's options, i.e. + a vsx function may inline an altivec function, but a non-vsx function + must not inline a vsx function. However, for those options that the + callee has explicitly set, then we must enforce that the callee's + and caller's options match exactly; see PR70010. */ + if (((caller_isa & callee_isa) == callee_isa) + && (caller_isa & explicit_isa) == (callee_isa & explicit_isa)) ret = true; } Index: gcc/testsuite/gcc.target/powerpc/pr70010.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr70010.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/pr70010.c (working copy) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -finline-functions" } */ +/* { dg-final { scan-assembler {\mbl vadd_no_vsx\M} } } */ + +typedef int vec_t __attribute__((vector_size(16))); + +static vec_t +__attribute__((__target__("no-vsx"))) +vadd_no_vsx (vec_t a, vec_t b) +{ + return a + b; +} + +vec_t +__attribute__((__target__("vsx"))) +call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) +{ + return vadd_no_vsx (x, y) - z; +}
Peter Bergner <bergner@linux.ibm.com> writes: > Segher Boessenkool <segher@kernel.crashing.org> writes: >> So what should we do about this? There are arguments for *both* >> behaviours, and apparently with LTO we do not know which flags are >> explicit? > > Actually, from my testing, it seems the rs6000_isa_flags_explicit > flags are set correctly in LTO! > > > > On 10/15/19 7:45 AM, Jiufu Guo wrote: >> The difference between 'with LTO' and 'without LTO' is: >> Wit LTO, if a function does not have target attribute in source code, >> then using option attribute from command line(e.g. -mcpu=xxx) as >> isa_flags; and if features (e.g. -mno-vsx, -mvsx) are specified on >> command line, isa_flags_explicit is also set. >> If a function has target attribute in source code explicitly, >> then isa_flags_explicit is set to reflect feature is specified >> explicitly besides explicit features from command line; and isa_flags is >> set as the effective features. >> >> Without LTO, if a function does not have target attribute in source code, >> then it's isa_flags_explicit and isa_flags are as NULL (target_options >> == NULL). >> If a function has target attribute in source code, it is same as 'with >> LTO'. > > After reading this a few times and compiling a few unit tests cases in > gdb/cc1/lto1, I agree with your explanation above and with your addition > to my patch. However, how about a slight modification to your change > with some updated comments? Updated patch below. Great update. > > This also incorporates Segher and Andreas's changes to my test case. > I have not looked at your extra test cases, so I have not added them > to this patch, but I like the idea of adding test cases that exercise > this code in LTO context. > > I am going on vacation tomorrow, so Jiufu, can you take over ownership > of this combined patch, along with your extra test cases? I have not > bootstrapped or regtested this, so that still needs doing. If you're > busy, I can pick this up when I return to work on Monday. Sure, I will add test cases and do bootstrap/regtest. Jiufu BR > > Peter > > > gcc/ > PR target/70010 > * config/rs6000/rs6000.c (rs6000_can_inline_p): Prohibit inlining if > the callee explicitly disables some isa_flags the caller is using. > > gcc.testsuite/ > PR target/70010 > * gcc.target/powerpc/pr70010.c: New test. > > Index: gcc/config/rs6000/rs6000.c > =================================================================== > --- gcc/config/rs6000/rs6000.c (revision 276975) > +++ gcc/config/rs6000/rs6000.c (working copy) > @@ -23964,25 +23964,31 @@ rs6000_can_inline_p (tree caller, tree c > tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller); > tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); > > - /* If callee has no option attributes, then it is ok to inline. */ > + /* If the callee has no option attributes, then it is ok to inline. */ > if (!callee_tree) > ret = true; > > - /* If caller has no option attributes, but callee does then it is not ok to > - inline. */ > - else if (!caller_tree) > - ret = false; > - > else > { > - struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); > + HOST_WIDE_INT caller_isa; > struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); > + HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags; > + HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit; > + > + /* If the caller has option attributes, then use them. > + Otherwise, use the command line options. */ > + if (caller_tree) > + caller_isa = TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags; > + else > + caller_isa = rs6000_isa_flags; > > - /* Callee's options should a subset of the caller's, i.e. a vsx function > - can inline an altivec function but a non-vsx function can't inline a > - vsx function. */ > - if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags) > - == callee_opts->x_rs6000_isa_flags) > + /* The callee's options must be a subset of the caller's options, i.e. > + a vsx function may inline an altivec function, but a non-vsx function > + must not inline a vsx function. However, for those options that the > + callee has explicitly set, then we must enforce that the callee's > + and caller's options match exactly; see PR70010. */ > + if (((caller_isa & callee_isa) == callee_isa) > + && (caller_isa & explicit_isa) == (callee_isa & explicit_isa)) > ret = true; > } > > Index: gcc/testsuite/gcc.target/powerpc/pr70010.c > =================================================================== > --- gcc/testsuite/gcc.target/powerpc/pr70010.c (nonexistent) > +++ gcc/testsuite/gcc.target/powerpc/pr70010.c (working copy) > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > +/* { dg-options "-O2 -finline-functions" } */ > +/* { dg-final { scan-assembler {\mbl vadd_no_vsx\M} } } */ > + > +typedef int vec_t __attribute__((vector_size(16))); > + > +static vec_t > +__attribute__((__target__("no-vsx"))) > +vadd_no_vsx (vec_t a, vec_t b) > +{ > + return a + b; > +} > + > +vec_t > +__attribute__((__target__("vsx"))) > +call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) > +{ > + return vadd_no_vsx (x, y) - z; > +}
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index d1434a9..c38dc87 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -23978,11 +23978,15 @@ rs6000_can_inline_p (tree caller, tree callee) struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); - /* Callee's options should a subset of the caller's, i.e. a vsx function - can inline an altivec function but a non-vsx function can't inline a - vsx function. */ - if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags) - == callee_opts->x_rs6000_isa_flags) + /* Callee's options should a subset of the caller's. In addition, + vsx function should not inline function with non-vsx by which + programmer may intend to disable VSX code generation. */ + if ((caller_opts->x_rs6000_isa_flags & OPTION_MASK_VSX) + && (callee_opts->x_rs6000_isa_flags & OPTION_MASK_VSX) == 0) + ret = false; + else if ((caller_opts->x_rs6000_isa_flags + & callee_opts->x_rs6000_isa_flags) + == callee_opts->x_rs6000_isa_flags) ret = true; } diff --git a/gcc/testsuite/gcc.target/powerpc/inline_error.c b/gcc/testsuite/gcc.target/powerpc/inline_error.c new file mode 100644 index 0000000..78870db --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/inline_error.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -flto -mvsx" } */ + +vector int c, a, b; + +static inline void __attribute__ ((__always_inline__, target ("no-vsx"))) +foo () /* { dg-error "inlining failed in call to .* target specific option mismatch" } */ +{ + c = a + b; +} + +int +main () +{ + foo (); /* { dg-message "called from here" } */ + c = a + b; +}