Message ID | 3b0ef8c4-45df-1d84-158d-f6d98a493a32@suse.cz |
---|---|
State | New |
Headers | show |
Series | IPA: fix reproducibility in IPA MOD REF | expand |
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > * ipa-modref.c (analyze_function): Do not execute the code > only if dump_file != NULL. > --- > gcc/ipa-modref.c | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c > index a3c7c6d6a1f..6cacf9c8ab1 100644 > --- a/gcc/ipa-modref.c > +++ b/gcc/ipa-modref.c > @@ -2868,15 +2868,15 @@ analyze_function (function *f, bool ipa) > optimization_summaries = modref_summaries::create_ggc (symtab); > else /* Remove existing summary if we are re-running the pass. */ > { > - if (dump_file > - && (summary > - = optimization_summaries->get (fnode)) > - != NULL > + summary = optimization_summaries->get (fnode); > + if (summary != NULL How does this affect reproducibility? Honza
On 11/18/21 19:11, Jan Hubicka wrote: >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. >> >> Ready to be installed? >> Thanks, >> Martin >> >> gcc/ChangeLog: >> >> * ipa-modref.c (analyze_function): Do not execute the code >> only if dump_file != NULL. >> --- >> gcc/ipa-modref.c | 14 +++++++------- >> 1 file changed, 7 insertions(+), 7 deletions(-) >> >> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c >> index a3c7c6d6a1f..6cacf9c8ab1 100644 >> --- a/gcc/ipa-modref.c >> +++ b/gcc/ipa-modref.c >> @@ -2868,15 +2868,15 @@ analyze_function (function *f, bool ipa) >> optimization_summaries = modref_summaries::create_ggc (symtab); >> else /* Remove existing summary if we are re-running the pass. */ >> { >> - if (dump_file >> - && (summary >> - = optimization_summaries->get (fnode)) >> - != NULL >> + summary = optimization_summaries->get (fnode); >> + if (summary != NULL > How does this affect reproducibility? In the following way: $ cat 1.i __ckd_calloc___n_elem; __ckd_calloc___elem_size; *__ckd_calloc__() { void *mem = calloc(__ckd_calloc___n_elem, __ckd_calloc___elem_size); return mem; } _E__die_error() { exit(1); } $ cat 2.i typedef struct { int *lmclass } lm_t; lm_read_dump(int *lmclass) { lm_t lm; { int i; for (; i; i++) lm.lmclass[i] = lmclass[i]; } lm_set_param(); } lm_read_ctl_dict_size_n_lmclass_used; lm_read_ctl_dict_size() { int *lmclass = __ckd_calloc__(); while (strcmp()) { lmclass[lm_read_ctl_dict_size_n_lmclass_used] = 0; _E__die_error(); } lm_read_dump(lmclass); for (; ; ) ; } $ cat cmd rm -f *.o xxx* yyy* gcc -v gcc [12].i -O2 -flto=auto -c -w gcc [12].o -O2 -flto=auto -shared --save-temps -o xxx -w -fdump-tree-optimized rm -f *.o gcc [12].i -O2 -flto=auto -c -w gcc [12].o -O2 -flto=auto -shared --save-temps -o yyy -fdump-tree-modref -w diff -u -U30 xxx.ltrans0.ltrans*optimized yyy.ltrans0.ltrans*optimized diff xxx.ltrans0.ltrans.s yyy.ltrans0.ltrans.s if test $? = 1; then exit 0 else exit 1 fi $ ./cmd Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/home/marxin/bin/gcc/libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /home/marxin/Programming/gcc/configure --enable-languages=c,c++,fortran,jit --prefix=/home/marxin/bin/gcc --disable-multilib --enable-host-shared --disable-libsanitizer --enable-valgrind-annotations --disable-bootstrap Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.0.0 20211118 (experimental) (GCC) /usr/bin/ld: ./xxx.ltrans0.ltrans.o: warning: relocation against `lm_read_ctl_dict_size_n_lmclass_used' in read-only section `.text' /usr/bin/ld: ./xxx.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status /usr/bin/ld: ./yyy.ltrans0.ltrans.o: warning: relocation against `__ckd_calloc___n_elem' in read-only section `.text' /usr/bin/ld: ./yyy.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status diff: yyy.ltrans0.ltrans*optimized: No such file or directory 54,55d53 < movslq lm_read_ctl_dict_size_n_lmclass_used(%rip), %rax < movl $0, 0(%rbp,%rax,4) I tracked that it differs in tree DSE dump. May I install the patch? Thanks, Martin > > Honza >
> Supported LTO compression algorithms: zlib zstd > gcc version 12.0.0 20211118 (experimental) (GCC) > /usr/bin/ld: ./xxx.ltrans0.ltrans.o: warning: relocation against `lm_read_ctl_dict_size_n_lmclass_used' in read-only section `.text' > /usr/bin/ld: ./xxx.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC > /usr/bin/ld: final link failed: bad value > collect2: error: ld returned 1 exit status > /usr/bin/ld: ./yyy.ltrans0.ltrans.o: warning: relocation against `__ckd_calloc___n_elem' in read-only section `.text' > /usr/bin/ld: ./yyy.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC > /usr/bin/ld: final link failed: bad value > collect2: error: ld returned 1 exit status > diff: yyy.ltrans0.ltrans*optimized: No such file or directory > 54,55d53 > < movslq lm_read_ctl_dict_size_n_lmclass_used(%rip), %rax > < movl $0, 0(%rbp,%rax,4) > > I tracked that it differs in tree DSE dump. > > May I install the patch? No, I think it is bug of symbol_summary that get is causing a difference. It should be pure function. I think it is: /* Getter for summary callgraph node pointer. */ T* get (cgraph_node *node) ATTRIBUTE_PURE { return exists (node) ? (*m_vector)[node->get_summary_id ()] : NULL; } It should not be using get_summary_id (which allocates it for no good reaosn) and simply check that summary_id is non-negative. Still I wonder how this can make code different - that looks like another bug somewhere. Honza
On 11/18/21 19:22, Jan Hubicka wrote: >> Supported LTO compression algorithms: zlib zstd >> gcc version 12.0.0 20211118 (experimental) (GCC) >> /usr/bin/ld: ./xxx.ltrans0.ltrans.o: warning: relocation against `lm_read_ctl_dict_size_n_lmclass_used' in read-only section `.text' >> /usr/bin/ld: ./xxx.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC >> /usr/bin/ld: final link failed: bad value >> collect2: error: ld returned 1 exit status >> /usr/bin/ld: ./yyy.ltrans0.ltrans.o: warning: relocation against `__ckd_calloc___n_elem' in read-only section `.text' >> /usr/bin/ld: ./yyy.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC >> /usr/bin/ld: final link failed: bad value >> collect2: error: ld returned 1 exit status >> diff: yyy.ltrans0.ltrans*optimized: No such file or directory >> 54,55d53 >> < movslq lm_read_ctl_dict_size_n_lmclass_used(%rip), %rax >> < movl $0, 0(%rbp,%rax,4) >> >> I tracked that it differs in tree DSE dump. >> >> May I install the patch? > > No, I think it is bug of symbol_summary that get is causing a > difference. Isn't problem that the following code past_flags.reserve_exact (summary->arg_flags.length ()); past_flags.splice (summary->arg_flags); past_retslot_flags = summary->retslot_flags; is guarded with if (dump_file && ... )? It should be pure function. I think it is: > /* Getter for summary callgraph node pointer. */ > T* get (cgraph_node *node) ATTRIBUTE_PURE > { > return exists (node) ? (*m_vector)[node->get_summary_id ()] : NULL; > } > It should not be using get_summary_id (which allocates it for no good > reaosn) and simply check that summary_id is non-negative. /* Get summary id of the node. */ inline int get_summary_id () { return m_summary_id; } Where do you see the allocation? Martin > > Still I wonder how this can make code different - that looks like > another bug somewhere. > > Honza >
> On 11/18/21 19:22, Jan Hubicka wrote: > > > Supported LTO compression algorithms: zlib zstd > > > gcc version 12.0.0 20211118 (experimental) (GCC) > > > /usr/bin/ld: ./xxx.ltrans0.ltrans.o: warning: relocation against `lm_read_ctl_dict_size_n_lmclass_used' in read-only section `.text' > > > /usr/bin/ld: ./xxx.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC > > > /usr/bin/ld: final link failed: bad value > > > collect2: error: ld returned 1 exit status > > > /usr/bin/ld: ./yyy.ltrans0.ltrans.o: warning: relocation against `__ckd_calloc___n_elem' in read-only section `.text' > > > /usr/bin/ld: ./yyy.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC > > > /usr/bin/ld: final link failed: bad value > > > collect2: error: ld returned 1 exit status > > > diff: yyy.ltrans0.ltrans*optimized: No such file or directory > > > 54,55d53 > > > < movslq lm_read_ctl_dict_size_n_lmclass_used(%rip), %rax > > > < movl $0, 0(%rbp,%rax,4) > > > > > > I tracked that it differs in tree DSE dump. > > > > > > May I install the patch? > > > > No, I think it is bug of symbol_summary that get is causing a > > difference. > > Isn't problem that the following code > > past_flags.reserve_exact (summary->arg_flags.length ()); > past_flags.splice (summary->arg_flags); > past_retslot_flags = summary->retslot_flags; Aha, that makes sense. Sorry. It used to be saved for dumping only while we now use it to merge old summaries with new. So it is indeed a (quite stupid) bug. The patch is OK. Thanks for finding it. Honza
On 11/18/21 19:29, Jan Hubicka wrote: >> On 11/18/21 19:22, Jan Hubicka wrote: >>>> Supported LTO compression algorithms: zlib zstd >>>> gcc version 12.0.0 20211118 (experimental) (GCC) >>>> /usr/bin/ld: ./xxx.ltrans0.ltrans.o: warning: relocation against `lm_read_ctl_dict_size_n_lmclass_used' in read-only section `.text' >>>> /usr/bin/ld: ./xxx.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC >>>> /usr/bin/ld: final link failed: bad value >>>> collect2: error: ld returned 1 exit status >>>> /usr/bin/ld: ./yyy.ltrans0.ltrans.o: warning: relocation against `__ckd_calloc___n_elem' in read-only section `.text' >>>> /usr/bin/ld: ./yyy.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC >>>> /usr/bin/ld: final link failed: bad value >>>> collect2: error: ld returned 1 exit status >>>> diff: yyy.ltrans0.ltrans*optimized: No such file or directory >>>> 54,55d53 >>>> < movslq lm_read_ctl_dict_size_n_lmclass_used(%rip), %rax >>>> < movl $0, 0(%rbp,%rax,4) >>>> >>>> I tracked that it differs in tree DSE dump. >>>> >>>> May I install the patch? >>> >>> No, I think it is bug of symbol_summary that get is causing a >>> difference. >> >> Isn't problem that the following code >> >> past_flags.reserve_exact (summary->arg_flags.length ()); >> past_flags.splice (summary->arg_flags); >> past_retslot_flags = summary->retslot_flags; > > Aha, that makes sense. Sorry. It used to be saved for dumping only > while we now use it to merge old summaries with new. So it is indeed a > (quite stupid) bug. Good :) Good. I thought I overlooked something. > > The patch is OK. Thanks for finding it. > Honza > You're welcome. Thanks for fast review. Martin
> > > > > > Isn't problem that the following code > > > > > > past_flags.reserve_exact (summary->arg_flags.length ()); > > > past_flags.splice (summary->arg_flags); > > > past_retslot_flags = summary->retslot_flags; > > > > Aha, that makes sense. Sorry. It used to be saved for dumping only > > while we now use it to merge old summaries with new. So it is indeed a > > (quite stupid) bug. > > Good :) Good. I thought I overlooked something. Hehe, I overlooked a hunk while breaking the patches into more independent changes. I planed a cleanup of this code after pushing out the new features. Pehraps a trivial parts of the cleanup can be done even in stage3. Honza
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index a3c7c6d6a1f..6cacf9c8ab1 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -2868,15 +2868,15 @@ analyze_function (function *f, bool ipa) optimization_summaries = modref_summaries::create_ggc (symtab); else /* Remove existing summary if we are re-running the pass. */ { - if (dump_file - && (summary - = optimization_summaries->get (fnode)) - != NULL + summary = optimization_summaries->get (fnode); + if (summary != NULL && summary->loads) { - fprintf (dump_file, "Past summary:\n"); - optimization_summaries->get - (fnode)->dump (dump_file); + if (dump_file) + { + fprintf (dump_file, "Past summary:\n"); + optimization_summaries->get (fnode)->dump (dump_file); + } past_flags.reserve_exact (summary->arg_flags.length ()); past_flags.splice (summary->arg_flags); past_retslot_flags = summary->retslot_flags;