Message ID | 20230525151632.3567825-1-maskray@google.com |
---|---|
State | New |
Headers | show |
Series | [v2] i386: Allow -mlarge-data-threshold with -mcmodel=large | expand |
On 25.05.2023 17:16, Fangrui Song wrote: > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -32942,9 +32942,10 @@ the cache line size. @samp{compat} is the default. > > @opindex mlarge-data-threshold > @item -mlarge-data-threshold=@var{threshold} > -When @option{-mcmodel=medium} is specified, data objects larger than > -@var{threshold} are placed in the large data section. This value must be the > -same across all objects linked into the binary, and defaults to 65535. > +When @option{-mcmodel=medium} or @option{-mcmodel=large} is specified, data > +objects larger than @var{threshold} are placed in large data sections. This > +value must be the same across all objects linked into the binary, and defaults > +to 65535. Where's the "must be the same" requirement coming from? As to the default - to remain compatible with earlier versions, shouldn't large model code default to "infinity"? Jan
On 2023-05-25, Jan Beulich wrote: >On 25.05.2023 17:16, Fangrui Song wrote: >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -32942,9 +32942,10 @@ the cache line size. @samp{compat} is the default. >> >> @opindex mlarge-data-threshold >> @item -mlarge-data-threshold=@var{threshold} >> -When @option{-mcmodel=medium} is specified, data objects larger than >> -@var{threshold} are placed in the large data section. This value must be the >> -same across all objects linked into the binary, and defaults to 65535. >> +When @option{-mcmodel=medium} or @option{-mcmodel=large} is specified, data >> +objects larger than @var{threshold} are placed in large data sections. This >> +value must be the same across all objects linked into the binary, and defaults >> +to 65535. > >Where's the "must be the same" requirement coming from? It's an existing requirement. I think it may be related to discouraging different COMDAT sections names due to different -mlarge-data-threshold=. I don't think it makes sense but did not feel strongly dropping it. Happy to drop the requirement if I revise this patch. >As to the default - to remain compatible with earlier versions, shouldn't >large model code default to "infinity"? > >Jan I have thought about this compatibility need and feel that it is very unlikly to be needed. GNU ld has supported large data sections since 2005 (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=3b22753a67cf616514de804ef6d5ed5e90a7d883). Users' programs with the internal linker scripts will still be working and -fdata-sections sections will be combined. First, -mcmodel=large use cases are rare enough. Rare perhaps -mcmodel=largel was considered theoretic excercise in trying to reach feature completion (https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU/m/NNuA0P7pAQAJ), without this patch -mcmodel=large object files don't interract well with existing -mcmodel=small object files. Moreover, if a user expects a specific section prefix with -mcmodel=large, that's a brittle assumption. I think it's fair to say that the fault is on the user side and GCC doesn't need to work around their issues.
On 25.05.2023 18:11, Fangrui Song wrote: > On 2023-05-25, Jan Beulich wrote: >> On 25.05.2023 17:16, Fangrui Song wrote: >>> --- a/gcc/doc/invoke.texi >>> +++ b/gcc/doc/invoke.texi >>> @@ -32942,9 +32942,10 @@ the cache line size. @samp{compat} is the default. >>> >>> @opindex mlarge-data-threshold >>> @item -mlarge-data-threshold=@var{threshold} >>> -When @option{-mcmodel=medium} is specified, data objects larger than >>> -@var{threshold} are placed in the large data section. This value must be the >>> -same across all objects linked into the binary, and defaults to 65535. >>> +When @option{-mcmodel=medium} or @option{-mcmodel=large} is specified, data >>> +objects larger than @var{threshold} are placed in large data sections. This >>> +value must be the same across all objects linked into the binary, and defaults >>> +to 65535. >> >> Where's the "must be the same" requirement coming from? > > It's an existing requirement. I think it may be related to discouraging > different COMDAT sections names due to different -mlarge-data-threshold=. > I don't think it makes sense but did not feel strongly dropping it. > > Happy to drop the requirement if I revise this patch. I understand that this isn't something you introduce, but it still stuck me as odd. Therefore I thought I'd suggest to take the opportunity to at least soften the language, unless of course there's a real reason behind it. >> As to the default - to remain compatible with earlier versions, shouldn't >> large model code default to "infinity"? >> >> Jan > > I have thought about this compatibility need and feel that it is very > unlikly to be needed. GNU ld has supported large data sections since > 2005 > (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=3b22753a67cf616514de804ef6d5ed5e90a7d883). > Users' programs with the internal linker scripts will still be working > and -fdata-sections sections will be combined. Well, the concern clearly is about custom scripts. Imo ... > First, -mcmodel=large use cases are rare enough. Rare perhaps > -mcmodel=largel was considered theoretic excercise in > trying to reach feature completion > (https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU/m/NNuA0P7pAQAJ), > without this patch -mcmodel=large object files don't interract well with > existing -mcmodel=small object files. ... the more exotic a project, the more likely it is that they're using custom scripts. > Moreover, if a user expects a specific section prefix with > -mcmodel=large, that's a brittle assumption. I think it's fair to say > that the fault is on the user side and GCC doesn't need to work around > their issues. I guess I don't really see what you base this on. Without any special options, expecting data to end up in .data/.bss/.rodata (and variants thereof) looks like quite reasonable an assumption to me. Jan
On Fri, May 26, 2023 at 12:11 AM Jan Beulich <jbeulich@suse.com> wrote: > > On 25.05.2023 18:11, Fangrui Song wrote: > > On 2023-05-25, Jan Beulich wrote: > >> On 25.05.2023 17:16, Fangrui Song wrote: > >>> --- a/gcc/doc/invoke.texi > >>> +++ b/gcc/doc/invoke.texi > >>> @@ -32942,9 +32942,10 @@ the cache line size. @samp{compat} is the default. > >>> > >>> @opindex mlarge-data-threshold > >>> @item -mlarge-data-threshold=@var{threshold} > >>> -When @option{-mcmodel=medium} is specified, data objects larger than > >>> -@var{threshold} are placed in the large data section. This value must be the > >>> -same across all objects linked into the binary, and defaults to 65535. > >>> +When @option{-mcmodel=medium} or @option{-mcmodel=large} is specified, data > >>> +objects larger than @var{threshold} are placed in large data sections. This > >>> +value must be the same across all objects linked into the binary, and defaults > >>> +to 65535. > >> > >> Where's the "must be the same" requirement coming from? > > > > It's an existing requirement. I think it may be related to discouraging > > different COMDAT sections names due to different -mlarge-data-threshold=. > > I don't think it makes sense but did not feel strongly dropping it. > > > > Happy to drop the requirement if I revise this patch. > > I understand that this isn't something you introduce, but it still stuck > me as odd. Therefore I thought I'd suggest to take the opportunity to at > least soften the language, unless of course there's a real reason behind > it. Dropping "This value must be the same across all objects linked into the binary" looks good to me. > >> As to the default - to remain compatible with earlier versions, shouldn't > >> large model code default to "infinity"? > >> > >> Jan > > > > I have thought about this compatibility need and feel that it is very > > unlikly to be needed. GNU ld has supported large data sections since > > 2005 > > (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=3b22753a67cf616514de804ef6d5ed5e90a7d883). > > Users' programs with the internal linker scripts will still be working > > and -fdata-sections sections will be combined. > > Well, the concern clearly is about custom scripts. Imo ... > > > First, -mcmodel=large use cases are rare enough. Rare perhaps > > -mcmodel=largel was considered theoretic excercise in > > trying to reach feature completion > > (https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU/m/NNuA0P7pAQAJ), > > without this patch -mcmodel=large object files don't interract well with > > existing -mcmodel=small object files. > > ... the more exotic a project, the more likely it is that they're using > custom scripts. > > > Moreover, if a user expects a specific section prefix with > > -mcmodel=large, that's a brittle assumption. I think it's fair to say > > that the fault is on the user side and GCC doesn't need to work around > > their issues. > > I guess I don't really see what you base this on. Without any special > options, expecting data to end up in .data/.bss/.rodata (and variants > thereof) looks like quite reasonable an assumption to me. > > Jan Making -mlarge-data-threshold= default value for -mcmodel={medium,large} seems quite odd to me. The default value is 65536, which is larger than most data objects that we may encounter in practice. I want to investigate how often users use -mcmodel=large but it is quite difficult. Many are for AIX and/or powerpc. I have tried to be considerate but I am not sure we have users in the intersection of the three sets: -mcmodel=large, data objects larger than 65536, using linker script in a way that orphan sections .ldata will cause trouble.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 202abf0b39c..3568da4f053 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -637,7 +637,8 @@ ix86_can_inline_p (tree caller, tree callee) static bool ix86_in_large_data_p (tree exp) { - if (ix86_cmodel != CM_MEDIUM && ix86_cmodel != CM_MEDIUM_PIC) + if (ix86_cmodel != CM_MEDIUM && ix86_cmodel != CM_MEDIUM_PIC && + ix86_cmodel != CM_LARGE && ix86_cmodel != CM_LARGE_PIC) return false; if (exp == NULL_TREE) @@ -848,8 +849,9 @@ x86_elf_aligned_decl_common (FILE *file, tree decl, const char *name, unsigned HOST_WIDE_INT size, unsigned align) { - if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC) - && size > (unsigned int)ix86_section_threshold) + if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC || + ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC) && + size > (unsigned int)ix86_section_threshold) { switch_to_section (get_named_section (decl, ".lbss", 0)); fputs (LARGECOMM_SECTION_ASM_OP, file); @@ -869,9 +871,10 @@ void x86_output_aligned_bss (FILE *file, tree decl, const char *name, unsigned HOST_WIDE_INT size, unsigned align) { - if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC) - && size > (unsigned int)ix86_section_threshold) - switch_to_section (get_named_section (decl, ".lbss", 0)); + if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC || + ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC) && + size > (unsigned int)ix86_section_threshold) + switch_to_section(get_named_section(decl, ".lbss", 0)); else switch_to_section (bss_section); ASM_OUTPUT_ALIGN (file, floor_log2 (align / BITS_PER_UNIT)); diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index d74f6b1f8fc..de8e722cd62 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -282,7 +282,7 @@ Branches are this expensive (arbitrary units). mlarge-data-threshold= Target RejectNegative Joined UInteger Var(ix86_section_threshold) Init(DEFAULT_LARGE_SECTION_THRESHOLD) --mlarge-data-threshold=<number> Data greater than given threshold will go into .ldata section in x86-64 medium model. +-mlarge-data-threshold=<number> Data greater than given threshold will go into a large data section in x86-64 medium and large code models. mcmodel= Target RejectNegative Joined Enum(cmodel) Var(ix86_cmodel) Init(CM_32) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ee78591c73e..4b5391e12b5 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -32942,9 +32942,10 @@ the cache line size. @samp{compat} is the default. @opindex mlarge-data-threshold @item -mlarge-data-threshold=@var{threshold} -When @option{-mcmodel=medium} is specified, data objects larger than -@var{threshold} are placed in the large data section. This value must be the -same across all objects linked into the binary, and defaults to 65535. +When @option{-mcmodel=medium} or @option{-mcmodel=large} is specified, data +objects larger than @var{threshold} are placed in large data sections. This +value must be the same across all objects linked into the binary, and defaults +to 65535. @opindex mrtd @item -mrtd diff --git a/gcc/testsuite/gcc.target/i386/large-data.c b/gcc/testsuite/gcc.target/i386/large-data.c new file mode 100644 index 00000000000..09a917431d4 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/large-data.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-options "-O2 -mcmodel=large -mlarge-data-threshold=4" } */ +/* { dg-final { scan-assembler ".lbss" } } */ +/* { dg-final { scan-assembler ".bss" } } */ +/* { dg-final { scan-assembler ".ldata" } } */ +/* { dg-final { scan-assembler ".data" } } */ +/* { dg-final { scan-assembler ".lrodata" } } */ +/* { dg-final { scan-assembler ".rodata" } } */ + +const char rodata_a[] = "abc", rodata_b[] = "abcd"; +char data_a[4] = {1}, data_b[5] = {1}; +char bss_a[4], bss_b[5];
When using -mcmodel=medium, large data objects larger than the -mlarge-data-threshold threshold are placed into large data sections (.lrodata, .ldata, .lbss and some variants). GNU ld and ld.lld 17 place .l* sections into separate output sections. If small and medium code model object files are mixed, the .l* sections won't exert relocation overflow pressure on sections in object files built with -mcmodel=small. However, when using -mcmodel=large, -mlarge-data-threshold doesn't apply. This means that the .rodata/.data/.bss sections may exert relocation overflow pressure on sections in -mcmodel=small object files. This patch allows -mcmodel=large to generate .l* sections. Link: https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU ("Large data sections for the large code model") Signed-off-by: Fangrui Song <maskray@google.com> --- Changes from v1 (https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616947.html): * Clarify commit message. Add link to https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU --- gcc/config/i386/i386.cc | 15 +++++++++------ gcc/config/i386/i386.opt | 2 +- gcc/doc/invoke.texi | 7 ++++--- gcc/testsuite/gcc.target/i386/large-data.c | 13 +++++++++++++ 4 files changed, 27 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/large-data.c