Message ID | orin4miwjg.fsf@lxoliva.fsfla.org |
---|---|
State | New |
Headers | show |
Series | [RFC] Introduce -msdata=explicit for powerpc | expand |
Hi! On Tue, Aug 07, 2018 at 02:18:59AM -0300, Alexandre Oliva wrote: > This option allows users to manually select what goes in the limited > small data range, and still get smaller and faster small data access > sequences for the selected data. > We've considered adding a new attribute, say "sdata", to let the > compiler pick the sdata/sbss section name itself, but that's not > strictly necessary since attribute section already does what we need. > I'm not sure how the semantics of conflicting attributes should be > implemented, but if others think it's a good idea, I could give it a > shot. Like, if attribute((sdata, section("data"))) is given, should the > variable be placed in .data but be accessed using sdata addressing > modes? Should we reject that with an error? Should we warn and ignore > one of the attributes? Something else? > I saw comments, docs and init code that suggested the possibility of > using r2/.sdata2 for small data, but I couldn't get code to be generated > for such access, even with very old toolchains. I'm not sure I'm just > missing how this magic comes about, or whether it hasn't been available > for a very long time but nobody removed the comments/docs. Assuming I'm > missing something, I put in the possibility of using r2 in the test in > the patch, but I'm sure I have not exercised it to make sure I got it > right. Help? attribute(section("sdata2"))) works, perhaps. Nothing is put there implicitly afais. What do the ABIs say? > I have not YET given this much testing. I'm posting it so as to give > ppc maintainers an opportunity to comment on the proposed approach, in > hopes of getting buy-in for the idea, if not necessarily for the patch, > but I welcome alternate suggestions to enable users to choose what goes > in faster sdata when there's too much data for size-based assignment to > place interesting variables in sdata without overflowing its range. There is 64kB of sdata, and the maximum size of an object to be put there is 8 bytes by default. That will require an awful lot of manual markup. You can use different -msdata options per object file, maybe tune -G a bit, and change that section attribute (or the target attribute) where you want to (or does that not work?) The approach looks like it should work, but it does not seem all that convenient to me. Well I'm sure you try it out on a reasonably sized project, and you'll find out if it is handy or not :-) Segher
On Aug 7, 2018, Segher Boessenkool <segher@kernel.crashing.org> wrote: > Hi! Hi! > On Tue, Aug 07, 2018 at 02:18:59AM -0300, Alexandre Oliva wrote: >> I saw comments, docs and init code that suggested the possibility of >> using r2/.sdata2 for small data, but I couldn't get code to be generated >> for such access, even with very old toolchains. I'm not sure I'm just >> missing how this magic comes about, or whether it hasn't been available >> for a very long time but nobody removed the comments/docs. Assuming I'm >> missing something, I put in the possibility of using r2 in the test in >> the patch, but I'm sure I have not exercised it to make sure I got it >> right. Help? > attribute(section("sdata2"))) works, perhaps. Nothing is put there > implicitly afais. Yeah, that's the point. Docs say: @item -msdata=eabi @opindex msdata=eabi On System V.4 and embedded PowerPC systems, put small initialized @code{const} global and static data in the @code{.sdata2} section, which is pointed to by register @code{r2}. Put small initialized non-@code{const} global and static data in the @code{.sdata} section, which is pointed to by register @code{r13}. [...] and rs6000.c contains: rs6000_elf_asm_init_sections (void) { [...] sdata2_section = get_unnamed_section (SECTION_WRITE, output_section_asm_op, SDATA2_SECTION_ASM_OP); but sdata2_section seems to be unused, and I don't see anything that selects r2 over SMALL_DATA_REG. >> I have not YET given this much testing. I'm posting it so as to give >> ppc maintainers an opportunity to comment on the proposed approach, in >> hopes of getting buy-in for the idea, if not necessarily for the patch, >> but I welcome alternate suggestions to enable users to choose what goes >> in faster sdata when there's too much data for size-based assignment to >> place interesting variables in sdata without overflowing its range. > There is 64kB of sdata, and the maximum size of an object to be put there > is 8 bytes by default. That will require an awful lot of manual markup. In this case, it's machine-generated code we're talking about. IIUC one large-ish data set is performance critical, and another data set that would take them both over the edge isn't, though they're both used together in the same translation units. I suppose defining a fastint C macro / Fast_Integer Ada subtype in some C header file or Ada package could make it a lot more appealing and easy to use ;-) > Well I'm sure you try it out on a reasonably sized > project, and you'll find out if it is handy or not :-) I wanted to run this design through port maintainers to check for objections before offering it to the interested customer. If you tell me there's no fundamental objection, I will have it run through the customer to confirm it will indeed meet their needs. Thanks,
On Wed, Aug 08, 2018 at 12:38:54AM -0300, Alexandre Oliva wrote: > On Aug 7, 2018, Segher Boessenkool <segher@kernel.crashing.org> wrote: > > On Tue, Aug 07, 2018 at 02:18:59AM -0300, Alexandre Oliva wrote: > > >> I saw comments, docs and init code that suggested the possibility of > >> using r2/.sdata2 for small data, but I couldn't get code to be generated > >> for such access, even with very old toolchains. I'm not sure I'm just > >> missing how this magic comes about, or whether it hasn't been available > >> for a very long time but nobody removed the comments/docs. Assuming I'm > >> missing something, I put in the possibility of using r2 in the test in > >> the patch, but I'm sure I have not exercised it to make sure I got it > >> right. Help? > > > attribute(section("sdata2"))) works, perhaps. Nothing is put there > > implicitly afais. > > Yeah, that's the point. Docs say: > > @item -msdata=eabi > @opindex msdata=eabi > On System V.4 and embedded PowerPC systems, put small initialized > @code{const} global and static data in the @code{.sdata2} section, which > is pointed to by register @code{r2}. Put small initialized > non-@code{const} global and static data in the @code{.sdata} section, > which is pointed to by register @code{r13}. [...] > > and rs6000.c contains: > > rs6000_elf_asm_init_sections (void) > { > [...] > sdata2_section > = get_unnamed_section (SECTION_WRITE, output_section_asm_op, > SDATA2_SECTION_ASM_OP); > > but sdata2_section seems to be unused, and I don't see anything that > selects r2 over SMALL_DATA_REG. So, you need to build a powerpc-eabi- toolchain, and use -msdata=eabi. Then you get sdata2 used (via srodata in generic code), and it is accessed via GPR2 (via the sda21 reloc and linker magic). It is hard to trace down :-) > >> I have not YET given this much testing. I'm posting it so as to give > >> ppc maintainers an opportunity to comment on the proposed approach, in > >> hopes of getting buy-in for the idea, if not necessarily for the patch, > >> but I welcome alternate suggestions to enable users to choose what goes > >> in faster sdata when there's too much data for size-based assignment to > >> place interesting variables in sdata without overflowing its range. > > > There is 64kB of sdata, and the maximum size of an object to be put there > > is 8 bytes by default. That will require an awful lot of manual markup. > > In this case, it's machine-generated code we're talking about. IIUC one > large-ish data set is performance critical, and another data set that > would take them both over the edge isn't, though they're both used > together in the same translation units. > > I suppose defining a fastint C macro / Fast_Integer Ada subtype in some > C header file or Ada package could make it a lot more appealing and easy > to use ;-) > > > Well I'm sure you try it out on a reasonably sized > > project, and you'll find out if it is handy or not :-) > > I wanted to run this design through port maintainers to check for > objections before offering it to the interested customer. If you tell > me there's no fundamental objection, I will have it run through the > customer to confirm it will indeed meet their needs. It looks fine, not intrusive, and your use case is reasonable as well. Shouldn't be a problem. Segher
On Aug 7, 2018, Segher Boessenkool <segher@kernel.crashing.org> wrote: > The approach looks like it should work, but it does not seem all that > convenient to me. And there's more, it's actually redundant. -msdata -G 0 is equivalent to the proposed -msdata=explicit. Patch withdrawn. Sorry about the noise.
On Aug 8, 2018, Segher Boessenkool <segher@kernel.crashing.org> wrote: > Then you get sdata2 used (via srodata in generic code), and it is accessed > via GPR2 (via the sda21 reloc and linker magic). It is hard to trace down :-) Aah, it didn't occur to me that the r2 uses could be introduced by the linker. I was cutting corners and looking at the asm output, after building only gcc, without binutils. Thanks for the explanation. Mind if I add a comment about that next to... hmm... how about the SMALL_DATA_* macros? for gcc/ChangeLog * config/rs6000/rs6000.c (SMALL_DATA_RELOC, SMALL_DATA_REG): Add a comment about how uses of r2 for .sdata2 come about. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index ddc61bdaffe7..bbc564de41ae 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -21240,6 +21240,9 @@ rs6000_output_function_entry (FILE *file, const char *fname) /* Print an operand. Recognize special options, documented below. */ #if TARGET_ELF +/* Access to .sdata2 through r2 (see -msdata=eabi in invoke.texi) is + only introduced by the linker, when applying the sda21 + relocation. */ #define SMALL_DATA_RELOC ((rs6000_sdata == SDATA_EABI) ? "sda21" : "sdarel") #define SMALL_DATA_REG ((rs6000_sdata == SDATA_EABI) ? 0 : 13) #else
Hi Alexandre, On Thu, Aug 09, 2018 at 03:23:12AM -0300, Alexandre Oliva wrote: > On Aug 8, 2018, Segher Boessenkool <segher@kernel.crashing.org> wrote: > > > Then you get sdata2 used (via srodata in generic code), and it is accessed > > via GPR2 (via the sda21 reloc and linker magic). It is hard to trace down :-) > > Aah, it didn't occur to me that the r2 uses could be introduced by the > linker. I was cutting corners and looking at the asm output, after > building only gcc, without binutils. Thanks for the explanation. Mind > if I add a comment about that next to... hmm... how about the > SMALL_DATA_* macros? This is fine, please commit to trunk. Thanks! Segher
diff --git a/gcc/config/rs6000/rs6000-opts.h b/gcc/config/rs6000/rs6000-opts.h index a8194783e018..cc780f569308 100644 --- a/gcc/config/rs6000/rs6000-opts.h +++ b/gcc/config/rs6000/rs6000-opts.h @@ -120,7 +120,8 @@ enum rs6000_sdata_type { SDATA_NONE, /* No small data support. */ SDATA_DATA, /* Just put data in .sbss/.sdata, don't use relocs. */ SDATA_SYSV, /* Use r13 to point to .sdata/.sbss. */ - SDATA_EABI /* Use r13 like above, r2 points to .sdata2/.sbss2. */ + SDATA_EABI, /* Use r13 like above, r2 points to .sdata2/.sbss2. */ + SDATA_EXPLICIT /* Use r13 (or r2?) for explicit sdata. */ }; /* Type of traceback to use. */ diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index ddc61bdaffe7..a679709a055f 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -2819,6 +2819,10 @@ rs6000_debug_reg_global (void) fprintf (stderr, DEBUG_FMT_S, "sdata", "eabi"); break; + case SDATA_EXPLICIT: + fprintf (stderr, DEBUG_FMT_S, "sdata", "explicit"); + break; + } switch (rs6000_traceback) @@ -6098,6 +6102,8 @@ rs6000_file_start (void) case SDATA_DATA: fprintf (file, "%s -msdata=data", start); start = ""; break; case SDATA_SYSV: fprintf (file, "%s -msdata=sysv", start); start = ""; break; case SDATA_EABI: fprintf (file, "%s -msdata=eabi", start); start = ""; break; + case SDATA_EXPLICIT: + fprintf (file, "%s -msdata=explicit", start); start = ""; break; } if (rs6000_sdata && g_switch_value) @@ -21240,8 +21246,10 @@ rs6000_output_function_entry (FILE *file, const char *fname) /* Print an operand. Recognize special options, documented below. */ #if TARGET_ELF -#define SMALL_DATA_RELOC ((rs6000_sdata == SDATA_EABI) ? "sda21" : "sdarel") -#define SMALL_DATA_REG ((rs6000_sdata == SDATA_EABI) ? 0 : 13) +#define SMALL_DATA_EABI_P (rs6000_sdata == SDATA_EABI \ + || (rs6000_sdata == SDATA_EXPLICIT && TARGET_EABI)) +#define SMALL_DATA_RELOC (SMALL_DATA_EABI_P ? "sda21" : "sdarel") +#define SMALL_DATA_REG (SMALL_DATA_EABI_P ? 0 : 13) #else #define SMALL_DATA_RELOC "sda21" #define SMALL_DATA_REG 0 @@ -33221,6 +33229,11 @@ rs6000_elf_in_small_data_p (const_tree decl) } else { + /* Explicit mode means no implicit assignment to small data + sections. */ + if (rs6000_sdata == SDATA_EXPLICIT) + return false; + /* If we are told not to put readonly data in sdata, then don't. */ if (TREE_READONLY (decl) && rs6000_sdata != SDATA_EABI && !rs6000_readonly_in_sdata) diff --git a/gcc/config/rs6000/sysv4.h b/gcc/config/rs6000/sysv4.h index bb19d0dcd411..83f527c57af1 100644 --- a/gcc/config/rs6000/sysv4.h +++ b/gcc/config/rs6000/sysv4.h @@ -119,6 +119,8 @@ do { \ rs6000_sdata = SDATA_DATA; \ else if (!strcmp (rs6000_sdata_name, "default")) \ rs6000_sdata = (TARGET_EABI) ? SDATA_EABI : SDATA_SYSV; \ + else if (!strcmp (rs6000_sdata_name, "explicit")) \ + rs6000_sdata = SDATA_EXPLICIT; \ else if (!strcmp (rs6000_sdata_name, "sysv")) \ rs6000_sdata = SDATA_SYSV; \ else if (!strcmp (rs6000_sdata_name, "eabi")) \ diff --git a/gcc/config/rs6000/sysv4.opt b/gcc/config/rs6000/sysv4.opt index 34fea0ddd08a..d11053e4bdaf 100644 --- a/gcc/config/rs6000/sysv4.opt +++ b/gcc/config/rs6000/sysv4.opt @@ -25,7 +25,7 @@ Target RejectNegative Joined Var(rs6000_abi_name) msdata= Target RejectNegative Joined Var(rs6000_sdata_name) --msdata=[none,data,sysv,eabi] Select method for sdata handling. +-msdata=[none,data,sysv,eabi,explicit] Select method for sdata handling. mreadonly-in-sdata Target Report Var(rs6000_readonly_in_sdata) Init(1) Save diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 444159178a76..86dbb1d8700d 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -22824,6 +22824,16 @@ On embedded PowerPC systems, put all initialized global and static data in the @code{.data} section, and all uninitialized data in the @code{.bss} section. +@item -msdata=explicit +@itemx -msdata +@opindex msdata=explicit +@opindex msdata +Put all initialized global and static data that is not assigned to an +explicit section in the @code{.data} section, and all uninitialized data +that is not assigned to an explicit section in the @code{.bss} section. +Use small data access patterns to access variables explicitly assigned +to small data sections. + @item -mblock-move-inline-limit=@var{num} @opindex mblock-move-inline-limit Inline all block moves (such as calls to @code{memcpy} or structure diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-sdata-3.c b/gcc/testsuite/gcc.target/powerpc/ppc-sdata-3.c new file mode 100644 index 000000000000..750e5ad234af --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/ppc-sdata-3.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-msdata=explicit" } */ +/* { dg-require-effective-target powerpc_eabi_ok } */ +/* { dg-require-effective-target nonpic } */ +/* { dg-final { scan-assembler "\\.section\[ \t\]\\.sdata," } } */ +/* { dg-final { scan-assembler "\\.section\[ \t\]\\.sbss," } } */ +/* { dg-final { scan-assembler "\\.section\[ \t\]\\.sdata2," } } */ +/* { dg-final { scan-assembler "\\.section\[ \t\]\\.sbss2," } } */ +/* { dg-final { scan-assembler "i@ha" } } */ +/* { dg-final { scan-assembler "j@sda(rel|21)\\((13|0)\\)" } } */ +/* { dg-final { scan-assembler "k@sda(rel|21)\\((13|0)\\)" } } */ +/* { dg-final { scan-assembler "l@sda(rel|21)\\((2|13|0)\\)" } } */ +/* { dg-final { scan-assembler "m@sda(rel|21)\\((2|13|0)\\)" } } */ + +static int i; +static int j __attribute__((section(".sbss"))); +static int k __attribute__((section(".sdata"))) = 2; +static int l __attribute__((section(".sbss2"))); +static int m __attribute__((section(".sdata2"))) = 4; +int f(void) { + return i + j + k + l + m; +}