diff mbox series

[RFC] Introduce -msdata=explicit for powerpc

Message ID orin4miwjg.fsf@lxoliva.fsfla.org
State New
Headers show
Series [RFC] Introduce -msdata=explicit for powerpc | expand

Commit Message

Alexandre Oliva Aug. 7, 2018, 5:18 a.m. UTC
This option allows users to manually select what goes in the limited
small data range, and still get smaller and faster small data access
sequences for the selected data.


We've considered adding a new attribute, say "sdata", to let the
compiler pick the sdata/sbss section name itself, but that's not
strictly necessary since attribute section already does what we need.
I'm not sure how the semantics of conflicting attributes should be
implemented, but if others think it's a good idea, I could give it a
shot.  Like, if attribute((sdata, section("data"))) is given, should the
variable be placed in .data but be accessed using sdata addressing
modes?  Should we reject that with an error?  Should we warn and ignore
one of the attributes?  Something else?


I saw comments, docs and init code that suggested the possibility of
using r2/.sdata2 for small data, but I couldn't get code to be generated
for such access, even with very old toolchains.  I'm not sure I'm just
missing how this magic comes about, or whether it hasn't been available
for a very long time but nobody removed the comments/docs.  Assuming I'm
missing something, I put in the possibility of using r2 in the test in
the patch, but I'm sure I have not exercised it to make sure I got it
right.  Help?


I have not YET given this much testing.  I'm posting it so as to give
ppc maintainers an opportunity to comment on the proposed approach, in
hopes of getting buy-in for the idea, if not necessarily for the patch,
but I welcome alternate suggestions to enable users to choose what goes
in faster sdata when there's too much data for size-based assignment to
place interesting variables in sdata without overflowing its range.


for  gcc/ChangeLog

	* config/rs6000/rs6000-opts.h (SDATA_EXPLICIT): New enumerator.
	* config/rs6000/rs6000.c (rs6000_debug_reg_global): Handle it.
	(rs6000_file_start): Likewise.
	(rs6000_elf_in_small_data_p): Likewise.
	(SMALL_DATA_EABI_P): New, likewise.
	(SMALL_DATA_RELOC, SMALL_DATA_REG): Use it.
	* config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): Check for
	and set up SDATA_EXPLICIT.
	* config/rs6000/sysv4.opt: Add explicit to -msdata.
	* doc/invoke.texi (-msdata=explicit): Document it.

for  gcc/testsuite/ChangeLog

	* gcc.target/powerpc/ppc-sdata-3.c: New.

Comments

Segher Boessenkool Aug. 7, 2018, 10:50 a.m. UTC | #1
Hi!

On Tue, Aug 07, 2018 at 02:18:59AM -0300, Alexandre Oliva wrote:
> This option allows users to manually select what goes in the limited
> small data range, and still get smaller and faster small data access
> sequences for the selected data.

> We've considered adding a new attribute, say "sdata", to let the
> compiler pick the sdata/sbss section name itself, but that's not
> strictly necessary since attribute section already does what we need.
> I'm not sure how the semantics of conflicting attributes should be
> implemented, but if others think it's a good idea, I could give it a
> shot.  Like, if attribute((sdata, section("data"))) is given, should the
> variable be placed in .data but be accessed using sdata addressing
> modes?  Should we reject that with an error?  Should we warn and ignore
> one of the attributes?  Something else?

> I saw comments, docs and init code that suggested the possibility of
> using r2/.sdata2 for small data, but I couldn't get code to be generated
> for such access, even with very old toolchains.  I'm not sure I'm just
> missing how this magic comes about, or whether it hasn't been available
> for a very long time but nobody removed the comments/docs.  Assuming I'm
> missing something, I put in the possibility of using r2 in the test in
> the patch, but I'm sure I have not exercised it to make sure I got it
> right.  Help?

attribute(section("sdata2")))  works, perhaps.  Nothing is put there
implicitly afais.  What do the ABIs say?

> I have not YET given this much testing.  I'm posting it so as to give
> ppc maintainers an opportunity to comment on the proposed approach, in
> hopes of getting buy-in for the idea, if not necessarily for the patch,
> but I welcome alternate suggestions to enable users to choose what goes
> in faster sdata when there's too much data for size-based assignment to
> place interesting variables in sdata without overflowing its range.

There is 64kB of sdata, and the maximum size of an object to be put there
is 8 bytes by default.  That will require an awful lot of manual markup.

You can use different -msdata options per object file, maybe tune -G a
bit, and change that section attribute (or the target attribute) where
you want to (or does that not work?)

The approach looks like it should work, but it does not seem all that
convenient to me.  Well I'm sure you try it out on a reasonably sized
project, and you'll find out if it is handy or not :-)


Segher
Alexandre Oliva Aug. 8, 2018, 3:38 a.m. UTC | #2
On Aug  7, 2018, Segher Boessenkool <segher@kernel.crashing.org> wrote:

> Hi!

Hi!

> On Tue, Aug 07, 2018 at 02:18:59AM -0300, Alexandre Oliva wrote:

>> I saw comments, docs and init code that suggested the possibility of
>> using r2/.sdata2 for small data, but I couldn't get code to be generated
>> for such access, even with very old toolchains.  I'm not sure I'm just
>> missing how this magic comes about, or whether it hasn't been available
>> for a very long time but nobody removed the comments/docs.  Assuming I'm
>> missing something, I put in the possibility of using r2 in the test in
>> the patch, but I'm sure I have not exercised it to make sure I got it
>> right.  Help?

> attribute(section("sdata2")))  works, perhaps.  Nothing is put there
> implicitly afais.

Yeah, that's the point.  Docs say:

@item -msdata=eabi
@opindex msdata=eabi
On System V.4 and embedded PowerPC systems, put small initialized
@code{const} global and static data in the @code{.sdata2} section, which
is pointed to by register @code{r2}.  Put small initialized
non-@code{const} global and static data in the @code{.sdata} section,
which is pointed to by register @code{r13}.  [...]

and rs6000.c contains:

rs6000_elf_asm_init_sections (void)
{
[...]
  sdata2_section
    = get_unnamed_section (SECTION_WRITE, output_section_asm_op,
			   SDATA2_SECTION_ASM_OP);

but sdata2_section seems to be unused, and I don't see anything that
selects r2 over SMALL_DATA_REG.


>> I have not YET given this much testing.  I'm posting it so as to give
>> ppc maintainers an opportunity to comment on the proposed approach, in
>> hopes of getting buy-in for the idea, if not necessarily for the patch,
>> but I welcome alternate suggestions to enable users to choose what goes
>> in faster sdata when there's too much data for size-based assignment to
>> place interesting variables in sdata without overflowing its range.

> There is 64kB of sdata, and the maximum size of an object to be put there
> is 8 bytes by default.  That will require an awful lot of manual markup.

In this case, it's machine-generated code we're talking about.  IIUC one
large-ish data set is performance critical, and another data set that
would take them both over the edge isn't, though they're both used
together in the same translation units.

I suppose defining a fastint C macro / Fast_Integer Ada subtype in some
C header file or Ada package could make it a lot more appealing and easy
to use ;-)

> Well I'm sure you try it out on a reasonably sized
> project, and you'll find out if it is handy or not :-)

I wanted to run this design through port maintainers to check for
objections before offering it to the interested customer.  If you tell
me there's no fundamental objection, I will have it run through the
customer to confirm it will indeed meet their needs.

Thanks,
Segher Boessenkool Aug. 8, 2018, 1:45 p.m. UTC | #3
On Wed, Aug 08, 2018 at 12:38:54AM -0300, Alexandre Oliva wrote:
> On Aug  7, 2018, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> > On Tue, Aug 07, 2018 at 02:18:59AM -0300, Alexandre Oliva wrote:
> 
> >> I saw comments, docs and init code that suggested the possibility of
> >> using r2/.sdata2 for small data, but I couldn't get code to be generated
> >> for such access, even with very old toolchains.  I'm not sure I'm just
> >> missing how this magic comes about, or whether it hasn't been available
> >> for a very long time but nobody removed the comments/docs.  Assuming I'm
> >> missing something, I put in the possibility of using r2 in the test in
> >> the patch, but I'm sure I have not exercised it to make sure I got it
> >> right.  Help?
> 
> > attribute(section("sdata2")))  works, perhaps.  Nothing is put there
> > implicitly afais.
> 
> Yeah, that's the point.  Docs say:
> 
> @item -msdata=eabi
> @opindex msdata=eabi
> On System V.4 and embedded PowerPC systems, put small initialized
> @code{const} global and static data in the @code{.sdata2} section, which
> is pointed to by register @code{r2}.  Put small initialized
> non-@code{const} global and static data in the @code{.sdata} section,
> which is pointed to by register @code{r13}.  [...]
> 
> and rs6000.c contains:
> 
> rs6000_elf_asm_init_sections (void)
> {
> [...]
>   sdata2_section
>     = get_unnamed_section (SECTION_WRITE, output_section_asm_op,
> 			   SDATA2_SECTION_ASM_OP);
> 
> but sdata2_section seems to be unused, and I don't see anything that
> selects r2 over SMALL_DATA_REG.

So, you need to build a powerpc-eabi- toolchain, and use -msdata=eabi.
Then you get sdata2 used (via srodata in generic code), and it is accessed
via GPR2 (via the sda21 reloc and linker magic).  It is hard to trace down :-)

> >> I have not YET given this much testing.  I'm posting it so as to give
> >> ppc maintainers an opportunity to comment on the proposed approach, in
> >> hopes of getting buy-in for the idea, if not necessarily for the patch,
> >> but I welcome alternate suggestions to enable users to choose what goes
> >> in faster sdata when there's too much data for size-based assignment to
> >> place interesting variables in sdata without overflowing its range.
> 
> > There is 64kB of sdata, and the maximum size of an object to be put there
> > is 8 bytes by default.  That will require an awful lot of manual markup.
> 
> In this case, it's machine-generated code we're talking about.  IIUC one
> large-ish data set is performance critical, and another data set that
> would take them both over the edge isn't, though they're both used
> together in the same translation units.
> 
> I suppose defining a fastint C macro / Fast_Integer Ada subtype in some
> C header file or Ada package could make it a lot more appealing and easy
> to use ;-)
> 
> > Well I'm sure you try it out on a reasonably sized
> > project, and you'll find out if it is handy or not :-)
> 
> I wanted to run this design through port maintainers to check for
> objections before offering it to the interested customer.  If you tell
> me there's no fundamental objection, I will have it run through the
> customer to confirm it will indeed meet their needs.

It looks fine, not intrusive, and your use case is reasonable as well.
Shouldn't be a problem.


Segher
Alexandre Oliva Aug. 9, 2018, 5:06 a.m. UTC | #4
On Aug  7, 2018, Segher Boessenkool <segher@kernel.crashing.org> wrote:

> The approach looks like it should work, but it does not seem all that
> convenient to me.

And there's more, it's actually redundant.
-msdata -G 0 is equivalent to the proposed -msdata=explicit.
Patch withdrawn.  Sorry about the noise.
Alexandre Oliva Aug. 9, 2018, 6:23 a.m. UTC | #5
On Aug  8, 2018, Segher Boessenkool <segher@kernel.crashing.org> wrote:

> Then you get sdata2 used (via srodata in generic code), and it is accessed
> via GPR2 (via the sda21 reloc and linker magic).  It is hard to trace down :-)

Aah, it didn't occur to me that the r2 uses could be introduced by the
linker.  I was cutting corners and looking at the asm output, after
building only gcc, without binutils.  Thanks for the explanation.  Mind
if I add a comment about that next to... hmm...  how about the
SMALL_DATA_* macros?

for  gcc/ChangeLog

	* config/rs6000/rs6000.c (SMALL_DATA_RELOC, SMALL_DATA_REG): Add
	a comment about how uses of r2 for .sdata2 come about.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ddc61bdaffe7..bbc564de41ae 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21240,6 +21240,9 @@ rs6000_output_function_entry (FILE *file, const char *fname)
 /* Print an operand.  Recognize special options, documented below.  */
 
 #if TARGET_ELF
+/* Access to .sdata2 through r2 (see -msdata=eabi in invoke.texi) is
+   only introduced by the linker, when applying the sda21
+   relocation.  */
 #define SMALL_DATA_RELOC ((rs6000_sdata == SDATA_EABI) ? "sda21" : "sdarel")
 #define SMALL_DATA_REG ((rs6000_sdata == SDATA_EABI) ? 0 : 13)
 #else
Segher Boessenkool Aug. 10, 2018, 2:01 p.m. UTC | #6
Hi Alexandre,

On Thu, Aug 09, 2018 at 03:23:12AM -0300, Alexandre Oliva wrote:
> On Aug  8, 2018, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> 
> > Then you get sdata2 used (via srodata in generic code), and it is accessed
> > via GPR2 (via the sda21 reloc and linker magic).  It is hard to trace down :-)
> 
> Aah, it didn't occur to me that the r2 uses could be introduced by the
> linker.  I was cutting corners and looking at the asm output, after
> building only gcc, without binutils.  Thanks for the explanation.  Mind
> if I add a comment about that next to... hmm...  how about the
> SMALL_DATA_* macros?

This is fine, please commit to trunk.  Thanks!


Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000-opts.h b/gcc/config/rs6000/rs6000-opts.h
index a8194783e018..cc780f569308 100644
--- a/gcc/config/rs6000/rs6000-opts.h
+++ b/gcc/config/rs6000/rs6000-opts.h
@@ -120,7 +120,8 @@  enum rs6000_sdata_type {
   SDATA_NONE,			/* No small data support.  */
   SDATA_DATA,			/* Just put data in .sbss/.sdata, don't use relocs.  */
   SDATA_SYSV,			/* Use r13 to point to .sdata/.sbss.  */
-  SDATA_EABI			/* Use r13 like above, r2 points to .sdata2/.sbss2.  */
+  SDATA_EABI,			/* Use r13 like above, r2 points to .sdata2/.sbss2.  */
+  SDATA_EXPLICIT		/* Use r13 (or r2?) for explicit sdata.  */
 };
 
 /* Type of traceback to use.  */
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ddc61bdaffe7..a679709a055f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2819,6 +2819,10 @@  rs6000_debug_reg_global (void)
       fprintf (stderr, DEBUG_FMT_S, "sdata", "eabi");
       break;
 
+    case SDATA_EXPLICIT:
+      fprintf (stderr, DEBUG_FMT_S, "sdata", "explicit");
+      break;
+
     }
 
   switch (rs6000_traceback)
@@ -6098,6 +6102,8 @@  rs6000_file_start (void)
 	case SDATA_DATA: fprintf (file, "%s -msdata=data", start); start = ""; break;
 	case SDATA_SYSV: fprintf (file, "%s -msdata=sysv", start); start = ""; break;
 	case SDATA_EABI: fprintf (file, "%s -msdata=eabi", start); start = ""; break;
+	case SDATA_EXPLICIT:
+	  fprintf (file, "%s -msdata=explicit", start); start = ""; break;
 	}
 
       if (rs6000_sdata && g_switch_value)
@@ -21240,8 +21246,10 @@  rs6000_output_function_entry (FILE *file, const char *fname)
 /* Print an operand.  Recognize special options, documented below.  */
 
 #if TARGET_ELF
-#define SMALL_DATA_RELOC ((rs6000_sdata == SDATA_EABI) ? "sda21" : "sdarel")
-#define SMALL_DATA_REG ((rs6000_sdata == SDATA_EABI) ? 0 : 13)
+#define SMALL_DATA_EABI_P (rs6000_sdata == SDATA_EABI \
+			   || (rs6000_sdata == SDATA_EXPLICIT && TARGET_EABI))
+#define SMALL_DATA_RELOC (SMALL_DATA_EABI_P ? "sda21" : "sdarel")
+#define SMALL_DATA_REG (SMALL_DATA_EABI_P ? 0 : 13)
 #else
 #define SMALL_DATA_RELOC "sda21"
 #define SMALL_DATA_REG 0
@@ -33221,6 +33229,11 @@  rs6000_elf_in_small_data_p (const_tree decl)
     }
   else
     {
+      /* Explicit mode means no implicit assignment to small data
+	 sections.  */
+      if (rs6000_sdata == SDATA_EXPLICIT)
+	return false;
+
       /* If we are told not to put readonly data in sdata, then don't.  */
       if (TREE_READONLY (decl) && rs6000_sdata != SDATA_EABI
 	  && !rs6000_readonly_in_sdata)
diff --git a/gcc/config/rs6000/sysv4.h b/gcc/config/rs6000/sysv4.h
index bb19d0dcd411..83f527c57af1 100644
--- a/gcc/config/rs6000/sysv4.h
+++ b/gcc/config/rs6000/sysv4.h
@@ -119,6 +119,8 @@  do {									\
 	rs6000_sdata = SDATA_DATA;					\
       else if (!strcmp (rs6000_sdata_name, "default"))			\
 	rs6000_sdata = (TARGET_EABI) ? SDATA_EABI : SDATA_SYSV;		\
+      else if (!strcmp (rs6000_sdata_name, "explicit"))			\
+	rs6000_sdata = SDATA_EXPLICIT;					\
       else if (!strcmp (rs6000_sdata_name, "sysv"))			\
 	rs6000_sdata = SDATA_SYSV;					\
       else if (!strcmp (rs6000_sdata_name, "eabi"))			\
diff --git a/gcc/config/rs6000/sysv4.opt b/gcc/config/rs6000/sysv4.opt
index 34fea0ddd08a..d11053e4bdaf 100644
--- a/gcc/config/rs6000/sysv4.opt
+++ b/gcc/config/rs6000/sysv4.opt
@@ -25,7 +25,7 @@  Target RejectNegative Joined Var(rs6000_abi_name)
 
 msdata=
 Target RejectNegative Joined Var(rs6000_sdata_name)
--msdata=[none,data,sysv,eabi]	Select method for sdata handling.
+-msdata=[none,data,sysv,eabi,explicit]	Select method for sdata handling.
 
 mreadonly-in-sdata
 Target Report Var(rs6000_readonly_in_sdata) Init(1) Save
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 444159178a76..86dbb1d8700d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -22824,6 +22824,16 @@  On embedded PowerPC systems, put all initialized global and static data
 in the @code{.data} section, and all uninitialized data in the
 @code{.bss} section.
 
+@item -msdata=explicit
+@itemx -msdata
+@opindex msdata=explicit
+@opindex msdata
+Put all initialized global and static data that is not assigned to an
+explicit section in the @code{.data} section, and all uninitialized data
+that is not assigned to an explicit section in the @code{.bss} section.
+Use small data access patterns to access variables explicitly assigned
+to small data sections.
+
 @item -mblock-move-inline-limit=@var{num}
 @opindex mblock-move-inline-limit
 Inline all block moves (such as calls to @code{memcpy} or structure
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-sdata-3.c b/gcc/testsuite/gcc.target/powerpc/ppc-sdata-3.c
new file mode 100644
index 000000000000..750e5ad234af
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-sdata-3.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-options "-msdata=explicit" } */
+/* { dg-require-effective-target powerpc_eabi_ok } */
+/* { dg-require-effective-target nonpic } */
+/* { dg-final { scan-assembler "\\.section\[ \t\]\\.sdata," } } */
+/* { dg-final { scan-assembler "\\.section\[ \t\]\\.sbss," } } */
+/* { dg-final { scan-assembler "\\.section\[ \t\]\\.sdata2," } } */
+/* { dg-final { scan-assembler "\\.section\[ \t\]\\.sbss2," } } */
+/* { dg-final { scan-assembler "i@ha" } } */
+/* { dg-final { scan-assembler "j@sda(rel|21)\\((13|0)\\)" } } */
+/* { dg-final { scan-assembler "k@sda(rel|21)\\((13|0)\\)" } } */
+/* { dg-final { scan-assembler "l@sda(rel|21)\\((2|13|0)\\)" } } */
+/* { dg-final { scan-assembler "m@sda(rel|21)\\((2|13|0)\\)" } } */
+
+static int i;
+static int j __attribute__((section(".sbss")));
+static int k __attribute__((section(".sdata"))) = 2;
+static int l __attribute__((section(".sbss2")));
+static int m __attribute__((section(".sdata2"))) = 4;
+int f(void) {
+  return i + j + k + l + m;
+}