diff mbox series

[v3] Make function clone name numbering independent.

Message ID ded14873-6181-849f-6a3e-faf804d3f097@oracle.com
State New
Headers show
Series [v3] Make function clone name numbering independent. | expand

Commit Message

Michael Ploujnikov Nov. 28, 2018, 9:09 p.m. UTC
Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html

It took longer than expected, but I've finally rebased this on top of
the new clone_function_name* API and included the requested
optimizations.

There are a few remaining spots where we could probably apply similar
optimizations:

- gcc/multiple_target.c:create_target_clone
- gcc/multiple_target.c:create_dispatcher_calls
- gcc/omp-expand.c:grid_expand_target_grid_body

But I've yet to figure out how these work in detail and with the new
API these shouldn't block the main change from being merged.

I've also included a small change to rs6000 which I'm pretty sure is
safe, but I have no way of testing.

I'm not sure what's the consensus on what's more appropriate, but I
thought that it would be a good idea to keep these changes as separate
patches since only the first one really changes behaviour, while the
rest are optimizations. It's conceivable that someone who is
backporting this to an older release might want to just backport the
core bits of the change. I can re-submit it as one patch if that's
more appropriate.

Everything in these patches was bootstrapped and regression tested on
x86_64.

Ok for trunk?

- Michael

Comments

Segher Boessenkool Nov. 28, 2018, 10:49 p.m. UTC | #1
Hi!

On Wed, Nov 28, 2018 at 04:09:14PM -0500, Michael Ploujnikov wrote:
> I've also included a small change to rs6000 which I'm pretty sure is
> safe, but I have no way of testing.

Do you have an account on the GCC Compile Farm?
https://cfarm.tetaneutral.net/
There are some nice Power machines in there!

Does this patch dependent on the rest of the series?

If it tests okay, it is okay for trunk of course.  Thanks!

One comment about your changelog:

> 2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>
> 
> 	* config/rs6000/rs6000.c (make_resolver_func): Generate
> 	  resolver symbol with clone_function_name instead of
> 	  clone_function_name_numbered.

Those last two lines should not start with the spaces.  It should be:

2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* config/rs6000/rs6000.c (make_resolver_func): Generate
	resolver symbol with clone_function_name instead of
	clone_function_name_numbered.


Segher
Michael Ploujnikov Nov. 29, 2018, 2:13 p.m. UTC | #2
On 2018-11-28 5:49 p.m., Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Nov 28, 2018 at 04:09:14PM -0500, Michael Ploujnikov wrote:
>> I've also included a small change to rs6000 which I'm pretty sure is
>> safe, but I have no way of testing.
> 
> Do you have an account on the GCC Compile Farm?
> https://cfarm.tetaneutral.net/
> There are some nice Power machines in there!
> 
> Does this patch dependent on the rest of the series?
> 
> If it tests okay, it is okay for trunk of course.  Thanks!
> 
> One comment about your changelog:
> 
>> 2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>
>>
>> 	* config/rs6000/rs6000.c (make_resolver_func): Generate
>> 	  resolver symbol with clone_function_name instead of
>> 	  clone_function_name_numbered.
> 
> Those last two lines should not start with the spaces.  It should be:
> 
> 2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>
> 
> 	* config/rs6000/rs6000.c (make_resolver_func): Generate
> 	resolver symbol with clone_function_name instead of
> 	clone_function_name_numbered.
> 
> 
> Segher
> 

Thanks for the review and suggestion to use the cfarm! I've
successfully regtested this on power9 and committed the stand-alone
change to rs6000.c after fixing the changelog.

- Michael
Richard Biener Nov. 30, 2018, 7:25 a.m. UTC | #3
On Wed, Nov 28, 2018 at 10:09 PM Michael Ploujnikov
<michael.ploujnikov@oracle.com> wrote:
>
> Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html
>
> It took longer than expected, but I've finally rebased this on top of
> the new clone_function_name* API and included the requested
> optimizations.
>
> There are a few remaining spots where we could probably apply similar
> optimizations:
>
> - gcc/multiple_target.c:create_target_clone
> - gcc/multiple_target.c:create_dispatcher_calls
> - gcc/omp-expand.c:grid_expand_target_grid_body
>
> But I've yet to figure out how these work in detail and with the new
> API these shouldn't block the main change from being merged.
>
> I've also included a small change to rs6000 which I'm pretty sure is
> safe, but I have no way of testing.
>
> I'm not sure what's the consensus on what's more appropriate, but I
> thought that it would be a good idea to keep these changes as separate
> patches since only the first one really changes behaviour, while the
> rest are optimizations. It's conceivable that someone who is
> backporting this to an older release might want to just backport the
> core bits of the change. I can re-submit it as one patch if that's
> more appropriate.
>
> Everything in these patches was bootstrapped and regression tested on
> x86_64.
>
> Ok for trunk?

+  unsigned &clone_number = lto_clone_numbers->get_or_insert (
+     IDENTIFIER_POINTER (DECL_NAME (decl)));
   name = maybe_rewrite_identifier (name);
   symtab->change_decl_assembler_name (decl,
-      clone_function_name_numbered (
-  name, "lto_priv"));
+      clone_function_name (
+  name, "lto_priv", clone_number));

since we use 'name' from maybe_rewrite_identifier in the end wouldn't it
make more sense to use that as a key for lto_clone_numbers?

For the ipa-hsa.c changes since we iterate over all defined functions
we at most create a single clone per cgraph_node.  That means your
hash-map keyed on cgraph_node is useless and you could use
an unnumbered clone (or a static zero number).

-  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,
-   suffix));
+  SET_DECL_ASSEMBLER_NAME (new_decl,
+   clone_function_name (
+     IDENTIFIER_POINTER (
+       DECL_ASSEMBLER_NAME (old_decl)),
+     suffix,
+     num_suffix));

can you please hide the implementation detail of accessing the assembler name
by adding an overload to clone_function_name_numbered with an explicit
number?

OK with those changes.

Thanks,
Richard.



>
> - Michael
Michael Ploujnikov Nov. 30, 2018, 9:10 p.m. UTC | #4
Hi,

On 2018-11-30 2:25 a.m., Richard Biener wrote:
> +  unsigned &clone_number = lto_clone_numbers->get_or_insert (
> +     IDENTIFIER_POINTER (DECL_NAME (decl)));
>    name = maybe_rewrite_identifier (name);
>    symtab->change_decl_assembler_name (decl,
> -      clone_function_name_numbered (
> -  name, "lto_priv"));
> +      clone_function_name (
> +  name, "lto_priv", clone_number));
> 
> since we use 'name' from maybe_rewrite_identifier in the end wouldn't it
> make more sense to use that as a key for lto_clone_numbers?

Yup, that looks much better. Fixed it.

> For the ipa-hsa.c changes since we iterate over all defined functions
> we at most create a single clone per cgraph_node.  That means your
> hash-map keyed on cgraph_node is useless and you could use
> an unnumbered clone (or a static zero number).

Good catch. I was thinking of doing this, but it somehow fell through the cracks during the rebase.

> 
> -  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,
> -   suffix));
> +  SET_DECL_ASSEMBLER_NAME (new_decl,
> +   clone_function_name (
> +     IDENTIFIER_POINTER (
> +       DECL_ASSEMBLER_NAME (old_decl)),
> +     suffix,
> +     num_suffix));
> 
> can you please hide the implementation detail of accessing the assembler name
> by adding an overload to clone_function_name_numbered with an explicit
> number?

Done.


> 
> OK with those changes.
> 
> Thanks,
> Richard.

Thank you for the review. I will commit as soon as my last test run finishes.

- Michael
From ac1f1579d37804c97833d460ec6cd5b87d6184c7 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>
Date: Thu, 1 Nov 2018 12:57:30 -0400
Subject: [PATCH 1/3] Make function assembly more independent.

This is achieved by having clone_function_name assign unique clone
numbers for each function independently.

gcc:

2018-11-30  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* cgraphclones.c: Replaced clone_fn_id_num with clone_fn_ids;
	hash map.
	(clone_function_name_numbered): Use clone_fn_ids.

gcc/testsuite:

2018-11-30  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* gcc.dg/independent-cloneids-1.c: New test.
---
 gcc/cgraphclones.c                            | 10 ++++-
 gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++++++++++++++++++
 2 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c

diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index e17959c0ca..fdd84b60d3 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -513,7 +513,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
   return new_node;
 }
 
-static GTY(()) unsigned int clone_fn_id_num;
+static GTY(()) hash_map<const char *, unsigned> *clone_fn_ids;
 
 /* Return a new assembler name for a clone of decl named NAME.  Apart
    from the string SUFFIX, the new name will end with a unique (for
@@ -525,7 +525,13 @@ static GTY(()) unsigned int clone_fn_id_num;
 tree
 clone_function_name_numbered (const char *name, const char *suffix)
 {
-  return clone_function_name (name, suffix, clone_fn_id_num++);
+  /* Initialize the function->counter mapping the first time it's
+     needed.  */
+  if (!clone_fn_ids)
+    clone_fn_ids = hash_map<const char *, unsigned int>::create_ggc (64);
+  unsigned int &suffix_counter = clone_fn_ids->get_or_insert (
+				   IDENTIFIER_POINTER (get_identifier (name)));
+  return clone_function_name (name, suffix, suffix_counter++);
 }
 
 /* Return a new assembler name for a clone of DECL.  Apart from string
diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.dg/independent-cloneids-1.c
new file mode 100644
index 0000000000..3203895492
--- /dev/null
+++ gcc/testsuite/gcc.dg/independent-cloneids-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fipa-cp -fipa-cp-clone"  } */
+
+extern int printf (const char *, ...);
+
+static int __attribute__ ((noinline))
+foo (int arg)
+{
+  return 7 * arg;
+}
+
+static int __attribute__ ((noinline))
+bar (int arg)
+{
+  return arg * arg;
+}
+
+int
+baz (int arg)
+{
+  printf("%d\n", bar (3));
+  printf("%d\n", bar (4));
+  printf("%d\n", foo (5));
+  printf("%d\n", foo (6));
+  /* adding or removing the following call should not affect foo
+     function's clone numbering */
+  printf("%d\n", bar (7));
+  return foo (8);
+}
+
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]3:} } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]4:} } } */
H.J. Lu Dec. 1, 2018, 4:29 p.m. UTC | #5
On Fri, Nov 30, 2018 at 1:11 PM Michael Ploujnikov
<michael.ploujnikov@oracle.com> wrote:
>
> Hi,
>
> On 2018-11-30 2:25 a.m., Richard Biener wrote:
> > +  unsigned &clone_number = lto_clone_numbers->get_or_insert (
> > +     IDENTIFIER_POINTER (DECL_NAME (decl)));
> >    name = maybe_rewrite_identifier (name);
> >    symtab->change_decl_assembler_name (decl,
> > -      clone_function_name_numbered (
> > -  name, "lto_priv"));
> > +      clone_function_name (
> > +  name, "lto_priv", clone_number));
> >
> > since we use 'name' from maybe_rewrite_identifier in the end wouldn't it
> > make more sense to use that as a key for lto_clone_numbers?
>
> Yup, that looks much better. Fixed it.
>
> > For the ipa-hsa.c changes since we iterate over all defined functions
> > we at most create a single clone per cgraph_node.  That means your
> > hash-map keyed on cgraph_node is useless and you could use
> > an unnumbered clone (or a static zero number).
>
> Good catch. I was thinking of doing this, but it somehow fell through the cracks during the rebase.
>
> >
> > -  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,
> > -   suffix));
> > +  SET_DECL_ASSEMBLER_NAME (new_decl,
> > +   clone_function_name (
> > +     IDENTIFIER_POINTER (
> > +       DECL_ASSEMBLER_NAME (old_decl)),
> > +     suffix,
> > +     num_suffix));
> >
> > can you please hide the implementation detail of accessing the assembler name
> > by adding an overload to clone_function_name_numbered with an explicit
> > number?
>
> Done.
>
>
> >
> > OK with those changes.
> >
> > Thanks,
> > Richard.
>
> Thank you for the review. I will commit as soon as my last test run finishes.
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297
Michael Ploujnikov Dec. 3, 2018, 5 p.m. UTC | #6
On 2018-12-01 11:29 a.m., H.J. Lu wrote:
> This caused:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297
> 

Sorry about that. Looks like I should have been testing with
--with-build-config=bootstrap-lto rather than just --enable-bootstrap.

The quick fix would be to undo the patch to create_virtual_clone or to
just change clone_num_suffixes to key off of DECL_ASSEMBLER_NAME
(node->decl) instead of node pointers. Any preferences?

The harder fix would be to figure out why some nodes share the same
names and fix that, but maybe that's just inevitable with LTO?

- Michael
Michael Ploujnikov Dec. 4, 2018, 3:06 a.m. UTC | #7
On 2018-12-03 12:00 p.m., Michael Ploujnikov wrote:
> On 2018-12-01 11:29 a.m., H.J. Lu wrote:
>> This caused:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297
>>
> 
> Sorry about that. Looks like I should have been testing with
> --with-build-config=bootstrap-lto rather than just --enable-bootstrap.
> 
> The quick fix would be to undo the patch to create_virtual_clone or to
> just change clone_num_suffixes to key off of DECL_ASSEMBLER_NAME
> (node->decl) instead of node pointers. Any preferences?
> 
> The harder fix would be to figure out why some nodes share the same
> names and fix that, but maybe that's just inevitable with LTO?
> 
> - Michael
> 

Here's a quick fix while the issue is being investigated.

Bootstrapped (--with-build-config=bootstrap-lto) and regtested on x86_64.

Ok for trunk?
From f5e2500f30ad337e85e0b53eaa15c724657966a2 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>
Date: Mon, 3 Dec 2018 18:19:18 -0500
Subject: [PATCH] PR ipa/88297

gcc/ChangeLog:

2018-12-03  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	PR ipa/88297
	* ipa-cp.c (create_specialized_node): Store clone counters by
	node assembler names.
	(ipcp_driver): Change type of clone_num_suffixes key type to
	const char*.
---
 gcc/ipa-cp.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git gcc/ipa-cp.c gcc/ipa-cp.c
index e0cd1bc45b..f82473e37c 100644
--- gcc/ipa-cp.c
+++ gcc/ipa-cp.c
@@ -376,8 +376,8 @@ static profile_count max_count;
 
 static long overall_size, max_new_size;
 
-/* Node to unique clone suffix number map.  */
-static hash_map<cgraph_node *, unsigned> *clone_num_suffixes;
+/* Node name to unique clone suffix number map.  */
+static hash_map<const char *, unsigned> *clone_num_suffixes;
 
 /* Return the param lattices structure corresponding to the Ith formal
    parameter of the function described by INFO.  */
@@ -3831,7 +3831,9 @@ create_specialized_node (struct cgraph_node *node,
 	}
     }
 
-  unsigned &suffix_counter = clone_num_suffixes->get_or_insert (node);
+  unsigned &suffix_counter = clone_num_suffixes->get_or_insert (
+			       IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (
+				 node->decl)));
   new_node = node->create_virtual_clone (callers, replace_trees,
 					 args_to_skip, "constprop",
 					 suffix_counter);
@@ -5049,7 +5051,7 @@ ipcp_driver (void)
 
   ipa_check_create_node_params ();
   ipa_check_create_edge_args ();
-  clone_num_suffixes = new hash_map<cgraph_node *, unsigned>;
+  clone_num_suffixes = new hash_map<const char *, unsigned>;
 
   if (dump_file)
     {
Jan Hubicka Dec. 4, 2018, 2:06 p.m. UTC | #8
> On 2018-12-03 12:00 p.m., Michael Ploujnikov wrote:
> > On 2018-12-01 11:29 a.m., H.J. Lu wrote:
> >> This caused:
> >>
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297
> >>
> > 
> > Sorry about that. Looks like I should have been testing with
> > --with-build-config=bootstrap-lto rather than just --enable-bootstrap.
> > 
> > The quick fix would be to undo the patch to create_virtual_clone or to
> > just change clone_num_suffixes to key off of DECL_ASSEMBLER_NAME
> > (node->decl) instead of node pointers. Any preferences?
> > 
> > The harder fix would be to figure out why some nodes share the same
> > names and fix that, but maybe that's just inevitable with LTO?
> > 
> > - Michael
> > 
> 
> Here's a quick fix while the issue is being investigated.
> 
> Bootstrapped (--with-build-config=bootstrap-lto) and regtested on x86_64.
> 
> Ok for trunk?

> From f5e2500f30ad337e85e0b53eaa15c724657966a2 Mon Sep 17 00:00:00 2001
> From: Michael Ploujnikov <michael.ploujnikov@oracle.com>
> Date: Mon, 3 Dec 2018 18:19:18 -0500
> Subject: [PATCH] PR ipa/88297
> 
> gcc/ChangeLog:
> 
> 2018-12-03  Michael Ploujnikov  <michael.ploujnikov@oracle.com>
> 
> 	PR ipa/88297
> 	* ipa-cp.c (create_specialized_node): Store clone counters by
> 	node assembler names.
> 	(ipcp_driver): Change type of clone_num_suffixes key type to
> 	const char*.

OK,
thanks!
Honza
diff mbox series

Patch

From a822759e422f0cea11640ec3d6d6fba94bcd4ee9 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>
Date: Wed, 28 Nov 2018 09:24:30 -0500
Subject: [PATCH 4/4] There can be at most one .resolver clone per function

gcc:

2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* config/rs6000/rs6000.c (make_resolver_func): Generate
	  resolver symbol with clone_function_name instead of
	  clone_function_name_numbered.
---
 gcc/config/rs6000/rs6000.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git gcc/config/rs6000/rs6000.c gcc/config/rs6000/rs6000.c
index 4f113cb025..a9d038829b 100644
--- gcc/config/rs6000/rs6000.c
+++ gcc/config/rs6000/rs6000.c
@@ -36997,7 +36997,7 @@  make_resolver_func (const tree default_decl,
 {
   /* Make the resolver function static.  The resolver function returns
      void *.  */
-  tree decl_name = clone_function_name_numbered (default_decl, "resolver");
+  tree decl_name = clone_function_name (default_decl, "resolver");
   const char *resolver_name = IDENTIFIER_POINTER (decl_name);
   tree type = build_function_type_list (ptr_type_node, NULL_TREE);
   tree decl = build_fn_decl (resolver_name, type);
-- 
2.19.1