Speed up qsort in IPA ICF
diff mbox series

Message ID e787c43e-2be4-ee92-7820-b8968856a3ea@suse.cz
State New
Headers show
Series
  • Speed up qsort in IPA ICF
Related show

Commit Message

Martin Liška Sept. 19, 2019, 9:06 a.m. UTC
Hi.

As Alexander pointed out, the sort_congruence_class_groups_by_decl_uid is the most
expensive qsort operator in tramp3d compilation. It does unfortunate 7 pointer dereferences
via: DECL_UID (classes[i]->classes[0]->members[0]->decl). I'm suggesting to cache that
in congruence_class_group.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-09-19  Martin Liska  <mliska@suse.cz>

	* ipa-icf.c (sort_sem_items_by_decl_uid): Simplify comparator.
	(sort_congruence_classes_by_decl_uid): Likewise.
	(sort_congruence_class_groups_by_decl_uid): Use
	congruence_class_group::uid.
	(sem_item_optimizer::merge_classes): Cache
	DECL_UID (classes[i]->classes[0]->members[0]->decl).
	* ipa-icf.h (struct congruence_class_group): New field.
---
 gcc/ipa-icf.c | 29 +++++------------------------
 gcc/ipa-icf.h |  1 +
 2 files changed, 6 insertions(+), 24 deletions(-)

Comments

Richard Biener Sept. 19, 2019, 11:01 a.m. UTC | #1
On Thu, Sep 19, 2019 at 11:06 AM Martin Liška <mliska@suse.cz> wrote:
>
> Hi.
>
> As Alexander pointed out, the sort_congruence_class_groups_by_decl_uid is the most
> expensive qsort operator in tramp3d compilation. It does unfortunate 7 pointer dereferences
> via: DECL_UID (classes[i]->classes[0]->members[0]->decl). I'm suggesting to cache that
> in congruence_class_group.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

Since we're populating the classes vector right before you could elide
the new field and do it in the same iteration by having

auto_vec <std::pair<congruence_class_group *, unsigned> > classes

and pushing *it, DECL_UID and then sorting by the readily available
UID in the data?  That makes the sort use a linear memory region
w/o dereferences.

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2019-09-19  Martin Liska  <mliska@suse.cz>
>
>         * ipa-icf.c (sort_sem_items_by_decl_uid): Simplify comparator.
>         (sort_congruence_classes_by_decl_uid): Likewise.
>         (sort_congruence_class_groups_by_decl_uid): Use
>         congruence_class_group::uid.
>         (sem_item_optimizer::merge_classes): Cache
>         DECL_UID (classes[i]->classes[0]->members[0]->decl).
>         * ipa-icf.h (struct congruence_class_group): New field.
> ---
>  gcc/ipa-icf.c | 29 +++++------------------------
>  gcc/ipa-icf.h |  1 +
>  2 files changed, 6 insertions(+), 24 deletions(-)
>
>
Martin Liška Sept. 19, 2019, 1:02 p.m. UTC | #2
On 9/19/19 1:01 PM, Richard Biener wrote:
> On Thu, Sep 19, 2019 at 11:06 AM Martin Liška <mliska@suse.cz> wrote:
>>
>> Hi.
>>
>> As Alexander pointed out, the sort_congruence_class_groups_by_decl_uid is the most
>> expensive qsort operator in tramp3d compilation. It does unfortunate 7 pointer dereferences
>> via: DECL_UID (classes[i]->classes[0]->members[0]->decl). I'm suggesting to cache that
>> in congruence_class_group.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
> 
> Since we're populating the classes vector right before you could elide
> the new field and do it in the same iteration by having
> 
> auto_vec <std::pair<congruence_class_group *, unsigned> > classes
> 
> and pushing *it, DECL_UID and then sorting by the readily available
> UID in the data?  That makes the sort use a linear memory region
> w/o dereferences.

Good point, there's a tested patch.

Martin

> 
> Richard.
> 
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2019-09-19  Martin Liska  <mliska@suse.cz>
>>
>>         * ipa-icf.c (sort_sem_items_by_decl_uid): Simplify comparator.
>>         (sort_congruence_classes_by_decl_uid): Likewise.
>>         (sort_congruence_class_groups_by_decl_uid): Use
>>         congruence_class_group::uid.
>>         (sem_item_optimizer::merge_classes): Cache
>>         DECL_UID (classes[i]->classes[0]->members[0]->decl).
>>         * ipa-icf.h (struct congruence_class_group): New field.
>> ---
>>  gcc/ipa-icf.c | 29 +++++------------------------
>>  gcc/ipa-icf.h |  1 +
>>  2 files changed, 6 insertions(+), 24 deletions(-)
>>
>>
Richard Biener Sept. 19, 2019, 1:08 p.m. UTC | #3
On Thu, Sep 19, 2019 at 3:02 PM Martin Liška <mliska@suse.cz> wrote:
>
> On 9/19/19 1:01 PM, Richard Biener wrote:
> > On Thu, Sep 19, 2019 at 11:06 AM Martin Liška <mliska@suse.cz> wrote:
> >>
> >> Hi.
> >>
> >> As Alexander pointed out, the sort_congruence_class_groups_by_decl_uid is the most
> >> expensive qsort operator in tramp3d compilation. It does unfortunate 7 pointer dereferences
> >> via: DECL_UID (classes[i]->classes[0]->members[0]->decl). I'm suggesting to cache that
> >> in congruence_class_group.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >
> > Since we're populating the classes vector right before you could elide
> > the new field and do it in the same iteration by having
> >
> > auto_vec <std::pair<congruence_class_group *, unsigned> > classes
> >
> > and pushing *it, DECL_UID and then sorting by the readily available
> > UID in the data?  That makes the sort use a linear memory region
> > w/o dereferences.
>
> Good point, there's a tested patch.

OK.

Thanks,
Richard.

> Martin
>
> >
> > Richard.
> >
> >> Thanks,
> >> Martin
> >>
> >> gcc/ChangeLog:
> >>
> >> 2019-09-19  Martin Liska  <mliska@suse.cz>
> >>
> >>         * ipa-icf.c (sort_sem_items_by_decl_uid): Simplify comparator.
> >>         (sort_congruence_classes_by_decl_uid): Likewise.
> >>         (sort_congruence_class_groups_by_decl_uid): Use
> >>         congruence_class_group::uid.
> >>         (sem_item_optimizer::merge_classes): Cache
> >>         DECL_UID (classes[i]->classes[0]->members[0]->decl).
> >>         * ipa-icf.h (struct congruence_class_group): New field.
> >> ---
> >>  gcc/ipa-icf.c | 29 +++++------------------------
> >>  gcc/ipa-icf.h |  1 +
> >>  2 files changed, 6 insertions(+), 24 deletions(-)
> >>
> >>
>
Alexander Monakov Sept. 19, 2019, 1:15 p.m. UTC | #4
On Thu, 19 Sep 2019, Richard Biener wrote:

> > Good point, there's a tested patch.
> 
> OK.

Hold on, is the new comparator really correct?


@@ -3384,20 +3372,11 @@ sort_congruence_classes_by_decl_uid (const void *a, const void *b)
 static int
 sort_congruence_class_groups_by_decl_uid (const void *a, const void *b)
 {
-  const congruence_class_group *g1
-    = *(const congruence_class_group * const *)a;
-  const congruence_class_group *g2
-    = *(const congruence_class_group * const *)b;
-
-  int uid1 = DECL_UID (g1->classes[0]->members[0]->decl);
-  int uid2 = DECL_UID (g2->classes[0]->members[0]->decl);
-
-  if (uid1 < uid2)
-    return -1;
-  else if (uid1 > uid2)
-    return 1;
-  else
-    return 0;
+  const std::pair<congruence_class_group *, int> *g1
+    = *(const std::pair<congruence_class_group *, int> *const *) a;
+  const std::pair<congruence_class_group *, int> *g2
+    = *(const std::pair<congruence_class_group *, int> *const *) b;
+  return g1->second - g2->second;
 }


AFAICT the vec holds pairs, not pointers to pairs, so why does the comparator
cast to a pointer-to-pointer-to-pair? Shouldn't it read

  const std::pair<congruence_class_group *, int> *g1
    = (const std::pair<congruence_class_group *, int> *) a;

Thanks.
Alexander
Richard Biener Sept. 19, 2019, 1:18 p.m. UTC | #5
On Thu, Sep 19, 2019 at 3:15 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Thu, 19 Sep 2019, Richard Biener wrote:
>
> > > Good point, there's a tested patch.
> >
> > OK.
>
> Hold on, is the new comparator really correct?
>
>
> @@ -3384,20 +3372,11 @@ sort_congruence_classes_by_decl_uid (const void *a, const void *b)
>  static int
>  sort_congruence_class_groups_by_decl_uid (const void *a, const void *b)
>  {
> -  const congruence_class_group *g1
> -    = *(const congruence_class_group * const *)a;
> -  const congruence_class_group *g2
> -    = *(const congruence_class_group * const *)b;
> -
> -  int uid1 = DECL_UID (g1->classes[0]->members[0]->decl);
> -  int uid2 = DECL_UID (g2->classes[0]->members[0]->decl);
> -
> -  if (uid1 < uid2)
> -    return -1;
> -  else if (uid1 > uid2)
> -    return 1;
> -  else
> -    return 0;
> +  const std::pair<congruence_class_group *, int> *g1
> +    = *(const std::pair<congruence_class_group *, int> *const *) a;
> +  const std::pair<congruence_class_group *, int> *g2
> +    = *(const std::pair<congruence_class_group *, int> *const *) b;
> +  return g1->second - g2->second;
>  }
>
>
> AFAICT the vec holds pairs, not pointers to pairs, so why does the comparator
> cast to a pointer-to-pointer-to-pair? Shouldn't it read
>
>   const std::pair<congruence_class_group *, int> *g1
>     = (const std::pair<congruence_class_group *, int> *) a;

Indeed.  We need static type correctness for the comparator :/

Richard.

> Thanks.
> Alexander
Martin Liška Sept. 19, 2019, 1:32 p.m. UTC | #6
On 9/19/19 3:18 PM, Richard Biener wrote:
> On Thu, Sep 19, 2019 at 3:15 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>>
>> On Thu, 19 Sep 2019, Richard Biener wrote:
>>
>>>> Good point, there's a tested patch.
>>>
>>> OK.
>>
>> Hold on, is the new comparator really correct?
>>
>>
>> @@ -3384,20 +3372,11 @@ sort_congruence_classes_by_decl_uid (const void *a, const void *b)
>>  static int
>>  sort_congruence_class_groups_by_decl_uid (const void *a, const void *b)
>>  {
>> -  const congruence_class_group *g1
>> -    = *(const congruence_class_group * const *)a;
>> -  const congruence_class_group *g2
>> -    = *(const congruence_class_group * const *)b;
>> -
>> -  int uid1 = DECL_UID (g1->classes[0]->members[0]->decl);
>> -  int uid2 = DECL_UID (g2->classes[0]->members[0]->decl);
>> -
>> -  if (uid1 < uid2)
>> -    return -1;
>> -  else if (uid1 > uid2)
>> -    return 1;
>> -  else
>> -    return 0;
>> +  const std::pair<congruence_class_group *, int> *g1
>> +    = *(const std::pair<congruence_class_group *, int> *const *) a;
>> +  const std::pair<congruence_class_group *, int> *g2
>> +    = *(const std::pair<congruence_class_group *, int> *const *) b;
>> +  return g1->second - g2->second;
>>  }
>>
>>
>> AFAICT the vec holds pairs, not pointers to pairs, so why does the comparator
>> cast to a pointer-to-pointer-to-pair? Shouldn't it read
>>
>>   const std::pair<congruence_class_group *, int> *g1
>>     = (const std::pair<congruence_class_group *, int> *) a;
> 
> Indeed.  We need static type correctness for the comparator :/

Ooops, sorry for the mistake. I'm testing the following patch.
I checked in debugger that uids are valid for a simple test-case.

Martin

> 
> Richard.
> 
>> Thanks.
>> Alexander

Patch
diff mbox series

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index c9c3cb4a331..d78635ad6b3 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -3350,13 +3350,7 @@  sort_sem_items_by_decl_uid (const void *a, const void *b)
 
   int uid1 = DECL_UID (i1->decl);
   int uid2 = DECL_UID (i2->decl);
-
-  if (uid1 < uid2)
-    return -1;
-  else if (uid1 > uid2)
-    return 1;
-  else
-    return 0;
+  return uid1 - uid2;
 }
 
 /* Sort pair of congruence_classes A and B by DECL_UID of the first member.  */
@@ -3369,13 +3363,7 @@  sort_congruence_classes_by_decl_uid (const void *a, const void *b)
 
   int uid1 = DECL_UID (c1->members[0]->decl);
   int uid2 = DECL_UID (c2->members[0]->decl);
-
-  if (uid1 < uid2)
-    return -1;
-  else if (uid1 > uid2)
-    return 1;
-  else
-    return 0;
+  return uid1 - uid2;
 }
 
 /* Sort pair of congruence_class_groups A and B by
@@ -3388,16 +3376,7 @@  sort_congruence_class_groups_by_decl_uid (const void *a, const void *b)
     = *(const congruence_class_group * const *)a;
   const congruence_class_group *g2
     = *(const congruence_class_group * const *)b;
-
-  int uid1 = DECL_UID (g1->classes[0]->members[0]->decl);
-  int uid2 = DECL_UID (g2->classes[0]->members[0]->decl);
-
-  if (uid1 < uid2)
-    return -1;
-  else if (uid1 > uid2)
-    return 1;
-  else
-    return 0;
+  return g1->uid - g2->uid;
 }
 
 /* After reduction is done, we can declare all items in a group
@@ -3450,6 +3429,8 @@  sem_item_optimizer::merge_classes (unsigned int prev_class_count)
        it != m_classes.end (); ++it)
     classes.quick_push (*it);
 
+  for (unsigned i = 0; i < classes.length (); i++)
+    classes[i]->uid = DECL_UID (classes[i]->classes[0]->members[0]->decl);
   classes.qsort (sort_congruence_class_groups_by_decl_uid);
 
   if (dump_file)
diff --git a/gcc/ipa-icf.h b/gcc/ipa-icf.h
index 2bf0f156ef6..e7bb4afcfee 100644
--- a/gcc/ipa-icf.h
+++ b/gcc/ipa-icf.h
@@ -461,6 +461,7 @@  struct congruence_class_group
 {
   hashval_t hash;
   sem_item_type type;
+  int uid;
   vec <congruence_class *> classes;
 };