Patchwork [RFC] Re-write LTO type merging again, do tree merging

login
register
mail settings
Submitter Jan Hubicka
Date June 13, 2013, 9:37 p.m.
Message ID <20130613213705.GA1358@atrey.karlin.mff.cuni.cz>
Download mbox | patch
Permalink /patch/251176/
State New
Headers show

Comments

Jan Hubicka - June 13, 2013, 9:37 p.m.
> 
> Ok, not streaming and comparing TREE_USED gets it improved to

I will try to gather better data tomorrow. My mozilla build died on disk space,
but according to stats we are now at about 7GB of GGC memory after merging.
I was playing with the following patch that implements testing whether types
are same in my (probably naive and wrong) understanding of ODR rule in C++

It prints type pairs that seems same and then it verifies that they are having
same names and they are in same namespaces and records. On Javascript there are
5000 types found same by devirtualization code this way that are not having
the same MAIN VARIANT.

I gess those trees may be good starting point for you to look why they are not merged.

I suppose that once we have maintenable code base we can get into more
aggressive merging in special cases.  Requiring trees to be exactly same is a
good default behaviour.  We however may take advantage of extra knowledge.  FE
may tag types/decls that are subject to ODR rule and for those we can reduce
the hash to be based only on name+context and we can even output sane
diagnostic on mismatches.

Simiarly I think it would help a lot if we proactively merged !can_prevail_p
decls with matching types into those that can prevail by hashing PUBLIC decls
only by their assembler name.  Merging those should subsequently allow
collapsing the types that are otherwise kept separate just because associated
vtables are having differences in EXTERNAL and PUBLIC flags on the methods and
such.
Jan Hubicka - June 13, 2013, 10:16 p.m.
> > 
> > Ok, not streaming and comparing TREE_USED gets it improved to
> 
> I will try to gather better data tomorrow. My mozilla build died on disk space,
> but according to stats we are now at about 7GB of GGC memory after merging.
> I was playing with the following patch that implements testing whether types
> are same in my (probably naive and wrong) understanding of ODR rule in C++

So i can confirm that we now need 3GB of TMP space instead of 8GB with earlier
version of patch.  I will compare to mainline tomorrow, but I think it is
about the same.
 phase opt and generate  :  96.39 ( 9%) usr  40.45 (45%) sys 136.91 (12%) wall  271042 kB ( 7%) ggc
 phase stream in         : 457.87 (43%) usr   8.38 ( 9%) sys 466.44 (40%) wall 3798844 kB (93%) ggc
 phase stream out        : 509.39 (48%) usr  40.82 (46%) sys 550.88 (48%) wall    7149 kB ( 0%) ggc
 ipa cp                  :  13.62 ( 1%) usr   5.00 ( 6%) sys  18.61 ( 2%) wall  425204 kB (10%) ggc
 ipa inlining heuristics :  60.52 ( 6%) usr  36.15 (40%) sys  96.71 ( 8%) wall 1353370 kB (33%) ggc
 ipa lto decl in         : 346.94 (33%) usr   5.49 ( 6%) sys 352.60 (31%) wall    7042 kB ( 0%) ggc
 ipa lto decl out        : 481.19 (45%) usr  23.28 (26%) sys 504.68 (44%) wall       0 kB ( 0%) ggc
 TOTAL                 :1063.67            89.65          1154.26            4078436 kB

So we are still bound by streaming. I am running -flto-report overnight.

My ODR patch finds 36377 matches and also weird looking mismatches of type:
 <record_type 0x7fbd30d46dc8 sockaddr_storage BLK
    size <integer_cst 0x7fbd416bc1e0 type <integer_type 0x7fbd415660a8 bitsizetype> constant 1024>
    unit size <integer_cst 0x7fbd416bc700 type <integer_type 0x7fbd41566000 sizetype> constant 128>
    align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78
    fields <field_decl 0x7fbd30e99ed8 ss_family
        type <integer_type 0x7fbd3b98c000 sa_family_t public unsigned HI
            size <integer_cst 0x7fbd41555fe0 constant 16>
            unit size <integer_cst 0x7fbd4156a000 constant 2>
            align 16 symtab 0 alias set -1 canonical type 0x7fbd41566540 precision 16 min <integer_cst 0x7fbd4156a020 0> max <integer_cst 0x7fbd41555fc0 65535>>
        unsigned nonlocal HI file /usr/include/bits/socket.h line 189 col 0 size <integer_cst 0x7fbd41555fe0 16> unit size <integer_cst 0x7fbd4156a000 2>
        align 16 offset_align 128
        offset <integer_cst 0x7fbd41555d60 constant 0>
        bit offset <integer_cst 0x7fbd41555de0 constant 0> context <record_type 0x7fbd30d46dc8 sockaddr_storage>
        chain <field_decl 0x7fbd30e99000 __ss_align type <integer_type 0x7fbd415667e0 long unsigned int>
            unsigned nonlocal DI file /usr/include/bits/socket.h line 190 col 0
            size <integer_cst 0x7fbd41555d20 constant 64>
            unit size <integer_cst 0x7fbd41555d40 constant 8>
            align 64 offset_align 128 offset <integer_cst 0x7fbd41555d60 0> bit offset <integer_cst 0x7fbd41555d20 64> context <record_type 0x7fbd30d46dc8 sockaddr_storage> chain <field_decl 0x7fbd30e99e40 __ss_padding>>> context <translation_unit_decl 0x7fbd30cbc2e0 D.967968>
    chain <type_decl 0x7fbd30d47da8 sockaddr_storage>>
 <record_type 0x7fbd30f0bc78 sockaddr_storage BLK
    size <integer_cst 0x7fbd416bc1e0 type <integer_type 0x7fbd415660a8 bitsizetype> constant 1024>
    unit size <integer_cst 0x7fbd416bc700 type <integer_type 0x7fbd41566000 sizetype> constant 128>
    align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78
    fields <field_decl 0x7fbd30ef9558 ss_family
        type <integer_type 0x7fbd3b98c000 sa_family_t public unsigned HI
            size <integer_cst 0x7fbd41555fe0 constant 16>
            unit size <integer_cst 0x7fbd4156a000 constant 2>
            align 16 symtab 0 alias set -1 canonical type 0x7fbd41566540 precision 16 min <integer_cst 0x7fbd4156a020 0> max <integer_cst 0x7fbd41555fc0 65535>>
        unsigned HI file /usr/include/bits/socket.h line 189 col 0 size <integer_cst 0x7fbd41555fe0 16> unit size <integer_cst 0x7fbd4156a000 2>
        align 16 offset_align 128
        offset <integer_cst 0x7fbd41555d60 constant 0>
        bit offset <integer_cst 0x7fbd41555de0 constant 0> context <record_type 0x7fbd30f0bc78 sockaddr_storage>
        chain <field_decl 0x7fbd30ef94c0 __ss_align type <integer_type 0x7fbd415667e0 long unsigned int>
            unsigned DI file /usr/include/bits/socket.h line 190 col 0
            size <integer_cst 0x7fbd41555d20 constant 64>
            unit size <integer_cst 0x7fbd41555d40 constant 8>
            align 64 offset_align 128 offset <integer_cst 0x7fbd41555d60 0> bit offset <integer_cst 0x7fbd41555d20 64> context <record_type 0x7fbd30f0bc78 sockaddr_storage> chain <field_decl 0x7fbd30ef9428 __ss_padding>>> context <translation_unit_decl 0x7fbd30ea9f18 D.936417>
    pointer_to_this <pointer_type 0x7fbd30f0bd20> chain <type_decl 0x7fbd30ea9398 D.938243>>

that mismatch because we run into following difference:
 <type_decl 0x7fbd30d47da8 sockaddr_storage
    type <record_type 0x7fbd30d46dc8 sockaddr_storage BLK
        size <integer_cst 0x7fbd416bc1e0 constant 1024>
        unit size <integer_cst 0x7fbd416bc700 constant 128>
        align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78
        fields <field_decl 0x7fbd30e99ed8 ss_family type <integer_type 0x7fbd3b98c000 sa_family_t>
            unsigned nonlocal HI file /usr/include/bits/socket.h line 189 col 0
            size <integer_cst 0x7fbd41555fe0 constant 16>
            unit size <integer_cst 0x7fbd4156a000 constant 2>
            align 16 offset_align 128
            offset <integer_cst 0x7fbd41555d60 constant 0>
            bit offset <integer_cst 0x7fbd41555de0 constant 0> context <record_type 0x7fbd30d46dc8 sockaddr_storage> chain <field_decl 0x7fbd30e99000 __ss_align>> context <translation_unit_decl 0x7fbd30cbc2e0 D.967968>
        chain <type_decl 0x7fbd30d47da8 sockaddr_storage>>
    public VOID file /usr/include/bits/socket.h line 187 col 0
    align 8 context <translation_unit_decl 0x7fbd30cbc2e0 D.967968>>
 <identifier_node 0x7fbd30f06d70 sockaddr_storage>

I am not sure what means that one type has more TYPE_DECLs stacked than the other.

Honza
Jan Hubicka - June 14, 2013, 5:45 a.m.
> > > 
> > > Ok, not streaming and comparing TREE_USED gets it improved to
> > 
> > I will try to gather better data tomorrow. My mozilla build died on disk space,
> > but according to stats we are now at about 7GB of GGC memory after merging.
> > I was playing with the following patch that implements testing whether types
> > are same in my (probably naive and wrong) understanding of ODR rule in C++
> 
> So i can confirm that we now need 3GB of TMP space instead of 8GB with earlier
> version of patch.  I will compare to mainline tomorrow, but I think it is
> about the same.
>  phase opt and generate  :  96.39 ( 9%) usr  40.45 (45%) sys 136.91 (12%) wall  271042 kB ( 7%) ggc
>  phase stream in         : 457.87 (43%) usr   8.38 ( 9%) sys 466.44 (40%) wall 3798844 kB (93%) ggc
>  phase stream out        : 509.39 (48%) usr  40.82 (46%) sys 550.88 (48%) wall    7149 kB ( 0%) ggc
>  ipa cp                  :  13.62 ( 1%) usr   5.00 ( 6%) sys  18.61 ( 2%) wall  425204 kB (10%) ggc
>  ipa inlining heuristics :  60.52 ( 6%) usr  36.15 (40%) sys  96.71 ( 8%) wall 1353370 kB (33%) ggc
>  ipa lto decl in         : 346.94 (33%) usr   5.49 ( 6%) sys 352.60 (31%) wall    7042 kB ( 0%) ggc
>  ipa lto decl out        : 481.19 (45%) usr  23.28 (26%) sys 504.68 (44%) wall       0 kB ( 0%) ggc
>  TOTAL                 :1063.67            89.65          1154.26            4078436 kB
> 
> So we are still bound by streaming. I am running -flto-report overnight.
[WPA] read 43363300 SCCs of average size 2.264113
[WPA] 98179403 tree bodies read in total
[WPA] tree SCC table: size 16777213, 6422251 elements, collision ratio: 0.811639
[WPA] tree SCC max chain length 88 (size 1)
[WPA] Compared 16544560 SCCs, 275298 collisions (0.016640)
[WPA] Merged 16458553 SCCs
[WPA] Merged 46453870 tree bodies
[WPA] Merged 9535385 types
[WPA] 6771259 types prevailed (21348860 associated trees)
[WPA] Old merging code merges an additional 1759918 types of which 379059 are in the same SCC with their prevailing variant (19696849 and 15301625 associated trees)
[WPA] GIMPLE canonical type table: size 131071, 77875 elements, 6771394 searches, 1528380 collisions (ratio: 0.225711)
[WPA] GIMPLE canonical type hash table: size 16777213, 6771339 elements, 23174504 searches, 21075518 collisions (ratio: 0.909427)
....
[LTRANS] read 228296 SCCs of average size 11.882460
[LTRANS] 2712718 tree bodies read in total
[LTRANS] GIMPLE canonical type table: size 16381, 7025 elements, 704670 searches, 24040 collisions (ratio: 0.034115)
[LTRANS] GIMPLE canonical type hash table: size 1048573, 704613 elements, 2269381 searches, 2021919 collisions (ratio: 0.890956)

We manage to get stuck in one of ltranses on LRA
 LRA hard reg assignment : 476.07 (44%) usr   0.03 ( 0%) sys 476.08 (44%) wall       0 kB ( 0%) ggc

28607    12.1151  lto1                     alloc_page(unsigned int)
3564      1.5094  lto1                     record_reg_classes(int, int, rtx_def**, machine_mode*, char const**, rtx_def*, reg_class*)
3235      1.3700  libc-2.11.1.so           _int_malloc
3056      1.2942  lto1                     ggc_set_mark(void const*)
2646      1.1206  lto1                     gt_ggc_mx_lang_tree_node(void*)
2539      1.0753  lto1                     bitmap_set_bit(bitmap_head_def*, int)
2333      0.9880  opreport                 /usr/bin/opreport
2210      0.9359  lto1                     for_each_rtx_1(rtx_def*, int, int (*)(rtx_def**, void*), void*)
2133      0.9033  lto1                     constrain_operands(int)
2128      0.9012  lto1                     lookup_page_table_entry(void const*)
1586      0.6717  lto1                     preprocess_constraints()

While GGC  memory is now under 7GB after type streaming and we GGC just once in WPA, the TOP usage still goes to about 12GB.

With the ODR patch there are 424 devirtualizations happening during WPA and some extra (do not have stats for)
during ltrans.

Honza
Richard Guenther - June 14, 2013, 8:31 a.m.
On Fri, 14 Jun 2013, Jan Hubicka wrote:

> > > 
> > > Ok, not streaming and comparing TREE_USED gets it improved to
> > 
> > I will try to gather better data tomorrow. My mozilla build died on disk space,
> > but according to stats we are now at about 7GB of GGC memory after merging.
> > I was playing with the following patch that implements testing whether types
> > are same in my (probably naive and wrong) understanding of ODR rule in C++
> 
> So i can confirm that we now need 3GB of TMP space instead of 8GB with earlier
> version of patch.  I will compare to mainline tomorrow, but I think it is
> about the same.
>  phase opt and generate  :  96.39 ( 9%) usr  40.45 (45%) sys 136.91 (12%) wall  271042 kB ( 7%) ggc
>  phase stream in         : 457.87 (43%) usr   8.38 ( 9%) sys 466.44 (40%) wall 3798844 kB (93%) ggc
>  phase stream out        : 509.39 (48%) usr  40.82 (46%) sys 550.88 (48%) wall    7149 kB ( 0%) ggc
>  ipa cp                  :  13.62 ( 1%) usr   5.00 ( 6%) sys  18.61 ( 2%) wall  425204 kB (10%) ggc
>  ipa inlining heuristics :  60.52 ( 6%) usr  36.15 (40%) sys  96.71 ( 8%) wall 1353370 kB (33%) ggc
>  ipa lto decl in         : 346.94 (33%) usr   5.49 ( 6%) sys 352.60 (31%) wall    7042 kB ( 0%) ggc
>  ipa lto decl out        : 481.19 (45%) usr  23.28 (26%) sys 504.68 (44%) wall       0 kB ( 0%) ggc
>  TOTAL                 :1063.67            89.65          1154.26            4078436 kB
> 
> So we are still bound by streaming. I am running -flto-report overnight.
> 
> My ODR patch finds 36377 matches and also weird looking mismatches of type:
>  <record_type 0x7fbd30d46dc8 sockaddr_storage BLK
>     size <integer_cst 0x7fbd416bc1e0 type <integer_type 0x7fbd415660a8 bitsizetype> constant 1024>
>     unit size <integer_cst 0x7fbd416bc700 type <integer_type 0x7fbd41566000 sizetype> constant 128>
>     align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78
>     fields <field_decl 0x7fbd30e99ed8 ss_family
>         type <integer_type 0x7fbd3b98c000 sa_family_t public unsigned HI
>             size <integer_cst 0x7fbd41555fe0 constant 16>
>             unit size <integer_cst 0x7fbd4156a000 constant 2>
>             align 16 symtab 0 alias set -1 canonical type 0x7fbd41566540 precision 16 min <integer_cst 0x7fbd4156a020 0> max <integer_cst 0x7fbd41555fc0 65535>>
>         unsigned nonlocal HI file /usr/include/bits/socket.h line 189 col 0 size <integer_cst 0x7fbd41555fe0 16> unit size <integer_cst 0x7fbd4156a000 2>
>         align 16 offset_align 128
>         offset <integer_cst 0x7fbd41555d60 constant 0>
>         bit offset <integer_cst 0x7fbd41555de0 constant 0> context <record_type 0x7fbd30d46dc8 sockaddr_storage>
>         chain <field_decl 0x7fbd30e99000 __ss_align type <integer_type 0x7fbd415667e0 long unsigned int>
>             unsigned nonlocal DI file /usr/include/bits/socket.h line 190 col 0
>             size <integer_cst 0x7fbd41555d20 constant 64>
>             unit size <integer_cst 0x7fbd41555d40 constant 8>
>             align 64 offset_align 128 offset <integer_cst 0x7fbd41555d60 0> bit offset <integer_cst 0x7fbd41555d20 64> context <record_type 0x7fbd30d46dc8 sockaddr_storage> chain <field_decl 0x7fbd30e99e40 __ss_padding>>> context <translation_unit_decl 0x7fbd30cbc2e0 D.967968>
>     chain <type_decl 0x7fbd30d47da8 sockaddr_storage>>
>  <record_type 0x7fbd30f0bc78 sockaddr_storage BLK
>     size <integer_cst 0x7fbd416bc1e0 type <integer_type 0x7fbd415660a8 bitsizetype> constant 1024>
>     unit size <integer_cst 0x7fbd416bc700 type <integer_type 0x7fbd41566000 sizetype> constant 128>
>     align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78
>     fields <field_decl 0x7fbd30ef9558 ss_family
>         type <integer_type 0x7fbd3b98c000 sa_family_t public unsigned HI
>             size <integer_cst 0x7fbd41555fe0 constant 16>
>             unit size <integer_cst 0x7fbd4156a000 constant 2>
>             align 16 symtab 0 alias set -1 canonical type 0x7fbd41566540 precision 16 min <integer_cst 0x7fbd4156a020 0> max <integer_cst 0x7fbd41555fc0 65535>>
>         unsigned HI file /usr/include/bits/socket.h line 189 col 0 size <integer_cst 0x7fbd41555fe0 16> unit size <integer_cst 0x7fbd4156a000 2>
>         align 16 offset_align 128
>         offset <integer_cst 0x7fbd41555d60 constant 0>
>         bit offset <integer_cst 0x7fbd41555de0 constant 0> context <record_type 0x7fbd30f0bc78 sockaddr_storage>
>         chain <field_decl 0x7fbd30ef94c0 __ss_align type <integer_type 0x7fbd415667e0 long unsigned int>
>             unsigned DI file /usr/include/bits/socket.h line 190 col 0
>             size <integer_cst 0x7fbd41555d20 constant 64>
>             unit size <integer_cst 0x7fbd41555d40 constant 8>
>             align 64 offset_align 128 offset <integer_cst 0x7fbd41555d60 0> bit offset <integer_cst 0x7fbd41555d20 64> context <record_type 0x7fbd30f0bc78 sockaddr_storage> chain <field_decl 0x7fbd30ef9428 __ss_padding>>> context <translation_unit_decl 0x7fbd30ea9f18 D.936417>
>     pointer_to_this <pointer_type 0x7fbd30f0bd20> chain <type_decl 0x7fbd30ea9398 D.938243>>
> 
> that mismatch because we run into following difference:
>  <type_decl 0x7fbd30d47da8 sockaddr_storage
>     type <record_type 0x7fbd30d46dc8 sockaddr_storage BLK
>         size <integer_cst 0x7fbd416bc1e0 constant 1024>
>         unit size <integer_cst 0x7fbd416bc700 constant 128>
>         align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78
>         fields <field_decl 0x7fbd30e99ed8 ss_family type <integer_type 0x7fbd3b98c000 sa_family_t>
>             unsigned nonlocal HI file /usr/include/bits/socket.h line 189 col 0
>             size <integer_cst 0x7fbd41555fe0 constant 16>
>             unit size <integer_cst 0x7fbd4156a000 constant 2>
>             align 16 offset_align 128
>             offset <integer_cst 0x7fbd41555d60 constant 0>
>             bit offset <integer_cst 0x7fbd41555de0 constant 0> context <record_type 0x7fbd30d46dc8 sockaddr_storage> chain <field_decl 0x7fbd30e99000 __ss_align>> context <translation_unit_decl 0x7fbd30cbc2e0 D.967968>
>         chain <type_decl 0x7fbd30d47da8 sockaddr_storage>>
>     public VOID file /usr/include/bits/socket.h line 187 col 0
>     align 8 context <translation_unit_decl 0x7fbd30cbc2e0 D.967968>>
>  <identifier_node 0x7fbd30f06d70 sockaddr_storage>
> 
> I am not sure what means that one type has more TYPE_DECLs stacked than the other.

I think that's the usual case of two nearly identical types in the
_same_ SCC, linked through one of the types TYPE_DECL DECL_ORIGINAL_TYPE.
The C++ frontend likes to produce those ...

They are accounted as the "are in the same SCC" in the stats (for the
latest updated patch):

[WPA] Old merging code merges an additional 41493 types of which 21633 are 
in the same SCC with their prevailing variant (415188 and 324371 
associated trees)

I'm looking into other things now and for example see TYPE_ALIAS_SET
issues (we do stream only alias sets -1 and 0, but whether -1 got
transformed to 0 at compile-time depends on whether get_alias_set
was called on it).  That's one case where we can either forcefully
call get_alias_set at streaming time, introduce a predicate
type_would_have_alias_set_zero that doesn't run the whole get_alias_set
machinery, or ignore this for merging and make sure to merge -1 and 0
as 0 (the whole 0 vs. -1 special-case is to support TUs compiled
with -fno-strict-aliasing linked together with -fstrict-aliasing TUs).

Richard.
Jan Hubicka - June 14, 2013, 8:45 a.m.
> On Fri, 14 Jun 2013, Jan Hubicka wrote:
> 
> > > > 
> > > > Ok, not streaming and comparing TREE_USED gets it improved to
> > > 
> > > I will try to gather better data tomorrow. My mozilla build died on disk space,
> > > but according to stats we are now at about 7GB of GGC memory after merging.
> > > I was playing with the following patch that implements testing whether types
> > > are same in my (probably naive and wrong) understanding of ODR rule in C++
> > 
> > So i can confirm that we now need 3GB of TMP space instead of 8GB with earlier
> > version of patch.  I will compare to mainline tomorrow, but I think it is
> > about the same.
> >  phase opt and generate  :  96.39 ( 9%) usr  40.45 (45%) sys 136.91 (12%) wall  271042 kB ( 7%) ggc
> >  phase stream in         : 457.87 (43%) usr   8.38 ( 9%) sys 466.44 (40%) wall 3798844 kB (93%) ggc
> >  phase stream out        : 509.39 (48%) usr  40.82 (46%) sys 550.88 (48%) wall    7149 kB ( 0%) ggc
> >  ipa cp                  :  13.62 ( 1%) usr   5.00 ( 6%) sys  18.61 ( 2%) wall  425204 kB (10%) ggc
> >  ipa inlining heuristics :  60.52 ( 6%) usr  36.15 (40%) sys  96.71 ( 8%) wall 1353370 kB (33%) ggc
> >  ipa lto decl in         : 346.94 (33%) usr   5.49 ( 6%) sys 352.60 (31%) wall    7042 kB ( 0%) ggc
> >  ipa lto decl out        : 481.19 (45%) usr  23.28 (26%) sys 504.68 (44%) wall       0 kB ( 0%) ggc
> >  TOTAL                 :1063.67            89.65          1154.26            4078436 kB
> > 
> > So we are still bound by streaming. I am running -flto-report overnight.
> > 
> > My ODR patch finds 36377 matches and also weird looking mismatches of type:
> >  <record_type 0x7fbd30d46dc8 sockaddr_storage BLK
> >     size <integer_cst 0x7fbd416bc1e0 type <integer_type 0x7fbd415660a8 bitsizetype> constant 1024>
> >     unit size <integer_cst 0x7fbd416bc700 type <integer_type 0x7fbd41566000 sizetype> constant 128>
> >     align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78
> >     fields <field_decl 0x7fbd30e99ed8 ss_family
> >         type <integer_type 0x7fbd3b98c000 sa_family_t public unsigned HI
> >             size <integer_cst 0x7fbd41555fe0 constant 16>
> >             unit size <integer_cst 0x7fbd4156a000 constant 2>
> >             align 16 symtab 0 alias set -1 canonical type 0x7fbd41566540 precision 16 min <integer_cst 0x7fbd4156a020 0> max <integer_cst 0x7fbd41555fc0 65535>>
> >         unsigned nonlocal HI file /usr/include/bits/socket.h line 189 col 0 size <integer_cst 0x7fbd41555fe0 16> unit size <integer_cst 0x7fbd4156a000 2>
> >         align 16 offset_align 128
> >         offset <integer_cst 0x7fbd41555d60 constant 0>
> >         bit offset <integer_cst 0x7fbd41555de0 constant 0> context <record_type 0x7fbd30d46dc8 sockaddr_storage>
> >         chain <field_decl 0x7fbd30e99000 __ss_align type <integer_type 0x7fbd415667e0 long unsigned int>
> >             unsigned nonlocal DI file /usr/include/bits/socket.h line 190 col 0
> >             size <integer_cst 0x7fbd41555d20 constant 64>
> >             unit size <integer_cst 0x7fbd41555d40 constant 8>
> >             align 64 offset_align 128 offset <integer_cst 0x7fbd41555d60 0> bit offset <integer_cst 0x7fbd41555d20 64> context <record_type 0x7fbd30d46dc8 sockaddr_storage> chain <field_decl 0x7fbd30e99e40 __ss_padding>>> context <translation_unit_decl 0x7fbd30cbc2e0 D.967968>
> >     chain <type_decl 0x7fbd30d47da8 sockaddr_storage>>
> >  <record_type 0x7fbd30f0bc78 sockaddr_storage BLK
> >     size <integer_cst 0x7fbd416bc1e0 type <integer_type 0x7fbd415660a8 bitsizetype> constant 1024>
> >     unit size <integer_cst 0x7fbd416bc700 type <integer_type 0x7fbd41566000 sizetype> constant 128>
> >     align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78
> >     fields <field_decl 0x7fbd30ef9558 ss_family
> >         type <integer_type 0x7fbd3b98c000 sa_family_t public unsigned HI
> >             size <integer_cst 0x7fbd41555fe0 constant 16>
> >             unit size <integer_cst 0x7fbd4156a000 constant 2>
> >             align 16 symtab 0 alias set -1 canonical type 0x7fbd41566540 precision 16 min <integer_cst 0x7fbd4156a020 0> max <integer_cst 0x7fbd41555fc0 65535>>
> >         unsigned HI file /usr/include/bits/socket.h line 189 col 0 size <integer_cst 0x7fbd41555fe0 16> unit size <integer_cst 0x7fbd4156a000 2>
> >         align 16 offset_align 128
> >         offset <integer_cst 0x7fbd41555d60 constant 0>
> >         bit offset <integer_cst 0x7fbd41555de0 constant 0> context <record_type 0x7fbd30f0bc78 sockaddr_storage>
> >         chain <field_decl 0x7fbd30ef94c0 __ss_align type <integer_type 0x7fbd415667e0 long unsigned int>
> >             unsigned DI file /usr/include/bits/socket.h line 190 col 0
> >             size <integer_cst 0x7fbd41555d20 constant 64>
> >             unit size <integer_cst 0x7fbd41555d40 constant 8>
> >             align 64 offset_align 128 offset <integer_cst 0x7fbd41555d60 0> bit offset <integer_cst 0x7fbd41555d20 64> context <record_type 0x7fbd30f0bc78 sockaddr_storage> chain <field_decl 0x7fbd30ef9428 __ss_padding>>> context <translation_unit_decl 0x7fbd30ea9f18 D.936417>
> >     pointer_to_this <pointer_type 0x7fbd30f0bd20> chain <type_decl 0x7fbd30ea9398 D.938243>>
> > 
> > that mismatch because we run into following difference:
> >  <type_decl 0x7fbd30d47da8 sockaddr_storage
> >     type <record_type 0x7fbd30d46dc8 sockaddr_storage BLK
> >         size <integer_cst 0x7fbd416bc1e0 constant 1024>
> >         unit size <integer_cst 0x7fbd416bc700 constant 128>
> >         align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78
> >         fields <field_decl 0x7fbd30e99ed8 ss_family type <integer_type 0x7fbd3b98c000 sa_family_t>
> >             unsigned nonlocal HI file /usr/include/bits/socket.h line 189 col 0
> >             size <integer_cst 0x7fbd41555fe0 constant 16>
> >             unit size <integer_cst 0x7fbd4156a000 constant 2>
> >             align 16 offset_align 128
> >             offset <integer_cst 0x7fbd41555d60 constant 0>
> >             bit offset <integer_cst 0x7fbd41555de0 constant 0> context <record_type 0x7fbd30d46dc8 sockaddr_storage> chain <field_decl 0x7fbd30e99000 __ss_align>> context <translation_unit_decl 0x7fbd30cbc2e0 D.967968>
> >         chain <type_decl 0x7fbd30d47da8 sockaddr_storage>>
> >     public VOID file /usr/include/bits/socket.h line 187 col 0
> >     align 8 context <translation_unit_decl 0x7fbd30cbc2e0 D.967968>>
> >  <identifier_node 0x7fbd30f06d70 sockaddr_storage>
> > 
> > I am not sure what means that one type has more TYPE_DECLs stacked than the other.
> 
> I think that's the usual case of two nearly identical types in the
> _same_ SCC, linked through one of the types TYPE_DECL DECL_ORIGINAL_TYPE.
> The C++ frontend likes to produce those ...

Hmm, so i guess I should walk into ORIGINAL_TYPE when seeing TYPE_DECL?

Honza

Patch

Index: tree.c
===================================================================
--- tree.c	(revision 200064)
+++ tree.c	(working copy)
@@ -11618,6 +11711,91 @@  lhd_gcc_personality (void)
   return gcc_eh_personality_decl;
 }
 
+/* For languages with One Definition Rule, work out if
+   decls are actually the same even if the tree representation
+   differs.  This handles only decls appearing in TYPE_NAME
+   and TYPE_CONTEXT.  That is NAMESPACE_DECL, TYPE_DECL,
+   RECORD_TYPE and IDENTIFIER_NODE.  */
+
+static bool
+decls_same_for_odr (tree decl1, tree decl2)
+{
+  if (decl1 == decl2)
+    return true;
+  if (!decl1 || !decl2)
+    {
+      fprintf (stderr, "Nesting mismatch\n");
+      debug_tree (decl1);
+      debug_tree (decl2);
+      return false;
+    }
+  if (TREE_CODE (decl1) != TREE_CODE (decl2))
+    {
+      fprintf (stderr, "Code mismatch\n");
+      debug_tree (decl1);
+      debug_tree (decl2);
+      return false;
+    }
+  if (TREE_CODE (decl1) == TRANSLATION_UNIT_DECL)
+    return true;
+  if (TREE_CODE (decl1) != NAMESPACE_DECL
+      && TREE_CODE (decl1) != RECORD_TYPE
+      && TREE_CODE (decl1) != TYPE_DECL)
+    {
+      fprintf (stderr, "Decl type mismatch\n");
+      debug_tree (decl1);
+      return false;
+    }
+  if (!DECL_NAME (decl1))
+    {
+      fprintf (stderr, "Anonymous; name mysmatch\n");
+      debug_tree (decl1);
+      return false;
+    }
+  if (!decls_same_for_odr (DECL_NAME (decl1), DECL_NAME (decl2)))
+    return false;
+  return decls_same_for_odr (DECL_CONTEXT (decl1),
+		             DECL_CONTEXT (decl2));
+}
+
+/* For languages with One Definition Rule, work out if
+   types are same even if the tree representation differs. 
+   This is non-trivial for LTO where minnor differences in
+   the type representation may have prevented type merging
+   to merge two copies of otherwise equivalent type.  */
+
+static bool
+types_same_for_odr (tree type1, tree type2)
+{
+  type1 = TYPE_MAIN_VARIANT (type1);
+  type2 = TYPE_MAIN_VARIANT (type2);
+  if (type1 == type2)
+    return true;
+  if (!type1 || !type2)
+    return false;
+
+  /* If types are not structuraly same, do not bother to contnue.
+     Match in the remainder of code would mean ODR violation.  */
+  if (!types_compatible_p (type1, type2))
+    return false;
+
+  debug_tree (type1);
+  debug_tree (type2);
+  if (!TYPE_NAME (type1))
+    {
+      fprintf (stderr, "Anonymous; name mysmatch\n");
+      return false;
+    }
+  if (!decls_same_for_odr (TYPE_NAME (type1), TYPE_NAME (type2)))
+    return false;
+  if (!decls_same_for_odr (TYPE_CONTEXT (type1), TYPE_CONTEXT (type2)))
+    return false;
+  fprintf (stderr, "type match!\n");
+  gcc_assert (in_lto_p);
+    
+  return true;
+}
+
 /* Try to find a base info of BINFO that would have its field decl at offset
    OFFSET within the BINFO type and which is of EXPECTED_TYPE.  If it can be
    found, return, otherwise return NULL_TREE.  */
@@ -11633,8 +11811,8 @@  get_binfo_at_offset (tree binfo, HOST_WI
       tree fld;
       int i;
 
-      if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type))
-	  return binfo;
+      if (types_same_for_odr (type, expected_type))
+        return binfo;
       if (offset < 0)
 	return NULL_TREE;
 
@@ -11663,7 +11841,7 @@  get_binfo_at_offset (tree binfo, HOST_WI
 	{
 	  tree base_binfo, found_binfo = NULL_TREE;
 	  for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
-	    if (TREE_TYPE (base_binfo) == TREE_TYPE (fld))
+	    if (types_same_for_odr (base_binfo, fld))
 	      {
 		found_binfo = base_binfo;
 		break;