Message ID | 4a6e1c28-6a18-0084-e602-57a0c259d676@suse.cz |
---|---|
State | New |
Headers | show |
Series | [RFC] Radically simplify emission of balanced tree for switch statements. | expand |
On 09/14/2017 06:17 AM, Martin Liška wrote: > Hello. > > As mentioned at Cauldron 2017, second step in switch lowering should be massive > simplification in code that does expansion of balanced tree. Basically it includes > VRP and DCE, which we can for obvious reason do by our own. > > The patch does that, and introduces a separate pass for -O0 that's responsible > for lowering at the end of tree pass. > > There's a small fallback that I would like to discuss: > > 1) vrp105.c - there's no longer catches threading opportunity in between default cases: > adding Patrick who can probably help why is the opportunity skipped with expanded tree Well, VRP knows how to analyze a switch statement and determine when paths out of one switch imply a particular case will be taken in a later switch. In this particular case we're trying to verify that the default case in the first switch threads directly to the default case in the second (though it seems we ought to be able to totally thread the cases). Obviously we don't have switches anymore after your lowering pass. But ISTM we still ought to be able to evaluate how your lowering affects jump threading on this case. In theory the lowered switch statements should be easier to thread. Reality is sadly different. There's two cases to consider. One when i < 0 (should go to first default, then directly to second default). The other when i > 1 which should also go to the first default, then directly to the second default). When i < 0 we do in fact thread through the second switch and go from the first default case directly to the second default case. When i > 1 we still go through the test for the second switch. These are the key blocks: ;; basic block 2, loop depth 0, freq 10000, maybe hot ;; prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED) ;; pred: ENTRY [always] (FALLTHRU,EXECUTABLE) i.0_1 = i; if (i.0_1 > 0) goto <bb 4>; [50.01%] [count: INV] else goto <bb 3>; [49.99%] [count: INV] ;; succ: 3 [50.0% (guessed)] (FALSE_VALUE,EXECUTABLE) ;; 4 [50.0% (guessed)] (TRUE_VALUE,EXECUTABLE) ;; basic block 3, loop depth 0, freq 10000, maybe hot ;; Invalid sum of incoming frequencies 4999, should be 10000 ;; prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) ;; pred: 2 [50.0% (guessed)] (FALSE_VALUE,EXECUTABLE) if (i.0_1 == 0) goto <bb 5>; [40.00%] [count: INV] else goto <bb 14>; [60.00%] [count: INV] ;; succ: 14 [60.0% (guessed)] (FALSE_VALUE,EXECUTABLE) ;; 5 [40.0% (guessed)] (TRUE_VALUE,EXECUTABLE) ;; basic block 4, loop depth 0, freq 10000, maybe hot ;; Invalid sum of incoming frequencies 5001, should be 10000 ;; prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED) ;; pred: 2 [50.0% (guessed)] (TRUE_VALUE,EXECUTABLE) _13 = i.0_1 != 1; if (_13 != 0) goto <bb 13>; [16.68%] [count: INV] else goto <bb 6>; [83.32%] [count: INV] ;; succ: 6 [83.3% (guessed)] (FALSE_VALUE,EXECUTABLE) ;; 13 [16.7% (guessed)] (TRUE_VALUE,EXECUTABLE) [ ... ] ;; basic block 9, loop depth 0, freq 9999, maybe hot ;; Invalid sum of incoming frequencies 3999, should be 9999 ;; prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED) ;; pred: 13 [always] (FALLTHRU,EXECUTABLE) ;; 7 [always (guessed)] (TRUE_VALUE,EXECUTABLE) # prephitmp_19 = PHI <prephitmp_15(13), prephitmp_12(7)> _2 = prephitmp_19 != 1; if (_2 != 0) goto <bb 12>; [16.68%] [count: INV] else goto <bb 11>; [83.32%] [count: INV] ;; succ: 11 [83.3% (guessed)] (FALSE_VALUE,EXECUTABLE) ;; 12 [16.7% (guessed)] (TRUE_VALUE,EXECUTABLE) [ ... ] ;; basic block 13, loop depth 0, freq 1668, maybe hot ;; prev block 12, next block 14, flags: (NEW, REACHABLE, VISITED) ;; pred: 4 [16.7% (guessed)] (TRUE_VALUE,EXECUTABLE) # prephitmp_15 = PHI <i.0_1(4)> goto <bb 9>; [100.00%] [count: INV] ;; succ: 9 [always] (FALLTHRU,EXECUTABLE) WHen bb9 is reached by bb13 we know by back substitution that the assignemnt _2 = prehitmp_19 != 1 _2 = prehitmp_15 != 1 _2 = i.0_1 != 1 And we should know enough about the range of i.0_1 to determine that is true which would allow us to thread the jump. But a few things get in the way. First, the VRP thread doesn't try hard at all to simplify gimple assignments. It really just tries to simplify gimple_cond and gimple_switch. This could be trivially improved. So if I do the right things by hand we end up trying to evaluate i.0_1 != 1. So that's good. But we don't get a useful result back from vrp_evaluate_conditional. WHy? Because the recorded range for i.0_1 is [1,INF] -- that's the global range for i.0_1 not the range on the path. Now it turns out this is precisely the problem that I've got code to address in my queue which fixes a regression I was working on earlier this year in the gcc-7 cycle. It's backed up behind some improvements to tree-ssa-uninit.c that Richi and I still need to hash through. This would likely also be fixed by Andrew's work on ranges. It's precisely the kind of query its designed to answer. In summary, your switch lowering does regress the code we get for this test. But the regression is more a failing of the jump threader than anything. > > 2) uninit-18.c where we currently report: > > /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/uninit-18.c:13:12: warning: ‘tmp’ may be used uninitialized in this function [-Wmaybe-uninitialized] > tmp[5] = 7; /* { dg-bogus "may be used uninitialized" } */ > > Am I right that the pass uses VRP? No, it doesn't use VRP. If you look at the history of uninit-18 it shows: commit cac06b024306772e2be2d357064f7ca2aa82bde8 Author: rguenth <rguenth@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Fri Jan 16 13:26:10 2015 +0000 2015-01-16 Richard Biener <rguenther@suse.de> PR middle-end/64614 * tree-ssa-uninit.c: Include tree-cfg.h. (MAX_SWITCH_CASES): New define. (convert_control_dep_chain_into_preds): Handle switch statements. (is_pred_expr_subset_of): Handle x == CST vs. (x & CST) != 0. (normalize_one_pred_1): Do not split bit-manipulations. Record (x & CST). * gcc.dg/uninit-18.c: New testcase. Essentially tree-ssa-uninit.c has some specialized code to handle switch statements in predicates with BIT_AND_EXPR and uninit-18.c tests that capability. Your new switch lowering breaks that relatively specialized code. You could further enhance uninit to detect this. From the .uninit dump: MEM[(char *)tmp_2 + 5B] = 7; is guarded by : bar_5(D) > 0 (.AND.) (.NOT.) bar_5(D) > 1 [CHECK]: Found unguarded use: MEM[(char *)tmp_2 + 5B] = 7; So that check is really bar_5 == 1 and in theory if bar_5 == 1, then we would have assigned a value to tmp. VRP is capable of making that determination and I think when VRP does that, things simplify enough that the general predicate code in tree-ssa-uninit.c will DTRT. I wonder if we could get the tighter test coming out of switch lowering, or discover it in DOM, then there's a reasonable change tree-ssa-uninit.c would just DTRT. Looking at the DOM dumps we have: Optimizing block #5 1>>> STMT 1 = bar_5(D) le_expr 1 1>>> STMT 0 = bar_5(D) gt_expr 1 Optimizing statement if (bar_5(D) >= 1) LKUP STMT bar_5(D) ge_expr 1 DOM doesn't usually try to do any kind of range optimization. But this is a case where it might make sense. So the first two lines indicate that we know bar5 <= 1 is true and that bar5 > 1 must be false. Those are expressions we can look up in the hash table. So when the lookup bar5 >= 1 fails we could in theory lookup bar5 <= 1 and if that's true, then we know the test could be simplified to bar5 == 1 I don't know how often this would hit in general and when it hits how often the resulting code is actually better. I can try to do some instrumentation on that... I wonder if that might help the first case as well. > > Last question is whether the pass is properly moved in optimization pipeline? > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Well, that's a *real* interesting question. THe two cases above argue that it's too early. However, I could easily make an argument that the spot you chose was reasonable -- namely it's after the first set of threading/vrp/dom passes, but before the second set of threading/vrp/dom passes. Let me poke a bit with DOM and see if it can reasonably clean up this stuff and whether or not the transformation is generally useful. Jeff
On 09/16/2017 12:19 AM, Jeff Law wrote: > On 09/14/2017 06:17 AM, Martin Liška wrote: >> Hello. >> >> As mentioned at Cauldron 2017, second step in switch lowering should be massive >> simplification in code that does expansion of balanced tree. Basically it includes >> VRP and DCE, which we can for obvious reason do by our own. >> >> The patch does that, and introduces a separate pass for -O0 that's responsible >> for lowering at the end of tree pass. >> >> There's a small fallback that I would like to discuss: >> >> 1) vrp105.c - there's no longer catches threading opportunity in between default cases: >> adding Patrick who can probably help why is the opportunity skipped with expanded tree > Well, VRP knows how to analyze a switch statement and determine when > paths out of one switch imply a particular case will be taken in a later > switch. In this particular case we're trying to verify that the default > case in the first switch threads directly to the default case in the > second (though it seems we ought to be able to totally thread the cases). > > Obviously we don't have switches anymore after your lowering pass. But > ISTM we still ought to be able to evaluate how your lowering affects > jump threading on this case. In theory the lowered switch statements > should be easier to thread. > > Reality is sadly different. There's two cases to consider. One when i > < 0 (should go to first default, then directly to second default). The > other when i > 1 which should also go to the first default, then > directly to the second default). > > When i < 0 we do in fact thread through the second switch and go from > the first default case directly to the second default case. > > When i > 1 we still go through the test for the second switch. > > These are the key blocks: > > ;; basic block 2, loop depth 0, freq 10000, maybe hot > ;; prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED) > ;; pred: ENTRY [always] (FALLTHRU,EXECUTABLE) > i.0_1 = i; > if (i.0_1 > 0) > goto <bb 4>; [50.01%] [count: INV] > else > goto <bb 3>; [49.99%] [count: INV] > ;; succ: 3 [50.0% (guessed)] (FALSE_VALUE,EXECUTABLE) > ;; 4 [50.0% (guessed)] (TRUE_VALUE,EXECUTABLE) > > ;; basic block 3, loop depth 0, freq 10000, maybe hot > ;; Invalid sum of incoming frequencies 4999, should be 10000 > ;; prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) > ;; pred: 2 [50.0% (guessed)] (FALSE_VALUE,EXECUTABLE) > if (i.0_1 == 0) > goto <bb 5>; [40.00%] [count: INV] > else > goto <bb 14>; [60.00%] [count: INV] > ;; succ: 14 [60.0% (guessed)] (FALSE_VALUE,EXECUTABLE) > ;; 5 [40.0% (guessed)] (TRUE_VALUE,EXECUTABLE) > > ;; basic block 4, loop depth 0, freq 10000, maybe hot > ;; Invalid sum of incoming frequencies 5001, should be 10000 > ;; prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED) > ;; pred: 2 [50.0% (guessed)] (TRUE_VALUE,EXECUTABLE) > _13 = i.0_1 != 1; > if (_13 != 0) > goto <bb 13>; [16.68%] [count: INV] > else > goto <bb 6>; [83.32%] [count: INV] > ;; succ: 6 [83.3% (guessed)] (FALSE_VALUE,EXECUTABLE) > ;; 13 [16.7% (guessed)] (TRUE_VALUE,EXECUTABLE) > > [ ... ] > > ;; basic block 9, loop depth 0, freq 9999, maybe hot > ;; Invalid sum of incoming frequencies 3999, should be 9999 > ;; prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED) > ;; pred: 13 [always] (FALLTHRU,EXECUTABLE) > ;; 7 [always (guessed)] (TRUE_VALUE,EXECUTABLE) > # prephitmp_19 = PHI <prephitmp_15(13), prephitmp_12(7)> > _2 = prephitmp_19 != 1; > if (_2 != 0) > goto <bb 12>; [16.68%] [count: INV] > else > goto <bb 11>; [83.32%] [count: INV] > ;; succ: 11 [83.3% (guessed)] (FALSE_VALUE,EXECUTABLE) > ;; 12 [16.7% (guessed)] (TRUE_VALUE,EXECUTABLE) > > [ ... ] > ;; basic block 13, loop depth 0, freq 1668, maybe hot > ;; prev block 12, next block 14, flags: (NEW, REACHABLE, VISITED) > ;; pred: 4 [16.7% (guessed)] (TRUE_VALUE,EXECUTABLE) > # prephitmp_15 = PHI <i.0_1(4)> > goto <bb 9>; [100.00%] [count: INV] > ;; succ: 9 [always] (FALLTHRU,EXECUTABLE) > > > > WHen bb9 is reached by bb13 we know by back substitution that the assignemnt > > _2 = prehitmp_19 != 1 > _2 = prehitmp_15 != 1 > _2 = i.0_1 != 1 > > And we should know enough about the range of i.0_1 to determine that is > true which would allow us to thread the jump. But a few things get in > the way. > > First, the VRP thread doesn't try hard at all to simplify gimple > assignments. It really just tries to simplify gimple_cond and > gimple_switch. This could be trivially improved. > > So if I do the right things by hand we end up trying to evaluate i.0_1 > != 1. So that's good. But we don't get a useful result back from > vrp_evaluate_conditional. WHy? Because the recorded range for i.0_1 is > [1,INF] -- that's the global range for i.0_1 not the range on the path. > > Now it turns out this is precisely the problem that I've got code to > address in my queue which fixes a regression I was working on earlier > this year in the gcc-7 cycle. It's backed up behind some improvements > to tree-ssa-uninit.c that Richi and I still need to hash through. > > This would likely also be fixed by Andrew's work on ranges. It's > precisely the kind of query its designed to answer. > > In summary, your switch lowering does regress the code we get for this > test. But the regression is more a failing of the jump threader than > anything. > >> >> 2) uninit-18.c where we currently report: >> >> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/uninit-18.c:13:12: warning: ‘tmp’ may be used uninitialized in this function [-Wmaybe-uninitialized] >> tmp[5] = 7; /* { dg-bogus "may be used uninitialized" } */ >> >> Am I right that the pass uses VRP? > No, it doesn't use VRP. If you look at the history of uninit-18 it shows: > commit cac06b024306772e2be2d357064f7ca2aa82bde8 > Author: rguenth <rguenth@138bc75d-0d04-0410-961f-82ee72b054a4> > Date: Fri Jan 16 13:26:10 2015 +0000 > > 2015-01-16 Richard Biener <rguenther@suse.de> > > PR middle-end/64614 > * tree-ssa-uninit.c: Include tree-cfg.h. > (MAX_SWITCH_CASES): New define. > (convert_control_dep_chain_into_preds): Handle switch > statements. > (is_pred_expr_subset_of): Handle x == CST vs. (x & CST) != 0. > (normalize_one_pred_1): Do not split bit-manipulations. > Record (x & CST). > > * gcc.dg/uninit-18.c: New testcase. > > Essentially tree-ssa-uninit.c has some specialized code to handle switch > statements in predicates with BIT_AND_EXPR and uninit-18.c tests that > capability. > > Your new switch lowering breaks that relatively specialized code. > > You could further enhance uninit to detect this. From the .uninit dump: > > MEM[(char *)tmp_2 + 5B] = 7; > is guarded by : > > bar_5(D) > 0 (.AND.) (.NOT.) bar_5(D) > 1 > > > [CHECK]: Found unguarded use: MEM[(char *)tmp_2 + 5B] = 7; > > > So that check is really bar_5 == 1 and in theory if bar_5 == 1, then we > would have assigned a value to tmp. VRP is capable of making that > determination and I think when VRP does that, things simplify enough > that the general predicate code in tree-ssa-uninit.c will DTRT. > > I wonder if we could get the tighter test coming out of switch lowering, > or discover it in DOM, then there's a reasonable change > tree-ssa-uninit.c would just DTRT. > > Looking at the DOM dumps we have: > > Optimizing block #5 > > 1>>> STMT 1 = bar_5(D) le_expr 1 > 1>>> STMT 0 = bar_5(D) gt_expr 1 > Optimizing statement if (bar_5(D) >= 1) > LKUP STMT bar_5(D) ge_expr 1 > > > DOM doesn't usually try to do any kind of range optimization. But this > is a case where it might make sense. > > So the first two lines indicate that we know bar5 <= 1 is true and that > bar5 > 1 must be false. Those are expressions we can look up in the > hash table. > > So when the lookup bar5 >= 1 fails we could in theory lookup bar5 <= 1 > and if that's true, then we know the test could be simplified to bar5 == 1 > > I don't know how often this would hit in general and when it hits how > often the resulting code is actually better. I can try to do some > instrumentation on that... I wonder if that might help the first case > as well. > > >> >> Last question is whether the pass is properly moved in optimization pipeline? >> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > Well, that's a *real* interesting question. THe two cases above argue > that it's too early. However, I could easily make an argument that the > spot you chose was reasonable -- namely it's after the first set of > threading/vrp/dom passes, but before the second set of threading/vrp/dom > passes. > > Let me poke a bit with DOM and see if it can reasonably clean up this > stuff and whether or not the transformation is generally useful. > > Jeff > Hello. Thank you Jeff for very verbose explanation what's happening. I'm planning to do follow-up of this patch that will include clustering for bit-tests and jump tables. Maybe that will make aforementioned issues even more difficult, but we'll see. Martin
On 09/20/2017 01:24 AM, Martin Liška wrote: > > Hello. > > Thank you Jeff for very verbose explanation what's happening. I'm planning to do > follow-up of this patch that will include clustering for bit-tests and jump tables. > Maybe that will make aforementioned issues even more difficult, but we'll see. FWIW, the DOM changes to simplify the conditionals seem to help both cases, trigger reasonably consistently in a bootstrap and for some subset of the triggers actually result in transformations that allow other passes to do a better job in the common (-O2) case. So my inclination is to polish them a bit further get them on the trunk. My recommendation is to ignore the two regressions for now and focus on the cleanups you're trying to do. jeff
On 20 September 2017 17:00:13 CEST, Jeff Law <law@redhat.com> wrote: >On 09/20/2017 01:24 AM, Martin Liška wrote: > >> >> Hello. >> >> Thank you Jeff for very verbose explanation what's happening. I'm >planning to do >> follow-up of this patch that will include clustering for bit-tests >and jump tables. >> Maybe that will make aforementioned issues even more difficult, but >we'll see. >FWIW, the DOM changes to simplify the conditionals seem to help both >cases, trigger reasonably consistently in a bootstrap and for some >subset of the triggers actually result in transformations that allow >other passes to do a better job in the common (-O2) case. So my >inclination is to polish them a bit further get them on the trunk. > >My recommendation is to ignore the two regressions for now and focus on >the cleanups you're trying to do. Can you please post CSiBE numbers? Ideally throwing in gcc-3.4.6 numbers too? Just curious since I stumbled across suboptimal handling of switch statements some time ago: gcc.gnu.org/ml/gcc-patches/2007-06/msg01648.html thanks,
On 09/20/2017 05:00 PM, Jeff Law wrote: > On 09/20/2017 01:24 AM, Martin Liška wrote: > >> >> Hello. >> >> Thank you Jeff for very verbose explanation what's happening. I'm planning to do >> follow-up of this patch that will include clustering for bit-tests and jump tables. >> Maybe that will make aforementioned issues even more difficult, but we'll see. > FWIW, the DOM changes to simplify the conditionals seem to help both > cases, trigger reasonably consistently in a bootstrap and for some > subset of the triggers actually result in transformations that allow > other passes to do a better job in the common (-O2) case. So my > inclination is to polish them a bit further get them on the trunk. > > My recommendation is to ignore the two regressions for now and focus on > the cleanups you're trying to do. > > jeff > Hello. Some time ago I've decided that I'll make patch submission of switch clustering in next stage1. However, this patch can be applied as is in this stage3. Would it be possible or is it too late? Thanks, Martin
On 01/09/2018 07:43 AM, Martin Liška wrote: > On 09/20/2017 05:00 PM, Jeff Law wrote: >> On 09/20/2017 01:24 AM, Martin Liška wrote: >> >>> >>> Hello. >>> >>> Thank you Jeff for very verbose explanation what's happening. I'm planning to do >>> follow-up of this patch that will include clustering for bit-tests and jump tables. >>> Maybe that will make aforementioned issues even more difficult, but we'll see. >> FWIW, the DOM changes to simplify the conditionals seem to help both >> cases, trigger reasonably consistently in a bootstrap and for some >> subset of the triggers actually result in transformations that allow >> other passes to do a better job in the common (-O2) case. So my >> inclination is to polish them a bit further get them on the trunk. >> >> My recommendation is to ignore the two regressions for now and focus on >> the cleanups you're trying to do. >> >> jeff >> > > Hello. > > Some time ago I've decided that I'll make patch submission of switch clustering > in next stage1. However, this patch can be applied as is in this stage3. Would > it be possible or is it too late? I'll let Richi make the call here. FWIW, the DOM changes to avoid the two missed-optimization regressions you ran into are on the trunk, so that's no longer a blocking issue. jeff
On Tue, Jan 9, 2018 at 7:29 PM, Jeff Law <law@redhat.com> wrote: > On 01/09/2018 07:43 AM, Martin Liška wrote: >> On 09/20/2017 05:00 PM, Jeff Law wrote: >>> On 09/20/2017 01:24 AM, Martin Liška wrote: >>> >>>> >>>> Hello. >>>> >>>> Thank you Jeff for very verbose explanation what's happening. I'm planning to do >>>> follow-up of this patch that will include clustering for bit-tests and jump tables. >>>> Maybe that will make aforementioned issues even more difficult, but we'll see. >>> FWIW, the DOM changes to simplify the conditionals seem to help both >>> cases, trigger reasonably consistently in a bootstrap and for some >>> subset of the triggers actually result in transformations that allow >>> other passes to do a better job in the common (-O2) case. So my >>> inclination is to polish them a bit further get them on the trunk. >>> >>> My recommendation is to ignore the two regressions for now and focus on >>> the cleanups you're trying to do. >>> >>> jeff >>> >> >> Hello. >> >> Some time ago I've decided that I'll make patch submission of switch clustering >> in next stage1. However, this patch can be applied as is in this stage3. Would >> it be possible or is it too late? > I'll let Richi make the call here. FWIW, the DOM changes to avoid the > two missed-optimization regressions you ran into are on the trunk, so > that's no longer a blocking issue. If you are fine with waiting then please wait ;) Richard. > jeff
On 01/10/2018 02:13 PM, Richard Biener wrote: > On Tue, Jan 9, 2018 at 7:29 PM, Jeff Law <law@redhat.com> wrote: >> On 01/09/2018 07:43 AM, Martin Liška wrote: >>> On 09/20/2017 05:00 PM, Jeff Law wrote: >>>> On 09/20/2017 01:24 AM, Martin Liška wrote: >>>> >>>>> >>>>> Hello. >>>>> >>>>> Thank you Jeff for very verbose explanation what's happening. I'm planning to do >>>>> follow-up of this patch that will include clustering for bit-tests and jump tables. >>>>> Maybe that will make aforementioned issues even more difficult, but we'll see. >>>> FWIW, the DOM changes to simplify the conditionals seem to help both >>>> cases, trigger reasonably consistently in a bootstrap and for some >>>> subset of the triggers actually result in transformations that allow >>>> other passes to do a better job in the common (-O2) case. So my >>>> inclination is to polish them a bit further get them on the trunk. >>>> >>>> My recommendation is to ignore the two regressions for now and focus on >>>> the cleanups you're trying to do. >>>> >>>> jeff >>>> >>> >>> Hello. >>> >>> Some time ago I've decided that I'll make patch submission of switch clustering >>> in next stage1. However, this patch can be applied as is in this stage3. Would >>> it be possible or is it too late? >> I'll let Richi make the call here. FWIW, the DOM changes to avoid the >> two missed-optimization regressions you ran into are on the trunk, so >> that's no longer a blocking issue. > > If you are fine with waiting then please wait ;) Yep, it's not urgent. Thanks. Martin > > Richard. > >> jeff
On 10 January 2018 15:59:28 CET, "Martin Liška" <mliska@suse.cz> wrote: >On 01/10/2018 02:13 PM, Richard Biener wrote: >> On Tue, Jan 9, 2018 at 7:29 PM, Jeff Law <law@redhat.com> wrote: >>> On 01/09/2018 07:43 AM, Martin Liška wrote: >>>> On 09/20/2017 05:00 PM, Jeff Law wrote: >>>>> On 09/20/2017 01:24 AM, Martin Liška wrote: >>>>> >>>>>> >>>>>> Hello. >>>>>> >>>>>> Thank you Jeff for very verbose explanation what's happening. I'm >planning to do >>>>>> follow-up of this patch that will include clustering for >bit-tests and jump tables. >>>>>> Maybe that will make aforementioned issues even more difficult, >but we'll see. >>>>> FWIW, the DOM changes to simplify the conditionals seem to help >both >>>>> cases, trigger reasonably consistently in a bootstrap and for some >>>>> subset of the triggers actually result in transformations that >allow >>>>> other passes to do a better job in the common (-O2) case. So my >>>>> inclination is to polish them a bit further get them on the trunk. >>>>> >>>>> My recommendation is to ignore the two regressions for now and >focus on >>>>> the cleanups you're trying to do. >>>>> >>>>> jeff >>>>> >>>> >>>> Hello. >>>> >>>> Some time ago I've decided that I'll make patch submission of >switch clustering >>>> in next stage1. However, this patch can be applied as is in this >stage3. Would >>>> it be possible or is it too late? >>> I'll let Richi make the call here. FWIW, the DOM changes to avoid >the >>> two missed-optimization regressions you ran into are on the trunk, >so >>> that's no longer a blocking issue. >> >> If you are fine with waiting then please wait ;) > >Yep, it's not urgent. Can you please post CSiBE numbers? Ideally throwing in gcc-3.4.6 numbers too? thanks,
On 01/15/2018 12:22 AM, Bernhard Reutner-Fischer wrote: > Can you please post CSiBE numbers? Ideally throwing in gcc-3.4.6 numbers too? > > thanks, Hi. I've just retested the patch and looks fine. There are numbers of CSiBE. I'm sorry I don't have such old version of GCC: +-------------------+--------+------------------+------------------+-----------------+--------------------+ | object | trunk | Trunk w/ patch | Gcc 7.3.1 | patch vs. trunk | Gcc 7.3.1 vs trunk | +-------------------+--------+------------------+------------------+-----------------+--------------------+ | buf.c.o | 2531 | 2531 | 2531 | 0 | 0 | | ccl.c.o | 2520 | 2520 | 2519 | 0 | -1 | | dfa.c.o | 9909 | 9909 | 9909 | 0 | 0 | | ecs.c.o | 1432 | 1432 | 1432 | 0 | 0 | | filter.c.o | 4810 | 4810 | 4810 | 0 | 0 | | gen.c.o | 28696 | 28696 | 28805 | 0 | 109 | | libmain.c.o | 88 | 88 | 88 | 0 | 0 | | libyywrap.c.o | 54 | 54 | 67 | 0 | 13 | | main.c.o | 22132 | 22132 | 22129 | 0 | -3 | | misc.c.o | 9765 | 9765 | 9811 | 0 | 46 | | nfa.c.o | 6449 | 6467 | 6449 | 18 | 0 | | options.c.o | 1240 | 1240 | 1240 | 0 | 0 | | parse.c.o | 15737 | 15737 | 15742 | 0 | 5 | | regex.c.o | 1374 | 1374 | 1374 | 0 | 0 | | scan.c.o | 66844 | 66896 | 66944 | 52 | 100 | | scanflags.c.o | 422 | 422 | 435 | 0 | 13 | | scanopt.c.o | 8170 | 8201 | 8198 | 31 | 28 | | skel.c.o | 91010 | 91010 | 91010 | 0 | 0 | | sym.c.o | 1796 | 1796 | 1796 | 0 | 0 | | tables.c.o | 5000 | 5070 | 4998 | 70 | -2 | | tables_shared.c.o | 122 | 122 | 122 | 0 | 0 | | tblcmp.c.o | 5587 | 5587 | 5578 | 0 | -9 | | yylex.c.o | 2166 | 2332 | 2122 | 166 | -44 | | blocksort.c.o | 13850 | 13850 | 13862 | 0 | 12 | | bzip2.c.o | 23540 | 23702 | 23535 | 162 | -5 | | bzip2recover.c.o | 4863 | 4863 | 4865 | 0 | 2 | | bzlib.c.o | 21359 | 21433 | 21393 | 74 | 34 | | compress.c.o | 24424 | 24424 | 24409 | 0 | -15 | | crctable.c.o | 0 | 0 | 0 | 0 | 0 | | decompress.c.o | 23467 | 23467 | 23464 | 0 | -3 | | dlltest.c.o | 1213 | 1213 | 1213 | 0 | 0 | | huffman.c.o | 2180 | 2180 | 2180 | 0 | 0 | | mk251.c.o | 103 | 103 | 103 | 0 | 0 | | randtable.c.o | 0 | 0 | 0 | 0 | 0 | | spewG.c.o | 477 | 477 | 480 | 0 | 3 | | unzcrash.c.o | 1284 | 1284 | 1284 | 0 | 0 | | SUM | 404614 | 405187 | 404897 | 573 | 283 | | ratio | | 1.00141616454201 | 0.99928428108503 | | | +-------------------+--------+------------------+------------------+-----------------+--------------------+ So the patch looks fine, only very very slightly binary is produced. I'm going to install the patch so that I can carry on more complex patches based on this one. Bernhard: I'm able to build only flex and bzip2. Do you know why? I noticed there are other projects in toplevel directory. ./csibe.py -- The C compiler identification is GNU 7.3.1 -- The CXX compiler identification is GNU 7.3.1 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Configuring done -- Generating done -- Build files have been written to: /tmp/csibe/build/native make: Entering directory '/tmp/csibe/build/native' make[1]: Entering directory '/tmp/csibe/build/native' make[2]: Entering directory '/tmp/csibe/build/native' Scanning dependencies of target bzip2-1.0.6 make[2]: Leaving directory '/tmp/csibe/build/native' make[2]: Entering directory '/tmp/csibe/build/native' [ 2%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/blocksort.c.o [ 5%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/bzip2.c.o [ 8%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/bzip2recover.c.o [ 11%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/bzlib.c.o [ 13%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/compress.c.o [ 16%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/crctable.c.o [ 19%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/decompress.c.o [ 22%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/dlltest.c.o [ 25%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/huffman.c.o [ 27%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/mk251.c.o [ 30%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/randtable.c.o [ 33%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/spewG.c.o [ 36%] Building C object gen/bzip2-1.0.6/CMakeFiles/bzip2-1.0.6.dir/__/__/src/bzip2-1.0.6/unzcrash.c.o make[2]: Leaving directory '/tmp/csibe/build/native' [ 36%] Built target bzip2-1.0.6 make[2]: Entering directory '/tmp/csibe/build/native' Scanning dependencies of target flex-2.6.0 make[2]: Leaving directory '/tmp/csibe/build/native' make[2]: Entering directory '/tmp/csibe/build/native' [ 38%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/buf.c.o [ 41%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/ccl.c.o [ 44%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/dfa.c.o [ 47%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/ecs.c.o [ 50%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/filter.c.o [ 52%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/gen.c.o [ 55%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/libmain.c.o [ 58%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/libyywrap.c.o [ 61%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/main.c.o [ 63%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/misc.c.o [ 66%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/nfa.c.o [ 69%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/options.c.o [ 72%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/parse.c.o [ 75%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/regex.c.o [ 77%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/scan.c.o [ 80%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/scanflags.c.o [ 83%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/scanopt.c.o [ 86%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/skel.c.o [ 88%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/sym.c.o [ 91%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/tables.c.o [ 94%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/tables_shared.c.o [ 97%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/tblcmp.c.o [100%] Building C object gen/flex-2.6.0/CMakeFiles/flex-2.6.0.dir/__/__/src/flex-2.6.0/src/yylex.c.o make[2]: Leaving directory '/tmp/csibe/build/native' [100%] Built target flex-2.6.0 make[1]: Leaving directory '/tmp/csibe/build/native' make: Leaving directory '/tmp/csibe/build/native' make: Entering directory '/tmp/csibe/build/native' make[1]: Entering directory '/tmp/csibe/build/native' make[2]: Entering directory '/tmp/csibe/build/native' make[3]: Entering directory '/tmp/csibe/build/native' make[3]: Leaving directory '/tmp/csibe/build/native' [ 60%] Built target flex-2.6.0 make[3]: Entering directory '/tmp/csibe/build/native' Scanning dependencies of target flex-2.6.0_size make[3]: Leaving directory '/tmp/csibe/build/native' make[3]: Entering directory '/tmp/csibe/build/native' [ 63%] Generating result.csv make[3]: Leaving directory '/tmp/csibe/build/native' [ 63%] Built target flex-2.6.0_size make[3]: Entering directory '/tmp/csibe/build/native' make[3]: Leaving directory '/tmp/csibe/build/native' [ 97%] Built target bzip2-1.0.6 make[3]: Entering directory '/tmp/csibe/build/native' Scanning dependencies of target bzip2-1.0.6_size make[3]: Leaving directory '/tmp/csibe/build/native' make[3]: Entering directory '/tmp/csibe/build/native' [100%] Generating result.csv make[3]: Leaving directory '/tmp/csibe/build/native' [100%] Built target bzip2-1.0.6_size make[3]: Entering directory '/tmp/csibe/build/native' Scanning dependencies of target size make[3]: Leaving directory '/tmp/csibe/build/native' make[3]: Entering directory '/tmp/csibe/build/native' make[3]: Leaving directory '/tmp/csibe/build/native' [100%] Built target size make[2]: Leaving directory '/tmp/csibe/build/native' make[1]: Leaving directory '/tmp/csibe/build/native' make: Leaving directory '/tmp/csibe/build/native' Martin From 1ee3fd1cd03a47ad44b7f2b475c3ddcaf2f4c6a9 Mon Sep 17 00:00:00 2001 From: marxin <mliska@suse.cz> Date: Tue, 5 Sep 2017 12:11:13 +0200 Subject: [PATCH] Radically simplify emission of balanced tree for switch statements. gcc/ChangeLog: 2017-09-14 Martin Liska <mliska@suse.cz> * passes.def: Add pass_lower_switch and pass_lower_switch_O0. * tree-pass.h (make_pass_lower_switch_O0): New function. * tree-switch-conversion.c (node_has_low_bound): Remove. (node_has_high_bound): Likewise. (node_is_bounded): Likewise. (class pass_lower_switch): Make it a template type and create two instances. (pass_lower_switch::execute): Add template argument. (make_pass_lower_switch): New function. (make_pass_lower_switch_O0): New function. (do_jump_if_equal): Remove. (emit_case_nodes): Simplify to just handle all 3 cases and leave all the hard work to tree optimization passes. gcc/testsuite/ChangeLog: 2017-09-14 Martin Liska <mliska@suse.cz> * gcc.dg/tree-ssa/vrp104.c: Adjust dump file that is scanned. * gcc.dg/tree-prof/update-loopch.c: Likewise. --- gcc/passes.def | 3 + gcc/testsuite/gcc.dg/tree-prof/update-loopch.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/vrp104.c | 2 +- gcc/tree-pass.h | 1 + gcc/tree-switch-conversion.c | 604 +++---------------------- 5 files changed, 72 insertions(+), 540 deletions(-) diff --git a/gcc/passes.def b/gcc/passes.def index 3ebcfc30349..050009464ea 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -317,6 +317,7 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_lower_switch); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_sprintf_length, true); NEXT_PASS (pass_reassoc, false /* insert_powi_p */); @@ -362,6 +363,7 @@ along with GCC; see the file COPYING3. If not see /* Lower remaining pieces of GIMPLE. */ NEXT_PASS (pass_lower_complex); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_lower_switch); /* Perform simple scalar cleanup which is constant/copy propagation. */ NEXT_PASS (pass_ccp, true /* nonzero_p */); NEXT_PASS (pass_post_ipa_warn); @@ -397,6 +399,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_lower_vaarg); NEXT_PASS (pass_lower_vector); NEXT_PASS (pass_lower_complex_O0); + NEXT_PASS (pass_lower_switch_O0); NEXT_PASS (pass_sancov_O0); NEXT_PASS (pass_lower_switch); NEXT_PASS (pass_asan_O0); diff --git a/gcc/testsuite/gcc.dg/tree-prof/update-loopch.c b/gcc/testsuite/gcc.dg/tree-prof/update-loopch.c index 73efc878ec0..15baada1081 100644 --- a/gcc/testsuite/gcc.dg/tree-prof/update-loopch.c +++ b/gcc/testsuite/gcc.dg/tree-prof/update-loopch.c @@ -1,4 +1,4 @@ -/* { dg-options "-O2 -fdump-ipa-profile-blocks-details -fdump-tree-switchlower-blocks-details" } */ +/* { dg-options "-O2 -fdump-ipa-profile-blocks-details -fdump-tree-switchlower1-blocks-details" } */ int max = 33333; int a[8]; int diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c index d4691fcf041..71fa3bfa2ca 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c @@ -2,7 +2,7 @@ /* { dg-options "-O2 -fdump-tree-switchlower" } */ /* We scan for 2 switches as the dump file reports a transformation, IL really contains just a single. */ -/* { dg-final { scan-tree-dump-times "switch \\(i_" 2 "switchlower" } } */ +/* { dg-final { scan-tree-dump-times "switch" 2 "switchlower1" } } */ void foo (void); void bar (void); diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 93a6a99eb7a..7625d1963ff 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -413,6 +413,7 @@ extern gimple_opt_pass *make_pass_strip_predict_hints (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_complex_O0 (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_complex (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_switch (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_lower_switch_O0 (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector_ssa (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_omp (gcc::context *ctxt); diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c index b0470ef1b5e..dc60b34f506 100644 --- a/gcc/tree-switch-conversion.c +++ b/gcc/tree-switch-conversion.c @@ -1691,9 +1691,6 @@ typedef case_node *case_node_ptr; static basic_block emit_case_nodes (basic_block, tree, case_node_ptr, basic_block, tree, profile_probability, tree, hash_map<tree, tree> *); -static bool node_has_low_bound (case_node_ptr, tree); -static bool node_has_high_bound (case_node_ptr, tree); -static bool node_is_bounded (case_node_ptr, tree); /* Return the smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches. */ @@ -2169,10 +2166,31 @@ try_switch_expansion (gswitch *stmt) namespace { -const pass_data pass_data_lower_switch = +template <bool O0> class pass_lower_switch: public gimple_opt_pass { - GIMPLE_PASS, /* type */ - "switchlower", /* name */ +public: + pass_lower_switch (gcc::context *ctxt) : gimple_opt_pass (data, ctxt) {} + + static const pass_data data; + opt_pass * + clone () + { + return new pass_lower_switch<O0> (m_ctxt); + } + + virtual bool + gate (function *) + { + return !O0 || !optimize; + } + + virtual unsigned int execute (function *fun); +}; // class pass_lower_switch + +template <bool O0> +const pass_data pass_lower_switch<O0>::data = { + GIMPLE_PASS, /* type */ + O0 ? "switchlower_O0" : "switchlower", /* name */ OPTGROUP_NONE, /* optinfo_flags */ TV_TREE_SWITCH_LOWERING, /* tv_id */ ( PROP_cfg | PROP_ssa ), /* properties_required */ @@ -2182,21 +2200,9 @@ const pass_data pass_data_lower_switch = TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */ }; -class pass_lower_switch : public gimple_opt_pass -{ -public: - pass_lower_switch (gcc::context *ctxt) - : gimple_opt_pass (pass_data_lower_switch, ctxt) - {} - - /* opt_pass methods: */ - virtual bool gate (function *) { return true; } - virtual unsigned int execute (function *); - -}; // class pass_lower_switch - +template <bool O0> unsigned int -pass_lower_switch::execute (function *fun) +pass_lower_switch<O0>::execute (function *fun) { basic_block bb; bool expanded = false; @@ -2234,33 +2240,14 @@ pass_lower_switch::execute (function *fun) } // anon namespace gimple_opt_pass * -make_pass_lower_switch (gcc::context *ctxt) +make_pass_lower_switch_O0 (gcc::context *ctxt) { - return new pass_lower_switch (ctxt); + return new pass_lower_switch<true> (ctxt); } - -/* Generate code to jump to LABEL if OP0 and OP1 are equal in mode MODE. - PROB is the probability of jumping to LABEL. */ -static basic_block -do_jump_if_equal (basic_block bb, tree op0, tree op1, basic_block label_bb, - profile_probability prob, hash_map<tree, tree> *phi_mapping) +gimple_opt_pass * +make_pass_lower_switch (gcc::context *ctxt) { - gcond *cond = gimple_build_cond (EQ_EXPR, op0, op1, NULL_TREE, NULL_TREE); - gimple_stmt_iterator gsi = gsi_last_bb (bb); - gsi_insert_before (&gsi, cond, GSI_SAME_STMT); - - gcc_assert (single_succ_p (bb)); - - /* Make a new basic block where false branch will take place. */ - edge false_edge = split_block (bb, cond); - false_edge->flags = EDGE_FALSE_VALUE; - false_edge->probability = prob.invert (); - - edge true_edge = make_edge (bb, label_bb, EDGE_TRUE_VALUE); - fix_phi_operands_for_edge (true_edge, phi_mapping); - true_edge->probability = prob; - - return false_edge->dest; + return new pass_lower_switch<false> (ctxt); } /* Generate code to compare X with Y so that the condition codes are @@ -2323,28 +2310,7 @@ conditional_probability (profile_probability target_prob, /* Emit step-by-step code to select a case for the value of INDEX. The thus generated decision tree follows the form of the case-node binary tree NODE, whose nodes represent test conditions. - INDEX_TYPE is the type of the index of the switch. - - Care is taken to prune redundant tests from the decision tree - by detecting any boundary conditions already checked by - emitted rtx. (See node_has_high_bound, node_has_low_bound - and node_is_bounded, above.) - - Where the test conditions can be shown to be redundant we emit - an unconditional jump to the target code. As a further - optimization, the subordinates of a tree node are examined to - check for bounded nodes. In this case conditional and/or - unconditional jumps as a result of the boundary check for the - current node are arranged to target the subordinates associated - code for out of bound conditions on the current node. - - We can assume that when control reaches the code generated here, - the index value has already been compared with the parents - of this node, and determined to be on the same side of each parent - as this node is. Thus, if this node tests for the value 51, - and a parent tested for 52, we don't need to consider - the possibility of a value greater than 51. If another parent - tests for the value 50, then this node need not test anything. */ + INDEX_TYPE is the type of the index of the switch. */ static basic_block emit_case_nodes (basic_block bb, tree index, case_node_ptr node, @@ -2352,482 +2318,44 @@ emit_case_nodes (basic_block bb, tree index, case_node_ptr node, profile_probability default_prob, tree index_type, hash_map<tree, tree> *phi_mapping) { - /* If INDEX has an unsigned type, we must make unsigned branches. */ - profile_probability probability; - profile_probability prob = node->prob, subtree_prob = node->subtree_prob; - - /* See if our parents have already tested everything for us. - If they have, emit an unconditional jump for this node. */ - if (node_is_bounded (node, index_type)) - { - emit_jump (bb, node->case_bb, phi_mapping); - return NULL; - } - - else if (tree_int_cst_equal (node->low, node->high)) - { - probability = conditional_probability (prob, subtree_prob + default_prob); - /* Node is single valued. First see if the index expression matches - this node and then check our children, if any. */ - bb = do_jump_if_equal (bb, index, node->low, node->case_bb, probability, - phi_mapping); - /* Since this case is taken at this point, reduce its weight from - subtree_weight. */ - subtree_prob -= prob; - if (node->right != 0 && node->left != 0) - { - /* This node has children on both sides. - Dispatch to one side or the other - by comparing the index value with this node's value. - If one subtree is bounded, check that one first, - so we can avoid real branches in the tree. */ - - if (node_is_bounded (node->right, index_type)) - { - probability - = conditional_probability (node->right->prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - node->right->case_bb, probability, - phi_mapping); - bb = emit_case_nodes (bb, index, node->left, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - - else if (node_is_bounded (node->left, index_type)) - { - probability - = conditional_probability (node->left->prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, LT_EXPR, - node->left->case_bb, probability, - phi_mapping); - bb = emit_case_nodes (bb, index, node->right, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - - /* If both children are single-valued cases with no - children, finish up all the work. This way, we can save - one ordered comparison. */ - else if (tree_int_cst_equal (node->right->low, node->right->high) - && node->right->left == 0 && node->right->right == 0 - && tree_int_cst_equal (node->left->low, node->left->high) - && node->left->left == 0 && node->left->right == 0) - { - /* Neither node is bounded. First distinguish the two sides; - then emit the code for one side at a time. */ - - /* See if the value matches what the right hand side - wants. */ - probability - = conditional_probability (node->right->prob, - subtree_prob + default_prob); - bb = do_jump_if_equal (bb, index, node->right->low, - node->right->case_bb, probability, - phi_mapping); - - /* See if the value matches what the left hand side - wants. */ - probability - = conditional_probability (node->left->prob, - subtree_prob + default_prob); - bb = do_jump_if_equal (bb, index, node->left->low, - node->left->case_bb, probability, - phi_mapping); - } - - else - { - /* Neither node is bounded. First distinguish the two sides; - then emit the code for one side at a time. */ - - basic_block test_bb = split_edge (single_succ_edge (bb)); - redirect_edge_succ (single_pred_edge (test_bb), - single_succ_edge (bb)->dest); - - /* The default label could be reached either through the right - subtree or the left subtree. Divide the probability - equally. */ - probability - = conditional_probability (node->right->subtree_prob - + default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - /* See if the value is on the right. */ - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - test_bb, probability, phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - - /* Value must be on the left. - Handle the left-hand subtree. */ - bb = emit_case_nodes (bb, index, node->left, default_bb, - default_label, default_prob, index_type, - phi_mapping); - /* If left-hand subtree does nothing, - go to default. */ - - if (bb && default_bb) - emit_jump (bb, default_bb, phi_mapping); - - /* Code branches here for the right-hand subtree. */ - bb = emit_case_nodes (test_bb, index, node->right, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - } - else if (node->right != 0 && node->left == 0) - { - /* Here we have a right child but no left so we issue a conditional - branch to default and process the right child. - - Omit the conditional branch to default if the right child - does not have any children and is single valued; it would - cost too much space to save so little time. */ - - if (node->right->right || node->right->left - || !tree_int_cst_equal (node->right->low, node->right->high)) - { - if (!node_has_low_bound (node, index_type)) - { - probability - = conditional_probability (default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, LT_EXPR, - default_bb, probability, - phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - } - - bb = emit_case_nodes (bb, index, node->right, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - else - { - probability - = conditional_probability (node->right->subtree_prob, - subtree_prob + default_prob); - /* We cannot process node->right normally - since we haven't ruled out the numbers less than - this node's value. So handle node->right explicitly. */ - bb = do_jump_if_equal (bb, index, node->right->low, - node->right->case_bb, probability, - phi_mapping); - } - } - - else if (node->right == 0 && node->left != 0) - { - /* Just one subtree, on the left. */ - if (node->left->left || node->left->right - || !tree_int_cst_equal (node->left->low, node->left->high)) - { - if (!node_has_high_bound (node, index_type)) - { - probability - = conditional_probability (default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - default_bb, probability, - phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - } - - bb = emit_case_nodes (bb, index, node->left, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - else - { - probability - = conditional_probability (node->left->subtree_prob, - subtree_prob + default_prob); - /* We cannot process node->left normally - since we haven't ruled out the numbers less than - this node's value. So handle node->left explicitly. */ - do_jump_if_equal (bb, index, node->left->low, node->left->case_bb, - probability, phi_mapping); - } - } - } - else - { - /* Node is a range. These cases are very similar to those for a single - value, except that we do not start by testing whether this node - is the one to branch to. */ - - if (node->right != 0 && node->left != 0) - { - /* Node has subtrees on both sides. - If the right-hand subtree is bounded, - test for it first, since we can go straight there. - Otherwise, we need to make a branch in the control structure, - then handle the two subtrees. */ - basic_block test_bb = NULL; - - if (node_is_bounded (node->right, index_type)) - { - /* Right hand node is fully bounded so we can eliminate any - testing and branch directly to the target code. */ - probability - = conditional_probability (node->right->subtree_prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - node->right->case_bb, probability, - phi_mapping); - } - else - { - /* Right hand node requires testing. - Branch to a label where we will handle it later. */ - - test_bb = split_edge (single_succ_edge (bb)); - redirect_edge_succ (single_pred_edge (test_bb), - single_succ_edge (bb)->dest); - - probability - = conditional_probability (node->right->subtree_prob - + default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - test_bb, probability, phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - } - - /* Value belongs to this node or to the left-hand subtree. */ - - probability - = conditional_probability (prob, subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->low, GE_EXPR, - node->case_bb, probability, - phi_mapping); - - /* Handle the left-hand subtree. */ - bb = emit_case_nodes (bb, index, node->left, default_bb, - default_label, default_prob, index_type, + /* If node is null, we are done. */ + if (node == NULL) + return bb; + + /* Branch to a label where we will handle it later. */ + basic_block test_bb = split_edge (single_succ_edge (bb)); + redirect_edge_succ (single_pred_edge (test_bb), + single_succ_edge (bb)->dest); + + profile_probability probability + = node->right ? node->right->subtree_prob : profile_probability::never (); + probability + = conditional_probability (probability + default_prob.apply_scale (1, 2), + node->subtree_prob + default_prob); + bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, + test_bb, probability, phi_mapping); + default_prob = default_prob.apply_scale (1, 2); + + /* Value belongs to this node or to the left-hand subtree. */ + probability + = conditional_probability (node->prob, node->subtree_prob + default_prob); + bb = emit_cmp_and_jump_insns (bb, index, node->low, GE_EXPR, + node->case_bb, probability, phi_mapping); - /* If right node had to be handled later, do that now. */ - if (test_bb) - { - /* If the left-hand subtree fell through, - don't let it fall into the right-hand subtree. */ - if (bb && default_bb) - emit_jump (bb, default_bb, phi_mapping); - - bb = emit_case_nodes (test_bb, index, node->right, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - } - - else if (node->right != 0 && node->left == 0) - { - /* Deal with values to the left of this node, - if they are possible. */ - if (!node_has_low_bound (node, index_type)) - { - probability - = conditional_probability (default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->low, LT_EXPR, - default_bb, probability, - phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - } - - /* Value belongs to this node or to the right-hand subtree. */ - - probability - = conditional_probability (prob, subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, LE_EXPR, - node->case_bb, probability, - phi_mapping); - - bb = emit_case_nodes (bb, index, node->right, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - - else if (node->right == 0 && node->left != 0) - { - /* Deal with values to the right of this node, - if they are possible. */ - if (!node_has_high_bound (node, index_type)) - { - probability - = conditional_probability (default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - default_bb, probability, - phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - } - - /* Value belongs to this node or to the left-hand subtree. */ + /* Handle the left-hand subtree. */ + bb = emit_case_nodes (bb, index, node->left, default_bb, + default_label, default_prob, index_type, + phi_mapping); - probability - = conditional_probability (prob, subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->low, GE_EXPR, - node->case_bb, probability, - phi_mapping); - - bb = emit_case_nodes (bb, index, node->left, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - - else - { - /* Node has no children so we check low and high bounds to remove - redundant tests. Only one of the bounds can exist, - since otherwise this node is bounded--a case tested already. */ - bool high_bound = node_has_high_bound (node, index_type); - bool low_bound = node_has_low_bound (node, index_type); - - if (!high_bound && low_bound) - { - probability - = conditional_probability (default_prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - default_bb, probability, - phi_mapping); - } - - else if (!low_bound && high_bound) - { - probability - = conditional_probability (default_prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->low, LT_EXPR, - default_bb, probability, - phi_mapping); - } - else if (!low_bound && !high_bound) - { - tree lhs, rhs; - generate_range_test (bb, index, node->low, node->high, - &lhs, &rhs); - probability - = conditional_probability (default_prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, lhs, rhs, GT_EXPR, - default_bb, probability, - phi_mapping); - } + /* If the left-hand subtree fell through, + don't let it fall into the right-hand subtree. */ + if (default_bb) + emit_jump (bb, default_bb, phi_mapping); - emit_jump (bb, node->case_bb, phi_mapping); - return NULL; - } - } + bb = emit_case_nodes (test_bb, index, node->right, default_bb, + default_label, default_prob, index_type, + phi_mapping); return bb; } - -/* Search the parent sections of the case node tree - to see if a test for the lower bound of NODE would be redundant. - INDEX_TYPE is the type of the index expression. - - The instructions to generate the case decision tree are - output in the same order as nodes are processed so it is - known that if a parent node checks the range of the current - node minus one that the current node is bounded at its lower - span. Thus the test would be redundant. */ - -static bool -node_has_low_bound (case_node_ptr node, tree index_type) -{ - tree low_minus_one; - case_node_ptr pnode; - - /* If the lower bound of this node is the lowest value in the index type, - we need not test it. */ - - if (tree_int_cst_equal (node->low, TYPE_MIN_VALUE (index_type))) - return true; - - /* If this node has a left branch, the value at the left must be less - than that at this node, so it cannot be bounded at the bottom and - we need not bother testing any further. */ - - if (node->left) - return false; - - low_minus_one = fold_build2 (MINUS_EXPR, TREE_TYPE (node->low), node->low, - build_int_cst (TREE_TYPE (node->low), 1)); - - /* If the subtraction above overflowed, we can't verify anything. - Otherwise, look for a parent that tests our value - 1. */ - - if (!tree_int_cst_lt (low_minus_one, node->low)) - return false; - - for (pnode = node->parent; pnode; pnode = pnode->parent) - if (tree_int_cst_equal (low_minus_one, pnode->high)) - return true; - - return false; -} - -/* Search the parent sections of the case node tree - to see if a test for the upper bound of NODE would be redundant. - INDEX_TYPE is the type of the index expression. - - The instructions to generate the case decision tree are - output in the same order as nodes are processed so it is - known that if a parent node checks the range of the current - node plus one that the current node is bounded at its upper - span. Thus the test would be redundant. */ - -static bool -node_has_high_bound (case_node_ptr node, tree index_type) -{ - tree high_plus_one; - case_node_ptr pnode; - - /* If there is no upper bound, obviously no test is needed. */ - - if (TYPE_MAX_VALUE (index_type) == NULL) - return true; - - /* If the upper bound of this node is the highest value in the type - of the index expression, we need not test against it. */ - - if (tree_int_cst_equal (node->high, TYPE_MAX_VALUE (index_type))) - return true; - - /* If this node has a right branch, the value at the right must be greater - than that at this node, so it cannot be bounded at the top and - we need not bother testing any further. */ - - if (node->right) - return false; - - high_plus_one = fold_build2 (PLUS_EXPR, TREE_TYPE (node->high), node->high, - build_int_cst (TREE_TYPE (node->high), 1)); - - /* If the addition above overflowed, we can't verify anything. - Otherwise, look for a parent that tests our value + 1. */ - - if (!tree_int_cst_lt (node->high, high_plus_one)) - return false; - - for (pnode = node->parent; pnode; pnode = pnode->parent) - if (tree_int_cst_equal (high_plus_one, pnode->low)) - return true; - - return false; -} - -/* Search the parent sections of the - case node tree to see if both tests for the upper and lower - bounds of NODE would be redundant. */ - -static bool -node_is_bounded (case_node_ptr node, tree index_type) -{ - return (node_has_low_bound (node, index_type) - && node_has_high_bound (node, index_type)); -}
Hi Martin, > So the patch looks fine, only very very slightly binary is produced. I'm > going to install the patch so that > I can carry on more complex patches based on this one. it seems you didn't properly test the testsuite part: I see +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower "Removing basic block" +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower "loop depth 1, count 33333" +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not switchlower "Invalid sum" +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not switchlower "loop depth 1, count 33332" everywhere. The log has gcc.dg/tree-prof/update-loopch.c: dump file does not exist Obviously you forgot the adapt the dg-final* files for the dumpfile name. If I do, three of the failures go away, but FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" remains (on 32 and 64-bit Linux/x86_64). Please fix. Rainer
On 05/18/2018 03:55 PM, Rainer Orth wrote: > Hi Martin, > >> So the patch looks fine, only very very slightly binary is produced. I'm >> going to install the patch so that >> I can carry on more complex patches based on this one. > > it seems you didn't properly test the testsuite part: I see > > +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower "Removing basic block" > +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower "loop depth 1, count 33333" > +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not switchlower "Invalid sum" > +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not switchlower "loop depth 1, count 33332" > > everywhere. The log has > > gcc.dg/tree-prof/update-loopch.c: dump file does not exist > > Obviously you forgot the adapt the dg-final* files for the dumpfile > name. If I do, three of the failures go away, but > > FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" > > remains (on 32 and 64-bit Linux/x86_64). > > Please fix. > > Rainer > Thanks for opened eyes, following patch will fix that. It's quite obvious, I'll install it right after tests will finish. Martin From 7ae0c7d4a81166dbf5e9dff5d35e4c9d63b1c058 Mon Sep 17 00:00:00 2001 From: marxin <mliska@suse.cz> Date: Fri, 18 May 2018 16:17:57 +0200 Subject: [PATCH] Remove redundand pass pass_lower_switch. gcc/ChangeLog: 2018-05-18 Martin Liska <mliska@suse.cz> * passes.def: Remove a redundant pass. --- gcc/passes.def | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/gcc/passes.def b/gcc/passes.def index 050009464ea..055d354f959 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -399,9 +399,8 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_lower_vaarg); NEXT_PASS (pass_lower_vector); NEXT_PASS (pass_lower_complex_O0); - NEXT_PASS (pass_lower_switch_O0); NEXT_PASS (pass_sancov_O0); - NEXT_PASS (pass_lower_switch); + NEXT_PASS (pass_lower_switch_O0); NEXT_PASS (pass_asan_O0); NEXT_PASS (pass_tsan_O0); NEXT_PASS (pass_sanopt);
Hi Martin, > On 05/18/2018 03:55 PM, Rainer Orth wrote: >> Hi Martin, >> >>> So the patch looks fine, only very very slightly binary is produced. I'm >>> going to install the patch so that >>> I can carry on more complex patches based on this one. >> >> it seems you didn't properly test the testsuite part: I see >> >> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower >> "Removing basic block" >> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower >> "loop depth 1, count 33333" >> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not >> switchlower "Invalid sum" >> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not >> switchlower "loop depth 1, count 33332" >> >> everywhere. The log has >> >> gcc.dg/tree-prof/update-loopch.c: dump file does not exist >> >> Obviously you forgot the adapt the dg-final* files for the dumpfile >> name. If I do, three of the failures go away, but >> >> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 >> "Removing basic block" >> >> remains (on 32 and 64-bit Linux/x86_64). >> >> Please fix. >> >> Rainer >> > > Thanks for opened eyes, following patch will fix that. > It's quite obvious, I'll install it right after tests will finish. unfortunately, it didn't fix either issue: * The switchlower -> switchlower1 renames in the dg-final* lines (attached) are still necessary to avoid the UNRESOLVED errors. Although obvious, I haven't installed them since ... * ... even so FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" remains. Rainer > From 7ae0c7d4a81166dbf5e9dff5d35e4c9d63b1c058 Mon Sep 17 00:00:00 2001 > From: marxin <mliska@suse.cz> > Date: Fri, 18 May 2018 16:17:57 +0200 > Subject: [PATCH] Remove redundand pass pass_lower_switch. > > gcc/ChangeLog: > > 2018-05-18 Martin Liska <mliska@suse.cz> > > * passes.def: Remove a redundant pass. > --- > gcc/passes.def | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/gcc/passes.def b/gcc/passes.def > index 050009464ea..055d354f959 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -399,9 +399,8 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_lower_vaarg); > NEXT_PASS (pass_lower_vector); > NEXT_PASS (pass_lower_complex_O0); > - NEXT_PASS (pass_lower_switch_O0); > NEXT_PASS (pass_sancov_O0); > - NEXT_PASS (pass_lower_switch); > + NEXT_PASS (pass_lower_switch_O0); > NEXT_PASS (pass_asan_O0); > NEXT_PASS (pass_tsan_O0); > NEXT_PASS (pass_sanopt);
On 05/21/2018 01:18 PM, Rainer Orth wrote: > Hi Martin, > >> On 05/18/2018 03:55 PM, Rainer Orth wrote: >>> Hi Martin, >>> >>>> So the patch looks fine, only very very slightly binary is produced. I'm >>>> going to install the patch so that >>>> I can carry on more complex patches based on this one. >>> >>> it seems you didn't properly test the testsuite part: I see >>> >>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower >>> "Removing basic block" >>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower >>> "loop depth 1, count 33333" >>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not >>> switchlower "Invalid sum" >>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not >>> switchlower "loop depth 1, count 33332" >>> >>> everywhere. The log has >>> >>> gcc.dg/tree-prof/update-loopch.c: dump file does not exist >>> >>> Obviously you forgot the adapt the dg-final* files for the dumpfile >>> name. If I do, three of the failures go away, but >>> >>> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 >>> "Removing basic block" >>> >>> remains (on 32 and 64-bit Linux/x86_64). >>> >>> Please fix. >>> >>> Rainer >>> >> >> Thanks for opened eyes, following patch will fix that. >> It's quite obvious, I'll install it right after tests will finish. > > unfortunately, it didn't fix either issue: > > * The switchlower -> switchlower1 renames in the dg-final* lines > (attached) are still necessary to avoid the UNRESOLVED errors. > Although obvious, I haven't installed them since ... > > * ... even so > > FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" > > remains. > > Rainer Hi. You are right, it's using -O2, thus your patch is right. Please install the patch after testing. It's obvious fix. Martin > > >> From 7ae0c7d4a81166dbf5e9dff5d35e4c9d63b1c058 Mon Sep 17 00:00:00 2001 >> From: marxin <mliska@suse.cz> >> Date: Fri, 18 May 2018 16:17:57 +0200 >> Subject: [PATCH] Remove redundand pass pass_lower_switch. >> >> gcc/ChangeLog: >> >> 2018-05-18 Martin Liska <mliska@suse.cz> >> >> * passes.def: Remove a redundant pass. >> --- >> gcc/passes.def | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/gcc/passes.def b/gcc/passes.def >> index 050009464ea..055d354f959 100644 >> --- a/gcc/passes.def >> +++ b/gcc/passes.def >> @@ -399,9 +399,8 @@ along with GCC; see the file COPYING3. If not see >> NEXT_PASS (pass_lower_vaarg); >> NEXT_PASS (pass_lower_vector); >> NEXT_PASS (pass_lower_complex_O0); >> - NEXT_PASS (pass_lower_switch_O0); >> NEXT_PASS (pass_sancov_O0); >> - NEXT_PASS (pass_lower_switch); >> + NEXT_PASS (pass_lower_switch_O0); >> NEXT_PASS (pass_asan_O0); >> NEXT_PASS (pass_tsan_O0); >> NEXT_PASS (pass_sanopt); >
Hi Martin, >>> Thanks for opened eyes, following patch will fix that. >>> It's quite obvious, I'll install it right after tests will finish. >> >> unfortunately, it didn't fix either issue: >> >> * The switchlower -> switchlower1 renames in the dg-final* lines >> (attached) are still necessary to avoid the UNRESOLVED errors. >> Although obvious, I haven't installed them since ... >> >> * ... even so >> >> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" >> >> remains. [...] > You are right, it's using -O2, thus your patch is right. Please install the > patch > after testing. It's obvious fix. But what about the remaining FAIL? Rainer
On 21/05/18 15:00, Rainer Orth wrote: > Hi Martin, > >>>> Thanks for opened eyes, following patch will fix that. >>>> It's quite obvious, I'll install it right after tests will finish. >>> >>> unfortunately, it didn't fix either issue: >>> >>> * The switchlower -> switchlower1 renames in the dg-final* lines >>> (attached) are still necessary to avoid the UNRESOLVED errors. >>> Although obvious, I haven't installed them since ... >>> >>> * ... even so >>> >>> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" >>> >>> remains. > [...] >> You are right, it's using -O2, thus your patch is right. Please install the >> patch >> after testing. It's obvious fix. > > But what about the remaining FAIL? > Sorry to add to this, but I have also observed the following failures on aarch64-none-elf, aarch64-none-linux-gnu and aarch64_be-none-elf targets bisected to this commit: FAIL: gcc.dg/sancov/cmp0.c -O0 scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_const_cmp" 7 FAIL: gcc.dg/sancov/cmp0.c -O0 scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_switch \\(" 2 FAIL: gcc.dg/sancov/cmp0.c -O0 -g scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_const_cmp" 7 FAIL: gcc.dg/sancov/cmp0.c -O0 -g scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_switch \\(" 2 FAIL: gcc.dg/tree-ssa/pr77445-2.c scan-tree-dump-not thread3 "not considered" FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps threaded" Sudi > Rainer >
Hi Martin, > On 05/21/2018 01:18 PM, Rainer Orth wrote: >> Hi Martin, >> >>> On 05/18/2018 03:55 PM, Rainer Orth wrote: >>>> Hi Martin, >>>> >>>>> So the patch looks fine, only very very slightly binary is produced. I'm >>>>> going to install the patch so that >>>>> I can carry on more complex patches based on this one. >>>> >>>> it seems you didn't properly test the testsuite part: I see >>>> >>>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower >>>> "Removing basic block" >>>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower >>>> "loop depth 1, count 33333" >>>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not >>>> switchlower "Invalid sum" >>>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not >>>> switchlower "loop depth 1, count 33332" >>>> >>>> everywhere. The log has >>>> >>>> gcc.dg/tree-prof/update-loopch.c: dump file does not exist >>>> >>>> Obviously you forgot the adapt the dg-final* files for the dumpfile >>>> name. If I do, three of the failures go away, but >>>> >>>> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 >>>> "Removing basic block" >>>> >>>> remains (on 32 and 64-bit Linux/x86_64). >>>> >>>> Please fix. >>>> >>>> Rainer >>>> >>> >>> Thanks for opened eyes, following patch will fix that. >>> It's quite obvious, I'll install it right after tests will finish. >> >> unfortunately, it didn't fix either issue: >> >> * The switchlower -> switchlower1 renames in the dg-final* lines >> (attached) are still necessary to avoid the UNRESOLVED errors. >> Although obvious, I haven't installed them since ... >> >> * ... even so >> >> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" >> >> remains. >> >> Rainer > > Hi. > > You are right, it's using -O2, thus your patch is right. Please install the > patch > after testing. It's obvious fix. I've now installed the fix for the dumpfile renaming. However, you've still done nothing about the remaining failure. Rainer
On 05/21/2018 04:42 PM, Sudakshina Das wrote: > On 21/05/18 15:00, Rainer Orth wrote: >> Hi Martin, >> >>>>> Thanks for opened eyes, following patch will fix that. >>>>> It's quite obvious, I'll install it right after tests will finish. >>>> >>>> unfortunately, it didn't fix either issue: >>>> >>>> * The switchlower -> switchlower1 renames in the dg-final* lines >>>> (attached) are still necessary to avoid the UNRESOLVED errors. >>>> Although obvious, I haven't installed them since ... >>>> >>>> * ... even so >>>> >>>> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" >>>> >>>> remains. >> [...] >>> You are right, it's using -O2, thus your patch is right. Please install the >>> patch >>> after testing. It's obvious fix. >> >> But what about the remaining FAIL? >> > > Sorry to add to this, but I have also observed the following failures on aarch64-none-elf, aarch64-none-linux-gnu and aarch64_be-none-elf targets bisected to this commit: > > FAIL: gcc.dg/sancov/cmp0.c -O0 scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_const_cmp" 7 > > FAIL: gcc.dg/sancov/cmp0.c -O0 scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_switch \\(" 2 > > FAIL: gcc.dg/sancov/cmp0.c -O0 -g scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_const_cmp" 7 > > FAIL: gcc.dg/sancov/cmp0.c -O0 -g scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_switch \\(" 2 > > FAIL: gcc.dg/tree-ssa/pr77445-2.c scan-tree-dump-not thread3 "not considered" > > FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps threaded" > > Sudi Hi. Sorry for the breakage, I'll fix it in couple of days. Martin > >> Rainer >> >
On 05/24/2018 02:28 PM, Rainer Orth wrote: > Hi Martin, > >> On 05/21/2018 01:18 PM, Rainer Orth wrote: >>> Hi Martin, >>> >>>> On 05/18/2018 03:55 PM, Rainer Orth wrote: >>>>> Hi Martin, >>>>> >>>>>> So the patch looks fine, only very very slightly binary is produced. I'm >>>>>> going to install the patch so that >>>>>> I can carry on more complex patches based on this one. >>>>> >>>>> it seems you didn't properly test the testsuite part: I see >>>>> >>>>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower >>>>> "Removing basic block" >>>>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower >>>>> "loop depth 1, count 33333" >>>>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not >>>>> switchlower "Invalid sum" >>>>> +UNRESOLVED: gcc.dg/tree-prof/update-loopch.c scan-tree-dump-not >>>>> switchlower "loop depth 1, count 33332" >>>>> >>>>> everywhere. The log has >>>>> >>>>> gcc.dg/tree-prof/update-loopch.c: dump file does not exist >>>>> >>>>> Obviously you forgot the adapt the dg-final* files for the dumpfile >>>>> name. If I do, three of the failures go away, but >>>>> >>>>> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 >>>>> "Removing basic block" >>>>> >>>>> remains (on 32 and 64-bit Linux/x86_64). >>>>> >>>>> Please fix. >>>>> >>>>> Rainer >>>>> >>>> >>>> Thanks for opened eyes, following patch will fix that. >>>> It's quite obvious, I'll install it right after tests will finish. >>> >>> unfortunately, it didn't fix either issue: >>> >>> * The switchlower -> switchlower1 renames in the dg-final* lines >>> (attached) are still necessary to avoid the UNRESOLVED errors. >>> Although obvious, I haven't installed them since ... >>> >>> * ... even so >>> >>> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" >>> >>> remains. >>> >>> Rainer >> >> Hi. >> >> You are right, it's using -O2, thus your patch is right. Please install the >> patch >> after testing. It's obvious fix. > > I've now installed the fix for the dumpfile renaming. However, you've > still done nothing about the remaining failure. Thanks. Is the last remaining one: gcc.dg/tree-prof/update-loopch.c? Thanks, Martin > > Rainer >
On 05/21/2018 04:42 PM, Sudakshina Das wrote: > On 21/05/18 15:00, Rainer Orth wrote: >> Hi Martin, >> >>>>> Thanks for opened eyes, following patch will fix that. >>>>> It's quite obvious, I'll install it right after tests will finish. >>>> >>>> unfortunately, it didn't fix either issue: >>>> >>>> * The switchlower -> switchlower1 renames in the dg-final* lines >>>> (attached) are still necessary to avoid the UNRESOLVED errors. >>>> Although obvious, I haven't installed them since ... >>>> >>>> * ... even so >>>> >>>> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" >>>> >>>> remains. >> [...] >>> You are right, it's using -O2, thus your patch is right. Please install the >>> patch >>> after testing. It's obvious fix. >> >> But what about the remaining FAIL? >> > > Sorry to add to this, but I have also observed the following failures on aarch64-none-elf, aarch64-none-linux-gnu and aarch64_be-none-elf targets bisected to this commit: > > FAIL: gcc.dg/sancov/cmp0.c -O0 scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_const_cmp" 7 > > FAIL: gcc.dg/sancov/cmp0.c -O0 scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_switch \\(" 2 > > FAIL: gcc.dg/sancov/cmp0.c -O0 -g scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_const_cmp" 7 > > FAIL: gcc.dg/sancov/cmp0.c -O0 -g scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_switch \\(" 2 Hi. I've just tested sancov tests on my aarch64 and cmp0.c looks fine. Can you please tell me which -march, -mtune does your board have? > > FAIL: gcc.dg/tree-ssa/pr77445-2.c scan-tree-dump-not thread3 "not considered" > > FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps threaded" I can confirm these 2. It's kind of expected, I will clean it up before next release. Jeff is aware of that.. Martin > > Sudi > >> Rainer >> >
Hi Martin On 25/05/18 10:45, Martin Liška wrote: > On 05/21/2018 04:42 PM, Sudakshina Das wrote: >> On 21/05/18 15:00, Rainer Orth wrote: >>> Hi Martin, >>> >>>>>> Thanks for opened eyes, following patch will fix that. >>>>>> It's quite obvious, I'll install it right after tests will finish. >>>>> >>>>> unfortunately, it didn't fix either issue: >>>>> >>>>> * The switchlower -> switchlower1 renames in the dg-final* lines >>>>> (attached) are still necessary to avoid the UNRESOLVED errors. >>>>> Although obvious, I haven't installed them since ... >>>>> >>>>> * ... even so >>>>> >>>>> FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" >>>>> >>>>> remains. >>> [...] >>>> You are right, it's using -O2, thus your patch is right. Please install the >>>> patch >>>> after testing. It's obvious fix. >>> >>> But what about the remaining FAIL? >>> >> >> Sorry to add to this, but I have also observed the following failures on aarch64-none-elf, aarch64-none-linux-gnu and aarch64_be-none-elf targets bisected to this commit: >> >> FAIL: gcc.dg/sancov/cmp0.c -O0 scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_const_cmp" 7 >> >> FAIL: gcc.dg/sancov/cmp0.c -O0 scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_switch \\(" 2 >> >> FAIL: gcc.dg/sancov/cmp0.c -O0 -g scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_const_cmp" 7 >> >> FAIL: gcc.dg/sancov/cmp0.c -O0 -g scan-tree-dump-times optimized "__builtin___sanitizer_cov_trace_switch \\(" 2 > > Hi. > > I've just tested sancov tests on my aarch64 and cmp0.c looks fine. Can you please tell me which -march, -mtune does > your board have? > >> >> FAIL: gcc.dg/tree-ssa/pr77445-2.c scan-tree-dump-not thread3 "not considered" >> >> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps threaded" > > I can confirm these 2. It's kind of expected, I will clean it up before next release. Jeff is aware > of that.. > > Martin > From my today's build, I only see the following remaining now: FAIL: gcc.dg/tree-prof/update-loopch.c scan-tree-dump switchlower1 "Removing basic block" FAIL: gcc.dg/tree-ssa/pr77445-2.c scan-tree-dump-not thread3 "not considered" FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps threaded" Sudi >> >> Sudi >> >>> Rainer >>> >> >
diff --git a/gcc/passes.def b/gcc/passes.def index 00e75d2b55a..bb371d9bde5 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -314,6 +314,7 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_lower_switch); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_sprintf_length, true); NEXT_PASS (pass_reassoc, false /* insert_powi_p */); @@ -358,6 +359,7 @@ along with GCC; see the file COPYING3. If not see /* Lower remaining pieces of GIMPLE. */ NEXT_PASS (pass_lower_complex); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_lower_switch); /* Perform simple scalar cleanup which is constant/copy propagation. */ NEXT_PASS (pass_ccp, true /* nonzero_p */); NEXT_PASS (pass_post_ipa_warn); @@ -393,7 +395,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_lower_vaarg); NEXT_PASS (pass_lower_vector); NEXT_PASS (pass_lower_complex_O0); - NEXT_PASS (pass_lower_switch); + NEXT_PASS (pass_lower_switch_O0); NEXT_PASS (pass_sancov_O0); NEXT_PASS (pass_asan_O0); NEXT_PASS (pass_tsan_O0); diff --git a/gcc/testsuite/gcc.dg/tree-prof/update-loopch.c b/gcc/testsuite/gcc.dg/tree-prof/update-loopch.c index 73efc878ec0..15baada1081 100644 --- a/gcc/testsuite/gcc.dg/tree-prof/update-loopch.c +++ b/gcc/testsuite/gcc.dg/tree-prof/update-loopch.c @@ -1,4 +1,4 @@ -/* { dg-options "-O2 -fdump-ipa-profile-blocks-details -fdump-tree-switchlower-blocks-details" } */ +/* { dg-options "-O2 -fdump-ipa-profile-blocks-details -fdump-tree-switchlower1-blocks-details" } */ int max = 33333; int a[8]; int diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c index 0a952267b29..71fa3bfa2ca 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c @@ -2,7 +2,7 @@ /* { dg-options "-O2 -fdump-tree-switchlower" } */ /* We scan for 2 switches as the dump file reports a transformation, IL really contains just a single. */ -/* { dg-final { scan-tree-dump-times "switch" 2 "switchlower" } } */ +/* { dg-final { scan-tree-dump-times "switch" 2 "switchlower1" } } */ void foo (void); void bar (void); diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 9f76d822abc..6ae65765431 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -410,6 +410,7 @@ extern gimple_opt_pass *make_pass_strip_predict_hints (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_complex_O0 (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_complex (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_switch (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_lower_switch_O0 (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_vector_ssa (gcc::context *ctxt); extern gimple_opt_pass *make_pass_lower_omp (gcc::context *ctxt); diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c index 6d7c2c4902f..f67531b2620 100644 --- a/gcc/tree-switch-conversion.c +++ b/gcc/tree-switch-conversion.c @@ -1685,9 +1685,6 @@ typedef case_node *case_node_ptr; static basic_block emit_case_nodes (basic_block, tree, case_node_ptr, basic_block, tree, profile_probability, tree, hash_map<tree, tree> *); -static bool node_has_low_bound (case_node_ptr, tree); -static bool node_has_high_bound (case_node_ptr, tree); -static bool node_is_bounded (case_node_ptr, tree); /* Return the smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches. */ @@ -2162,10 +2159,31 @@ try_switch_expansion (gswitch *stmt) namespace { -const pass_data pass_data_lower_switch = +template <bool O0> class pass_lower_switch: public gimple_opt_pass { - GIMPLE_PASS, /* type */ - "switchlower", /* name */ +public: + pass_lower_switch (gcc::context *ctxt) : gimple_opt_pass (data, ctxt) {} + + static const pass_data data; + opt_pass * + clone () + { + return new pass_lower_switch<O0> (m_ctxt); + } + + virtual bool + gate (function *) + { + return !O0 || !optimize; + } + + virtual unsigned int execute (function *fun); +}; // class pass_lower_switch + +template <bool O0> +const pass_data pass_lower_switch<O0>::data = { + GIMPLE_PASS, /* type */ + O0 ? "switchlower_O0" : "switchlower", /* name */ OPTGROUP_NONE, /* optinfo_flags */ TV_TREE_SWITCH_LOWERING, /* tv_id */ ( PROP_cfg | PROP_ssa ), /* properties_required */ @@ -2175,21 +2193,9 @@ const pass_data pass_data_lower_switch = TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */ }; -class pass_lower_switch : public gimple_opt_pass -{ -public: - pass_lower_switch (gcc::context *ctxt) - : gimple_opt_pass (pass_data_lower_switch, ctxt) - {} - - /* opt_pass methods: */ - virtual bool gate (function *) { return true; } - virtual unsigned int execute (function *); - -}; // class pass_lower_switch - +template <bool O0> unsigned int -pass_lower_switch::execute (function *fun) +pass_lower_switch<O0>::execute (function *fun) { basic_block bb; bool expanded = false; @@ -2227,33 +2233,14 @@ pass_lower_switch::execute (function *fun) } // anon namespace gimple_opt_pass * -make_pass_lower_switch (gcc::context *ctxt) +make_pass_lower_switch_O0 (gcc::context *ctxt) { - return new pass_lower_switch (ctxt); + return new pass_lower_switch<true> (ctxt); } - -/* Generate code to jump to LABEL if OP0 and OP1 are equal in mode MODE. - PROB is the probability of jumping to LABEL. */ -static basic_block -do_jump_if_equal (basic_block bb, tree op0, tree op1, basic_block label_bb, - profile_probability prob, hash_map<tree, tree> *phi_mapping) +gimple_opt_pass * +make_pass_lower_switch (gcc::context *ctxt) { - gcond *cond = gimple_build_cond (EQ_EXPR, op0, op1, NULL_TREE, NULL_TREE); - gimple_stmt_iterator gsi = gsi_last_bb (bb); - gsi_insert_before (&gsi, cond, GSI_SAME_STMT); - - gcc_assert (single_succ_p (bb)); - - /* Make a new basic block where false branch will take place. */ - edge false_edge = split_block (bb, cond); - false_edge->flags = EDGE_FALSE_VALUE; - false_edge->probability = prob.invert (); - - edge true_edge = make_edge (bb, label_bb, EDGE_TRUE_VALUE); - fix_phi_operands_for_edge (true_edge, phi_mapping); - true_edge->probability = prob; - - return false_edge->dest; + return new pass_lower_switch<false> (ctxt); } /* Generate code to compare X with Y so that the condition codes are @@ -2316,28 +2303,7 @@ conditional_probability (profile_probability target_prob, /* Emit step-by-step code to select a case for the value of INDEX. The thus generated decision tree follows the form of the case-node binary tree NODE, whose nodes represent test conditions. - INDEX_TYPE is the type of the index of the switch. - - Care is taken to prune redundant tests from the decision tree - by detecting any boundary conditions already checked by - emitted rtx. (See node_has_high_bound, node_has_low_bound - and node_is_bounded, above.) - - Where the test conditions can be shown to be redundant we emit - an unconditional jump to the target code. As a further - optimization, the subordinates of a tree node are examined to - check for bounded nodes. In this case conditional and/or - unconditional jumps as a result of the boundary check for the - current node are arranged to target the subordinates associated - code for out of bound conditions on the current node. - - We can assume that when control reaches the code generated here, - the index value has already been compared with the parents - of this node, and determined to be on the same side of each parent - as this node is. Thus, if this node tests for the value 51, - and a parent tested for 52, we don't need to consider - the possibility of a value greater than 51. If another parent - tests for the value 50, then this node need not test anything. */ + INDEX_TYPE is the type of the index of the switch. */ static basic_block emit_case_nodes (basic_block bb, tree index, case_node_ptr node, @@ -2345,482 +2311,44 @@ emit_case_nodes (basic_block bb, tree index, case_node_ptr node, profile_probability default_prob, tree index_type, hash_map<tree, tree> *phi_mapping) { - /* If INDEX has an unsigned type, we must make unsigned branches. */ - profile_probability probability; - profile_probability prob = node->prob, subtree_prob = node->subtree_prob; - - /* See if our parents have already tested everything for us. - If they have, emit an unconditional jump for this node. */ - if (node_is_bounded (node, index_type)) - { - emit_jump (bb, node->case_bb, phi_mapping); - return NULL; - } - - else if (tree_int_cst_equal (node->low, node->high)) - { - probability = conditional_probability (prob, subtree_prob + default_prob); - /* Node is single valued. First see if the index expression matches - this node and then check our children, if any. */ - bb = do_jump_if_equal (bb, index, node->low, node->case_bb, probability, - phi_mapping); - /* Since this case is taken at this point, reduce its weight from - subtree_weight. */ - subtree_prob -= prob; - if (node->right != 0 && node->left != 0) - { - /* This node has children on both sides. - Dispatch to one side or the other - by comparing the index value with this node's value. - If one subtree is bounded, check that one first, - so we can avoid real branches in the tree. */ - - if (node_is_bounded (node->right, index_type)) - { - probability - = conditional_probability (node->right->prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - node->right->case_bb, probability, - phi_mapping); - bb = emit_case_nodes (bb, index, node->left, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - - else if (node_is_bounded (node->left, index_type)) - { - probability - = conditional_probability (node->left->prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, LT_EXPR, - node->left->case_bb, probability, - phi_mapping); - bb = emit_case_nodes (bb, index, node->right, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - - /* If both children are single-valued cases with no - children, finish up all the work. This way, we can save - one ordered comparison. */ - else if (tree_int_cst_equal (node->right->low, node->right->high) - && node->right->left == 0 && node->right->right == 0 - && tree_int_cst_equal (node->left->low, node->left->high) - && node->left->left == 0 && node->left->right == 0) - { - /* Neither node is bounded. First distinguish the two sides; - then emit the code for one side at a time. */ - - /* See if the value matches what the right hand side - wants. */ - probability - = conditional_probability (node->right->prob, - subtree_prob + default_prob); - bb = do_jump_if_equal (bb, index, node->right->low, - node->right->case_bb, probability, - phi_mapping); - - /* See if the value matches what the left hand side - wants. */ - probability - = conditional_probability (node->left->prob, - subtree_prob + default_prob); - bb = do_jump_if_equal (bb, index, node->left->low, - node->left->case_bb, probability, - phi_mapping); - } - - else - { - /* Neither node is bounded. First distinguish the two sides; - then emit the code for one side at a time. */ - - basic_block test_bb = split_edge (single_succ_edge (bb)); - redirect_edge_succ (single_pred_edge (test_bb), - single_succ_edge (bb)->dest); - - /* The default label could be reached either through the right - subtree or the left subtree. Divide the probability - equally. */ - probability - = conditional_probability (node->right->subtree_prob - + default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - /* See if the value is on the right. */ - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - test_bb, probability, phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - - /* Value must be on the left. - Handle the left-hand subtree. */ - bb = emit_case_nodes (bb, index, node->left, default_bb, - default_label, default_prob, index_type, - phi_mapping); - /* If left-hand subtree does nothing, - go to default. */ - - if (bb && default_bb) - emit_jump (bb, default_bb, phi_mapping); - - /* Code branches here for the right-hand subtree. */ - bb = emit_case_nodes (test_bb, index, node->right, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - } - else if (node->right != 0 && node->left == 0) - { - /* Here we have a right child but no left so we issue a conditional - branch to default and process the right child. - - Omit the conditional branch to default if the right child - does not have any children and is single valued; it would - cost too much space to save so little time. */ - - if (node->right->right || node->right->left - || !tree_int_cst_equal (node->right->low, node->right->high)) - { - if (!node_has_low_bound (node, index_type)) - { - probability - = conditional_probability (default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, LT_EXPR, - default_bb, probability, - phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - } - - bb = emit_case_nodes (bb, index, node->right, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - else - { - probability - = conditional_probability (node->right->subtree_prob, - subtree_prob + default_prob); - /* We cannot process node->right normally - since we haven't ruled out the numbers less than - this node's value. So handle node->right explicitly. */ - bb = do_jump_if_equal (bb, index, node->right->low, - node->right->case_bb, probability, - phi_mapping); - } - } - - else if (node->right == 0 && node->left != 0) - { - /* Just one subtree, on the left. */ - if (node->left->left || node->left->right - || !tree_int_cst_equal (node->left->low, node->left->high)) - { - if (!node_has_high_bound (node, index_type)) - { - probability - = conditional_probability (default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - default_bb, probability, - phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - } - - bb = emit_case_nodes (bb, index, node->left, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - else - { - probability - = conditional_probability (node->left->subtree_prob, - subtree_prob + default_prob); - /* We cannot process node->left normally - since we haven't ruled out the numbers less than - this node's value. So handle node->left explicitly. */ - do_jump_if_equal (bb, index, node->left->low, node->left->case_bb, - probability, phi_mapping); - } - } - } - else - { - /* Node is a range. These cases are very similar to those for a single - value, except that we do not start by testing whether this node - is the one to branch to. */ - - if (node->right != 0 && node->left != 0) - { - /* Node has subtrees on both sides. - If the right-hand subtree is bounded, - test for it first, since we can go straight there. - Otherwise, we need to make a branch in the control structure, - then handle the two subtrees. */ - basic_block test_bb = NULL; - - if (node_is_bounded (node->right, index_type)) - { - /* Right hand node is fully bounded so we can eliminate any - testing and branch directly to the target code. */ - probability - = conditional_probability (node->right->subtree_prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - node->right->case_bb, probability, - phi_mapping); - } - else - { - /* Right hand node requires testing. - Branch to a label where we will handle it later. */ - - test_bb = split_edge (single_succ_edge (bb)); - redirect_edge_succ (single_pred_edge (test_bb), - single_succ_edge (bb)->dest); - - probability - = conditional_probability (node->right->subtree_prob - + default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - test_bb, probability, phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - } - - /* Value belongs to this node or to the left-hand subtree. */ - - probability - = conditional_probability (prob, subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->low, GE_EXPR, - node->case_bb, probability, - phi_mapping); - - /* Handle the left-hand subtree. */ - bb = emit_case_nodes (bb, index, node->left, default_bb, - default_label, default_prob, index_type, + /* If node is null, we are done. */ + if (node == NULL) + return bb; + + /* Branch to a label where we will handle it later. */ + basic_block test_bb = split_edge (single_succ_edge (bb)); + redirect_edge_succ (single_pred_edge (test_bb), + single_succ_edge (bb)->dest); + + profile_probability probability + = node->right ? node->right->subtree_prob : profile_probability::never (); + probability + = conditional_probability (probability + default_prob.apply_scale (1, 2), + node->subtree_prob + default_prob); + bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, + test_bb, probability, phi_mapping); + default_prob = default_prob.apply_scale (1, 2); + + /* Value belongs to this node or to the left-hand subtree. */ + probability + = conditional_probability (node->prob, node->subtree_prob + default_prob); + bb = emit_cmp_and_jump_insns (bb, index, node->low, GE_EXPR, + node->case_bb, probability, phi_mapping); - /* If right node had to be handled later, do that now. */ - if (test_bb) - { - /* If the left-hand subtree fell through, - don't let it fall into the right-hand subtree. */ - if (bb && default_bb) - emit_jump (bb, default_bb, phi_mapping); - - bb = emit_case_nodes (test_bb, index, node->right, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - } - - else if (node->right != 0 && node->left == 0) - { - /* Deal with values to the left of this node, - if they are possible. */ - if (!node_has_low_bound (node, index_type)) - { - probability - = conditional_probability (default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->low, LT_EXPR, - default_bb, probability, - phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - } - - /* Value belongs to this node or to the right-hand subtree. */ - - probability - = conditional_probability (prob, subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, LE_EXPR, - node->case_bb, probability, - phi_mapping); - - bb = emit_case_nodes (bb, index, node->right, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - - else if (node->right == 0 && node->left != 0) - { - /* Deal with values to the right of this node, - if they are possible. */ - if (!node_has_high_bound (node, index_type)) - { - probability - = conditional_probability (default_prob.apply_scale (1, 2), - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - default_bb, probability, - phi_mapping); - default_prob = default_prob.apply_scale (1, 2); - } - - /* Value belongs to this node or to the left-hand subtree. */ + /* Handle the left-hand subtree. */ + bb = emit_case_nodes (bb, index, node->left, default_bb, + default_label, default_prob, index_type, + phi_mapping); - probability - = conditional_probability (prob, subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->low, GE_EXPR, - node->case_bb, probability, - phi_mapping); - - bb = emit_case_nodes (bb, index, node->left, default_bb, - default_label, default_prob, index_type, - phi_mapping); - } - - else - { - /* Node has no children so we check low and high bounds to remove - redundant tests. Only one of the bounds can exist, - since otherwise this node is bounded--a case tested already. */ - bool high_bound = node_has_high_bound (node, index_type); - bool low_bound = node_has_low_bound (node, index_type); - - if (!high_bound && low_bound) - { - probability - = conditional_probability (default_prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->high, GT_EXPR, - default_bb, probability, - phi_mapping); - } - - else if (!low_bound && high_bound) - { - probability - = conditional_probability (default_prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, index, node->low, LT_EXPR, - default_bb, probability, - phi_mapping); - } - else if (!low_bound && !high_bound) - { - tree lhs, rhs; - generate_range_test (bb, index, node->low, node->high, - &lhs, &rhs); - probability - = conditional_probability (default_prob, - subtree_prob + default_prob); - bb = emit_cmp_and_jump_insns (bb, lhs, rhs, GT_EXPR, - default_bb, probability, - phi_mapping); - } + /* If the left-hand subtree fell through, + don't let it fall into the right-hand subtree. */ + if (default_bb) + emit_jump (bb, default_bb, phi_mapping); - emit_jump (bb, node->case_bb, phi_mapping); - return NULL; - } - } + bb = emit_case_nodes (test_bb, index, node->right, default_bb, + default_label, default_prob, index_type, + phi_mapping); return bb; } - -/* Search the parent sections of the case node tree - to see if a test for the lower bound of NODE would be redundant. - INDEX_TYPE is the type of the index expression. - - The instructions to generate the case decision tree are - output in the same order as nodes are processed so it is - known that if a parent node checks the range of the current - node minus one that the current node is bounded at its lower - span. Thus the test would be redundant. */ - -static bool -node_has_low_bound (case_node_ptr node, tree index_type) -{ - tree low_minus_one; - case_node_ptr pnode; - - /* If the lower bound of this node is the lowest value in the index type, - we need not test it. */ - - if (tree_int_cst_equal (node->low, TYPE_MIN_VALUE (index_type))) - return true; - - /* If this node has a left branch, the value at the left must be less - than that at this node, so it cannot be bounded at the bottom and - we need not bother testing any further. */ - - if (node->left) - return false; - - low_minus_one = fold_build2 (MINUS_EXPR, TREE_TYPE (node->low), node->low, - build_int_cst (TREE_TYPE (node->low), 1)); - - /* If the subtraction above overflowed, we can't verify anything. - Otherwise, look for a parent that tests our value - 1. */ - - if (!tree_int_cst_lt (low_minus_one, node->low)) - return false; - - for (pnode = node->parent; pnode; pnode = pnode->parent) - if (tree_int_cst_equal (low_minus_one, pnode->high)) - return true; - - return false; -} - -/* Search the parent sections of the case node tree - to see if a test for the upper bound of NODE would be redundant. - INDEX_TYPE is the type of the index expression. - - The instructions to generate the case decision tree are - output in the same order as nodes are processed so it is - known that if a parent node checks the range of the current - node plus one that the current node is bounded at its upper - span. Thus the test would be redundant. */ - -static bool -node_has_high_bound (case_node_ptr node, tree index_type) -{ - tree high_plus_one; - case_node_ptr pnode; - - /* If there is no upper bound, obviously no test is needed. */ - - if (TYPE_MAX_VALUE (index_type) == NULL) - return true; - - /* If the upper bound of this node is the highest value in the type - of the index expression, we need not test against it. */ - - if (tree_int_cst_equal (node->high, TYPE_MAX_VALUE (index_type))) - return true; - - /* If this node has a right branch, the value at the right must be greater - than that at this node, so it cannot be bounded at the top and - we need not bother testing any further. */ - - if (node->right) - return false; - - high_plus_one = fold_build2 (PLUS_EXPR, TREE_TYPE (node->high), node->high, - build_int_cst (TREE_TYPE (node->high), 1)); - - /* If the addition above overflowed, we can't verify anything. - Otherwise, look for a parent that tests our value + 1. */ - - if (!tree_int_cst_lt (node->high, high_plus_one)) - return false; - - for (pnode = node->parent; pnode; pnode = pnode->parent) - if (tree_int_cst_equal (high_plus_one, pnode->low)) - return true; - - return false; -} - -/* Search the parent sections of the - case node tree to see if both tests for the upper and lower - bounds of NODE would be redundant. */ - -static bool -node_is_bounded (case_node_ptr node, tree index_type) -{ - return (node_has_low_bound (node, index_type) - && node_has_high_bound (node, index_type)); -}