Message ID | AM4PR0701MB2162F00D769B15645676D3ACE4020@AM4PR0701MB2162.eurprd07.prod.outlook.com |
---|---|
State | New |
Headers | show |
On Sat, Jul 30, 2016 at 08:17:59AM +0000, Bernd Edlinger wrote: > Ok, I the incorrect REG_POINTER is done here: > > cse_main -> reg_scan -> reg_scan_mark_refs -> set_reg_attrs_from_value > > and here I see a bug, because if POINTERS_EXTEND_UNSIGNED > can_be_reg_pointer is only set to false if x is SIGN_EXTEND but not > if x is a SUBREG as in this case. > > So I think that should be fixed this way: > > Index: emit-rtl.c > =================================================================== > --- emit-rtl.c (revision 238891) > +++ emit-rtl.c (working copy) > @@ -1155,7 +1155,7 @@ set_reg_attrs_from_value (rtx reg, rtx x) > || (GET_CODE (x) == SUBREG && subreg_lowpart_p (x))) > { > #if defined(POINTERS_EXTEND_UNSIGNED) > - if (((GET_CODE (x) == SIGN_EXTEND && POINTERS_EXTEND_UNSIGNED) > + if (((GET_CODE (x) != ZERO_EXTEND && POINTERS_EXTEND_UNSIGNED) > || (GET_CODE (x) != SIGN_EXTEND && ! POINTERS_EXTEND_UNSIGNED)) > && !targetm.have_ptr_extend ()) > can_be_reg_pointer = false; > > > What do you think does this look like the right fix? There also is (from rtl.texi), for paradoxical subregs: """ @item @code{subreg} of @code{reg}s The upper bits are defined when @code{SUBREG_PROMOTED_VAR_P} is true. @code{SUBREG_PROMOTED_UNSIGNED_P} describes what the upper bits hold. Such subregs usually represent local variables, register variables and parameter pseudo variables that have been promoted to a wider mode. """ so you might want to test for these as well. > With this patch the code the reg/f:DI 481 does no longer appear, > and also the invalid combine does no longer happen. Good :-) > However the test case from pr70903 does not get fixed by this. > > But when I look at the dumps, I see again the first incorrect > transformation in cse2 (again cse!): > > (insn 20 19 21 2 (set (reg:DI 94) > (subreg:DI (reg:QI 93) 0)) pr.c:8 50 {*movdi_aarch64} > (expr_list:REG_EQUAL (const_int 255 [0xff]) > (expr_list:REG_DEAD (reg:QI 93) > (nil)))) > > but that is simply wrong, because later optimization passes > expect reg 94 to be 0x000000ff but the upper bits are unspecified, > so that REG_EQUAL should better not exist. Agreed. So where does that come from? > When I looked at cse.c I saw a comment in #if 0, which exactly > describes the problem that we have with paradoxical subreg here: <snip> > I am pretty sure, that this has to be reverted, and that it can > generate less efficient code. > > What do you think? I think this pessimises the generated code too much; there must be a better solution. Segher
Index: emit-rtl.c =================================================================== --- emit-rtl.c (revision 238891) +++ emit-rtl.c (working copy) @@ -1155,7 +1155,7 @@ set_reg_attrs_from_value (rtx reg, rtx x) || (GET_CODE (x) == SUBREG && subreg_lowpart_p (x))) { #if defined(POINTERS_EXTEND_UNSIGNED) - if (((GET_CODE (x) == SIGN_EXTEND && POINTERS_EXTEND_UNSIGNED) + if (((GET_CODE (x) != ZERO_EXTEND && POINTERS_EXTEND_UNSIGNED) || (GET_CODE (x) != SIGN_EXTEND && ! POINTERS_EXTEND_UNSIGNED)) && !targetm.have_ptr_extend ()) can_be_reg_pointer = false; What do you think does this look like the right fix? With this patch the code the reg/f:DI 481 does no longer appear, and also the invalid combine does no longer happen. However the test case from pr70903 does not get fixed by this. But when I look at the dumps, I see again the first incorrect transformation in cse2 (again cse!): (insn 20 19 21 2 (set (reg:DI 94) (subreg:DI (reg:QI 93) 0)) pr.c:8 50 {*movdi_aarch64} (expr_list:REG_EQUAL (const_int 255 [0xff]) (expr_list:REG_DEAD (reg:QI 93) (nil)))) but that is simply wrong, because later optimization passes expect reg 94 to be 0x000000ff but the upper bits are unspecified, so that REG_EQUAL should better not exist. When I looked at cse.c I saw a comment in #if 0, which exactly describes the problem that we have with paradoxical subreg here: Index: cse.c =================================================================== --- cse.c (revision 238891) +++ cse.c (working copy) @@ -4716,10 +4716,6 @@ cse_insn (rtx_insn *insn) } } -#if 0 - /* It is no longer clear why we used to do this, but it doesn't - appear to still be needed. So let's try without it since this - code hurts cse'ing widened ops. */ /* If source is a paradoxical subreg (such as QI treated as an SI), treat it as volatile. It may do the work of an SI in one context where the extra bits are not being used, but cannot replace an SI @@ -4726,7 +4722,6 @@ cse_insn (rtx_insn *insn) in general. */ if (paradoxical_subreg_p (src)) sets[i].src_volatile = 1; -#endif /* Locate all possible equivalent forms for SRC. Try to replace SRC in the insn with each cheaper equivalent.