Message ID | 201006281205.11210.ebotcazou@adacore.com |
---|---|
State | New |
Headers | show |
On Mon, Jun 28, 2010 at 3:05 AM, Eric Botcazou <ebotcazou@adacore.com> wrote: >> However, combiner never exposes this to x86 backend. I added >> a TARGET_EXPAND_COMPOUND_OPERATION hook to allow x86 backend to >> optimize it. For >> >> --- >> typedef struct >> { >> unsigned char c1; >> unsigned char c2; >> unsigned char c3; >> unsigned char c4; >> } foo_t; >> >> int >> foo (foo_t x) >> { >> return x.c2 > 4; >> } > > I think that disabling the canonicalization done by expand_compound_operation > can disable certain forms of combining that are beneficial to x86 even for > this kind of patterns. > > The combiner already knows that it needs to re-create the compound forms when > it is trying to simplify a comparison, see simplify_comparison. The problem > with your testcase is that make_compound_operation fails to do so. > > Lightly tested patch attached. > > > * combine.c (make_compound_operation) <SUBREG>: Do not return the > result of force_to_mode if it partially re-expanded the compound. > It works on my testcases. I will do a full bootstrap and test on Linux/x86-64. Thanks.
On Mon, Jun 28, 2010 at 7:28 AM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Mon, Jun 28, 2010 at 3:05 AM, Eric Botcazou <ebotcazou@adacore.com> wrote: >>> However, combiner never exposes this to x86 backend. I added >>> a TARGET_EXPAND_COMPOUND_OPERATION hook to allow x86 backend to >>> optimize it. For >>> >>> --- >>> typedef struct >>> { >>> unsigned char c1; >>> unsigned char c2; >>> unsigned char c3; >>> unsigned char c4; >>> } foo_t; >>> >>> int >>> foo (foo_t x) >>> { >>> return x.c2 > 4; >>> } >> >> I think that disabling the canonicalization done by expand_compound_operation >> can disable certain forms of combining that are beneficial to x86 even for >> this kind of patterns. >> >> The combiner already knows that it needs to re-create the compound forms when >> it is trying to simplify a comparison, see simplify_comparison. The problem >> with your testcase is that make_compound_operation fails to do so. >> >> Lightly tested patch attached. >> >> >> * combine.c (make_compound_operation) <SUBREG>: Do not return the >> result of force_to_mode if it partially re-expanded the compound. >> > > It works on my testcases. I will do a full bootstrap and test on > Linux/x86-64. > There are no regressions. Can you install it? Thanks.
Index: combine.c =================================================================== --- combine.c (revision 161470) +++ combine.c (working copy) @@ -7277,22 +7277,21 @@ make_compound_operation (rtx x, enum rtx /* Call ourselves recursively on the inner expression. If we are narrowing the object and it has a different RTL code from what it originally did, do this SUBREG as a force_to_mode. */ - - tem = make_compound_operation (SUBREG_REG (x), in_code); - { - rtx simplified = simplify_subreg (mode, tem, GET_MODE (SUBREG_REG (x)), - SUBREG_BYTE (x)); + rtx inner = SUBREG_REG (x), simplified; + + tem = make_compound_operation (inner, in_code); + simplified + = simplify_subreg (mode, tem, GET_MODE (inner), SUBREG_BYTE (x)); if (simplified) tem = simplified; - if (GET_CODE (tem) != GET_CODE (SUBREG_REG (x)) - && GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (SUBREG_REG (x))) + if (GET_CODE (tem) != GET_CODE (inner) + && GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (inner)) && subreg_lowpart_p (x)) { - rtx newer = force_to_mode (tem, mode, ~(HOST_WIDE_INT) 0, - 0); + rtx newer = force_to_mode (tem, mode, ~(HOST_WIDE_INT) 0, 0); /* If we have something other than a SUBREG, we might have done an expansion, so rerun ourselves. */ @@ -7300,9 +7299,16 @@ make_compound_operation (rtx x, enum rtx newer = make_compound_operation (newer, in_code); /* force_to_mode can expand compounds. If it just re-expanded the - compound use gen_lowpart instead to convert to the desired - mode. */ - if (rtx_equal_p (newer, x)) + compound, use gen_lowpart to convert to the desired mode. */ + if (rtx_equal_p (newer, x) + /* Likewise if it re-expanded the compound only partially. + This happens for SUBREG of ZERO_EXTRACT if they extract + the same number of bits. */ + || (GET_CODE (newer) == SUBREG + && (GET_CODE (SUBREG_REG (newer)) == LSHIFTRT + || GET_CODE (SUBREG_REG (newer)) == ASHIFTRT) + && GET_CODE (inner) == AND + && rtx_equal_p (SUBREG_REG (newer), XEXP (inner, 0)))) return gen_lowpart (GET_MODE (x), tem); return newer;