Message ID | YovTsD8M9F6+iIoM@gmail.com |
---|---|
State | New |
Headers | show |
Series | [v2] DSE: Use the constant store source if possible | expand |
On Mon, May 23, 2022 at 8:34 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote: > > On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > When recording store for RTL dead store elimination, check if the source > > > register is set only once to a constant. If yes, record the constant > > > as the store source. It eliminates unrolled zero stores after memset 0 > > > in a loop where a vector register is used as the zero store source. > > > > > > gcc/ > > > > > > PR rtl-optimization/105638 > > > * dse.cc (record_store): Use the constant source if the source > > > register is set only once. > > > > > > gcc/testsuite/ > > > > > > PR rtl-optimization/105638 > > > * g++.target/i386/pr105638.C: New test. > > > --- > > > gcc/dse.cc | 19 ++++++++++ > > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > > > 2 files changed, 63 insertions(+) > > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > > > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc > > > index 30c11cee034..0433dd3d846 100644 > > > --- a/gcc/dse.cc > > > +++ b/gcc/dse.cc > > > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info) > > > > > > if (tem && CONSTANT_P (tem)) > > > const_rhs = tem; > > > + else > > > + { > > > + /* If RHS is set only once to a constant, set CONST_RHS > > > + to the constant. */ > > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > > > + if (def != nullptr > > > + && !DF_REF_IS_ARTIFICIAL (def) > > > + && !DF_REF_NEXT_REG (def)) > > > + { > > > + rtx_insn *def_insn = DF_REF_INSN (def); > > > + rtx def_body = PATTERN (def_insn); > > > + if (GET_CODE (def_body) == SET) > > > + { > > > + rtx def_src = SET_SRC (def_body); > > > + if (CONSTANT_P (def_src)) > > > + const_rhs = def_src; > > > > doesn't DSE have its own tracking of stored values? Shouldn't we > > It tracks stored values only within the basic block. When RTL loop > invariant motion hoists a constant initialization out of the loop into > a separate basic block, the constant store value becomes unknown > within the original basic block. > > > improve _that_ if it is not enough? I also wonder if you need to > > My patch extends DSE stored value tracking to include the constant which > is set only once in another basic block. > > > verify the SET isn't partial? > > > > Here is the v2 patch to check that the constant is set by a non-partial > unconditional load. > > OK for master? > > Thanks. > > H.J. > --- > RTL DSE tracks redundant constant stores within a basic block. When RTL > loop invariant motion hoists a constant initialization out of the loop > into a separate basic block, the constant store value becomes unknown > within the original basic block. When recording store for RTL DSE, check > if the source register is set only once to a constant by a non-partial > unconditional load. If yes, record the constant as the constant store > source. It eliminates unrolled zero stores after memset 0 in a loop > where a vector register is used as the zero store source. > > gcc/ > > PR rtl-optimization/105638 > * dse.cc (record_store): Use the constant source if the source > register is set only once. > > gcc/testsuite/ > > PR rtl-optimization/105638 > * g++.target/i386/pr105638.C: New test. > --- > gcc/dse.cc | 22 ++++++++++++ > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > 2 files changed, 66 insertions(+) > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > > diff --git a/gcc/dse.cc b/gcc/dse.cc > index 30c11cee034..af8e88dac32 100644 > --- a/gcc/dse.cc > +++ b/gcc/dse.cc > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) > > if (tem && CONSTANT_P (tem)) > const_rhs = tem; > + else > + { > + /* If RHS is set only once to a constant, set CONST_RHS > + to the constant. */ > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > + if (def != nullptr > + && !DF_REF_IS_ARTIFICIAL (def) > + && !(DF_REF_FLAGS (def) > + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) > + && !DF_REF_NEXT_REG (def)) Can we really use df-chain here and rely that a single definition is the only one? If rhs is a hardreg does df-chain include implicit sets of function argument registers for example? Don't we need RD here or at least verify the single df-chain definition dominates the use here (if we can rely on the reg otherwise be uninitialized and thus the use invoking undefined behavior we could use the constant even in non-dominating context, but WRT the function args & hardregs I'm not sure we can tell). > + { > + rtx_insn *def_insn = DF_REF_INSN (def); > + rtx def_body = PATTERN (def_insn); > + if (GET_CODE (def_body) == SET) I think you should be able to use rtx def_body = single_set (def_insn); if (def_body) here to get at some more cases. > + { > + rtx def_src = SET_SRC (def_body); > + if (CONSTANT_P (def_src) > + && GET_MODE (def_src) == GET_MODE (rhs)) possibly verify SET_DEST (def_body) == rhs to rule out subregs and mode punning instead? Then def_src should be automatically OK. I wonder whether it's worth tracking (or if DSE even can) loads from the constant pool with constant REG_EQUAL/EQUIV notes? > + const_rhs = def_src; > + } > + } > + } > } > } > > diff --git a/gcc/testsuite/g++.target/i386/pr105638.C b/gcc/testsuite/g++.target/i386/pr105638.C > new file mode 100644 > index 00000000000..ff40a459de1 > --- /dev/null > +++ b/gcc/testsuite/g++.target/i386/pr105638.C > @@ -0,0 +1,44 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-std=gnu++20 -O2 -march=skylake" } */ > +/* { dg-final { scan-assembler-not "vpxor" } } */ > + > +#include <stdint.h> > +#include <vector> > +#include <tr1/array> > + > +class FastBoard { > +public: > + typedef std::pair<int, int> movescore_t; > + typedef std::tr1::array<movescore_t, 24> scoredlist_t; > + > +protected: > + std::vector<int> m_critical; > + > + int m_boardsize; > +}; > + > +class FastState { > +public: > + FastBoard board; > + > + int movenum; > +protected: > + FastBoard::scoredlist_t scoredmoves; > +}; > + > +class KoState : public FastState { > +private: > + std::vector<uint64_t> ko_hash_history; > + std::vector<uint64_t> hash_history; > +}; > + > +class GameState : public KoState { > +public: > + void foo (); > +private: > + std::vector<KoState> game_history; > +}; > + > +void GameState::foo() { > + game_history.resize(movenum); > +} > -- > 2.36.1 >
On Mon, May 23, 2022 at 11:42 PM Richard Biener <richard.guenther@gmail.com> wrote: > > On Mon, May 23, 2022 at 8:34 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote: > > > On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > When recording store for RTL dead store elimination, check if the source > > > > register is set only once to a constant. If yes, record the constant > > > > as the store source. It eliminates unrolled zero stores after memset 0 > > > > in a loop where a vector register is used as the zero store source. > > > > > > > > gcc/ > > > > > > > > PR rtl-optimization/105638 > > > > * dse.cc (record_store): Use the constant source if the source > > > > register is set only once. > > > > > > > > gcc/testsuite/ > > > > > > > > PR rtl-optimization/105638 > > > > * g++.target/i386/pr105638.C: New test. > > > > --- > > > > gcc/dse.cc | 19 ++++++++++ > > > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > > > > 2 files changed, 63 insertions(+) > > > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > > > > > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc > > > > index 30c11cee034..0433dd3d846 100644 > > > > --- a/gcc/dse.cc > > > > +++ b/gcc/dse.cc > > > > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info) > > > > > > > > if (tem && CONSTANT_P (tem)) > > > > const_rhs = tem; > > > > + else > > > > + { > > > > + /* If RHS is set only once to a constant, set CONST_RHS > > > > + to the constant. */ > > > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > > > > + if (def != nullptr > > > > + && !DF_REF_IS_ARTIFICIAL (def) > > > > + && !DF_REF_NEXT_REG (def)) > > > > + { > > > > + rtx_insn *def_insn = DF_REF_INSN (def); > > > > + rtx def_body = PATTERN (def_insn); > > > > + if (GET_CODE (def_body) == SET) > > > > + { > > > > + rtx def_src = SET_SRC (def_body); > > > > + if (CONSTANT_P (def_src)) > > > > + const_rhs = def_src; > > > > > > doesn't DSE have its own tracking of stored values? Shouldn't we > > > > It tracks stored values only within the basic block. When RTL loop > > invariant motion hoists a constant initialization out of the loop into > > a separate basic block, the constant store value becomes unknown > > within the original basic block. > > > > > improve _that_ if it is not enough? I also wonder if you need to > > > > My patch extends DSE stored value tracking to include the constant which > > is set only once in another basic block. > > > > > verify the SET isn't partial? > > > > > > > Here is the v2 patch to check that the constant is set by a non-partial > > unconditional load. > > > > OK for master? > > > > Thanks. > > > > H.J. > > --- > > RTL DSE tracks redundant constant stores within a basic block. When RTL > > loop invariant motion hoists a constant initialization out of the loop > > into a separate basic block, the constant store value becomes unknown > > within the original basic block. When recording store for RTL DSE, check > > if the source register is set only once to a constant by a non-partial > > unconditional load. If yes, record the constant as the constant store > > source. It eliminates unrolled zero stores after memset 0 in a loop > > where a vector register is used as the zero store source. > > > > gcc/ > > > > PR rtl-optimization/105638 > > * dse.cc (record_store): Use the constant source if the source > > register is set only once. > > > > gcc/testsuite/ > > > > PR rtl-optimization/105638 > > * g++.target/i386/pr105638.C: New test. > > --- > > gcc/dse.cc | 22 ++++++++++++ > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > > 2 files changed, 66 insertions(+) > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc > > index 30c11cee034..af8e88dac32 100644 > > --- a/gcc/dse.cc > > +++ b/gcc/dse.cc > > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) > > > > if (tem && CONSTANT_P (tem)) > > const_rhs = tem; > > + else > > + { > > + /* If RHS is set only once to a constant, set CONST_RHS > > + to the constant. */ > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > > + if (def != nullptr > > + && !DF_REF_IS_ARTIFICIAL (def) > > + && !(DF_REF_FLAGS (def) > > + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) > > + && !DF_REF_NEXT_REG (def)) > > Can we really use df-chain here and rely that a single definition is > the only one? If rhs is a hardreg does df-chain include implicit > sets of function argument registers for example? Don't we need RD > here or at least verify the single df-chain definition dominates the > use here (if we can rely on the reg otherwise be uninitialized and thus > the use invoking undefined behavior we could use the constant even > in non-dominating context, but WRT the function args & hardregs I'm not > sure we can tell). Does the hard register have a regular DEF? Won't the function args & hardregs have DF_REF_ARTIFICIAL? > > > + { > > + rtx_insn *def_insn = DF_REF_INSN (def); > > + rtx def_body = PATTERN (def_insn); > > + if (GET_CODE (def_body) == SET) > > I think you should be able to use > > rtx def_body = single_set (def_insn); > if (def_body) > > here to get at some more cases. Will change it. > > + { > > + rtx def_src = SET_SRC (def_body); > > + if (CONSTANT_P (def_src) > > + && GET_MODE (def_src) == GET_MODE (rhs)) > > possibly verify SET_DEST (def_body) == rhs to rule out subregs and mode punning > instead? Then def_src should be automatically OK. I wonder whether it's worth Will change it. > tracking (or if DSE even can) loads from the constant pool with constant.> REG_EQUAL/EQUIV notes? CONSTANT_P returns true for a constant pool. I am not sure if DSE can handle a constant pool. Thanks. > > > + const_rhs = def_src; > > + } > > + } > > + } > > } > > } > > > > diff --git a/gcc/testsuite/g++.target/i386/pr105638.C b/gcc/testsuite/g++.target/i386/pr105638.C > > new file mode 100644 > > index 00000000000..ff40a459de1 > > --- /dev/null > > +++ b/gcc/testsuite/g++.target/i386/pr105638.C > > @@ -0,0 +1,44 @@ > > +/* { dg-do compile { target { ! ia32 } } } */ > > +/* { dg-options "-std=gnu++20 -O2 -march=skylake" } */ > > +/* { dg-final { scan-assembler-not "vpxor" } } */ > > + > > +#include <stdint.h> > > +#include <vector> > > +#include <tr1/array> > > + > > +class FastBoard { > > +public: > > + typedef std::pair<int, int> movescore_t; > > + typedef std::tr1::array<movescore_t, 24> scoredlist_t; > > + > > +protected: > > + std::vector<int> m_critical; > > + > > + int m_boardsize; > > +}; > > + > > +class FastState { > > +public: > > + FastBoard board; > > + > > + int movenum; > > +protected: > > + FastBoard::scoredlist_t scoredmoves; > > +}; > > + > > +class KoState : public FastState { > > +private: > > + std::vector<uint64_t> ko_hash_history; > > + std::vector<uint64_t> hash_history; > > +}; > > + > > +class GameState : public KoState { > > +public: > > + void foo (); > > +private: > > + std::vector<KoState> game_history; > > +}; > > + > > +void GameState::foo() { > > + game_history.resize(movenum); > > +} > > -- > > 2.36.1 > >
"H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes: > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote: >> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches >> <gcc-patches@gcc.gnu.org> wrote: >> > >> > When recording store for RTL dead store elimination, check if the source >> > register is set only once to a constant. If yes, record the constant >> > as the store source. It eliminates unrolled zero stores after memset 0 >> > in a loop where a vector register is used as the zero store source. >> > >> > gcc/ >> > >> > PR rtl-optimization/105638 >> > * dse.cc (record_store): Use the constant source if the source >> > register is set only once. >> > >> > gcc/testsuite/ >> > >> > PR rtl-optimization/105638 >> > * g++.target/i386/pr105638.C: New test. >> > --- >> > gcc/dse.cc | 19 ++++++++++ >> > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ >> > 2 files changed, 63 insertions(+) >> > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C >> > >> > diff --git a/gcc/dse.cc b/gcc/dse.cc >> > index 30c11cee034..0433dd3d846 100644 >> > --- a/gcc/dse.cc >> > +++ b/gcc/dse.cc >> > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info) >> > >> > if (tem && CONSTANT_P (tem)) >> > const_rhs = tem; >> > + else >> > + { >> > + /* If RHS is set only once to a constant, set CONST_RHS >> > + to the constant. */ >> > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); >> > + if (def != nullptr >> > + && !DF_REF_IS_ARTIFICIAL (def) >> > + && !DF_REF_NEXT_REG (def)) >> > + { >> > + rtx_insn *def_insn = DF_REF_INSN (def); >> > + rtx def_body = PATTERN (def_insn); >> > + if (GET_CODE (def_body) == SET) >> > + { >> > + rtx def_src = SET_SRC (def_body); >> > + if (CONSTANT_P (def_src)) >> > + const_rhs = def_src; >> >> doesn't DSE have its own tracking of stored values? Shouldn't we > > It tracks stored values only within the basic block. When RTL loop > invariant motion hoists a constant initialization out of the loop into > a separate basic block, the constant store value becomes unknown > within the original basic block. > >> improve _that_ if it is not enough? I also wonder if you need to > > My patch extends DSE stored value tracking to include the constant which > is set only once in another basic block. > >> verify the SET isn't partial? >> > > Here is the v2 patch to check that the constant is set by a non-partial > unconditional load. > > OK for master? > > Thanks. > > H.J. > --- > RTL DSE tracks redundant constant stores within a basic block. When RTL > loop invariant motion hoists a constant initialization out of the loop > into a separate basic block, the constant store value becomes unknown > within the original basic block. When recording store for RTL DSE, check > if the source register is set only once to a constant by a non-partial > unconditional load. If yes, record the constant as the constant store > source. It eliminates unrolled zero stores after memset 0 in a loop > where a vector register is used as the zero store source. > > gcc/ > > PR rtl-optimization/105638 > * dse.cc (record_store): Use the constant source if the source > register is set only once. > > gcc/testsuite/ > > PR rtl-optimization/105638 > * g++.target/i386/pr105638.C: New test. > --- > gcc/dse.cc | 22 ++++++++++++ > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > 2 files changed, 66 insertions(+) > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > > diff --git a/gcc/dse.cc b/gcc/dse.cc > index 30c11cee034..af8e88dac32 100644 > --- a/gcc/dse.cc > +++ b/gcc/dse.cc > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) > > if (tem && CONSTANT_P (tem)) > const_rhs = tem; > + else > + { > + /* If RHS is set only once to a constant, set CONST_RHS > + to the constant. */ > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > + if (def != nullptr > + && !DF_REF_IS_ARTIFICIAL (def) > + && !(DF_REF_FLAGS (def) > + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) > + && !DF_REF_NEXT_REG (def)) Can we introduce a helper for this? There are already similar tests in ira and loop-iv, and it seems a bit too complex to have to open-code each time. Thanks, Richard > + { > + rtx_insn *def_insn = DF_REF_INSN (def); > + rtx def_body = PATTERN (def_insn); > + if (GET_CODE (def_body) == SET) > + { > + rtx def_src = SET_SRC (def_body); > + if (CONSTANT_P (def_src) > + && GET_MODE (def_src) == GET_MODE (rhs)) > + const_rhs = def_src; > + } > + } > + } > } > } > > diff --git a/gcc/testsuite/g++.target/i386/pr105638.C b/gcc/testsuite/g++.target/i386/pr105638.C > new file mode 100644 > index 00000000000..ff40a459de1 > --- /dev/null > +++ b/gcc/testsuite/g++.target/i386/pr105638.C > @@ -0,0 +1,44 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-std=gnu++20 -O2 -march=skylake" } */ > +/* { dg-final { scan-assembler-not "vpxor" } } */ > + > +#include <stdint.h> > +#include <vector> > +#include <tr1/array> > + > +class FastBoard { > +public: > + typedef std::pair<int, int> movescore_t; > + typedef std::tr1::array<movescore_t, 24> scoredlist_t; > + > +protected: > + std::vector<int> m_critical; > + > + int m_boardsize; > +}; > + > +class FastState { > +public: > + FastBoard board; > + > + int movenum; > +protected: > + FastBoard::scoredlist_t scoredmoves; > +}; > + > +class KoState : public FastState { > +private: > + std::vector<uint64_t> ko_hash_history; > + std::vector<uint64_t> hash_history; > +}; > + > +class GameState : public KoState { > +public: > + void foo (); > +private: > + std::vector<KoState> game_history; > +}; > + > +void GameState::foo() { > + game_history.resize(movenum); > +}
On Tue, May 24, 2022 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Mon, May 23, 2022 at 11:42 PM Richard Biener > <richard.guenther@gmail.com> wrote: > > > > On Mon, May 23, 2022 at 8:34 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote: > > > > On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches > > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > > > When recording store for RTL dead store elimination, check if the source > > > > > register is set only once to a constant. If yes, record the constant > > > > > as the store source. It eliminates unrolled zero stores after memset 0 > > > > > in a loop where a vector register is used as the zero store source. > > > > > > > > > > gcc/ > > > > > > > > > > PR rtl-optimization/105638 > > > > > * dse.cc (record_store): Use the constant source if the source > > > > > register is set only once. > > > > > > > > > > gcc/testsuite/ > > > > > > > > > > PR rtl-optimization/105638 > > > > > * g++.target/i386/pr105638.C: New test. > > > > > --- > > > > > gcc/dse.cc | 19 ++++++++++ > > > > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > > > > > 2 files changed, 63 insertions(+) > > > > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > > > > > > > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc > > > > > index 30c11cee034..0433dd3d846 100644 > > > > > --- a/gcc/dse.cc > > > > > +++ b/gcc/dse.cc > > > > > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info) > > > > > > > > > > if (tem && CONSTANT_P (tem)) > > > > > const_rhs = tem; > > > > > + else > > > > > + { > > > > > + /* If RHS is set only once to a constant, set CONST_RHS > > > > > + to the constant. */ > > > > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > > > > > + if (def != nullptr > > > > > + && !DF_REF_IS_ARTIFICIAL (def) > > > > > + && !DF_REF_NEXT_REG (def)) > > > > > + { > > > > > + rtx_insn *def_insn = DF_REF_INSN (def); > > > > > + rtx def_body = PATTERN (def_insn); > > > > > + if (GET_CODE (def_body) == SET) > > > > > + { > > > > > + rtx def_src = SET_SRC (def_body); > > > > > + if (CONSTANT_P (def_src)) > > > > > + const_rhs = def_src; > > > > > > > > doesn't DSE have its own tracking of stored values? Shouldn't we > > > > > > It tracks stored values only within the basic block. When RTL loop > > > invariant motion hoists a constant initialization out of the loop into > > > a separate basic block, the constant store value becomes unknown > > > within the original basic block. > > > > > > > improve _that_ if it is not enough? I also wonder if you need to > > > > > > My patch extends DSE stored value tracking to include the constant which > > > is set only once in another basic block. > > > > > > > verify the SET isn't partial? > > > > > > > > > > Here is the v2 patch to check that the constant is set by a non-partial > > > unconditional load. > > > > > > OK for master? > > > > > > Thanks. > > > > > > H.J. > > > --- > > > RTL DSE tracks redundant constant stores within a basic block. When RTL > > > loop invariant motion hoists a constant initialization out of the loop > > > into a separate basic block, the constant store value becomes unknown > > > within the original basic block. When recording store for RTL DSE, check > > > if the source register is set only once to a constant by a non-partial > > > unconditional load. If yes, record the constant as the constant store > > > source. It eliminates unrolled zero stores after memset 0 in a loop > > > where a vector register is used as the zero store source. > > > > > > gcc/ > > > > > > PR rtl-optimization/105638 > > > * dse.cc (record_store): Use the constant source if the source > > > register is set only once. > > > > > > gcc/testsuite/ > > > > > > PR rtl-optimization/105638 > > > * g++.target/i386/pr105638.C: New test. > > > --- > > > gcc/dse.cc | 22 ++++++++++++ > > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > > > 2 files changed, 66 insertions(+) > > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > > > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc > > > index 30c11cee034..af8e88dac32 100644 > > > --- a/gcc/dse.cc > > > +++ b/gcc/dse.cc > > > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) > > > > > > if (tem && CONSTANT_P (tem)) > > > const_rhs = tem; > > > + else > > > + { > > > + /* If RHS is set only once to a constant, set CONST_RHS > > > + to the constant. */ > > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > > > + if (def != nullptr > > > + && !DF_REF_IS_ARTIFICIAL (def) > > > + && !(DF_REF_FLAGS (def) > > > + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) > > > + && !DF_REF_NEXT_REG (def)) > > > > Can we really use df-chain here and rely that a single definition is > > the only one? If rhs is a hardreg does df-chain include implicit > > sets of function argument registers for example? Don't we need RD > > here or at least verify the single df-chain definition dominates the > > use here (if we can rely on the reg otherwise be uninitialized and thus > > the use invoking undefined behavior we could use the constant even > > in non-dominating context, but WRT the function args & hardregs I'm not > > sure we can tell). > > Does the hard register have a regular DEF? Won't the function args & > hardregs have DF_REF_ARTIFICIAL? I don't know - do they? > > > > > + { > > > + rtx_insn *def_insn = DF_REF_INSN (def); > > > + rtx def_body = PATTERN (def_insn); > > > + if (GET_CODE (def_body) == SET) > > > > I think you should be able to use > > > > rtx def_body = single_set (def_insn); > > if (def_body) > > > > here to get at some more cases. > > Will change it. > > > > + { > > > + rtx def_src = SET_SRC (def_body); > > > + if (CONSTANT_P (def_src) > > > + && GET_MODE (def_src) == GET_MODE (rhs)) > > > > possibly verify SET_DEST (def_body) == rhs to rule out subregs and mode punning > > instead? Then def_src should be automatically OK. I wonder whether it's worth > > Will change it. > > > tracking (or if DSE even can) loads from the constant pool with constant.> REG_EQUAL/EQUIV notes? > > CONSTANT_P returns true for a constant pool. I am not > sure if DSE can handle a constant pool. > > Thanks. > > > > > > + const_rhs = def_src; > > > + } > > > + } > > > + } > > > } > > > } > > > > > > diff --git a/gcc/testsuite/g++.target/i386/pr105638.C b/gcc/testsuite/g++.target/i386/pr105638.C > > > new file mode 100644 > > > index 00000000000..ff40a459de1 > > > --- /dev/null > > > +++ b/gcc/testsuite/g++.target/i386/pr105638.C > > > @@ -0,0 +1,44 @@ > > > +/* { dg-do compile { target { ! ia32 } } } */ > > > +/* { dg-options "-std=gnu++20 -O2 -march=skylake" } */ > > > +/* { dg-final { scan-assembler-not "vpxor" } } */ > > > + > > > +#include <stdint.h> > > > +#include <vector> > > > +#include <tr1/array> > > > + > > > +class FastBoard { > > > +public: > > > + typedef std::pair<int, int> movescore_t; > > > + typedef std::tr1::array<movescore_t, 24> scoredlist_t; > > > + > > > +protected: > > > + std::vector<int> m_critical; > > > + > > > + int m_boardsize; > > > +}; > > > + > > > +class FastState { > > > +public: > > > + FastBoard board; > > > + > > > + int movenum; > > > +protected: > > > + FastBoard::scoredlist_t scoredmoves; > > > +}; > > > + > > > +class KoState : public FastState { > > > +private: > > > + std::vector<uint64_t> ko_hash_history; > > > + std::vector<uint64_t> hash_history; > > > +}; > > > + > > > +class GameState : public KoState { > > > +public: > > > + void foo (); > > > +private: > > > + std::vector<KoState> game_history; > > > +}; > > > + > > > +void GameState::foo() { > > > + game_history.resize(movenum); > > > +} > > > -- > > > 2.36.1 > > > > > > > -- > H.J.
Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > On Tue, May 24, 2022 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote: >> >> On Mon, May 23, 2022 at 11:42 PM Richard Biener >> <richard.guenther@gmail.com> wrote: >> > >> > On Mon, May 23, 2022 at 8:34 PM H.J. Lu <hjl.tools@gmail.com> wrote: >> > > >> > > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote: >> > > > On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches >> > > > <gcc-patches@gcc.gnu.org> wrote: >> > > > > >> > > > > When recording store for RTL dead store elimination, check if the source >> > > > > register is set only once to a constant. If yes, record the constant >> > > > > as the store source. It eliminates unrolled zero stores after memset 0 >> > > > > in a loop where a vector register is used as the zero store source. >> > > > > >> > > > > gcc/ >> > > > > >> > > > > PR rtl-optimization/105638 >> > > > > * dse.cc (record_store): Use the constant source if the source >> > > > > register is set only once. >> > > > > >> > > > > gcc/testsuite/ >> > > > > >> > > > > PR rtl-optimization/105638 >> > > > > * g++.target/i386/pr105638.C: New test. >> > > > > --- >> > > > > gcc/dse.cc | 19 ++++++++++ >> > > > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ >> > > > > 2 files changed, 63 insertions(+) >> > > > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C >> > > > > >> > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc >> > > > > index 30c11cee034..0433dd3d846 100644 >> > > > > --- a/gcc/dse.cc >> > > > > +++ b/gcc/dse.cc >> > > > > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info) >> > > > > >> > > > > if (tem && CONSTANT_P (tem)) >> > > > > const_rhs = tem; >> > > > > + else >> > > > > + { >> > > > > + /* If RHS is set only once to a constant, set CONST_RHS >> > > > > + to the constant. */ >> > > > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); >> > > > > + if (def != nullptr >> > > > > + && !DF_REF_IS_ARTIFICIAL (def) >> > > > > + && !DF_REF_NEXT_REG (def)) >> > > > > + { >> > > > > + rtx_insn *def_insn = DF_REF_INSN (def); >> > > > > + rtx def_body = PATTERN (def_insn); >> > > > > + if (GET_CODE (def_body) == SET) >> > > > > + { >> > > > > + rtx def_src = SET_SRC (def_body); >> > > > > + if (CONSTANT_P (def_src)) >> > > > > + const_rhs = def_src; >> > > > >> > > > doesn't DSE have its own tracking of stored values? Shouldn't we >> > > >> > > It tracks stored values only within the basic block. When RTL loop >> > > invariant motion hoists a constant initialization out of the loop into >> > > a separate basic block, the constant store value becomes unknown >> > > within the original basic block. >> > > >> > > > improve _that_ if it is not enough? I also wonder if you need to >> > > >> > > My patch extends DSE stored value tracking to include the constant which >> > > is set only once in another basic block. >> > > >> > > > verify the SET isn't partial? >> > > > >> > > >> > > Here is the v2 patch to check that the constant is set by a non-partial >> > > unconditional load. >> > > >> > > OK for master? >> > > >> > > Thanks. >> > > >> > > H.J. >> > > --- >> > > RTL DSE tracks redundant constant stores within a basic block. When RTL >> > > loop invariant motion hoists a constant initialization out of the loop >> > > into a separate basic block, the constant store value becomes unknown >> > > within the original basic block. When recording store for RTL DSE, check >> > > if the source register is set only once to a constant by a non-partial >> > > unconditional load. If yes, record the constant as the constant store >> > > source. It eliminates unrolled zero stores after memset 0 in a loop >> > > where a vector register is used as the zero store source. >> > > >> > > gcc/ >> > > >> > > PR rtl-optimization/105638 >> > > * dse.cc (record_store): Use the constant source if the source >> > > register is set only once. >> > > >> > > gcc/testsuite/ >> > > >> > > PR rtl-optimization/105638 >> > > * g++.target/i386/pr105638.C: New test. >> > > --- >> > > gcc/dse.cc | 22 ++++++++++++ >> > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ >> > > 2 files changed, 66 insertions(+) >> > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C >> > > >> > > diff --git a/gcc/dse.cc b/gcc/dse.cc >> > > index 30c11cee034..af8e88dac32 100644 >> > > --- a/gcc/dse.cc >> > > +++ b/gcc/dse.cc >> > > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) >> > > >> > > if (tem && CONSTANT_P (tem)) >> > > const_rhs = tem; >> > > + else >> > > + { >> > > + /* If RHS is set only once to a constant, set CONST_RHS >> > > + to the constant. */ >> > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); >> > > + if (def != nullptr >> > > + && !DF_REF_IS_ARTIFICIAL (def) >> > > + && !(DF_REF_FLAGS (def) >> > > + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) >> > > + && !DF_REF_NEXT_REG (def)) >> > >> > Can we really use df-chain here and rely that a single definition is >> > the only one? If rhs is a hardreg does df-chain include implicit >> > sets of function argument registers for example? Don't we need RD >> > here or at least verify the single df-chain definition dominates the >> > use here (if we can rely on the reg otherwise be uninitialized and thus >> > the use invoking undefined behavior we could use the constant even >> > in non-dominating context, but WRT the function args & hardregs I'm not >> > sure we can tell). >> >> Does the hard register have a regular DEF? Won't the function args & >> hardregs have DF_REF_ARTIFICIAL? > > I don't know - do they? They won't be DF_REF_ARTIFICIAL. I think the code should work ok for hard registers though. All sets of argument registers, and all clobbers of registers by function calls, are recorded as df defs. (This is one of the things that bloats the representation compared to rtx insns, where most call clobbers are implicit. But it's also one of the things that makes df easier to use & more reliable than ad hoc liveness tracking.) Thanks, Richard
On Wed, May 25, 2022 at 12:30 AM Richard Sandiford <richard.sandiford@arm.com> wrote: > > "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes: > > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote: > >> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches > >> <gcc-patches@gcc.gnu.org> wrote: > >> > > >> > When recording store for RTL dead store elimination, check if the source > >> > register is set only once to a constant. If yes, record the constant > >> > as the store source. It eliminates unrolled zero stores after memset 0 > >> > in a loop where a vector register is used as the zero store source. > >> > > >> > gcc/ > >> > > >> > PR rtl-optimization/105638 > >> > * dse.cc (record_store): Use the constant source if the source > >> > register is set only once. > >> > > >> > gcc/testsuite/ > >> > > >> > PR rtl-optimization/105638 > >> > * g++.target/i386/pr105638.C: New test. > >> > --- > >> > gcc/dse.cc | 19 ++++++++++ > >> > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > >> > 2 files changed, 63 insertions(+) > >> > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > >> > > >> > diff --git a/gcc/dse.cc b/gcc/dse.cc > >> > index 30c11cee034..0433dd3d846 100644 > >> > --- a/gcc/dse.cc > >> > +++ b/gcc/dse.cc > >> > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info) > >> > > >> > if (tem && CONSTANT_P (tem)) > >> > const_rhs = tem; > >> > + else > >> > + { > >> > + /* If RHS is set only once to a constant, set CONST_RHS > >> > + to the constant. */ > >> > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > >> > + if (def != nullptr > >> > + && !DF_REF_IS_ARTIFICIAL (def) > >> > + && !DF_REF_NEXT_REG (def)) > >> > + { > >> > + rtx_insn *def_insn = DF_REF_INSN (def); > >> > + rtx def_body = PATTERN (def_insn); > >> > + if (GET_CODE (def_body) == SET) > >> > + { > >> > + rtx def_src = SET_SRC (def_body); > >> > + if (CONSTANT_P (def_src)) > >> > + const_rhs = def_src; > >> > >> doesn't DSE have its own tracking of stored values? Shouldn't we > > > > It tracks stored values only within the basic block. When RTL loop > > invariant motion hoists a constant initialization out of the loop into > > a separate basic block, the constant store value becomes unknown > > within the original basic block. > > > >> improve _that_ if it is not enough? I also wonder if you need to > > > > My patch extends DSE stored value tracking to include the constant which > > is set only once in another basic block. > > > >> verify the SET isn't partial? > >> > > > > Here is the v2 patch to check that the constant is set by a non-partial > > unconditional load. > > > > OK for master? > > > > Thanks. > > > > H.J. > > --- > > RTL DSE tracks redundant constant stores within a basic block. When RTL > > loop invariant motion hoists a constant initialization out of the loop > > into a separate basic block, the constant store value becomes unknown > > within the original basic block. When recording store for RTL DSE, check > > if the source register is set only once to a constant by a non-partial > > unconditional load. If yes, record the constant as the constant store > > source. It eliminates unrolled zero stores after memset 0 in a loop > > where a vector register is used as the zero store source. > > > > gcc/ > > > > PR rtl-optimization/105638 > > * dse.cc (record_store): Use the constant source if the source > > register is set only once. > > > > gcc/testsuite/ > > > > PR rtl-optimization/105638 > > * g++.target/i386/pr105638.C: New test. > > --- > > gcc/dse.cc | 22 ++++++++++++ > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > > 2 files changed, 66 insertions(+) > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc > > index 30c11cee034..af8e88dac32 100644 > > --- a/gcc/dse.cc > > +++ b/gcc/dse.cc > > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) > > > > if (tem && CONSTANT_P (tem)) > > const_rhs = tem; > > + else > > + { > > + /* If RHS is set only once to a constant, set CONST_RHS > > + to the constant. */ > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > > + if (def != nullptr > > + && !DF_REF_IS_ARTIFICIAL (def) > > + && !(DF_REF_FLAGS (def) > > + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) > > + && !DF_REF_NEXT_REG (def)) > > Can we introduce a helper for this? There are already similar tests > in ira and loop-iv, and it seems a bit too complex to have to open-code > each time. I can use find_single_def_src in loop-iv.cc: /* If REGNO has a single definition, return its known value, otherwise return null. */ rtx find_single_def_src (unsigned int regno) and do /* If RHS is set only once to a constant, set CONST_RHS to the constant. */ rtx def_src = find_single_def_src (REGNO (rhs)); if (def_src != nullptr && CONSTANT_P (def_src)) { df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))) { rtx_insn *def_insn = DF_REF_INSN (def); rtx def_body = single_set (def_insn); if (rhs == SET_DEST (def_body)) const_rhs = def_src; } } } Will it work? Thanks. > Thanks, > Richard > > > + { > > + rtx_insn *def_insn = DF_REF_INSN (def); > > + rtx def_body = PATTERN (def_insn); > > + if (GET_CODE (def_body) == SET) > > + { > > + rtx def_src = SET_SRC (def_body); > > + if (CONSTANT_P (def_src) > > + && GET_MODE (def_src) == GET_MODE (rhs)) > > + const_rhs = def_src; > > + } > > + } > > + } > > } > > } > > > > diff --git a/gcc/testsuite/g++.target/i386/pr105638.C b/gcc/testsuite/g++.target/i386/pr105638.C > > new file mode 100644 > > index 00000000000..ff40a459de1 > > --- /dev/null > > +++ b/gcc/testsuite/g++.target/i386/pr105638.C > > @@ -0,0 +1,44 @@ > > +/* { dg-do compile { target { ! ia32 } } } */ > > +/* { dg-options "-std=gnu++20 -O2 -march=skylake" } */ > > +/* { dg-final { scan-assembler-not "vpxor" } } */ > > + > > +#include <stdint.h> > > +#include <vector> > > +#include <tr1/array> > > + > > +class FastBoard { > > +public: > > + typedef std::pair<int, int> movescore_t; > > + typedef std::tr1::array<movescore_t, 24> scoredlist_t; > > + > > +protected: > > + std::vector<int> m_critical; > > + > > + int m_boardsize; > > +}; > > + > > +class FastState { > > +public: > > + FastBoard board; > > + > > + int movenum; > > +protected: > > + FastBoard::scoredlist_t scoredmoves; > > +}; > > + > > +class KoState : public FastState { > > +private: > > + std::vector<uint64_t> ko_hash_history; > > + std::vector<uint64_t> hash_history; > > +}; > > + > > +class GameState : public KoState { > > +public: > > + void foo (); > > +private: > > + std::vector<KoState> game_history; > > +}; > > + > > +void GameState::foo() { > > + game_history.resize(movenum); > > +}
On Wed, May 25, 2022 at 2:30 AM Richard Sandiford <richard.sandiford@arm.com> wrote: > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > > On Tue, May 24, 2022 at 10:11 PM H.J. Lu <hjl.tools@gmail.com> wrote: > >> > >> On Mon, May 23, 2022 at 11:42 PM Richard Biener > >> <richard.guenther@gmail.com> wrote: > >> > > >> > On Mon, May 23, 2022 at 8:34 PM H.J. Lu <hjl.tools@gmail.com> wrote: > >> > > > >> > > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote: > >> > > > On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches > >> > > > <gcc-patches@gcc.gnu.org> wrote: > >> > > > > > >> > > > > When recording store for RTL dead store elimination, check if the source > >> > > > > register is set only once to a constant. If yes, record the constant > >> > > > > as the store source. It eliminates unrolled zero stores after memset 0 > >> > > > > in a loop where a vector register is used as the zero store source. > >> > > > > > >> > > > > gcc/ > >> > > > > > >> > > > > PR rtl-optimization/105638 > >> > > > > * dse.cc (record_store): Use the constant source if the source > >> > > > > register is set only once. > >> > > > > > >> > > > > gcc/testsuite/ > >> > > > > > >> > > > > PR rtl-optimization/105638 > >> > > > > * g++.target/i386/pr105638.C: New test. > >> > > > > --- > >> > > > > gcc/dse.cc | 19 ++++++++++ > >> > > > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > >> > > > > 2 files changed, 63 insertions(+) > >> > > > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > >> > > > > > >> > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc > >> > > > > index 30c11cee034..0433dd3d846 100644 > >> > > > > --- a/gcc/dse.cc > >> > > > > +++ b/gcc/dse.cc > >> > > > > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info) > >> > > > > > >> > > > > if (tem && CONSTANT_P (tem)) > >> > > > > const_rhs = tem; > >> > > > > + else > >> > > > > + { > >> > > > > + /* If RHS is set only once to a constant, set CONST_RHS > >> > > > > + to the constant. */ > >> > > > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > >> > > > > + if (def != nullptr > >> > > > > + && !DF_REF_IS_ARTIFICIAL (def) > >> > > > > + && !DF_REF_NEXT_REG (def)) > >> > > > > + { > >> > > > > + rtx_insn *def_insn = DF_REF_INSN (def); > >> > > > > + rtx def_body = PATTERN (def_insn); > >> > > > > + if (GET_CODE (def_body) == SET) > >> > > > > + { > >> > > > > + rtx def_src = SET_SRC (def_body); > >> > > > > + if (CONSTANT_P (def_src)) > >> > > > > + const_rhs = def_src; > >> > > > > >> > > > doesn't DSE have its own tracking of stored values? Shouldn't we > >> > > > >> > > It tracks stored values only within the basic block. When RTL loop > >> > > invariant motion hoists a constant initialization out of the loop into > >> > > a separate basic block, the constant store value becomes unknown > >> > > within the original basic block. > >> > > > >> > > > improve _that_ if it is not enough? I also wonder if you need to > >> > > > >> > > My patch extends DSE stored value tracking to include the constant which > >> > > is set only once in another basic block. > >> > > > >> > > > verify the SET isn't partial? > >> > > > > >> > > > >> > > Here is the v2 patch to check that the constant is set by a non-partial > >> > > unconditional load. > >> > > > >> > > OK for master? > >> > > > >> > > Thanks. > >> > > > >> > > H.J. > >> > > --- > >> > > RTL DSE tracks redundant constant stores within a basic block. When RTL > >> > > loop invariant motion hoists a constant initialization out of the loop > >> > > into a separate basic block, the constant store value becomes unknown > >> > > within the original basic block. When recording store for RTL DSE, check > >> > > if the source register is set only once to a constant by a non-partial > >> > > unconditional load. If yes, record the constant as the constant store > >> > > source. It eliminates unrolled zero stores after memset 0 in a loop > >> > > where a vector register is used as the zero store source. > >> > > > >> > > gcc/ > >> > > > >> > > PR rtl-optimization/105638 > >> > > * dse.cc (record_store): Use the constant source if the source > >> > > register is set only once. > >> > > > >> > > gcc/testsuite/ > >> > > > >> > > PR rtl-optimization/105638 > >> > > * g++.target/i386/pr105638.C: New test. > >> > > --- > >> > > gcc/dse.cc | 22 ++++++++++++ > >> > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > >> > > 2 files changed, 66 insertions(+) > >> > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > >> > > > >> > > diff --git a/gcc/dse.cc b/gcc/dse.cc > >> > > index 30c11cee034..af8e88dac32 100644 > >> > > --- a/gcc/dse.cc > >> > > +++ b/gcc/dse.cc > >> > > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) > >> > > > >> > > if (tem && CONSTANT_P (tem)) > >> > > const_rhs = tem; > >> > > + else > >> > > + { > >> > > + /* If RHS is set only once to a constant, set CONST_RHS > >> > > + to the constant. */ > >> > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > >> > > + if (def != nullptr > >> > > + && !DF_REF_IS_ARTIFICIAL (def) > >> > > + && !(DF_REF_FLAGS (def) > >> > > + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) > >> > > + && !DF_REF_NEXT_REG (def)) > >> > > >> > Can we really use df-chain here and rely that a single definition is > >> > the only one? If rhs is a hardreg does df-chain include implicit > >> > sets of function argument registers for example? Don't we need RD > >> > here or at least verify the single df-chain definition dominates the > >> > use here (if we can rely on the reg otherwise be uninitialized and thus > >> > the use invoking undefined behavior we could use the constant even > >> > in non-dominating context, but WRT the function args & hardregs I'm not > >> > sure we can tell). > >> > >> Does the hard register have a regular DEF? Won't the function args & > >> hardregs have DF_REF_ARTIFICIAL? > > > > I don't know - do they? > > They won't be DF_REF_ARTIFICIAL. I think the code should work ok for > hard registers though. All sets of argument registers, and all clobbers > of registers by function calls, are recorded as df defs. (This is one In these cases, will SET_SRC be constant? > of the things that bloats the representation compared to rtx insns, > where most call clobbers are implicit. But it's also one of the things > that makes df easier to use & more reliable than ad hoc liveness tracking.) > > Thanks, > Richard
"H.J. Lu" <hjl.tools@gmail.com> writes: > On Wed, May 25, 2022 at 12:30 AM Richard Sandiford > <richard.sandiford@arm.com> wrote: >> >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes: >> > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote: >> >> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches >> >> <gcc-patches@gcc.gnu.org> wrote: >> >> > >> >> > When recording store for RTL dead store elimination, check if the source >> >> > register is set only once to a constant. If yes, record the constant >> >> > as the store source. It eliminates unrolled zero stores after memset 0 >> >> > in a loop where a vector register is used as the zero store source. >> >> > >> >> > gcc/ >> >> > >> >> > PR rtl-optimization/105638 >> >> > * dse.cc (record_store): Use the constant source if the source >> >> > register is set only once. >> >> > >> >> > gcc/testsuite/ >> >> > >> >> > PR rtl-optimization/105638 >> >> > * g++.target/i386/pr105638.C: New test. >> >> > --- >> >> > gcc/dse.cc | 19 ++++++++++ >> >> > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ >> >> > 2 files changed, 63 insertions(+) >> >> > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C >> >> > >> >> > diff --git a/gcc/dse.cc b/gcc/dse.cc >> >> > index 30c11cee034..0433dd3d846 100644 >> >> > --- a/gcc/dse.cc >> >> > +++ b/gcc/dse.cc >> >> > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info) >> >> > >> >> > if (tem && CONSTANT_P (tem)) >> >> > const_rhs = tem; >> >> > + else >> >> > + { >> >> > + /* If RHS is set only once to a constant, set CONST_RHS >> >> > + to the constant. */ >> >> > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); >> >> > + if (def != nullptr >> >> > + && !DF_REF_IS_ARTIFICIAL (def) >> >> > + && !DF_REF_NEXT_REG (def)) >> >> > + { >> >> > + rtx_insn *def_insn = DF_REF_INSN (def); >> >> > + rtx def_body = PATTERN (def_insn); >> >> > + if (GET_CODE (def_body) == SET) >> >> > + { >> >> > + rtx def_src = SET_SRC (def_body); >> >> > + if (CONSTANT_P (def_src)) >> >> > + const_rhs = def_src; >> >> >> >> doesn't DSE have its own tracking of stored values? Shouldn't we >> > >> > It tracks stored values only within the basic block. When RTL loop >> > invariant motion hoists a constant initialization out of the loop into >> > a separate basic block, the constant store value becomes unknown >> > within the original basic block. >> > >> >> improve _that_ if it is not enough? I also wonder if you need to >> > >> > My patch extends DSE stored value tracking to include the constant which >> > is set only once in another basic block. >> > >> >> verify the SET isn't partial? >> >> >> > >> > Here is the v2 patch to check that the constant is set by a non-partial >> > unconditional load. >> > >> > OK for master? >> > >> > Thanks. >> > >> > H.J. >> > --- >> > RTL DSE tracks redundant constant stores within a basic block. When RTL >> > loop invariant motion hoists a constant initialization out of the loop >> > into a separate basic block, the constant store value becomes unknown >> > within the original basic block. When recording store for RTL DSE, check >> > if the source register is set only once to a constant by a non-partial >> > unconditional load. If yes, record the constant as the constant store >> > source. It eliminates unrolled zero stores after memset 0 in a loop >> > where a vector register is used as the zero store source. >> > >> > gcc/ >> > >> > PR rtl-optimization/105638 >> > * dse.cc (record_store): Use the constant source if the source >> > register is set only once. >> > >> > gcc/testsuite/ >> > >> > PR rtl-optimization/105638 >> > * g++.target/i386/pr105638.C: New test. >> > --- >> > gcc/dse.cc | 22 ++++++++++++ >> > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ >> > 2 files changed, 66 insertions(+) >> > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C >> > >> > diff --git a/gcc/dse.cc b/gcc/dse.cc >> > index 30c11cee034..af8e88dac32 100644 >> > --- a/gcc/dse.cc >> > +++ b/gcc/dse.cc >> > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) >> > >> > if (tem && CONSTANT_P (tem)) >> > const_rhs = tem; >> > + else >> > + { >> > + /* If RHS is set only once to a constant, set CONST_RHS >> > + to the constant. */ >> > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); >> > + if (def != nullptr >> > + && !DF_REF_IS_ARTIFICIAL (def) >> > + && !(DF_REF_FLAGS (def) >> > + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) >> > + && !DF_REF_NEXT_REG (def)) >> >> Can we introduce a helper for this? There are already similar tests >> in ira and loop-iv, and it seems a bit too complex to have to open-code >> each time. > > I can use find_single_def_src in loop-iv.cc: > > /* If REGNO has a single definition, return its known value, otherwise return > null. */ > > rtx > find_single_def_src (unsigned int regno) Yeah, reusing that sounds good. Perhaps we should move it into df-core.cc, alongside the df_reg_used group of functions. I think the mode check in your original patch should be kept though, so how about we change the parameter to an rtx reg and use rtx_equal in: rtx set = single_set (DF_REF_INSN (adef)); if (set == NULL || !REG_P (SET_DEST (set)) || REGNO (SET_DEST (set)) != regno) return NULL_RTX; rather than the existing !REG_P and REGNO checks. We should also add: > > and do > > /* If RHS is set only once to a constant, set CONST_RHS > to the constant. */ > rtx def_src = find_single_def_src (REGNO (rhs)); > if (def_src != nullptr && CONSTANT_P (def_src)) > { > df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > if (!(DF_REF_FLAGS (def) > & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))) …this check to the function, since it's needed for correctness even in the loop-iv.cc use. Thanks, Richard > { > rtx_insn *def_insn = DF_REF_INSN (def); > rtx def_body = single_set (def_insn); > if (rhs == SET_DEST (def_body)) > const_rhs = def_src; > } > } > } > > Will it work? > > Thanks. > >> Thanks, >> Richard >> >> > + { >> > + rtx_insn *def_insn = DF_REF_INSN (def); >> > + rtx def_body = PATTERN (def_insn); >> > + if (GET_CODE (def_body) == SET) >> > + { >> > + rtx def_src = SET_SRC (def_body); >> > + if (CONSTANT_P (def_src) >> > + && GET_MODE (def_src) == GET_MODE (rhs)) >> > + const_rhs = def_src; >> > + } >> > + } >> > + } >> > } >> > } >> > >> > diff --git a/gcc/testsuite/g++.target/i386/pr105638.C b/gcc/testsuite/g++.target/i386/pr105638.C >> > new file mode 100644 >> > index 00000000000..ff40a459de1 >> > --- /dev/null >> > +++ b/gcc/testsuite/g++.target/i386/pr105638.C >> > @@ -0,0 +1,44 @@ >> > +/* { dg-do compile { target { ! ia32 } } } */ >> > +/* { dg-options "-std=gnu++20 -O2 -march=skylake" } */ >> > +/* { dg-final { scan-assembler-not "vpxor" } } */ >> > + >> > +#include <stdint.h> >> > +#include <vector> >> > +#include <tr1/array> >> > + >> > +class FastBoard { >> > +public: >> > + typedef std::pair<int, int> movescore_t; >> > + typedef std::tr1::array<movescore_t, 24> scoredlist_t; >> > + >> > +protected: >> > + std::vector<int> m_critical; >> > + >> > + int m_boardsize; >> > +}; >> > + >> > +class FastState { >> > +public: >> > + FastBoard board; >> > + >> > + int movenum; >> > +protected: >> > + FastBoard::scoredlist_t scoredmoves; >> > +}; >> > + >> > +class KoState : public FastState { >> > +private: >> > + std::vector<uint64_t> ko_hash_history; >> > + std::vector<uint64_t> hash_history; >> > +}; >> > + >> > +class GameState : public KoState { >> > +public: >> > + void foo (); >> > +private: >> > + std::vector<KoState> game_history; >> > +}; >> > + >> > +void GameState::foo() { >> > + game_history.resize(movenum); >> > +}
diff --git a/gcc/dse.cc b/gcc/dse.cc index 30c11cee034..af8e88dac32 100644 --- a/gcc/dse.cc +++ b/gcc/dse.cc @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) if (tem && CONSTANT_P (tem)) const_rhs = tem; + else + { + /* If RHS is set only once to a constant, set CONST_RHS + to the constant. */ + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); + if (def != nullptr + && !DF_REF_IS_ARTIFICIAL (def) + && !(DF_REF_FLAGS (def) + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) + && !DF_REF_NEXT_REG (def)) + { + rtx_insn *def_insn = DF_REF_INSN (def); + rtx def_body = PATTERN (def_insn); + if (GET_CODE (def_body) == SET) + { + rtx def_src = SET_SRC (def_body); + if (CONSTANT_P (def_src) + && GET_MODE (def_src) == GET_MODE (rhs)) + const_rhs = def_src; + } + } + } } } diff --git a/gcc/testsuite/g++.target/i386/pr105638.C b/gcc/testsuite/g++.target/i386/pr105638.C new file mode 100644 index 00000000000..ff40a459de1 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/pr105638.C @@ -0,0 +1,44 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-std=gnu++20 -O2 -march=skylake" } */ +/* { dg-final { scan-assembler-not "vpxor" } } */ + +#include <stdint.h> +#include <vector> +#include <tr1/array> + +class FastBoard { +public: + typedef std::pair<int, int> movescore_t; + typedef std::tr1::array<movescore_t, 24> scoredlist_t; + +protected: + std::vector<int> m_critical; + + int m_boardsize; +}; + +class FastState { +public: + FastBoard board; + + int movenum; +protected: + FastBoard::scoredlist_t scoredmoves; +}; + +class KoState : public FastState { +private: + std::vector<uint64_t> ko_hash_history; + std::vector<uint64_t> hash_history; +}; + +class GameState : public KoState { +public: + void foo (); +private: + std::vector<KoState> game_history; +}; + +void GameState::foo() { + game_history.resize(movenum); +}