diff mbox series

[AArch64] Fix shrinkwrapping interactions with atomics (PR92692)

Message ID AM5PR0801MB2035E87ABB99D9B2721877E483360@AM5PR0801MB2035.eurprd08.prod.outlook.com
State New
Headers show
Series [AArch64] Fix shrinkwrapping interactions with atomics (PR92692) | expand

Commit Message

Wilco Dijkstra Jan. 16, 2020, 12:50 p.m. UTC
The separate shrinkwrapping pass may insert stores in the middle
of atomics loops which can cause issues on some implementations.
Avoid this by delaying splitting of atomic patterns until after
prolog/epilog generation.

Bootstrap completed, no test regressions on AArch64. 

Andrew, can you verify this fixes the failure you were getting?

ChangeLog:
2020-01-16  Wilco Dijkstra  <wdijkstr@arm.com>

	PR target/92692
	* config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
	Add assert to ensure prolog has been emitted.
	(aarch64_split_atomic_op): Likewise.
	* config/aarch64/atomics.md (aarch64_compare_and_swap<mode>)
	Use epilogue_completed rather than reload_completed.
	(aarch64_atomic_exchange<mode>): Likewise.
	(aarch64_atomic_<atomic_optab><mode>): Likewise.
	(atomic_nand<mode>): Likewise.
	(aarch64_atomic_fetch_<atomic_optab><mode>): Likewise.
	(atomic_fetch_nand<mode>): Likewise.
	(aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise.
	(atomic_nand_fetch<mode>): Likewise.
---

Comments

Richard Sandiford Jan. 16, 2020, 1:14 p.m. UTC | #1
Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> The separate shrinkwrapping pass may insert stores in the middle
> of atomics loops which can cause issues on some implementations.
> Avoid this by delaying splitting of atomic patterns until after
> prolog/epilog generation.
>
> Bootstrap completed, no test regressions on AArch64.
>
> Andrew, can you verify this fixes the failure you were getting?
>
> ChangeLog:
> 2020-01-16  Wilco Dijkstra  <wdijkstr@arm.com>
>
> PR target/92692
> * config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
> Add assert to ensure prolog has been emitted.
> (aarch64_split_atomic_op): Likewise.
> * config/aarch64/atomics.md (aarch64_compare_and_swap<mode>)
> Use epilogue_completed rather than reload_completed.
> (aarch64_atomic_exchange<mode>): Likewise.
> (aarch64_atomic_<atomic_optab><mode>): Likewise.
> (atomic_nand<mode>): Likewise.
> (aarch64_atomic_fetch_<atomic_optab><mode>): Likewise.
> (atomic_fetch_nand<mode>): Likewise.
> (aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise.
> (atomic_nand_fetch<mode>): Likewise.

OK if Andrew confirms it works, thanks.

Richard

> ---
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index ac89cc1f9c938455d33d8850d9ebfc0473cb73dc..cd9d813f2ac4990971f6435fdb28b0f94ae10309 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -18375,6 +18375,9 @@ aarch64_emit_post_barrier (enum memmodel model)
>  void
>  aarch64_split_compare_and_swap (rtx operands[])
>  {
> +  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
> +  gcc_assert (epilogue_completed);
> +
>    rtx rval, mem, oldval, newval, scratch, x, model_rtx;
>    machine_mode mode;
>    bool is_weak;
> @@ -18469,6 +18472,9 @@ void
>  aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
>   rtx value, rtx model_rtx, rtx cond)
>  {
> +  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
> +  gcc_assert (epilogue_completed);
> +
>    machine_mode mode = GET_MODE (mem);
>    machine_mode wmode = (mode == DImode ? DImode : SImode);
>    const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
> diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
> index c2bcabd0c3c2627b7222dcbc1af9c2e6b7ce6a76..996947799b5ef8445e9786b94e1ce62fd16e5b5c 100644
> --- a/gcc/config/aarch64/atomics.md
> +++ b/gcc/config/aarch64/atomics.md
> @@ -56,7 +56,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>"
>     (clobber (match_scratch:SI 7 "=&r"))]
>    ""
>    "#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>    [(const_int 0)]
>    {
>      aarch64_split_compare_and_swap (operands);
> @@ -80,7 +80,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>"
>     (clobber (match_scratch:SI 7 "=&r"))]
>    ""
>    "#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>    [(const_int 0)]
>    {
>      aarch64_split_compare_and_swap (operands);
> @@ -104,7 +104,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>"
>     (clobber (match_scratch:SI 7 "=&r"))]
>    ""
>    "#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>    [(const_int 0)]
>    {
>      aarch64_split_compare_and_swap (operands);
> @@ -223,7 +223,7 @@ (define_insn_and_split "aarch64_atomic_exchange<mode>"
>     (clobber (match_scratch:SI 4 "=&r"))]
>    ""
>    "#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>    [(const_int 0)]
>    {
>      aarch64_split_atomic_op (SET, operands[0], NULL, operands[1],
> @@ -344,7 +344,7 @@ (define_insn_and_split "aarch64_atomic_<atomic_optab><mode>"
>    (clobber (match_scratch:SI 4 "=&r"))]
>    ""
>    "#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>    [(const_int 0)]
>    {
>      aarch64_split_atomic_op (<CODE>, NULL, operands[3], operands[0],
> @@ -400,7 +400,7 @@ (define_insn_and_split "atomic_nand<mode>"
>     (clobber (match_scratch:SI 4 "=&r"))]
>    ""
>    "#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>    [(const_int 0)]
>    {
>       aarch64_split_atomic_op (NOT, NULL, operands[3], operands[0],
> @@ -504,7 +504,7 @@ (define_insn_and_split "aarch64_atomic_fetch_<atomic_optab><mode>"
>     (clobber (match_scratch:SI 5 "=&r"))]
>    ""
>    "#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>    [(const_int 0)]
>    {
>      aarch64_split_atomic_op (<CODE>, operands[0], operands[4], operands[1],
> @@ -551,7 +551,7 @@ (define_insn_and_split "atomic_fetch_nand<mode>"
>     (clobber (match_scratch:SI 5 "=&r"))]
>    ""
>    "#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>    [(const_int 0)]
>    {
>      aarch64_split_atomic_op (NOT, operands[0], operands[4], operands[1],
> @@ -604,7 +604,7 @@ (define_insn_and_split "aarch64_atomic_<atomic_optab>_fetch<mode>"
>     (clobber (match_scratch:SI 4 "=&r"))]
>    ""
>    "#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>    [(const_int 0)]
>    {
>      aarch64_split_atomic_op (<CODE>, NULL, operands[0], operands[1],
> @@ -628,7 +628,7 @@ (define_insn_and_split "atomic_nand_fetch<mode>"
>     (clobber (match_scratch:SI 4 "=&r"))]
>    ""
>    "#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>    [(const_int 0)]
>    {
>      aarch64_split_atomic_op (NOT, NULL, operands[0], operands[1],
Andrew Pinski Jan. 17, 2020, 1:51 a.m. UTC | #2
On Thu, Jan 16, 2020 at 5:14 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> > The separate shrinkwrapping pass may insert stores in the middle
> > of atomics loops which can cause issues on some implementations.
> > Avoid this by delaying splitting of atomic patterns until after
> > prolog/epilog generation.
> >
> > Bootstrap completed, no test regressions on AArch64.
> >
> > Andrew, can you verify this fixes the failure you were getting?
> >
> > ChangeLog:
> > 2020-01-16  Wilco Dijkstra  <wdijkstr@arm.com>
> >
> > PR target/92692
> > * config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
> > Add assert to ensure prolog has been emitted.
> > (aarch64_split_atomic_op): Likewise.
> > * config/aarch64/atomics.md (aarch64_compare_and_swap<mode>)
> > Use epilogue_completed rather than reload_completed.
> > (aarch64_atomic_exchange<mode>): Likewise.
> > (aarch64_atomic_<atomic_optab><mode>): Likewise.
> > (atomic_nand<mode>): Likewise.
> > (aarch64_atomic_fetch_<atomic_optab><mode>): Likewise.
> > (atomic_fetch_nand<mode>): Likewise.
> > (aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise.
> > (atomic_nand_fetch<mode>): Likewise.
>
> OK if Andrew confirms it works, thanks.

Yes this fixes the issue for me.

Thanks,
Andrew

>
> Richard
>
> > ---
> >
> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > index ac89cc1f9c938455d33d8850d9ebfc0473cb73dc..cd9d813f2ac4990971f6435fdb28b0f94ae10309 100644
> > --- a/gcc/config/aarch64/aarch64.c
> > +++ b/gcc/config/aarch64/aarch64.c
> > @@ -18375,6 +18375,9 @@ aarch64_emit_post_barrier (enum memmodel model)
> >  void
> >  aarch64_split_compare_and_swap (rtx operands[])
> >  {
> > +  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
> > +  gcc_assert (epilogue_completed);
> > +
> >    rtx rval, mem, oldval, newval, scratch, x, model_rtx;
> >    machine_mode mode;
> >    bool is_weak;
> > @@ -18469,6 +18472,9 @@ void
> >  aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
> >   rtx value, rtx model_rtx, rtx cond)
> >  {
> > +  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
> > +  gcc_assert (epilogue_completed);
> > +
> >    machine_mode mode = GET_MODE (mem);
> >    machine_mode wmode = (mode == DImode ? DImode : SImode);
> >    const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
> > diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
> > index c2bcabd0c3c2627b7222dcbc1af9c2e6b7ce6a76..996947799b5ef8445e9786b94e1ce62fd16e5b5c 100644
> > --- a/gcc/config/aarch64/atomics.md
> > +++ b/gcc/config/aarch64/atomics.md
> > @@ -56,7 +56,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>"
> >     (clobber (match_scratch:SI 7 "=&r"))]
> >    ""
> >    "#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >    [(const_int 0)]
> >    {
> >      aarch64_split_compare_and_swap (operands);
> > @@ -80,7 +80,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>"
> >     (clobber (match_scratch:SI 7 "=&r"))]
> >    ""
> >    "#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >    [(const_int 0)]
> >    {
> >      aarch64_split_compare_and_swap (operands);
> > @@ -104,7 +104,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>"
> >     (clobber (match_scratch:SI 7 "=&r"))]
> >    ""
> >    "#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >    [(const_int 0)]
> >    {
> >      aarch64_split_compare_and_swap (operands);
> > @@ -223,7 +223,7 @@ (define_insn_and_split "aarch64_atomic_exchange<mode>"
> >     (clobber (match_scratch:SI 4 "=&r"))]
> >    ""
> >    "#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >    [(const_int 0)]
> >    {
> >      aarch64_split_atomic_op (SET, operands[0], NULL, operands[1],
> > @@ -344,7 +344,7 @@ (define_insn_and_split "aarch64_atomic_<atomic_optab><mode>"
> >    (clobber (match_scratch:SI 4 "=&r"))]
> >    ""
> >    "#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >    [(const_int 0)]
> >    {
> >      aarch64_split_atomic_op (<CODE>, NULL, operands[3], operands[0],
> > @@ -400,7 +400,7 @@ (define_insn_and_split "atomic_nand<mode>"
> >     (clobber (match_scratch:SI 4 "=&r"))]
> >    ""
> >    "#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >    [(const_int 0)]
> >    {
> >       aarch64_split_atomic_op (NOT, NULL, operands[3], operands[0],
> > @@ -504,7 +504,7 @@ (define_insn_and_split "aarch64_atomic_fetch_<atomic_optab><mode>"
> >     (clobber (match_scratch:SI 5 "=&r"))]
> >    ""
> >    "#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >    [(const_int 0)]
> >    {
> >      aarch64_split_atomic_op (<CODE>, operands[0], operands[4], operands[1],
> > @@ -551,7 +551,7 @@ (define_insn_and_split "atomic_fetch_nand<mode>"
> >     (clobber (match_scratch:SI 5 "=&r"))]
> >    ""
> >    "#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >    [(const_int 0)]
> >    {
> >      aarch64_split_atomic_op (NOT, operands[0], operands[4], operands[1],
> > @@ -604,7 +604,7 @@ (define_insn_and_split "aarch64_atomic_<atomic_optab>_fetch<mode>"
> >     (clobber (match_scratch:SI 4 "=&r"))]
> >    ""
> >    "#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >    [(const_int 0)]
> >    {
> >      aarch64_split_atomic_op (<CODE>, NULL, operands[0], operands[1],
> > @@ -628,7 +628,7 @@ (define_insn_and_split "atomic_nand_fetch<mode>"
> >     (clobber (match_scratch:SI 4 "=&r"))]
> >    ""
> >    "#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >    [(const_int 0)]
> >    {
> >      aarch64_split_atomic_op (NOT, NULL, operands[0], operands[1],
Andrew Pinski Jan. 17, 2020, 2:03 a.m. UTC | #3
On Thu, Jan 16, 2020 at 5:51 PM Andrew Pinski <pinskia@gmail.com> wrote:
>
> On Thu, Jan 16, 2020 at 5:14 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> >
> > Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> > > The separate shrinkwrapping pass may insert stores in the middle
> > > of atomics loops which can cause issues on some implementations.
> > > Avoid this by delaying splitting of atomic patterns until after
> > > prolog/epilog generation.
> > >
> > > Bootstrap completed, no test regressions on AArch64.
> > >
> > > Andrew, can you verify this fixes the failure you were getting?
> > >
> > > ChangeLog:
> > > 2020-01-16  Wilco Dijkstra  <wdijkstr@arm.com>
> > >
> > > PR target/92692
> > > * config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
> > > Add assert to ensure prolog has been emitted.
> > > (aarch64_split_atomic_op): Likewise.
> > > * config/aarch64/atomics.md (aarch64_compare_and_swap<mode>)
> > > Use epilogue_completed rather than reload_completed.
> > > (aarch64_atomic_exchange<mode>): Likewise.
> > > (aarch64_atomic_<atomic_optab><mode>): Likewise.
> > > (atomic_nand<mode>): Likewise.
> > > (aarch64_atomic_fetch_<atomic_optab><mode>): Likewise.
> > > (atomic_fetch_nand<mode>): Likewise.
> > > (aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise.
> > > (atomic_nand_fetch<mode>): Likewise.
> >
> > OK if Andrew confirms it works, thanks.
>
> Yes this fixes the issue for me.
Here is the new assembly showing it worked:

    d390:       f9000bf3        str     x19, [sp, #16]
    d394:       885ffdc8        ldaxr   w8, [x14]
    d398:       6b01011f        cmp     w8, w1
    d39c:       54000061        b.ne    d3a8
<pthread_rwlock_clockwrlock+0x168>  // b.any
    d3a0:       88137dc5        stxr    w19, w5, [x14]

Thanks,
Andrew Pinski


>
> Thanks,
> Andrew
>
> >
> > Richard
> >
> > > ---
> > >
> > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > > index ac89cc1f9c938455d33d8850d9ebfc0473cb73dc..cd9d813f2ac4990971f6435fdb28b0f94ae10309 100644
> > > --- a/gcc/config/aarch64/aarch64.c
> > > +++ b/gcc/config/aarch64/aarch64.c
> > > @@ -18375,6 +18375,9 @@ aarch64_emit_post_barrier (enum memmodel model)
> > >  void
> > >  aarch64_split_compare_and_swap (rtx operands[])
> > >  {
> > > +  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
> > > +  gcc_assert (epilogue_completed);
> > > +
> > >    rtx rval, mem, oldval, newval, scratch, x, model_rtx;
> > >    machine_mode mode;
> > >    bool is_weak;
> > > @@ -18469,6 +18472,9 @@ void
> > >  aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
> > >   rtx value, rtx model_rtx, rtx cond)
> > >  {
> > > +  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
> > > +  gcc_assert (epilogue_completed);
> > > +
> > >    machine_mode mode = GET_MODE (mem);
> > >    machine_mode wmode = (mode == DImode ? DImode : SImode);
> > >    const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
> > > diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
> > > index c2bcabd0c3c2627b7222dcbc1af9c2e6b7ce6a76..996947799b5ef8445e9786b94e1ce62fd16e5b5c 100644
> > > --- a/gcc/config/aarch64/atomics.md
> > > +++ b/gcc/config/aarch64/atomics.md
> > > @@ -56,7 +56,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>"
> > >     (clobber (match_scratch:SI 7 "=&r"))]
> > >    ""
> > >    "#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >    [(const_int 0)]
> > >    {
> > >      aarch64_split_compare_and_swap (operands);
> > > @@ -80,7 +80,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>"
> > >     (clobber (match_scratch:SI 7 "=&r"))]
> > >    ""
> > >    "#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >    [(const_int 0)]
> > >    {
> > >      aarch64_split_compare_and_swap (operands);
> > > @@ -104,7 +104,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>"
> > >     (clobber (match_scratch:SI 7 "=&r"))]
> > >    ""
> > >    "#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >    [(const_int 0)]
> > >    {
> > >      aarch64_split_compare_and_swap (operands);
> > > @@ -223,7 +223,7 @@ (define_insn_and_split "aarch64_atomic_exchange<mode>"
> > >     (clobber (match_scratch:SI 4 "=&r"))]
> > >    ""
> > >    "#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >    [(const_int 0)]
> > >    {
> > >      aarch64_split_atomic_op (SET, operands[0], NULL, operands[1],
> > > @@ -344,7 +344,7 @@ (define_insn_and_split "aarch64_atomic_<atomic_optab><mode>"
> > >    (clobber (match_scratch:SI 4 "=&r"))]
> > >    ""
> > >    "#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >    [(const_int 0)]
> > >    {
> > >      aarch64_split_atomic_op (<CODE>, NULL, operands[3], operands[0],
> > > @@ -400,7 +400,7 @@ (define_insn_and_split "atomic_nand<mode>"
> > >     (clobber (match_scratch:SI 4 "=&r"))]
> > >    ""
> > >    "#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >    [(const_int 0)]
> > >    {
> > >       aarch64_split_atomic_op (NOT, NULL, operands[3], operands[0],
> > > @@ -504,7 +504,7 @@ (define_insn_and_split "aarch64_atomic_fetch_<atomic_optab><mode>"
> > >     (clobber (match_scratch:SI 5 "=&r"))]
> > >    ""
> > >    "#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >    [(const_int 0)]
> > >    {
> > >      aarch64_split_atomic_op (<CODE>, operands[0], operands[4], operands[1],
> > > @@ -551,7 +551,7 @@ (define_insn_and_split "atomic_fetch_nand<mode>"
> > >     (clobber (match_scratch:SI 5 "=&r"))]
> > >    ""
> > >    "#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >    [(const_int 0)]
> > >    {
> > >      aarch64_split_atomic_op (NOT, operands[0], operands[4], operands[1],
> > > @@ -604,7 +604,7 @@ (define_insn_and_split "aarch64_atomic_<atomic_optab>_fetch<mode>"
> > >     (clobber (match_scratch:SI 4 "=&r"))]
> > >    ""
> > >    "#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >    [(const_int 0)]
> > >    {
> > >      aarch64_split_atomic_op (<CODE>, NULL, operands[0], operands[1],
> > > @@ -628,7 +628,7 @@ (define_insn_and_split "atomic_nand_fetch<mode>"
> > >     (clobber (match_scratch:SI 4 "=&r"))]
> > >    ""
> > >    "#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >    [(const_int 0)]
> > >    {
> > >      aarch64_split_atomic_op (NOT, NULL, operands[0], operands[1],
Segher Boessenkool Jan. 27, 2020, 4:08 p.m. UTC | #4
Hi!

On Thu, Jan 16, 2020 at 12:50:14PM +0000, Wilco Dijkstra wrote:
> The separate shrinkwrapping pass may insert stores in the middle
> of atomics loops which can cause issues on some implementations.
> Avoid this by delaying splitting of atomic patterns until after
> prolog/epilog generation.

Note that this isn't specific to sws at all: there isn't anything
stopping later passes from doing this either.  Is there anything that
protects us from sched2 doing similar here, for example?


Segher
Wilco Dijkstra Jan. 27, 2020, 4:28 p.m. UTC | #5
Hi Segher,

> On Thu, Jan 16, 2020 at 12:50:14PM +0000, Wilco Dijkstra wrote:
>> The separate shrinkwrapping pass may insert stores in the middle
>> of atomics loops which can cause issues on some implementations.
>> Avoid this by delaying splitting of atomic patterns until after
>> prolog/epilog generation.
>
> Note that this isn't specific to sws at all: there isn't anything
> stopping later passes from doing this either.  Is there anything that
> protects us from sched2 doing similar here, for example?

The expansions create extra basic blocks and insert various barriers
that would stop any reasonable scheduler from doing it. And the
current scheduler is basic block based.

Wilco
Segher Boessenkool Jan. 28, 2020, 12:13 p.m. UTC | #6
Hi!

On Mon, Jan 27, 2020 at 04:28:29PM +0000, Wilco Dijkstra wrote:
> > On Thu, Jan 16, 2020 at 12:50:14PM +0000, Wilco Dijkstra wrote:
> >> The separate shrinkwrapping pass may insert stores in the middle
> >> of atomics loops which can cause issues on some implementations.
> >> Avoid this by delaying splitting of atomic patterns until after
> >> prolog/epilog generation.
> >
> > Note that this isn't specific to sws at all: there isn't anything
> > stopping later passes from doing this either.  Is there anything that
> > protects us from sched2 doing similar here, for example?
> 
> The expansions create extra basic blocks and insert various barriers
> that would stop any reasonable scheduler from doing it. And the
> current scheduler is basic block based.

Hrm, and probably no other passes after *logue can create new store
insns (ignoring peepholes and splitters, that would be almost *asking*
for trouble ;-) ).  I see.  Thanks for explaining.


Segher
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ac89cc1f9c938455d33d8850d9ebfc0473cb73dc..cd9d813f2ac4990971f6435fdb28b0f94ae10309 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -18375,6 +18375,9 @@  aarch64_emit_post_barrier (enum memmodel model)
 void
 aarch64_split_compare_and_swap (rtx operands[])
 {
+  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
+  gcc_assert (epilogue_completed);
+
   rtx rval, mem, oldval, newval, scratch, x, model_rtx;
   machine_mode mode;
   bool is_weak;
@@ -18469,6 +18472,9 @@  void
 aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
 			 rtx value, rtx model_rtx, rtx cond)
 {
+  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
+  gcc_assert (epilogue_completed);
+
   machine_mode mode = GET_MODE (mem);
   machine_mode wmode = (mode == DImode ? DImode : SImode);
   const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index c2bcabd0c3c2627b7222dcbc1af9c2e6b7ce6a76..996947799b5ef8445e9786b94e1ce62fd16e5b5c 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -56,7 +56,7 @@  (define_insn_and_split "@aarch64_compare_and_swap<mode>"
    (clobber (match_scratch:SI 7 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_compare_and_swap (operands);
@@ -80,7 +80,7 @@  (define_insn_and_split "@aarch64_compare_and_swap<mode>"
    (clobber (match_scratch:SI 7 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_compare_and_swap (operands);
@@ -104,7 +104,7 @@  (define_insn_and_split "@aarch64_compare_and_swap<mode>"
    (clobber (match_scratch:SI 7 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_compare_and_swap (operands);
@@ -223,7 +223,7 @@  (define_insn_and_split "aarch64_atomic_exchange<mode>"
    (clobber (match_scratch:SI 4 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_atomic_op (SET, operands[0], NULL, operands[1],
@@ -344,7 +344,7 @@  (define_insn_and_split "aarch64_atomic_<atomic_optab><mode>"
   (clobber (match_scratch:SI 4 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_atomic_op (<CODE>, NULL, operands[3], operands[0],
@@ -400,7 +400,7 @@  (define_insn_and_split "atomic_nand<mode>"
    (clobber (match_scratch:SI 4 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
      aarch64_split_atomic_op (NOT, NULL, operands[3], operands[0],
@@ -504,7 +504,7 @@  (define_insn_and_split "aarch64_atomic_fetch_<atomic_optab><mode>"
    (clobber (match_scratch:SI 5 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_atomic_op (<CODE>, operands[0], operands[4], operands[1],
@@ -551,7 +551,7 @@  (define_insn_and_split "atomic_fetch_nand<mode>"
    (clobber (match_scratch:SI 5 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_atomic_op (NOT, operands[0], operands[4], operands[1],
@@ -604,7 +604,7 @@  (define_insn_and_split "aarch64_atomic_<atomic_optab>_fetch<mode>"
    (clobber (match_scratch:SI 4 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_atomic_op (<CODE>, NULL, operands[0], operands[1],
@@ -628,7 +628,7 @@  (define_insn_and_split "atomic_nand_fetch<mode>"
    (clobber (match_scratch:SI 4 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_atomic_op (NOT, NULL, operands[0], operands[1],