diff mbox series

[v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion

Message ID 20240923145602.3872286-1-pan2.li@intel.com
State New
Headers show
Series [v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion | expand

Commit Message

Li, Pan2 Sept. 23, 2024, 2:56 p.m. UTC
From: Pan Li <pan2.li@intel.com>

This patch would like to fix the following ICE for -O2 -m32 of x86_64.

during RTL pass: expand
JackMidiAsyncWaitQueue.cpp.cpp: In function 'void DequeueEvent(unsigned
int)':
JackMidiAsyncWaitQueue.cpp.cpp:3:6: internal compiler error: in
expand_fn_using_insn, at internal-fn.cc:263
    3 | void DequeueEvent(unsigned frame) {
      |      ^~~~~~~~~~~~
0x27b580d diagnostic_context::diagnostic_impl(rich_location*,
diagnostic_metadata const*, diagnostic_option_id, char const*,
__va_list_tag (*) [1], diagnostic_t)
        ???:0
0x27c4a3f internal_error(char const*, ...)
        ???:0
0x27b3994 fancy_abort(char const*, int, char const*)
        ???:0
0xf25ae5 expand_fn_using_insn(gcall*, insn_code, unsigned int, unsigned int)
        ???:0
0xf2a124 expand_direct_optab_fn(internal_fn, gcall*, optab_tag, unsigned int)
        ???:0
0xf2c87c expand_SAT_SUB(internal_fn, gcall*)
        ???:0

We allowed the operand convert when matching SAT_SUB in match.pd, to support
the zip benchmark SAT_SUB pattern.  Aka,

(convert? (minus (convert1? @0) (convert1? @1))) for below sample code.

void test (uint16_t *x, unsigned b, unsigned n)
{
  unsigned a = 0;
  register uint16_t *p = x;

  do {
    a = *--p;
    *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB
  } while (--n);
}

The pattern match for SAT_SUB itself may also act on below scalar sample
code too.

unsigned long long GetTimeFromFrames(int);
unsigned long long GetMicroSeconds();

void DequeueEvent(unsigned frame) {
  long long frame_time = GetTimeFromFrames(frame);
  unsigned long long current_time = GetMicroSeconds();
  DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time);
}

Aka:

uint32_t a = (uint32_t)SAT_SUB(uint64_t, uint64_t);

Then there will be a problem when ia32 or -m32 is given when compiling.
Because we only check the lhs (aka uint32_t) type is supported by ifn
and missed the operand (aka uint64_t).  Mostly DImode is disabled for
32 bits target like ia32 or rv32gcv, and then trigger ICE when expanding.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

	PR target/116814

gcc/ChangeLog:

	* tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add
	ifn is_supported check for operand TREE type.

gcc/testsuite/ChangeLog:

	* g++.dg/torture/pr116814-1.C: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/testsuite/g++.dg/torture/pr116814-1.C | 12 ++++++++++++
 gcc/tree-ssa-math-opts.cc                 | 23 +++++++++++++++--------
 2 files changed, 27 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr116814-1.C

Comments

Uros Bizjak Sept. 24, 2024, 6:17 a.m. UTC | #1
On Mon, Sep 23, 2024 at 4:58 PM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> This patch would like to fix the following ICE for -O2 -m32 of x86_64.
>
> during RTL pass: expand
> JackMidiAsyncWaitQueue.cpp.cpp: In function 'void DequeueEvent(unsigned
> int)':
> JackMidiAsyncWaitQueue.cpp.cpp:3:6: internal compiler error: in
> expand_fn_using_insn, at internal-fn.cc:263
>     3 | void DequeueEvent(unsigned frame) {
>       |      ^~~~~~~~~~~~
> 0x27b580d diagnostic_context::diagnostic_impl(rich_location*,
> diagnostic_metadata const*, diagnostic_option_id, char const*,
> __va_list_tag (*) [1], diagnostic_t)
>         ???:0
> 0x27c4a3f internal_error(char const*, ...)
>         ???:0
> 0x27b3994 fancy_abort(char const*, int, char const*)
>         ???:0
> 0xf25ae5 expand_fn_using_insn(gcall*, insn_code, unsigned int, unsigned int)
>         ???:0
> 0xf2a124 expand_direct_optab_fn(internal_fn, gcall*, optab_tag, unsigned int)
>         ???:0
> 0xf2c87c expand_SAT_SUB(internal_fn, gcall*)
>         ???:0
>
> We allowed the operand convert when matching SAT_SUB in match.pd, to support
> the zip benchmark SAT_SUB pattern.  Aka,
>
> (convert? (minus (convert1? @0) (convert1? @1))) for below sample code.
>
> void test (uint16_t *x, unsigned b, unsigned n)
> {
>   unsigned a = 0;
>   register uint16_t *p = x;
>
>   do {
>     a = *--p;
>     *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB
>   } while (--n);
> }
>
> The pattern match for SAT_SUB itself may also act on below scalar sample
> code too.
>
> unsigned long long GetTimeFromFrames(int);
> unsigned long long GetMicroSeconds();
>
> void DequeueEvent(unsigned frame) {
>   long long frame_time = GetTimeFromFrames(frame);
>   unsigned long long current_time = GetMicroSeconds();
>   DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time);
> }
>
> Aka:
>
> uint32_t a = (uint32_t)SAT_SUB(uint64_t, uint64_t);
>
> Then there will be a problem when ia32 or -m32 is given when compiling.
> Because we only check the lhs (aka uint32_t) type is supported by ifn
> and missed the operand (aka uint64_t).  Mostly DImode is disabled for
> 32 bits target like ia32 or rv32gcv, and then trigger ICE when expanding.
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 fully regression test.
>
>         PR target/116814

This is not "target", but "middle-end" component. Even though the bug
is exposed on x86_64 target, the fix is in the middle-end code, not in
the target code.

> gcc/ChangeLog:
>
>         * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add
>         ifn is_supported check for operand TREE type.
>
> gcc/testsuite/ChangeLog:
>
>         * g++.dg/torture/pr116814-1.C: New test.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/testsuite/g++.dg/torture/pr116814-1.C | 12 ++++++++++++
>  gcc/tree-ssa-math-opts.cc                 | 23 +++++++++++++++--------
>  2 files changed, 27 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/torture/pr116814-1.C
>
> diff --git a/gcc/testsuite/g++.dg/torture/pr116814-1.C b/gcc/testsuite/g++.dg/torture/pr116814-1.C
> new file mode 100644
> index 00000000000..8db5b020cfd
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/torture/pr116814-1.C
> @@ -0,0 +1,12 @@
> +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */
> +/* { dg-options "-O2 -m32" } */

Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead.

Uros,

> +
> +unsigned long long GetTimeFromFrames(int);
> +unsigned long long GetMicroSeconds();
> +
> +void DequeueEvent(unsigned frame) {
> +  long long frame_time = GetTimeFromFrames(frame);
> +  unsigned long long current_time = GetMicroSeconds();
> +
> +  DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time);
> +}
> diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> index d61668aacfc..361761cedef 100644
> --- a/gcc/tree-ssa-math-opts.cc
> +++ b/gcc/tree-ssa-math-opts.cc
> @@ -4042,15 +4042,22 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi,
>                                     internal_fn fn, tree lhs, tree op_0,
>                                     tree op_1)
>  {
> -  if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH))
> -    {
> -      gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> -      gimple_call_set_lhs (call, lhs);
> -      gsi_insert_before (gsi, call, GSI_SAME_STMT);
> +  tree lhs_type = TREE_TYPE (lhs);
> +  tree op_type = TREE_TYPE (op_0);
>
> -      gimple_stmt_iterator psi = gsi_for_stmt (phi);
> -      remove_phi_node (&psi, /* release_lhs_p */ false);
> -    }
> +  if (!direct_internal_fn_supported_p (fn, lhs_type, OPTIMIZE_FOR_BOTH))
> +    return;
> +
> +  if (lhs_type != op_type
> +      && !direct_internal_fn_supported_p (fn, op_type, OPTIMIZE_FOR_BOTH))
> +    return;
> +
> +  gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> +  gimple_call_set_lhs (call, lhs);
> +  gsi_insert_before (gsi, call, GSI_SAME_STMT);
> +
> +  gimple_stmt_iterator psi = gsi_for_stmt (phi);
> +  remove_phi_node (&psi, /* release_lhs_p */ false);
>  }
>
>  /*
> --
> 2.43.0
>
Li, Pan2 Sept. 24, 2024, 6:24 a.m. UTC | #2
Thanks Uros for comments.

> This is not "target", but "middle-end" component. Even though the bug
> is exposed on x86_64 target, the fix is in the middle-end code, not in
> the target code.

Sure, will rename to middle-end.

> Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead.

Is there any suggestion to run the "ia32" test when configure gcc build?
I first leverage ia32 but complain UNSUPPORTED for this case.

Pan

-----Original Message-----
From: Uros Bizjak <ubizjak@gmail.com> 
Sent: Tuesday, September 24, 2024 2:17 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; richard.guenther@gmail.com; Tamar.Christina@arm.com; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com
Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion

On Mon, Sep 23, 2024 at 4:58 PM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> This patch would like to fix the following ICE for -O2 -m32 of x86_64.
>
> during RTL pass: expand
> JackMidiAsyncWaitQueue.cpp.cpp: In function 'void DequeueEvent(unsigned
> int)':
> JackMidiAsyncWaitQueue.cpp.cpp:3:6: internal compiler error: in
> expand_fn_using_insn, at internal-fn.cc:263
>     3 | void DequeueEvent(unsigned frame) {
>       |      ^~~~~~~~~~~~
> 0x27b580d diagnostic_context::diagnostic_impl(rich_location*,
> diagnostic_metadata const*, diagnostic_option_id, char const*,
> __va_list_tag (*) [1], diagnostic_t)
>         ???:0
> 0x27c4a3f internal_error(char const*, ...)
>         ???:0
> 0x27b3994 fancy_abort(char const*, int, char const*)
>         ???:0
> 0xf25ae5 expand_fn_using_insn(gcall*, insn_code, unsigned int, unsigned int)
>         ???:0
> 0xf2a124 expand_direct_optab_fn(internal_fn, gcall*, optab_tag, unsigned int)
>         ???:0
> 0xf2c87c expand_SAT_SUB(internal_fn, gcall*)
>         ???:0
>
> We allowed the operand convert when matching SAT_SUB in match.pd, to support
> the zip benchmark SAT_SUB pattern.  Aka,
>
> (convert? (minus (convert1? @0) (convert1? @1))) for below sample code.
>
> void test (uint16_t *x, unsigned b, unsigned n)
> {
>   unsigned a = 0;
>   register uint16_t *p = x;
>
>   do {
>     a = *--p;
>     *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB
>   } while (--n);
> }
>
> The pattern match for SAT_SUB itself may also act on below scalar sample
> code too.
>
> unsigned long long GetTimeFromFrames(int);
> unsigned long long GetMicroSeconds();
>
> void DequeueEvent(unsigned frame) {
>   long long frame_time = GetTimeFromFrames(frame);
>   unsigned long long current_time = GetMicroSeconds();
>   DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time);
> }
>
> Aka:
>
> uint32_t a = (uint32_t)SAT_SUB(uint64_t, uint64_t);
>
> Then there will be a problem when ia32 or -m32 is given when compiling.
> Because we only check the lhs (aka uint32_t) type is supported by ifn
> and missed the operand (aka uint64_t).  Mostly DImode is disabled for
> 32 bits target like ia32 or rv32gcv, and then trigger ICE when expanding.
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 fully regression test.
>
>         PR target/116814

This is not "target", but "middle-end" component. Even though the bug
is exposed on x86_64 target, the fix is in the middle-end code, not in
the target code.

> gcc/ChangeLog:
>
>         * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add
>         ifn is_supported check for operand TREE type.
>
> gcc/testsuite/ChangeLog:
>
>         * g++.dg/torture/pr116814-1.C: New test.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/testsuite/g++.dg/torture/pr116814-1.C | 12 ++++++++++++
>  gcc/tree-ssa-math-opts.cc                 | 23 +++++++++++++++--------
>  2 files changed, 27 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/torture/pr116814-1.C
>
> diff --git a/gcc/testsuite/g++.dg/torture/pr116814-1.C b/gcc/testsuite/g++.dg/torture/pr116814-1.C
> new file mode 100644
> index 00000000000..8db5b020cfd
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/torture/pr116814-1.C
> @@ -0,0 +1,12 @@
> +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */
> +/* { dg-options "-O2 -m32" } */

Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead.

Uros,

> +
> +unsigned long long GetTimeFromFrames(int);
> +unsigned long long GetMicroSeconds();
> +
> +void DequeueEvent(unsigned frame) {
> +  long long frame_time = GetTimeFromFrames(frame);
> +  unsigned long long current_time = GetMicroSeconds();
> +
> +  DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time);
> +}
> diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> index d61668aacfc..361761cedef 100644
> --- a/gcc/tree-ssa-math-opts.cc
> +++ b/gcc/tree-ssa-math-opts.cc
> @@ -4042,15 +4042,22 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi,
>                                     internal_fn fn, tree lhs, tree op_0,
>                                     tree op_1)
>  {
> -  if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH))
> -    {
> -      gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> -      gimple_call_set_lhs (call, lhs);
> -      gsi_insert_before (gsi, call, GSI_SAME_STMT);
> +  tree lhs_type = TREE_TYPE (lhs);
> +  tree op_type = TREE_TYPE (op_0);
>
> -      gimple_stmt_iterator psi = gsi_for_stmt (phi);
> -      remove_phi_node (&psi, /* release_lhs_p */ false);
> -    }
> +  if (!direct_internal_fn_supported_p (fn, lhs_type, OPTIMIZE_FOR_BOTH))
> +    return;
> +
> +  if (lhs_type != op_type
> +      && !direct_internal_fn_supported_p (fn, op_type, OPTIMIZE_FOR_BOTH))
> +    return;
> +
> +  gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> +  gimple_call_set_lhs (call, lhs);
> +  gsi_insert_before (gsi, call, GSI_SAME_STMT);
> +
> +  gimple_stmt_iterator psi = gsi_for_stmt (phi);
> +  remove_phi_node (&psi, /* release_lhs_p */ false);
>  }
>
>  /*
> --
> 2.43.0
>
Uros Bizjak Sept. 24, 2024, 6:33 a.m. UTC | #3
On Tue, Sep 24, 2024 at 8:24 AM Li, Pan2 <pan2.li@intel.com> wrote:
>
> Thanks Uros for comments.
>
> > This is not "target", but "middle-end" component. Even though the bug
> > is exposed on x86_64 target, the fix is in the middle-end code, not in
> > the target code.
>
> Sure, will rename to middle-end.
>
> > Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead.
>
> Is there any suggestion to run the "ia32" test when configure gcc build?
> I first leverage ia32 but complain UNSUPPORTED for this case.

You can add the following to your testsuite run:

RUNTESTFLAGS="--target-board=unix\{,-m32\}"

e.g:

make -j N -k check RUNTESTFLAGS=...

(where N is the number of make threads)

You can also add "dg.exp" or "dg.exp=pr12345.c" (or any other exp file
or testcase name) to RUNTESTFLAGS to run only one exp file or a single
test.

Uros.

> Pan
>
> -----Original Message-----
> From: Uros Bizjak <ubizjak@gmail.com>
> Sent: Tuesday, September 24, 2024 2:17 PM
> To: Li, Pan2 <pan2.li@intel.com>
> Cc: gcc-patches@gcc.gnu.org; richard.guenther@gmail.com; Tamar.Christina@arm.com; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com
> Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion
>
> On Mon, Sep 23, 2024 at 4:58 PM <pan2.li@intel.com> wrote:
> >
> > From: Pan Li <pan2.li@intel.com>
> >
> > This patch would like to fix the following ICE for -O2 -m32 of x86_64.
> >
> > during RTL pass: expand
> > JackMidiAsyncWaitQueue.cpp.cpp: In function 'void DequeueEvent(unsigned
> > int)':
> > JackMidiAsyncWaitQueue.cpp.cpp:3:6: internal compiler error: in
> > expand_fn_using_insn, at internal-fn.cc:263
> >     3 | void DequeueEvent(unsigned frame) {
> >       |      ^~~~~~~~~~~~
> > 0x27b580d diagnostic_context::diagnostic_impl(rich_location*,
> > diagnostic_metadata const*, diagnostic_option_id, char const*,
> > __va_list_tag (*) [1], diagnostic_t)
> >         ???:0
> > 0x27c4a3f internal_error(char const*, ...)
> >         ???:0
> > 0x27b3994 fancy_abort(char const*, int, char const*)
> >         ???:0
> > 0xf25ae5 expand_fn_using_insn(gcall*, insn_code, unsigned int, unsigned int)
> >         ???:0
> > 0xf2a124 expand_direct_optab_fn(internal_fn, gcall*, optab_tag, unsigned int)
> >         ???:0
> > 0xf2c87c expand_SAT_SUB(internal_fn, gcall*)
> >         ???:0
> >
> > We allowed the operand convert when matching SAT_SUB in match.pd, to support
> > the zip benchmark SAT_SUB pattern.  Aka,
> >
> > (convert? (minus (convert1? @0) (convert1? @1))) for below sample code.
> >
> > void test (uint16_t *x, unsigned b, unsigned n)
> > {
> >   unsigned a = 0;
> >   register uint16_t *p = x;
> >
> >   do {
> >     a = *--p;
> >     *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB
> >   } while (--n);
> > }
> >
> > The pattern match for SAT_SUB itself may also act on below scalar sample
> > code too.
> >
> > unsigned long long GetTimeFromFrames(int);
> > unsigned long long GetMicroSeconds();
> >
> > void DequeueEvent(unsigned frame) {
> >   long long frame_time = GetTimeFromFrames(frame);
> >   unsigned long long current_time = GetMicroSeconds();
> >   DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time);
> > }
> >
> > Aka:
> >
> > uint32_t a = (uint32_t)SAT_SUB(uint64_t, uint64_t);
> >
> > Then there will be a problem when ia32 or -m32 is given when compiling.
> > Because we only check the lhs (aka uint32_t) type is supported by ifn
> > and missed the operand (aka uint64_t).  Mostly DImode is disabled for
> > 32 bits target like ia32 or rv32gcv, and then trigger ICE when expanding.
> >
> > The below test suites are passed for this patch.
> > * The rv64gcv fully regression test.
> > * The x86 bootstrap test.
> > * The x86 fully regression test.
> >
> >         PR target/116814
>
> This is not "target", but "middle-end" component. Even though the bug
> is exposed on x86_64 target, the fix is in the middle-end code, not in
> the target code.
>
> > gcc/ChangeLog:
> >
> >         * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add
> >         ifn is_supported check for operand TREE type.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * g++.dg/torture/pr116814-1.C: New test.
> >
> > Signed-off-by: Pan Li <pan2.li@intel.com>
> > ---
> >  gcc/testsuite/g++.dg/torture/pr116814-1.C | 12 ++++++++++++
> >  gcc/tree-ssa-math-opts.cc                 | 23 +++++++++++++++--------
> >  2 files changed, 27 insertions(+), 8 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/torture/pr116814-1.C
> >
> > diff --git a/gcc/testsuite/g++.dg/torture/pr116814-1.C b/gcc/testsuite/g++.dg/torture/pr116814-1.C
> > new file mode 100644
> > index 00000000000..8db5b020cfd
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/torture/pr116814-1.C
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */
> > +/* { dg-options "-O2 -m32" } */
>
> Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead.
>
> Uros,
>
> > +
> > +unsigned long long GetTimeFromFrames(int);
> > +unsigned long long GetMicroSeconds();
> > +
> > +void DequeueEvent(unsigned frame) {
> > +  long long frame_time = GetTimeFromFrames(frame);
> > +  unsigned long long current_time = GetMicroSeconds();
> > +
> > +  DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time);
> > +}
> > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> > index d61668aacfc..361761cedef 100644
> > --- a/gcc/tree-ssa-math-opts.cc
> > +++ b/gcc/tree-ssa-math-opts.cc
> > @@ -4042,15 +4042,22 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi,
> >                                     internal_fn fn, tree lhs, tree op_0,
> >                                     tree op_1)
> >  {
> > -  if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH))
> > -    {
> > -      gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> > -      gimple_call_set_lhs (call, lhs);
> > -      gsi_insert_before (gsi, call, GSI_SAME_STMT);
> > +  tree lhs_type = TREE_TYPE (lhs);
> > +  tree op_type = TREE_TYPE (op_0);
> >
> > -      gimple_stmt_iterator psi = gsi_for_stmt (phi);
> > -      remove_phi_node (&psi, /* release_lhs_p */ false);
> > -    }
> > +  if (!direct_internal_fn_supported_p (fn, lhs_type, OPTIMIZE_FOR_BOTH))
> > +    return;
> > +
> > +  if (lhs_type != op_type
> > +      && !direct_internal_fn_supported_p (fn, op_type, OPTIMIZE_FOR_BOTH))
> > +    return;
> > +
> > +  gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> > +  gimple_call_set_lhs (call, lhs);
> > +  gsi_insert_before (gsi, call, GSI_SAME_STMT);
> > +
> > +  gimple_stmt_iterator psi = gsi_for_stmt (phi);
> > +  remove_phi_node (&psi, /* release_lhs_p */ false);
> >  }
> >
> >  /*
> > --
> > 2.43.0
> >
Li, Pan2 Sept. 24, 2024, 6:53 a.m. UTC | #4
Got it and thanks, let me rerun to make sure it works well as expected.

Pan

-----Original Message-----
From: Uros Bizjak <ubizjak@gmail.com> 
Sent: Tuesday, September 24, 2024 2:33 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; richard.guenther@gmail.com; Tamar.Christina@arm.com; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com
Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion

On Tue, Sep 24, 2024 at 8:24 AM Li, Pan2 <pan2.li@intel.com> wrote:
>
> Thanks Uros for comments.
>
> > This is not "target", but "middle-end" component. Even though the bug
> > is exposed on x86_64 target, the fix is in the middle-end code, not in
> > the target code.
>
> Sure, will rename to middle-end.
>
> > Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead.
>
> Is there any suggestion to run the "ia32" test when configure gcc build?
> I first leverage ia32 but complain UNSUPPORTED for this case.

You can add the following to your testsuite run:

RUNTESTFLAGS="--target-board=unix\{,-m32\}"

e.g:

make -j N -k check RUNTESTFLAGS=...

(where N is the number of make threads)

You can also add "dg.exp" or "dg.exp=pr12345.c" (or any other exp file
or testcase name) to RUNTESTFLAGS to run only one exp file or a single
test.

Uros.

> Pan
>
> -----Original Message-----
> From: Uros Bizjak <ubizjak@gmail.com>
> Sent: Tuesday, September 24, 2024 2:17 PM
> To: Li, Pan2 <pan2.li@intel.com>
> Cc: gcc-patches@gcc.gnu.org; richard.guenther@gmail.com; Tamar.Christina@arm.com; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com
> Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion
>
> On Mon, Sep 23, 2024 at 4:58 PM <pan2.li@intel.com> wrote:
> >
> > From: Pan Li <pan2.li@intel.com>
> >
> > This patch would like to fix the following ICE for -O2 -m32 of x86_64.
> >
> > during RTL pass: expand
> > JackMidiAsyncWaitQueue.cpp.cpp: In function 'void DequeueEvent(unsigned
> > int)':
> > JackMidiAsyncWaitQueue.cpp.cpp:3:6: internal compiler error: in
> > expand_fn_using_insn, at internal-fn.cc:263
> >     3 | void DequeueEvent(unsigned frame) {
> >       |      ^~~~~~~~~~~~
> > 0x27b580d diagnostic_context::diagnostic_impl(rich_location*,
> > diagnostic_metadata const*, diagnostic_option_id, char const*,
> > __va_list_tag (*) [1], diagnostic_t)
> >         ???:0
> > 0x27c4a3f internal_error(char const*, ...)
> >         ???:0
> > 0x27b3994 fancy_abort(char const*, int, char const*)
> >         ???:0
> > 0xf25ae5 expand_fn_using_insn(gcall*, insn_code, unsigned int, unsigned int)
> >         ???:0
> > 0xf2a124 expand_direct_optab_fn(internal_fn, gcall*, optab_tag, unsigned int)
> >         ???:0
> > 0xf2c87c expand_SAT_SUB(internal_fn, gcall*)
> >         ???:0
> >
> > We allowed the operand convert when matching SAT_SUB in match.pd, to support
> > the zip benchmark SAT_SUB pattern.  Aka,
> >
> > (convert? (minus (convert1? @0) (convert1? @1))) for below sample code.
> >
> > void test (uint16_t *x, unsigned b, unsigned n)
> > {
> >   unsigned a = 0;
> >   register uint16_t *p = x;
> >
> >   do {
> >     a = *--p;
> >     *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB
> >   } while (--n);
> > }
> >
> > The pattern match for SAT_SUB itself may also act on below scalar sample
> > code too.
> >
> > unsigned long long GetTimeFromFrames(int);
> > unsigned long long GetMicroSeconds();
> >
> > void DequeueEvent(unsigned frame) {
> >   long long frame_time = GetTimeFromFrames(frame);
> >   unsigned long long current_time = GetMicroSeconds();
> >   DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time);
> > }
> >
> > Aka:
> >
> > uint32_t a = (uint32_t)SAT_SUB(uint64_t, uint64_t);
> >
> > Then there will be a problem when ia32 or -m32 is given when compiling.
> > Because we only check the lhs (aka uint32_t) type is supported by ifn
> > and missed the operand (aka uint64_t).  Mostly DImode is disabled for
> > 32 bits target like ia32 or rv32gcv, and then trigger ICE when expanding.
> >
> > The below test suites are passed for this patch.
> > * The rv64gcv fully regression test.
> > * The x86 bootstrap test.
> > * The x86 fully regression test.
> >
> >         PR target/116814
>
> This is not "target", but "middle-end" component. Even though the bug
> is exposed on x86_64 target, the fix is in the middle-end code, not in
> the target code.
>
> > gcc/ChangeLog:
> >
> >         * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add
> >         ifn is_supported check for operand TREE type.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * g++.dg/torture/pr116814-1.C: New test.
> >
> > Signed-off-by: Pan Li <pan2.li@intel.com>
> > ---
> >  gcc/testsuite/g++.dg/torture/pr116814-1.C | 12 ++++++++++++
> >  gcc/tree-ssa-math-opts.cc                 | 23 +++++++++++++++--------
> >  2 files changed, 27 insertions(+), 8 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/torture/pr116814-1.C
> >
> > diff --git a/gcc/testsuite/g++.dg/torture/pr116814-1.C b/gcc/testsuite/g++.dg/torture/pr116814-1.C
> > new file mode 100644
> > index 00000000000..8db5b020cfd
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/torture/pr116814-1.C
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */
> > +/* { dg-options "-O2 -m32" } */
>
> Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead.
>
> Uros,
>
> > +
> > +unsigned long long GetTimeFromFrames(int);
> > +unsigned long long GetMicroSeconds();
> > +
> > +void DequeueEvent(unsigned frame) {
> > +  long long frame_time = GetTimeFromFrames(frame);
> > +  unsigned long long current_time = GetMicroSeconds();
> > +
> > +  DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time);
> > +}
> > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> > index d61668aacfc..361761cedef 100644
> > --- a/gcc/tree-ssa-math-opts.cc
> > +++ b/gcc/tree-ssa-math-opts.cc
> > @@ -4042,15 +4042,22 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi,
> >                                     internal_fn fn, tree lhs, tree op_0,
> >                                     tree op_1)
> >  {
> > -  if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH))
> > -    {
> > -      gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> > -      gimple_call_set_lhs (call, lhs);
> > -      gsi_insert_before (gsi, call, GSI_SAME_STMT);
> > +  tree lhs_type = TREE_TYPE (lhs);
> > +  tree op_type = TREE_TYPE (op_0);
> >
> > -      gimple_stmt_iterator psi = gsi_for_stmt (phi);
> > -      remove_phi_node (&psi, /* release_lhs_p */ false);
> > -    }
> > +  if (!direct_internal_fn_supported_p (fn, lhs_type, OPTIMIZE_FOR_BOTH))
> > +    return;
> > +
> > +  if (lhs_type != op_type
> > +      && !direct_internal_fn_supported_p (fn, op_type, OPTIMIZE_FOR_BOTH))
> > +    return;
> > +
> > +  gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> > +  gimple_call_set_lhs (call, lhs);
> > +  gsi_insert_before (gsi, call, GSI_SAME_STMT);
> > +
> > +  gimple_stmt_iterator psi = gsi_for_stmt (phi);
> > +  remove_phi_node (&psi, /* release_lhs_p */ false);
> >  }
> >
> >  /*
> > --
> > 2.43.0
> >
Uros Bizjak Sept. 24, 2024, 7:28 a.m. UTC | #5
On Tue, Sep 24, 2024 at 8:53 AM Li, Pan2 <pan2.li@intel.com> wrote:
>
> Got it and thanks, let me rerun to make sure it works well as expected.

For reference, this is documented in:

https://gcc.gnu.org/wiki/Testing_GCC
https://gcc-newbies-guide.readthedocs.io/en/latest/working-with-the-testsuite.html
https://gcc.gnu.org/install/test.html

Uros.
Li, Pan2 Sept. 24, 2024, 7:30 a.m. UTC | #6
Got it, thanks a lot.

Pan

-----Original Message-----
From: Uros Bizjak <ubizjak@gmail.com> 
Sent: Tuesday, September 24, 2024 3:29 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; richard.guenther@gmail.com; Tamar.Christina@arm.com; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com
Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion

On Tue, Sep 24, 2024 at 8:53 AM Li, Pan2 <pan2.li@intel.com> wrote:
>
> Got it and thanks, let me rerun to make sure it works well as expected.

For reference, this is documented in:

https://gcc.gnu.org/wiki/Testing_GCC
https://gcc-newbies-guide.readthedocs.io/en/latest/working-with-the-testsuite.html
https://gcc.gnu.org/install/test.html

Uros.
diff mbox series

Patch

diff --git a/gcc/testsuite/g++.dg/torture/pr116814-1.C b/gcc/testsuite/g++.dg/torture/pr116814-1.C
new file mode 100644
index 00000000000..8db5b020cfd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr116814-1.C
@@ -0,0 +1,12 @@ 
+/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-O2 -m32" } */
+
+unsigned long long GetTimeFromFrames(int);
+unsigned long long GetMicroSeconds();
+
+void DequeueEvent(unsigned frame) {
+  long long frame_time = GetTimeFromFrames(frame);
+  unsigned long long current_time = GetMicroSeconds();
+
+  DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time);
+}
diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
index d61668aacfc..361761cedef 100644
--- a/gcc/tree-ssa-math-opts.cc
+++ b/gcc/tree-ssa-math-opts.cc
@@ -4042,15 +4042,22 @@  build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi,
 				    internal_fn fn, tree lhs, tree op_0,
 				    tree op_1)
 {
-  if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH))
-    {
-      gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
-      gimple_call_set_lhs (call, lhs);
-      gsi_insert_before (gsi, call, GSI_SAME_STMT);
+  tree lhs_type = TREE_TYPE (lhs);
+  tree op_type = TREE_TYPE (op_0);
 
-      gimple_stmt_iterator psi = gsi_for_stmt (phi);
-      remove_phi_node (&psi, /* release_lhs_p */ false);
-    }
+  if (!direct_internal_fn_supported_p (fn, lhs_type, OPTIMIZE_FOR_BOTH))
+    return;
+
+  if (lhs_type != op_type
+      && !direct_internal_fn_supported_p (fn, op_type, OPTIMIZE_FOR_BOTH))
+    return;
+
+  gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
+  gimple_call_set_lhs (call, lhs);
+  gsi_insert_before (gsi, call, GSI_SAME_STMT);
+
+  gimple_stmt_iterator psi = gsi_for_stmt (phi);
+  remove_phi_node (&psi, /* release_lhs_p */ false);
 }
 
 /*