diff mbox series

testsuite: Adjust pr96789.c to exclude vect_load_lanes

Message ID 2ea418e8-7e7d-3477-9026-f850a046d463@linux.ibm.com
State New
Headers show
Series testsuite: Adjust pr96789.c to exclude vect_load_lanes | expand

Commit Message

Kewen.Lin Nov. 10, 2020, 2:18 a.m. UTC
Hi,

As Lyon pointed out, the newly introduced test case
gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf.
Loop vectorizer is able to vectorize the two loops which
operate on array tmp with load_lanes feature support.  It
makes dse3 get unexpected inputs and do nothing.

This patch is to teach the case to respect vect_load_lanes,
meanwhile to guard the check only under vect_int.

Is it ok for trunk?

BR,
Kewen
-----
gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr96789.c: Adjusted by excluding vect_load_lanes.

Comments

Richard Sandiford Nov. 10, 2020, 11:31 a.m. UTC | #1
"Kewen.Lin" <linkw@linux.ibm.com> writes:
> Hi,
>
> As Lyon pointed out, the newly introduced test case
> gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf.
> Loop vectorizer is able to vectorize the two loops which
> operate on array tmp with load_lanes feature support.  It
> makes dse3 get unexpected inputs and do nothing.
>
> This patch is to teach the case to respect vect_load_lanes,
> meanwhile to guard the check only under vect_int.

I'm not sure this is the right check.  The test passes on aarch64,
which also has load lanes, but apparently doesn't use them for this
test.  I think the way the loop vectoriser handles the loops will
depend a lot on target costs, which can vary in unpredictable ways.

Does it work if you instead change -ftree-vectorize to -ftree-slp-vectorize?
Or does that defeat the purpose of the test?

Thanks,
Richard

>
> Is it ok for trunk?
>
> BR,
> Kewen
> -----
> gcc/testsuite/ChangeLog:
>
> 	* gcc.dg/tree-ssa/pr96789.c: Adjusted by excluding vect_load_lanes.
>
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
> index d6139a014d8..1b89f8b7a6a 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
> @@ -55,4 +55,7 @@ bar (int16_t res[16], uint8_t *val1, uint8_t *val2)
>      }
>  }
>
> -/* { dg-final { scan-tree-dump {Deleted dead store:.*tmp} "dse3" } } */
> +/* Exclude targets which support load_lanes since loop vectorizer
> +   can vectorize those two loops that operate tmp array so that
> +   subsequent dse3 will not eliminate tmp stores.  */
> +/* { dg-final { scan-tree-dump {Deleted dead store:.*tmp} "dse3" { target { vect_int && { ! vect_load_lanes } } } } } */
Kewen.Lin Nov. 11, 2020, 2:42 a.m. UTC | #2
Hi Richard,

Thanks for the review!

on 2020/11/10 下午7:31, Richard Sandiford wrote:
> "Kewen.Lin" <linkw@linux.ibm.com> writes:
>> Hi,
>>
>> As Lyon pointed out, the newly introduced test case
>> gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf.
>> Loop vectorizer is able to vectorize the two loops which
>> operate on array tmp with load_lanes feature support.  It
>> makes dse3 get unexpected inputs and do nothing.
>>
>> This patch is to teach the case to respect vect_load_lanes,
>> meanwhile to guard the check only under vect_int.
> 
> I'm not sure this is the right check.  The test passes on aarch64,
> which also has load lanes, but apparently doesn't use them for this
> test.  I think the way the loop vectoriser handles the loops will
> depend a lot on target costs, which can vary in unpredictable ways.
> 

You are right, although aarch64 doesn't have this failure, it can fail
with explicit -march=armv8-a+sve.  It can vary as target features/costs
change.  The check is still fragile.

Your suggestion with -ftree-slp-vectorize below is better!

> Does it work if you instead change -ftree-vectorize to -ftree-slp-vectorize?
> Or does that defeat the purpose of the test?

It works, nice, thanks for the suggestion!

I appended one explicit -fno-tree-loop-vectorize to avoid it to fail
in case someone kicks off the testing with explicit -ftree-loop-vectorize.

The updated version is pasted below, is it ok for trunk?

BR,
Kewen
-----
gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr96789.c: Adjusted by disabling loop vectorization.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
index d6139a014d8..5704952309b 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
@@ -1,5 +1,8 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -funroll-loops -ftree-vectorize -fdump-tree-dse-details" } */
+/* Disable loop vectorization to avoid that loop vectorizer
+   optimizes those two loops that operate tmp array so that
+   subsequent dse3 won't eliminate expected tmp stores.  */
+/* { dg-options "-O2 -funroll-loops -ftree-slp-vectorize -fno-tree-loop-vectorize -fdump-tree-dse-details" } */

 /* Test if scalar cleanup pass takes effects, mainly check
    its secondary pass DSE can remove dead stores on array
Richard Sandiford Nov. 11, 2020, 9:49 a.m. UTC | #3
"Kewen.Lin" <linkw@linux.ibm.com> writes:
> Hi Richard,
>
> Thanks for the review!
>
> on 2020/11/10 锟斤拷锟斤拷7:31, Richard Sandiford wrote:
>> "Kewen.Lin" <linkw@linux.ibm.com> writes:
>>> Hi,
>>>
>>> As Lyon pointed out, the newly introduced test case
>>> gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf.
>>> Loop vectorizer is able to vectorize the two loops which
>>> operate on array tmp with load_lanes feature support.  It
>>> makes dse3 get unexpected inputs and do nothing.
>>>
>>> This patch is to teach the case to respect vect_load_lanes,
>>> meanwhile to guard the check only under vect_int.
>> 
>> I'm not sure this is the right check.  The test passes on aarch64,
>> which also has load lanes, but apparently doesn't use them for this
>> test.  I think the way the loop vectoriser handles the loops will
>> depend a lot on target costs, which can vary in unpredictable ways.
>> 
>
> You are right, although aarch64 doesn't have this failure, it can fail
> with explicit -march=armv8-a+sve.  It can vary as target features/costs
> change.  The check is still fragile.
>
> Your suggestion with -ftree-slp-vectorize below is better!
>
>> Does it work if you instead change -ftree-vectorize to -ftree-slp-vectorize?
>> Or does that defeat the purpose of the test?
>
> It works, nice, thanks for the suggestion!
>
> I appended one explicit -fno-tree-loop-vectorize to avoid it to fail
> in case someone kicks off the testing with explicit -ftree-loop-vectorize.
>
> The updated version is pasted below, is it ok for trunk?

OK, thanks.

Richard

>
> BR,
> Kewen
> -----
> gcc/testsuite/ChangeLog:
>
> 	* gcc.dg/tree-ssa/pr96789.c: Adjusted by disabling loop vectorization.
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
> index d6139a014d8..5704952309b 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
> @@ -1,5 +1,8 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -funroll-loops -ftree-vectorize -fdump-tree-dse-details" } */
> +/* Disable loop vectorization to avoid that loop vectorizer
> +   optimizes those two loops that operate tmp array so that
> +   subsequent dse3 won't eliminate expected tmp stores.  */
> +/* { dg-options "-O2 -funroll-loops -ftree-slp-vectorize -fno-tree-loop-vectorize -fdump-tree-dse-details" } */
>
>  /* Test if scalar cleanup pass takes effects, mainly check
>     its secondary pass DSE can remove dead stores on array
Jeff Law Nov. 13, 2020, 10:11 p.m. UTC | #4
On 11/10/20 7:42 PM, Kewen.Lin via Gcc-patches wrote:
> Hi Richard,
>
> Thanks for the review!
>
> on 2020/11/10 涓嬪崍7:31, Richard Sandiford wrote:
>> "Kewen.Lin" <linkw@linux.ibm.com> writes:
>>> Hi,
>>>
>>> As Lyon pointed out, the newly introduced test case
>>> gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf.
>>> Loop vectorizer is able to vectorize the two loops which
>>> operate on array tmp with load_lanes feature support.  It
>>> makes dse3 get unexpected inputs and do nothing.
>>>
>>> This patch is to teach the case to respect vect_load_lanes,
>>> meanwhile to guard the check only under vect_int.
>> I'm not sure this is the right check.  The test passes on aarch64,
>> which also has load lanes, but apparently doesn't use them for this
>> test.  I think the way the loop vectoriser handles the loops will
>> depend a lot on target costs, which can vary in unpredictable ways.
>>
> You are right, although aarch64 doesn't have this failure, it can fail
> with explicit -march=armv8-a+sve.  It can vary as target features/costs
> change.  The check is still fragile.
>
> Your suggestion with -ftree-slp-vectorize below is better!
>
>> Does it work if you instead change -ftree-vectorize to -ftree-slp-vectorize?
>> Or does that defeat the purpose of the test?
> It works, nice, thanks for the suggestion!
>
> I appended one explicit -fno-tree-loop-vectorize to avoid it to fail
> in case someone kicks off the testing with explicit -ftree-loop-vectorize.
>
> The updated version is pasted below, is it ok for trunk?
>
> BR,
> Kewen
> -----
> gcc/testsuite/ChangeLog:
>
> 	* gcc.dg/tree-ssa/pr96789.c: Adjusted by disabling loop vectorization.

OK

jeff
diff mbox series

Patch

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
index d6139a014d8..1b89f8b7a6a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
@@ -55,4 +55,7 @@  bar (int16_t res[16], uint8_t *val1, uint8_t *val2)
     }
 }

-/* { dg-final { scan-tree-dump {Deleted dead store:.*tmp} "dse3" } } */
+/* Exclude targets which support load_lanes since loop vectorizer
+   can vectorize those two loops that operate tmp array so that
+   subsequent dse3 will not eliminate tmp stores.  */
+/* { dg-final { scan-tree-dump {Deleted dead store:.*tmp} "dse3" { target { vect_int && { ! vect_load_lanes } } } } } */