diff mbox series

GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]' (was: [committed] amdgcn: Prefer V32 on RDNA devices)

Message ID 87zfu4m8ue.fsf@euler.schwinge.ddns.net
State New
Headers show
Series GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]' (was: [committed] amdgcn: Prefer V32 on RDNA devices) | expand

Commit Message

Thomas Schwinge April 8, 2024, 10:45 a.m. UTC
Hi!

On 2024-03-28T08:00:50+0100, I wrote:
> On 2024-03-22T15:54:48+0000, Andrew Stubbs <ams@baylibre.com> wrote:
>> This patch alters the default (preferred) vector size to 32 on RDNA devices to
>> better match the actual hardware.  64-lane vectors will continue to be
>> used where they are hard-coded (such as function prologues).
>>
>> We run these devices in wavefrontsize64 for compatibility, but they actually
>> only have 32-lane vectors, natively.  If the upper part of a V64 is masked
>> off (as it is in V32) then RDNA devices will skip execution of the upper part
>> for most operations, so this adjustment shouldn't leave too much performance on
>> the table.  One exception is memory instructions, so full wavefrontsize32
>> support would be better.
>>
>> The advantage is that we avoid the missing V64 operations (such as permute and
>> vec_extract).
>>
>> Committed to mainline.
>
> In my GCN target '-march=gfx1100' testing, this commit
> "amdgcn: Prefer V32 on RDNA devices" does resolve (or, make latent?) a
> number of execution test FAILs (that is, regressions compared to earlier
> '-march=gfx90a' etc. testing).
>
> This commit also resolves (for my '-march=gfx1100' testing) one
> pre-existing FAIL (that is, already seen in '-march=gfx90a' earlier
> etc. testing):
>
>     PASS: gcc.dg/tree-ssa/scev-14.c (test for excess errors)
>     [-FAIL:-]{+PASS:+} gcc.dg/tree-ssa/scev-14.c scan-tree-dump ivopts "Overflowness wrto loop niter:\tNo-overflow"
>
> That means, this test case specifically (or, just its 'scan-tree-dump'?)
> needs to be adjusted for GCN V64 testing?
>
> This commit, as you'd also mentioned elsewhere, however also causes a
> number of regressions in 'gcc.target/gcn/gcn.exp', see list below.
>
> Those can be "fixed" with 'dg-additional-options -march=gfx90a' (or
> similar) in the affected test cases (let me know if you'd like me to
> 'git push' that), but I suppose something more elaborate may be in order?
> (Conditionalize those on 'target { ! gcn_rdna }', and add respective
> scanning for 'target gcn_rdna'?  I can help with effective-target
> 'gcn_rdna' (or similar), if you'd like me to.)
>
> And/or, have a '-mpreferred-simd-mode=v64' (or similar) to be used for
> such test cases, to override 'if (TARGET_RDNA2_PLUS)' etc. in
> 'gcn_vectorize_preferred_simd_mode'?

The latter I have quickly implemented, see attached
"GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]'".  OK to
push to trunk branch?

(This '--param' will also be useful for another bug/regression I'm about
to file.)

> Best, probably, both these things, to properly test both V32 and V64?

That part remains to be done, but is best done by someone who actually
knowns "GCN" assembly/GCC back end -- that is, not me.


Grüße
 Thomas


>     PASS: gcc.target/gcn/cond_fmaxnm_1.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times smaxv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times smaxv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fmaxnm_1_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fmaxnm_1_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fmaxnm_2.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times smaxv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times smaxv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fmaxnm_2_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fmaxnm_2_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fmaxnm_3.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times movv64df_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times movv64sf_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times smaxv64sf3 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times smaxv64sf3 3
>     PASS: gcc.target/gcn/cond_fmaxnm_3_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fmaxnm_3_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fmaxnm_4.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times movv64df_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times movv64sf_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times smaxv64sf3 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times smaxv64sf3 3
>     PASS: gcc.target/gcn/cond_fmaxnm_4_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fmaxnm_4_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fmaxnm_5.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-times smaxv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-times smaxv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fmaxnm_5_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fmaxnm_5_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fmaxnm_6.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-times smaxv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-times smaxv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fmaxnm_6_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fmaxnm_6_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fmaxnm_7.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-times smaxv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-times smaxv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fmaxnm_7_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fmaxnm_7_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fmaxnm_8.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-times smaxv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-times smaxv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fmaxnm_8_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fmaxnm_8_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fminnm_1.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-times sminv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-times sminv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fminnm_1_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fminnm_1_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fminnm_2.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-times sminv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-times sminv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fminnm_2_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fminnm_2_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fminnm_3.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times movv64df_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times movv64sf_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times sminv64sf3 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times sminv64sf3 3
>     PASS: gcc.target/gcn/cond_fminnm_3_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fminnm_3_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fminnm_4.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times movv64df_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times movv64sf_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times sminv64sf3 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times sminv64sf3 3
>     PASS: gcc.target/gcn/cond_fminnm_4_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fminnm_4_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fminnm_5.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-times sminv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-times sminv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fminnm_5_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fminnm_5_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fminnm_6.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-times sminv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-times sminv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fminnm_6_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fminnm_6_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fminnm_7.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-times sminv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-times sminv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fminnm_7_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fminnm_7_run.c execution test
>
>     PASS: gcc.target/gcn/cond_fminnm_8.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-times sminv64df3_exec 3
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-times sminv64sf3_exec 3
>     PASS: gcc.target/gcn/cond_fminnm_8_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_fminnm_8_run.c execution test
>
>     @@ -124634,12 +124634,12 @@ PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-not movv64di_exec/2
>     PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-not v_cndmask_b32
>     PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-times \\tv_ashrrev_i32\\tv[0-9]+, 3, v[0-9]+ 1
>     PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-times \\tv_lshlrev_b32\\tv[0-9]+, 3, v[0-9]+ 10
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashlv64di3_exec 2
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashlv64si3_exec 18
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashrv64di3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashrv64si3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vlshrv64di3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vlshrv64si3_exec 1
>     PASS: gcc.target/gcn/cond_shift_3_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_shift_3_run.c execution test
>
>     PASS: gcc.target/gcn/cond_shift_4.c (test for excess errors)
>     @@ -124647,77 +124647,77 @@ PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-not movv64di_exec/2
>     PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-not v_cndmask_b32
>     PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-times \\tv_ashrrev_i32\\tv[0-9]+, 3, v[0-9]+ 1
>     PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-times \\tv_lshlrev_b32\\tv[0-9]+, 3, v[0-9]+ 10
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashlv64di3_exec 2
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashlv64si3_exec 18
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashrv64di3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashrv64si3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vlshrv64di3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vlshrv64si3_exec 1
>     PASS: gcc.target/gcn/cond_shift_4_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_shift_4_run.c execution test
>
>     PASS: gcc.target/gcn/cond_shift_8.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_shift_8.c scan-assembler-not movv64di_exec/0
>     PASS: gcc.target/gcn/cond_shift_8.c scan-assembler-not movv64si_exec/0
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashlv64di3_exec 2
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashlv64si3_exec 18
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashrv64di3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashrv64si3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vlshrv64di3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vlshrv64si3_exec 1
>     PASS: gcc.target/gcn/cond_shift_8_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_shift_8_run.c execution test
>
>     PASS: gcc.target/gcn/cond_shift_9.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not movv64di_exec/1
>     PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not movv64si_exec/2
>     PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not v_cndmask_b32
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashlv64di3_exec 2
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashlv64si3_exec 18
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashrv64di3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashrv64si3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vlshrv64di3_exec 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vlshrv64si3_exec 1
>     PASS: gcc.target/gcn/cond_shift_9_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_shift_9_run.c execution test
>
>     PASS: gcc.target/gcn/cond_smax_1.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
>     PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+
>     PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
>     PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not smaxv64si3/0
>     PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
>     PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
>     PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-times smaxv64si3_exec 30
>     PASS: gcc.target/gcn/cond_smax_1_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_smax_1_run.c execution test
>
>     PASS: gcc.target/gcn/cond_smin_1.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
>     PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+
>     PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
>     PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not sminv64si3/0
>     PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
>     PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_lt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
>     PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-times sminv64si3_exec 30
>     PASS: gcc.target/gcn/cond_smin_1_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_smin_1_run.c execution test
>
>     PASS: gcc.target/gcn/cond_umax_1.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
>     PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
>     PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not umaxv64si3/0
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
>     PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
>     PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times umaxv64si3_exec 20
>     PASS: gcc.target/gcn/cond_umax_1_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_umax_1_run.c execution test
>
>     PASS: gcc.target/gcn/cond_umin_1.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
>     PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
>     PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not uminv64si3/0
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
>     PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_lt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
>     PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times uminv64si3_exec 20
>     PASS: gcc.target/gcn/cond_umin_1_run.c (test for excess errors)
>     PASS: gcc.target/gcn/cond_umin_1_run.c execution test
>
>     PASS: gcc.target/gcn/simd-math-1.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_acos"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_acosh"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_asin"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_asinh"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_atan"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_atan2"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_atanh"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_copysign"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_cos"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_cosh"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_erf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_exp"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_exp2"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_fmod"
>     XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_gamma"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_hypot"
>     XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_lgamma"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log10"
>     XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log2"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_pow"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_remainder"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_rint"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_scalb"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_significand"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_sin"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_sinh"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_sqrt"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_tan"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_tanh"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_tgamma"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_acosf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_acoshf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_asinf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_asinhf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_atan2f"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_atanf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_atanhf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_copysignf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_cosf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_coshf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_erff"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_exp2f"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_expf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_fmodf"
>     XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_gammaf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_hypotf"
>     XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_lgammaf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_log10f"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_log2f"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_logf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_powf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_remainderf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_rintf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_scalbf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_significandf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_sinf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_sinhf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_sqrtf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_tanf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_tanhf"
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_tgammaf"
>
>     @@ -125130,7 +125130,7 @@ PASS: gcc.target/gcn/simd-math-5-char-run.c (test for excess errors)
>     PASS: gcc.target/gcn/simd-math-5-char-run.c execution test
>     PASS: gcc.target/gcn/simd-math-5-char.c (test for excess errors)
>     XFAIL: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divmodv64si4@rel32@lo 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divv64hi3@rel32@lo 1
>     PASS: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divv64qi3@rel32@lo 0
>     FAIL: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __modv64qi3@rel32@lo 1
>     PASS: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __udivv64qi3@rel32@lo 0
>
>     @@ -125171,8 +125171,8 @@ PASS: gcc.target/gcn/simd-math-5-long-run.c (test for excess errors)
>     PASS: gcc.target/gcn/simd-math-5-long-run.c execution test
>     PASS: gcc.target/gcn/simd-math-5-long.c (test for excess errors)
>     XFAIL: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __divmodv64di4@rel32@lo 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-long.c scan-assembler-times __divv64di3@rel32@lo 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-long.c scan-assembler-times __modv64di3@rel32@lo 1
>     PASS: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __udivv64di3@rel32@lo 0
>     PASS: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __umodv64di3@rel32@lo 0
>
>     PASS: gcc.target/gcn/simd-math-5-short.c (test for excess errors)
>     XFAIL: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divmodv64si4@rel32@lo 1
>     PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divv64hi3@rel32@lo 0
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divv64si3@rel32@lo 1
>     FAIL: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __modv64hi3@rel32@lo 1
>     PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __udivv64hi3@rel32@lo 0
>     PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __umodv64hi3@rel32@lo 0
>
>     PASS: gcc.target/gcn/simd-math-5.c (test for excess errors)
>     XFAIL: gcc.target/gcn/simd-math-5.c scan-assembler-times __divmodv64si4@rel32@lo 1
>     PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __divsi3@rel32@lo 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5.c scan-assembler-times __divv64si3@rel32@lo 1
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5.c scan-assembler-times __modv64si3@rel32@lo 1
>     PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivmodv64si4@rel32@lo 0
>     PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivsi3@rel32@lo 0
>     PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivv64si3@rel32@lo 0
>     @@ -125242,13 +125242,13 @@ PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __umodv64si3@rel32@lo 0
>
>     PASS: gcc.target/gcn/smax_1.c (test for excess errors)
>     PASS: gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmp_gt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
>     FAIL: gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/smax_1.c scan-assembler-times vec_cmpv64didi 10
>     PASS: gcc.target/gcn/smax_1_run.c (test for excess errors)
>     PASS: gcc.target/gcn/smax_1_run.c execution test
>
>     PASS: gcc.target/gcn/smin_1.c (test for excess errors)
>     PASS: gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmp_lt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
>     FAIL: gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/smin_1.c scan-assembler-times vec_cmpv64didi 10
>     PASS: gcc.target/gcn/smin_1_run.c (test for excess errors)
>     PASS: gcc.target/gcn/smin_1_run.c execution test
>
>     PASS: gcc.target/gcn/sram-ecc-3.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-3.c scan-assembler (\\*zero_extendv64qiv64si_sdwa|\\*zero_extendv64qiv64si_shift)
>
>     PASS: gcc.target/gcn/sram-ecc-4.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-4.c scan-assembler (\\*zero_extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift)
>
>     PASS: gcc.target/gcn/sram-ecc-7.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-7.c scan-assembler (\\*zero_extendv64qiv64si_sdwa|\\*zero_extendv64qiv64si_shift)
>
>     PASS: gcc.target/gcn/sram-ecc-8.c (test for excess errors)
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-8.c scan-assembler (\\*zero_extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift)
>
>     PASS: gcc.target/gcn/umax_1.c (test for excess errors)
>     PASS: gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmp_gt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
>     FAIL: gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/umax_1.c scan-assembler-times vec_cmpv64didi 8
>     PASS: gcc.target/gcn/umax_1_run.c (test for excess errors)
>     PASS: gcc.target/gcn/umax_1_run.c execution test
>
>     PASS: gcc.target/gcn/umin_1.c (test for excess errors)
>     PASS: gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmp_lt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
>     FAIL: gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
>     [-PASS:-]{+FAIL:+} gcc.target/gcn/umin_1.c scan-assembler-times vec_cmpv64didi 8
>     PASS: gcc.target/gcn/umin_1_run.c (test for excess errors)
>     PASS: gcc.target/gcn/umin_1_run.c execution test
>
>
> Grüße
>  Thomas
>
>
>> gcc/ChangeLog:
>>
>> 	* config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode): Prefer V32 on
>> 	RDNA devices.
>> ---
>>  gcc/config/gcn/gcn.cc | 26 ++++++++++++++++++++++++++
>>  1 file changed, 26 insertions(+)
>>
>> diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
>> index 498146dcde9..efb73af50c4 100644
>> --- a/gcc/config/gcn/gcn.cc
>> +++ b/gcc/config/gcn/gcn.cc
>> @@ -5226,6 +5226,32 @@ gcn_vector_mode_supported_p (machine_mode mode)
>>  static machine_mode
>>  gcn_vectorize_preferred_simd_mode (scalar_mode mode)
>>  {
>> +  /* RDNA devices have 32-lane vectors with limited support for 64-bit vectors
>> +     (in particular, permute operations are only available for cases that don't
>> +     span the 32-lane boundary).
>> +
>> +     From the RDNA3 manual: "Hardware may choose to skip either half if the
>> +     EXEC mask for that half is all zeros...". This means that preferring
>> +     32-lanes is a good stop-gap until we have proper wave32 support.  */
>> +  if (TARGET_RDNA2_PLUS)
>> +    switch (mode)
>> +      {
>> +      case E_QImode:
>> +	return V32QImode;
>> +      case E_HImode:
>> +	return V32HImode;
>> +      case E_SImode:
>> +	return V32SImode;
>> +      case E_DImode:
>> +	return V32DImode;
>> +      case E_SFmode:
>> +	return V32SFmode;
>> +      case E_DFmode:
>> +	return V32DFmode;
>> +      default:
>> +	return word_mode;
>> +      }
>> +
>>    switch (mode)
>>      {
>>      case E_QImode:
>> -- 
>> 2.41.0

Comments

Andrew Stubbs April 8, 2024, 12:24 p.m. UTC | #1
On 08/04/2024 11:45, Thomas Schwinge wrote:
> Hi!
> 
> On 2024-03-28T08:00:50+0100, I wrote:
>> On 2024-03-22T15:54:48+0000, Andrew Stubbs <ams@baylibre.com> wrote:
>>> This patch alters the default (preferred) vector size to 32 on RDNA devices to
>>> better match the actual hardware.  64-lane vectors will continue to be
>>> used where they are hard-coded (such as function prologues).
>>>
>>> We run these devices in wavefrontsize64 for compatibility, but they actually
>>> only have 32-lane vectors, natively.  If the upper part of a V64 is masked
>>> off (as it is in V32) then RDNA devices will skip execution of the upper part
>>> for most operations, so this adjustment shouldn't leave too much performance on
>>> the table.  One exception is memory instructions, so full wavefrontsize32
>>> support would be better.
>>>
>>> The advantage is that we avoid the missing V64 operations (such as permute and
>>> vec_extract).
>>>
>>> Committed to mainline.
>>
>> In my GCN target '-march=gfx1100' testing, this commit
>> "amdgcn: Prefer V32 on RDNA devices" does resolve (or, make latent?) a
>> number of execution test FAILs (that is, regressions compared to earlier
>> '-march=gfx90a' etc. testing).
>>
>> This commit also resolves (for my '-march=gfx1100' testing) one
>> pre-existing FAIL (that is, already seen in '-march=gfx90a' earlier
>> etc. testing):
>>
>>      PASS: gcc.dg/tree-ssa/scev-14.c (test for excess errors)
>>      [-FAIL:-]{+PASS:+} gcc.dg/tree-ssa/scev-14.c scan-tree-dump ivopts "Overflowness wrto loop niter:\tNo-overflow"
>>
>> That means, this test case specifically (or, just its 'scan-tree-dump'?)
>> needs to be adjusted for GCN V64 testing?
>>
>> This commit, as you'd also mentioned elsewhere, however also causes a
>> number of regressions in 'gcc.target/gcn/gcn.exp', see list below.
>>
>> Those can be "fixed" with 'dg-additional-options -march=gfx90a' (or
>> similar) in the affected test cases (let me know if you'd like me to
>> 'git push' that), but I suppose something more elaborate may be in order?
>> (Conditionalize those on 'target { ! gcn_rdna }', and add respective
>> scanning for 'target gcn_rdna'?  I can help with effective-target
>> 'gcn_rdna' (or similar), if you'd like me to.)
>>
>> And/or, have a '-mpreferred-simd-mode=v64' (or similar) to be used for
>> such test cases, to override 'if (TARGET_RDNA2_PLUS)' etc. in
>> 'gcn_vectorize_preferred_simd_mode'?
> 
> The latter I have quickly implemented, see attached
> "GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]'".  OK to
> push to trunk branch?
> 
> (This '--param' will also be useful for another bug/regression I'm about
> to file.)
> 
>> Best, probably, both these things, to properly test both V32 and V64?
> 
> That part remains to be done, but is best done by someone who actually
> knowns "GCN" assembly/GCC back end -- that is, not me.

I'm not sure that this is *best* solution to the problem (in general, 
it's probably best to test the actual code that will be generated in 
practice), but I think this option will be useful for testing 
performance in each configuration and other correctness issues, and 
these tests are not testing that feature.

However, "vector lane width" sounds like it's configuring the number of 
bits in each lane. I think "vectorization factor" is unambigous.

OK to commit, with the name change.

Andrew


> 
> Grüße
>   Thomas
> 
> 
>>      PASS: gcc.target/gcn/cond_fmaxnm_1.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times smaxv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times smaxv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fmaxnm_1_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fmaxnm_1_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fmaxnm_2.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times smaxv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times smaxv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fmaxnm_2_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fmaxnm_2_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fmaxnm_3.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times movv64df_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times movv64sf_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times smaxv64sf3 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times smaxv64sf3 3
>>      PASS: gcc.target/gcn/cond_fmaxnm_3_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fmaxnm_3_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fmaxnm_4.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times movv64df_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times movv64sf_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times smaxv64sf3 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times smaxv64sf3 3
>>      PASS: gcc.target/gcn/cond_fmaxnm_4_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fmaxnm_4_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fmaxnm_5.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-times smaxv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-times smaxv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fmaxnm_5_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fmaxnm_5_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fmaxnm_6.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-times smaxv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-times smaxv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fmaxnm_6_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fmaxnm_6_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fmaxnm_7.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-times smaxv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-times smaxv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fmaxnm_7_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fmaxnm_7_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fmaxnm_8.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-times smaxv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-times smaxv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fmaxnm_8_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fmaxnm_8_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fminnm_1.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-times sminv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-times sminv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fminnm_1_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fminnm_1_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fminnm_2.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-times sminv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-times sminv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fminnm_2_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fminnm_2_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fminnm_3.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times movv64df_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times movv64sf_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times sminv64sf3 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times sminv64sf3 3
>>      PASS: gcc.target/gcn/cond_fminnm_3_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fminnm_3_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fminnm_4.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times movv64df_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times movv64sf_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times sminv64sf3 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times sminv64sf3 3
>>      PASS: gcc.target/gcn/cond_fminnm_4_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fminnm_4_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fminnm_5.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-times sminv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-times sminv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fminnm_5_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fminnm_5_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fminnm_6.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-times sminv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-times sminv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fminnm_6_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fminnm_6_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fminnm_7.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-times sminv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-times sminv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fminnm_7_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fminnm_7_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_fminnm_8.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-times sminv64df3_exec 3
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-times sminv64sf3_exec 3
>>      PASS: gcc.target/gcn/cond_fminnm_8_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_fminnm_8_run.c execution test
>>
>>      @@ -124634,12 +124634,12 @@ PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-not movv64di_exec/2
>>      PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-not v_cndmask_b32
>>      PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-times \\tv_ashrrev_i32\\tv[0-9]+, 3, v[0-9]+ 1
>>      PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-times \\tv_lshlrev_b32\\tv[0-9]+, 3, v[0-9]+ 10
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashlv64di3_exec 2
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashlv64si3_exec 18
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashrv64di3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashrv64si3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vlshrv64di3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vlshrv64si3_exec 1
>>      PASS: gcc.target/gcn/cond_shift_3_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_shift_3_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_shift_4.c (test for excess errors)
>>      @@ -124647,77 +124647,77 @@ PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-not movv64di_exec/2
>>      PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-not v_cndmask_b32
>>      PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-times \\tv_ashrrev_i32\\tv[0-9]+, 3, v[0-9]+ 1
>>      PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-times \\tv_lshlrev_b32\\tv[0-9]+, 3, v[0-9]+ 10
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashlv64di3_exec 2
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashlv64si3_exec 18
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashrv64di3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashrv64si3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vlshrv64di3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vlshrv64si3_exec 1
>>      PASS: gcc.target/gcn/cond_shift_4_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_shift_4_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_shift_8.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_shift_8.c scan-assembler-not movv64di_exec/0
>>      PASS: gcc.target/gcn/cond_shift_8.c scan-assembler-not movv64si_exec/0
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashlv64di3_exec 2
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashlv64si3_exec 18
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashrv64di3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashrv64si3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vlshrv64di3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vlshrv64si3_exec 1
>>      PASS: gcc.target/gcn/cond_shift_8_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_shift_8_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_shift_9.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not movv64di_exec/1
>>      PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not movv64si_exec/2
>>      PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not v_cndmask_b32
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashlv64di3_exec 2
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashlv64si3_exec 18
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashrv64di3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashrv64si3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vlshrv64di3_exec 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vlshrv64si3_exec 1
>>      PASS: gcc.target/gcn/cond_shift_9_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_shift_9_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_smax_1.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
>>      PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+
>>      PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
>>      PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not smaxv64si3/0
>>      PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
>>      PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
>>      PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-times smaxv64si3_exec 30
>>      PASS: gcc.target/gcn/cond_smax_1_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_smax_1_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_smin_1.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
>>      PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+
>>      PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
>>      PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not sminv64si3/0
>>      PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
>>      PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_lt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
>>      PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-times sminv64si3_exec 30
>>      PASS: gcc.target/gcn/cond_smin_1_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_smin_1_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_umax_1.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
>>      PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
>>      PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not umaxv64si3/0
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
>>      PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
>>      PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times umaxv64si3_exec 20
>>      PASS: gcc.target/gcn/cond_umax_1_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_umax_1_run.c execution test
>>
>>      PASS: gcc.target/gcn/cond_umin_1.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
>>      PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
>>      PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not uminv64si3/0
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
>>      PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_lt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
>>      PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times uminv64si3_exec 20
>>      PASS: gcc.target/gcn/cond_umin_1_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/cond_umin_1_run.c execution test
>>
>>      PASS: gcc.target/gcn/simd-math-1.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_acos"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_acosh"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_asin"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_asinh"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_atan"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_atan2"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_atanh"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_copysign"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_cos"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_cosh"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_erf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_exp"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_exp2"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_fmod"
>>      XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_gamma"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_hypot"
>>      XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_lgamma"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log10"
>>      XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log2"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_pow"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_remainder"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_rint"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_scalb"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_significand"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_sin"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_sinh"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_sqrt"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_tan"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_tanh"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_tgamma"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_acosf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_acoshf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_asinf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_asinhf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_atan2f"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_atanf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_atanhf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_copysignf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_cosf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_coshf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_erff"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_exp2f"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_expf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_fmodf"
>>      XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_gammaf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_hypotf"
>>      XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_lgammaf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_log10f"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_log2f"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_logf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_powf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_remainderf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_rintf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_scalbf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_significandf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_sinf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_sinhf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_sqrtf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_tanf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_tanhf"
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_tgammaf"
>>
>>      @@ -125130,7 +125130,7 @@ PASS: gcc.target/gcn/simd-math-5-char-run.c (test for excess errors)
>>      PASS: gcc.target/gcn/simd-math-5-char-run.c execution test
>>      PASS: gcc.target/gcn/simd-math-5-char.c (test for excess errors)
>>      XFAIL: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divmodv64si4@rel32@lo 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divv64hi3@rel32@lo 1
>>      PASS: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divv64qi3@rel32@lo 0
>>      FAIL: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __modv64qi3@rel32@lo 1
>>      PASS: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __udivv64qi3@rel32@lo 0
>>
>>      @@ -125171,8 +125171,8 @@ PASS: gcc.target/gcn/simd-math-5-long-run.c (test for excess errors)
>>      PASS: gcc.target/gcn/simd-math-5-long-run.c execution test
>>      PASS: gcc.target/gcn/simd-math-5-long.c (test for excess errors)
>>      XFAIL: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __divmodv64di4@rel32@lo 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-long.c scan-assembler-times __divv64di3@rel32@lo 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-long.c scan-assembler-times __modv64di3@rel32@lo 1
>>      PASS: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __udivv64di3@rel32@lo 0
>>      PASS: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __umodv64di3@rel32@lo 0
>>
>>      PASS: gcc.target/gcn/simd-math-5-short.c (test for excess errors)
>>      XFAIL: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divmodv64si4@rel32@lo 1
>>      PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divv64hi3@rel32@lo 0
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divv64si3@rel32@lo 1
>>      FAIL: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __modv64hi3@rel32@lo 1
>>      PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __udivv64hi3@rel32@lo 0
>>      PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __umodv64hi3@rel32@lo 0
>>
>>      PASS: gcc.target/gcn/simd-math-5.c (test for excess errors)
>>      XFAIL: gcc.target/gcn/simd-math-5.c scan-assembler-times __divmodv64si4@rel32@lo 1
>>      PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __divsi3@rel32@lo 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5.c scan-assembler-times __divv64si3@rel32@lo 1
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5.c scan-assembler-times __modv64si3@rel32@lo 1
>>      PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivmodv64si4@rel32@lo 0
>>      PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivsi3@rel32@lo 0
>>      PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivv64si3@rel32@lo 0
>>      @@ -125242,13 +125242,13 @@ PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __umodv64si3@rel32@lo 0
>>
>>      PASS: gcc.target/gcn/smax_1.c (test for excess errors)
>>      PASS: gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmp_gt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
>>      FAIL: gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/smax_1.c scan-assembler-times vec_cmpv64didi 10
>>      PASS: gcc.target/gcn/smax_1_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/smax_1_run.c execution test
>>
>>      PASS: gcc.target/gcn/smin_1.c (test for excess errors)
>>      PASS: gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmp_lt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
>>      FAIL: gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/smin_1.c scan-assembler-times vec_cmpv64didi 10
>>      PASS: gcc.target/gcn/smin_1_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/smin_1_run.c execution test
>>
>>      PASS: gcc.target/gcn/sram-ecc-3.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-3.c scan-assembler (\\*zero_extendv64qiv64si_sdwa|\\*zero_extendv64qiv64si_shift)
>>
>>      PASS: gcc.target/gcn/sram-ecc-4.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-4.c scan-assembler (\\*zero_extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift)
>>
>>      PASS: gcc.target/gcn/sram-ecc-7.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-7.c scan-assembler (\\*zero_extendv64qiv64si_sdwa|\\*zero_extendv64qiv64si_shift)
>>
>>      PASS: gcc.target/gcn/sram-ecc-8.c (test for excess errors)
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-8.c scan-assembler (\\*zero_extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift)
>>
>>      PASS: gcc.target/gcn/umax_1.c (test for excess errors)
>>      PASS: gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmp_gt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
>>      FAIL: gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/umax_1.c scan-assembler-times vec_cmpv64didi 8
>>      PASS: gcc.target/gcn/umax_1_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/umax_1_run.c execution test
>>
>>      PASS: gcc.target/gcn/umin_1.c (test for excess errors)
>>      PASS: gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmp_lt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
>>      FAIL: gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
>>      [-PASS:-]{+FAIL:+} gcc.target/gcn/umin_1.c scan-assembler-times vec_cmpv64didi 8
>>      PASS: gcc.target/gcn/umin_1_run.c (test for excess errors)
>>      PASS: gcc.target/gcn/umin_1_run.c execution test
>>
>>
>> Grüße
>>   Thomas
>>
>>
>>> gcc/ChangeLog:
>>>
>>> 	* config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode): Prefer V32 on
>>> 	RDNA devices.
>>> ---
>>>   gcc/config/gcn/gcn.cc | 26 ++++++++++++++++++++++++++
>>>   1 file changed, 26 insertions(+)
>>>
>>> diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
>>> index 498146dcde9..efb73af50c4 100644
>>> --- a/gcc/config/gcn/gcn.cc
>>> +++ b/gcc/config/gcn/gcn.cc
>>> @@ -5226,6 +5226,32 @@ gcn_vector_mode_supported_p (machine_mode mode)
>>>   static machine_mode
>>>   gcn_vectorize_preferred_simd_mode (scalar_mode mode)
>>>   {
>>> +  /* RDNA devices have 32-lane vectors with limited support for 64-bit vectors
>>> +     (in particular, permute operations are only available for cases that don't
>>> +     span the 32-lane boundary).
>>> +
>>> +     From the RDNA3 manual: "Hardware may choose to skip either half if the
>>> +     EXEC mask for that half is all zeros...". This means that preferring
>>> +     32-lanes is a good stop-gap until we have proper wave32 support.  */
>>> +  if (TARGET_RDNA2_PLUS)
>>> +    switch (mode)
>>> +      {
>>> +      case E_QImode:
>>> +	return V32QImode;
>>> +      case E_HImode:
>>> +	return V32HImode;
>>> +      case E_SImode:
>>> +	return V32SImode;
>>> +      case E_DImode:
>>> +	return V32DImode;
>>> +      case E_SFmode:
>>> +	return V32SFmode;
>>> +      case E_DFmode:
>>> +	return V32DFmode;
>>> +      default:
>>> +	return word_mode;
>>> +      }
>>> +
>>>     switch (mode)
>>>       {
>>>       case E_QImode:
>>> -- 
>>> 2.41.0
> 
>
diff mbox series

Patch

From 9282ea8b064bc22866edb11fc422a85d3298a6b3 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <tschwinge@baylibre.com>
Date: Sat, 24 Feb 2024 00:29:14 +0100
Subject: [PATCH] GCN:
 '--param=gcn-preferred-vector-lane-width=[default,32,64]'

..., and specify '--param=gcn-preferred-vector-lane-width=64' for
'gcc.target/gcn/[...]' test cases with 'scan-assembler' directives that
are specific to 64-lane vectors.  This resolves regressions introduced
in commit 6dedafe166cc02ae87b6a0699ad61ce3ffc46803
"amdgcn: Prefer V32 on RDNA devices".

	gcc/
	* config/gcn/gcn.opt (--param=gcn-preferred-vector-lane-width):
	New.
	* config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode) Use it.
	* doc/invoke.texi (Optimize Options): Document it.
	gcc/testsuite/
	* gcc.target/gcn/cond_fmaxnm_1.c: Specify
	'--param=gcn-preferred-vector-lane-width=64'.
	* gcc.target/gcn/cond_fmaxnm_2.c: Likewise.
	* gcc.target/gcn/cond_fmaxnm_3.c: Likewise.
	* gcc.target/gcn/cond_fmaxnm_4.c: Likewise.
	* gcc.target/gcn/cond_fmaxnm_5.c: Likewise.
	* gcc.target/gcn/cond_fmaxnm_6.c: Likewise.
	* gcc.target/gcn/cond_fmaxnm_7.c: Likewise.
	* gcc.target/gcn/cond_fmaxnm_8.c: Likewise.
	* gcc.target/gcn/cond_fminnm_1.c: Likewise.
	* gcc.target/gcn/cond_fminnm_2.c: Likewise.
	* gcc.target/gcn/cond_fminnm_3.c: Likewise.
	* gcc.target/gcn/cond_fminnm_4.c: Likewise.
	* gcc.target/gcn/cond_fminnm_5.c: Likewise.
	* gcc.target/gcn/cond_fminnm_6.c: Likewise.
	* gcc.target/gcn/cond_fminnm_7.c: Likewise.
	* gcc.target/gcn/cond_fminnm_8.c: Likewise.
	* gcc.target/gcn/cond_shift_3.c: Likewise.
	* gcc.target/gcn/cond_shift_4.c: Likewise.
	* gcc.target/gcn/cond_shift_8.c: Likewise.
	* gcc.target/gcn/cond_shift_9.c: Likewise.
	* gcc.target/gcn/cond_smax_1.c: Likewise.
	* gcc.target/gcn/cond_smin_1.c: Likewise.
	* gcc.target/gcn/cond_umax_1.c: Likewise.
	* gcc.target/gcn/cond_umin_1.c: Likewise.
	* gcc.target/gcn/simd-math-1.c: Likewise.
	* gcc.target/gcn/simd-math-5-char.c: Likewise.
	* gcc.target/gcn/simd-math-5-long.c: Likewise.
	* gcc.target/gcn/simd-math-5-short.c: Likewise.
	* gcc.target/gcn/simd-math-5.c: Likewise.
	* gcc.target/gcn/smax_1.c: Likewise.
	* gcc.target/gcn/smin_1.c: Likewise.
	* gcc.target/gcn/umax_1.c: Likewise.
	* gcc.target/gcn/umin_1.c: Likewise.
---
 gcc/config/gcn/gcn.cc                            | 14 +++++++++++++-
 gcc/config/gcn/gcn.opt                           | 16 ++++++++++++++++
 gcc/doc/invoke.texi                              |  8 ++++++++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c     |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_shift_3.c      |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_shift_4.c      |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_shift_8.c      |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_shift_9.c      |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_smax_1.c       |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_smin_1.c       |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_umax_1.c       |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_umin_1.c       |  2 ++
 gcc/testsuite/gcc.target/gcn/simd-math-1.c       |  3 ++-
 gcc/testsuite/gcc.target/gcn/simd-math-5-char.c  |  3 +++
 gcc/testsuite/gcc.target/gcn/simd-math-5-long.c  |  3 +++
 gcc/testsuite/gcc.target/gcn/simd-math-5-short.c |  3 +++
 gcc/testsuite/gcc.target/gcn/simd-math-5.c       |  3 +++
 gcc/testsuite/gcc.target/gcn/smax_1.c            |  2 ++
 gcc/testsuite/gcc.target/gcn/smin_1.c            |  2 ++
 gcc/testsuite/gcc.target/gcn/umax_1.c            |  2 ++
 gcc/testsuite/gcc.target/gcn/umin_1.c            |  2 ++
 36 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 700e554855e..666f9fdebb2 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -5231,6 +5231,14 @@  gcn_vector_mode_supported_p (machine_mode mode)
 static machine_mode
 gcn_vectorize_preferred_simd_mode (scalar_mode mode)
 {
+  bool v32;
+  if (gcn_preferred_vector_lane_width == 32)
+    v32 = true;
+  else if (gcn_preferred_vector_lane_width == 64)
+    v32 = false;
+  else if (gcn_preferred_vector_lane_width != -1)
+    gcc_unreachable ();
+  else if (TARGET_RDNA2_PLUS)
   /* RDNA devices have 32-lane vectors with limited support for 64-bit vectors
      (in particular, permute operations are only available for cases that don't
      span the 32-lane boundary).
@@ -5238,7 +5246,11 @@  gcn_vectorize_preferred_simd_mode (scalar_mode mode)
      From the RDNA3 manual: "Hardware may choose to skip either half if the
      EXEC mask for that half is all zeros...". This means that preferring
      32-lanes is a good stop-gap until we have proper wave32 support.  */
-  if (TARGET_RDNA2_PLUS)
+    v32 = true;
+  else
+    v32 = false;
+
+  if (v32)
     switch (mode)
       {
       case E_QImode:
diff --git a/gcc/config/gcn/gcn.opt b/gcc/config/gcn/gcn.opt
index 1067b45f294..4c0ab2b19ee 100644
--- a/gcc/config/gcn/gcn.opt
+++ b/gcc/config/gcn/gcn.opt
@@ -116,3 +116,19 @@  Compile for devices requiring XNACK enabled. Default \"any\" if USM is supported
 msram-ecc=
 Target RejectNegative Joined ToLower Enum(hsaco_attr_type) Var(flag_sram_ecc) Init(HSACO_ATTR_ANY)
 Compile for devices with the SRAM ECC feature enabled, or not. Default \"any\".
+
+-param=gcn-preferred-vector-lane-width=
+Target Joined Enum(gcn_preferred_vector_lane_width) Var(gcn_preferred_vector_lane_width) Init(-1) Param
+--param=gcn-preferred-vector-lane-width=[default,32,64]	Preferred vector lane width.
+
+Enum
+Name(gcn_preferred_vector_lane_width) Type(int)
+
+EnumValue
+Enum(gcn_preferred_vector_lane_width) String(default) Value(-1)
+
+EnumValue
+Enum(gcn_preferred_vector_lane_width) String(32) Value(32)
+
+EnumValue
+Enum(gcn_preferred_vector_lane_width) String(64) Value(64)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e5acab24cae..9b9085f7167 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -17016,6 +17016,14 @@  loop.  The default value is four.
 
 @end table
 
+The following choices of @var{name} are available on GCN targets:
+
+@table @gcctabopt
+@item gcn-preferred-vector-lane-width
+Preferred vector lane width: @samp{default}, @samp{32}, @samp{64}.
+
+@end table
+
 The following choices of @var{name} are available on i386 and x86_64 targets:
 
 @table @gcctabopt
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c
index 17c49bdc518..c36967015e2 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c
index 406df48962a..21afb77a785 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c
index 45b8b7883ba..ee36f28be35 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c
index 416aea89e6e..a73e6e7c017 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c
index a4d7ab991de..73a1a736b87 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include "cond_fmaxnm_1.c"
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c
index 6c64a01bcbb..9ba5f5b318f 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include "cond_fmaxnm_2.c"
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c
index bdb3f2f99ef..68646edeb58 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include "cond_fmaxnm_3.c"
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c
index c11633b5236..f3c9e5fc097 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include "cond_fmaxnm_4.c"
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c
index bb456887568..66f45a33852 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmaxnm_1.c"
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c
index 502f8987494..87c788a59ec 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmaxnm_2.c"
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c
index 2ea1eb2ec2c..d24a2ab1a80 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmaxnm_3.c"
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c
index 3673ecafc2d..a4fa6bfaf07 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -ffast-math -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmaxnm_4.c"
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c
index ac98941a373..a55d241a1cb 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmaxnm_1.c"
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c
index 7f4dba0d314..b34ed424914 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmaxnm_2.c"
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c
index 5faf0c5cc59..7c8bd8ce14a 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmaxnm_3.c"
diff --git a/gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c b/gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c
index 89d93ac596a..ecd18812caa 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmaxnm_4.c"
diff --git a/gcc/testsuite/gcc.target/gcn/cond_shift_3.c b/gcc/testsuite/gcc.target/gcn/cond_shift_3.c
index 983386c1464..d41796ece9c 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_shift_3.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_shift_3.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_shift_4.c b/gcc/testsuite/gcc.target/gcn/cond_shift_4.c
index c610363d9df..bb22375b6b8 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_shift_4.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_shift_4.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_shift_8.c b/gcc/testsuite/gcc.target/gcn/cond_shift_8.c
index 0749e2e5e53..634b3c0a7e5 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_shift_8.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_shift_8.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_shift_9.c b/gcc/testsuite/gcc.target/gcn/cond_shift_9.c
index 61aba27504e..77a43fecf08 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_shift_9.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_shift_9.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_smax_1.c b/gcc/testsuite/gcc.target/gcn/cond_smax_1.c
index 342b5e827d2..bc80edf2a79 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_smax_1.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_smax_1.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_smin_1.c b/gcc/testsuite/gcc.target/gcn/cond_smin_1.c
index ad8b583448b..3df0d8beac5 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_smin_1.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_smin_1.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_umax_1.c b/gcc/testsuite/gcc.target/gcn/cond_umax_1.c
index 389228f9e4a..f573bbe27ae 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_umax_1.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_umax_1.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/cond_umin_1.c b/gcc/testsuite/gcc.target/gcn/cond_umin_1.c
index 65759d695ad..73b4089adcd 100644
--- a/gcc/testsuite/gcc.target/gcn/cond_umin_1.c
+++ b/gcc/testsuite/gcc.target/gcn/cond_umin_1.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/simd-math-1.c b/gcc/testsuite/gcc.target/gcn/simd-math-1.c
index 6868ccb2c54..a4346f866b7 100644
--- a/gcc/testsuite/gcc.target/gcn/simd-math-1.c
+++ b/gcc/testsuite/gcc.target/gcn/simd-math-1.c
@@ -2,7 +2,8 @@ 
 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fno-math-errno -mstack-size=3000000 -fdump-tree-vect" } */
-
+/* The 'scan-tree-dump' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #undef PRINT_RESULT
 #define VERBOSE 0
diff --git a/gcc/testsuite/gcc.target/gcn/simd-math-5-char.c b/gcc/testsuite/gcc.target/gcn/simd-math-5-char.c
index 2321c8390c6..5011839e29f 100644
--- a/gcc/testsuite/gcc.target/gcn/simd-math-5-char.c
+++ b/gcc/testsuite/gcc.target/gcn/simd-math-5-char.c
@@ -1,3 +1,6 @@ 
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
+
 #define TYPE char
 #include "simd-math-5.c"
 
diff --git a/gcc/testsuite/gcc.target/gcn/simd-math-5-long.c b/gcc/testsuite/gcc.target/gcn/simd-math-5-long.c
index 37b6cef691e..baf76171ded 100644
--- a/gcc/testsuite/gcc.target/gcn/simd-math-5-long.c
+++ b/gcc/testsuite/gcc.target/gcn/simd-math-5-long.c
@@ -1,3 +1,6 @@ 
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
+
 #define TYPE long
 #include "simd-math-5.c"
 
diff --git a/gcc/testsuite/gcc.target/gcn/simd-math-5-short.c b/gcc/testsuite/gcc.target/gcn/simd-math-5-short.c
index 84cdc9b5fdd..fd87bf51ffc 100644
--- a/gcc/testsuite/gcc.target/gcn/simd-math-5-short.c
+++ b/gcc/testsuite/gcc.target/gcn/simd-math-5-short.c
@@ -1,3 +1,6 @@ 
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
+
 #define TYPE short
 #include "simd-math-5.c"
 
diff --git a/gcc/testsuite/gcc.target/gcn/simd-math-5.c b/gcc/testsuite/gcc.target/gcn/simd-math-5.c
index bc181b45e1b..65916ddfdda 100644
--- a/gcc/testsuite/gcc.target/gcn/simd-math-5.c
+++ b/gcc/testsuite/gcc.target/gcn/simd-math-5.c
@@ -1,6 +1,9 @@ 
 /* Test that the auto-vectorizer uses the libgcc vectorized division and
    modulus functions.  */
 
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
+
 /* Setting it this way ensures the run tests use the same flag as the
    compile tests.  */
 #pragma GCC optimize("O2")
diff --git a/gcc/testsuite/gcc.target/gcn/smax_1.c b/gcc/testsuite/gcc.target/gcn/smax_1.c
index 46c21f73132..9ce2064b7d2 100644
--- a/gcc/testsuite/gcc.target/gcn/smax_1.c
+++ b/gcc/testsuite/gcc.target/gcn/smax_1.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/smin_1.c b/gcc/testsuite/gcc.target/gcn/smin_1.c
index 8d6edfaa3d1..2df1490381d 100644
--- a/gcc/testsuite/gcc.target/gcn/smin_1.c
+++ b/gcc/testsuite/gcc.target/gcn/smin_1.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/umax_1.c b/gcc/testsuite/gcc.target/gcn/umax_1.c
index dc4b9842d9a..4f6217dd6e4 100644
--- a/gcc/testsuite/gcc.target/gcn/umax_1.c
+++ b/gcc/testsuite/gcc.target/gcn/umax_1.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
diff --git a/gcc/testsuite/gcc.target/gcn/umin_1.c b/gcc/testsuite/gcc.target/gcn/umin_1.c
index d07f7ec083b..01b9ba1a225 100644
--- a/gcc/testsuite/gcc.target/gcn/umin_1.c
+++ b/gcc/testsuite/gcc.target/gcn/umin_1.c
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -dp" } */
+/* The 'scan-assembler' directives are specific to 64-lane vectors.
+   { dg-additional-options --param=gcn-preferred-vector-lane-width=64 } */
 
 #include <stdint.h>
 
-- 
2.34.1