Message ID | 8acdba25-576b-19d6-71b8-d6ca84ec675c@codesourcery.com |
---|---|
State | New |
Headers | show |
Series | [GCN] Fix handling of VCC_CONDITIONAL_REG | expand |
On 14/11/2019 12:43, Kwok Cheung Yeung wrote: > Hello > > This patch fixes an issue seen in the following test cases on AMD GCN: > > libgomp.oacc-fortran/gemm.f90 > libgomp.oacc-fortran/gemm-2.f90 > libgomp.c/for-5-test_ttdpfs_ds128_auto.c > libgomp.c/for-5-test_ttdpfs_ds128_guided32.c > libgomp.c/for-5-test_ttdpfs_ds128_runtime.c > libgomp.c/for-5-test_ttdpfs_ds128_static.c > libgomp.c/for-5-test_ttdpfs_ds128_static32.c > libgomp.c/for-6-test_tdpfs_ds128_auto.c > libgomp.c/for-6-test_tdpfs_ds128_guided32.c > libgomp.c/for-6-test_tdpfs_ds128_runtime.c > libgomp.c/for-6-test_tdpfs_ds128_static.c > libgomp.c/for-6-test_tdpfs_ds128_static32.c > libgomp.c-c++-common/for-5.c > libgomp.c-c++-common/for-6.c > > The compiler is generating code like this: > > v_cmp_gt_i32 vcc, s3, v1 > v_writelane_b32 v3, vcc_lo, 0 ; Move VCC to v[3:4] > v_writelane_b32 v4, vcc_hi, 0 > ... > v_cmp_eq_f32 vcc, v20, v0 ; Clobber VCC > ... > ; Move old VCC_LO into s62 - this is okay as v3 has not > ; been clobbered. > v_readlane_b32 s62, v3, 0 > > ; Move old VCC_HI into s63 - this is _not_ okay as VCC_HI > ; has been clobbered by the intervening v_cmp_eq_f32 instruction. > s_mov_b32 s63, vcc_hi > > GCC is failing to notice that the v_cmp_eq_f32 instruction clobbers both > vcc_lo and vcc_hi. This is because gcn_hard_regno_nregs uses > REGNO_REG_CLASS to determine the register class of the queried hard reg, > but vcc_lo and vcc_hi are classified as GENERAL_REGS by > gcn_regno_reg_class. gcn_hard_regno_nregs therefore returns 1 for the > BImode store into vcc, so vcc_hi is regarded as untouched by the operation. > > This is fixed by making gcn_regno_reg_class classify vcc_lo and vcc_hi > into VCC_CONDITIONAL_REG (REGNO_REG_CLASS is supposed to return the > smallest class anyway). I have also added a spill class for > VCC_CONDITIONAL_REG (into SGPR_REGS) to avoid expensive spills into memory. > > Built for and tested on the AMD GCN target with no regressions. > > Okay for trunk? > > Kwok > > > 2019-11-14 Kwok Cheung Yeung <kcy@codesourcery.com> > > gcc/ > * config/gcn/gcn.c (gcn_regno_reg_class): Return VCC_CONDITIONAL_REG > register class for VCC_LO and VCC_HI. > (gcn_spill_class): Use SGPR_REGS to spill registers in > VCC_CONDITIONAL_REG. > --- > gcc/config/gcn/gcn.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c > index 976843b..dd89c26 100644 > --- a/gcc/config/gcn/gcn.c > +++ b/gcc/config/gcn/gcn.c > @@ -470,6 +470,9 @@ gcn_regno_reg_class (int regno) > { > case SCC_REG: > return SCC_CONDITIONAL_REG; > + case VCC_LO_REG: > + case VCC_HI_REG: > + return VCC_CONDITIONAL_REG; > case VCCZ_REG: > return VCCZ_CONDITIONAL_REG; > case EXECZ_REG: > @@ -637,7 +640,8 @@ gcn_can_split_p (machine_mode, rtx op) > static reg_class_t > gcn_spill_class (reg_class_t c, machine_mode /*mode */ ) > { > - if (reg_classes_intersect_p (ALL_CONDITIONAL_REGS, c)) > + if (reg_classes_intersect_p (ALL_CONDITIONAL_REGS, c) > + || c == VCC_CONDITIONAL_REG) > return SGPR_REGS; > else > return NO_REGS; OK. Andrew
diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c index 976843b..dd89c26 100644 --- a/gcc/config/gcn/gcn.c +++ b/gcc/config/gcn/gcn.c @@ -470,6 +470,9 @@ gcn_regno_reg_class (int regno) { case SCC_REG: return SCC_CONDITIONAL_REG; + case VCC_LO_REG: + case VCC_HI_REG: + return VCC_CONDITIONAL_REG; case VCCZ_REG: return VCCZ_CONDITIONAL_REG; case EXECZ_REG: @@ -637,7 +640,8 @@ gcn_can_split_p (machine_mode, rtx op) static reg_class_t gcn_spill_class (reg_class_t c, machine_mode /*mode */ ) { - if (reg_classes_intersect_p (ALL_CONDITIONAL_REGS, c)) + if (reg_classes_intersect_p (ALL_CONDITIONAL_REGS, c) + || c == VCC_CONDITIONAL_REG) return SGPR_REGS; else return NO_REGS;