Message ID | 20110921074523.GT2687@tyan-ft48-01.lab.bos.redhat.com |
---|---|
State | New |
Headers | show |
On Wed, Sep 21, 2011 at 9:45 AM, Jakub Jelinek <jakub@redhat.com> wrote: >> * config/i386/i386.md (maxmin): New code iterator. >> * config/i386/sse.md (<maxmin:code><mode>3): Macroize expander >> from <umaxmin:code><mode>3 and <smaxmin:code><mode>3 using maxmin >> code iterator. >> (*avx2_<maxmin:code><mode>3): Macroize isn from >> *avx2_<umaxmin:code><mode>3 and *avx2_<smaxmin:code><mode>3 using >> maxmin code iterator. >> (<smaxmin:code><VI124_128:mode>3): Merge with <smaxmin:code>v8hi3. >> (<umaxmin:code><VI124_128:mode>3): Merge with umaxv4si3 and >> <umaxmin:code>v16qi3. >> >> Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. > > This regressed gcc.dg/vect/vect-reduc-10.c code quality with -msse2, > - psubusw (%rax), %xmm0 > - paddw (%rax), %xmm0 > + movdqa (%rax), %xmm1 > addq $16, %rax > cmpq $aus+2048, %rax > + psubusw %xmm1, %xmm0 > + paddw %xmm1, %xmm0 > > the problem is that the two expanders force arguments into registers > unconditionally, while previously in some cases they were > nonimmediate_operand instead of register_operand predicated. > This patch fixes that by always using nonimmediate_operand and adding > force_reg where needed. > > 2011-09-21 Jakub Jelinek <jakub@redhat.com> > > * config/i386/sse.md (<code><mode>3 smaxmin:VI124_128 expander): Use > nonimmediate_operand instead of register_operand predicate for operands > 1 and 2, force them into registers if expanding them as comparison. > (<code><mode>3 umaxmin:VI124_128 expander): Similarly. For UMAX > V8HImode force into register just operand 1. OK. Thanks, Uros.
--- gcc/config/i386/sse.md.jj 2011-09-21 09:04:21.000000000 +0200 +++ gcc/config/i386/sse.md 2011-09-21 09:37:43.000000000 +0200 @@ -5997,8 +5997,8 @@ (define_expand "<code><mode>3" (define_expand "<code><mode>3" [(set (match_operand:VI124_128 0 "register_operand" "") - (smaxmin:VI124_128 (match_operand:VI124_128 1 "register_operand" "") - (match_operand:VI124_128 2 "register_operand" "")))] + (smaxmin:VI124_128 (match_operand:VI124_128 1 "nonimmediate_operand" "") + (match_operand:VI124_128 2 "nonimmediate_operand" "")))] "TARGET_SSE2" { if (TARGET_SSE4_1 || <MODE>mode == V8HImode) @@ -6009,6 +6009,8 @@ (define_expand "<code><mode>3" bool ok; xops[0] = operands[0]; + operands[1] = force_reg (<MODE>mode, operands[1]); + operands[2] = force_reg (<MODE>mode, operands[2]); if (<CODE> == SMAX) { @@ -6064,8 +6066,8 @@ (define_insn "*<code>v8hi3" (define_expand "<code><mode>3" [(set (match_operand:VI124_128 0 "register_operand" "") - (umaxmin:VI124_128 (match_operand:VI124_128 1 "register_operand" "") - (match_operand:VI124_128 2 "register_operand" "")))] + (umaxmin:VI124_128 (match_operand:VI124_128 1 "nonimmediate_operand" "") + (match_operand:VI124_128 2 "nonimmediate_operand" "")))] "TARGET_SSE2" { if (TARGET_SSE4_1 || <MODE>mode == V16QImode) @@ -6073,6 +6075,7 @@ (define_expand "<code><mode>3" else if (<CODE> == UMAX && <MODE>mode == V8HImode) { rtx op0 = operands[0], op2 = operands[2], op3 = op0; + operands[1] = force_reg (<MODE>mode, operands[1]); if (rtx_equal_p (op3, op2)) op3 = gen_reg_rtx (V8HImode); emit_insn (gen_sse2_ussubv8hi3 (op3, operands[1], op2)); @@ -6084,6 +6087,9 @@ (define_expand "<code><mode>3" rtx xops[6]; bool ok; + operands[1] = force_reg (<MODE>mode, operands[1]); + operands[2] = force_reg (<MODE>mode, operands[2]); + xops[0] = operands[0]; if (<CODE> == UMAX)