Message ID | EB4625145972F94C9680D8CADD651615773094A9@SATLEXDAG02.amd.com |
---|---|
State | New |
Headers | show |
On Tue, Jun 10, 2014 at 12:30 PM, Gopalasubramanian, Ganesh <Ganesh.Gopalasubramanian@amd.com> wrote: > Hi, > > The below patch fixes the issue with 64-bit multiplication. > The instruction "vpmacsdql" does signed 32-bit multiplication. > For V2DImode, we require widened unsigned multiplication. > So, replacing the "vpmacsdql" instruction with "vpmuludq" and "vpaddq". > > This patch had been already discussed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52908 > > With required change in the test xop-imul64-vector.c, make check passes. Is it OK for upstream? > > Regards > Ganesh > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index d0a1253..c158612 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,9 @@ > +2014-06-10 Ganesh Gopalasubramanian <Ganesh.Gopalasubramanian@amd.com> > + > + * config/i386/i386.c (ix86_expand_sse2_mulvxdi3): Issue instructions > +"vpmuludq" and "vpaddq" instead of "vpmacsdql" for handling 32-bit > +multiplication. > OK for mainline and release branches. Thanks, Uros.
Hi Uros! > > +2014-06-10 Ganesh Gopalasubramanian > > +<Ganesh.Gopalasubramanian@amd.com> > > + > > + * config/i386/i386.c (ix86_expand_sse2_mulvxdi3): Issue > > +instructions "vpmuludq" and "vpaddq" instead of "vpmacsdql" for > > +handling 32-bit multiplication. > > > OK for mainline and release branches. I would like to backport the above patch for 4.9. Is it OK? Regards Ganesh
On Tue, Aug 12, 2014 at 7:23 AM, Gopalasubramanian, Ganesh <Ganesh.Gopalasubramanian@amd.com> wrote: >> > +2014-06-10 Ganesh Gopalasubramanian >> > +<Ganesh.Gopalasubramanian@amd.com> >> > + >> > + * config/i386/i386.c (ix86_expand_sse2_mulvxdi3): Issue >> > +instructions "vpmuludq" and "vpaddq" instead of "vpmacsdql" for >> > +handling 32-bit multiplication. >> > > >> OK for mainline and release branches. > > I would like to backport the above patch for 4.9. Is it OK? OK. It is a correctness issue. Uros.
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index d0a1253..c158612 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2014-06-10 Ganesh Gopalasubramanian <Ganesh.Gopalasubramanian@amd.com> + + * config/i386/i386.c (ix86_expand_sse2_mulvxdi3): Issue instructions +"vpmuludq" and "vpaddq" instead of "vpmacsdql" for handling 32-bit +multiplication. + 2014-06-07 Jan Hubicka <hubicka@ucw.cz> * cgraphunit.c (assemble_thunks_and_aliases): Expand thunks before diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 9105132..184d82d 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -45205,8 +45205,10 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2) /* t4: ((B*E)+(A*F))<<32, ((D*G)+(C*H))<<32 */ emit_insn (gen_ashlv2di3 (t4, t3, GEN_INT (32))); - /* op0: (((B*E)+(A*F))<<32)+(B*F), (((D*G)+(C*H))<<32)+(D*H) */ - emit_insn (gen_xop_pmacsdql (op0, op1, op2, t4)); + /* Multiply lower parts and add all */ + t5 = gen_reg_rtx (V2DImode); + emit_insn (gen_vec_widen_umult_even_v4si (t5, gen_lowpart (V4SImode, op1), gen_lowpart (V4SImode, op2))); + op0 = expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT); } else { diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index a6913af..757d3e3 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-06-10 Ganesh Gopalasubramanian <Ganesh.Gopalasubramanian@amd.com> + + * gcc.target/i386/xop-imul64-vector.c: Remove the check for + vpmacsdql instruction. + 2014-06-07 Eric Botcazou <ebotcazou@adacore.com> * gnat.dg/opt38.adb: New test. diff --git a/gcc/testsuite/gcc.target/i386/xop-imul64-vector.c b/gcc/testsuite/gcc.target/i386/xop-imul64-vector.c index fbf605f..fc8c880 100644 --- a/gcc/testsuite/gcc.target/i386/xop-imul64-vector.c +++ b/gcc/testsuite/gcc.target/i386/xop-imul64-vector.c @@ -33,4 +33,3 @@ int main () /* { dg-final { scan-assembler "vpmulld" } } */ /* { dg-final { scan-assembler "vphadddq" } } */ -/* { dg-final { scan-assembler "vpmacsdql" } } */