From patchwork Thu Feb 3 05:47:07 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 81618 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 49025B70EB for ; Thu, 3 Feb 2011 16:47:42 +1100 (EST) Received: (qmail 7368 invoked by alias); 3 Feb 2011 05:47:39 -0000 Received: (qmail 7343 invoked by uid 22791); 3 Feb 2011 05:47:29 -0000 X-SWARE-Spam-Status: No, hits=-0.7 required=5.0 tests=AWL, BAYES_40, NO_DNS_FOR_FROM, TW_DF, TW_LV, TW_TV, TW_VX, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e6.ny.us.ibm.com (HELO e6.ny.us.ibm.com) (32.97.182.146) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 03 Feb 2011 05:47:17 +0000 Received: from d01dlp02.pok.ibm.com (d01dlp02.pok.ibm.com [9.56.224.85]) by e6.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p135X0gP004210 for ; Thu, 3 Feb 2011 00:33:00 -0500 Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 648E24DE803B for ; Thu, 3 Feb 2011 00:46:37 -0500 (EST) Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p135lDZT257350 for ; Thu, 3 Feb 2011 00:47:13 -0500 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p135lDwN015215 for ; Thu, 3 Feb 2011 00:47:13 -0500 Received: from hungry-tiger.westford.ibm.com (sig-9-65-208-133.mts.ibm.com [9.65.208.133]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p135lB9X015141; Thu, 3 Feb 2011 00:47:11 -0500 Received: by hungry-tiger.westford.ibm.com (Postfix, from userid 500) id 0EAF3F7CE3; Thu, 3 Feb 2011 00:47:07 -0500 (EST) Date: Thu, 3 Feb 2011 00:47:07 -0500 From: Michael Meissner To: David Edelsohn Cc: Michael Meissner , Mark Mitchell , gcc-patches@gcc.gnu.org, rth@redhat.com, rguenther@suse.de, jakub@redhat.com, berner@vnet.ibm.com, geoffk@geoffk.org, joseph@codesourcery.com, pinskia@gmail.com, dominiq@lps.ens.fr Subject: Re: [PATCH] Fix PR 47272 to restore Altivec vec_ld/vec_st Message-ID: <20110203054707.GA24840@hungry-tiger.westford.ibm.com> Mail-Followup-To: Michael Meissner , David Edelsohn , Mark Mitchell , gcc-patches@gcc.gnu.org, rth@redhat.com, rguenther@suse.de, jakub@redhat.com, berner@vnet.ibm.com, geoffk@geoffk.org, joseph@codesourcery.com, pinskia@gmail.com, dominiq@lps.ens.fr References: <20110124213133.GA21518@hungry-tiger.westford.ibm.com> <4D3DF07A.3030101@codesourcery.com> <20110124215225.GA22498@hungry-tiger.westford.ibm.com> <20110131201404.GA13933@hungry-tiger.westford.ibm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Content-Scanned: Fidelis XPS MAILER X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Wed, Feb 02, 2011 at 04:08:44PM -0500, David Edelsohn wrote: > Okay, without the libcpp/lex.c change, as discussed offline. > > Some of the XXX_type_node lines are too long (replacing lines that > were too long). > > Thanks, David Here is the patch I committed, fixing the long lines in the patch, and dropping the lex.c patch. [gcc] 2011-02-02 Michael Meissner PR target/47272 * doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions): Document using vector double with the load/store builtins, and that the load/store builtins always use Altivec instructions. * config/rs6000/vector.md (vector_altivec_load_): New insns to use altivec memory instructions, even on VSX. (vector_altivec_store_): Ditto. * config/rs6000/rs6000-protos.h (rs6000_address_for_altivec): New function. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add V2DF, V2DI support to load/store overloaded builtins. * config/rs6000/rs6000-builtin.def (ALTIVEC_BUILTIN_*): Add altivec load/store builtins for V2DF/V2DI types. * config/rs6000/rs6000.c (rs6000_option_override_internal): Don't set avoid indexed addresses on power6 if -maltivec. (altivec_expand_ld_builtin): Add V2DF, V2DI support, use vector_altivec_load/vector_altivec_store builtins. (altivec_expand_st_builtin): Ditto. (altivec_expand_builtin): Add VSX memory builtins. (rs6000_init_builtins): Add V2DI types to internal types. (altivec_init_builtins): Add support for V2DF/V2DI altivec load/store builtins. (rs6000_address_for_altivec): Insure memory address is appropriate for Altivec. * config/rs6000/vsx.md (vsx_load_): New expanders for vec_vsx_ld and vec_vsx_st. (vsx_store_): Ditto. * config/rs6000/rs6000.h (RS6000_BTI_long_long): New type variables to hold long long types for VSX vector memory builtins. (RS6000_BTI_unsigned_long_long): Ditto. (long_long_integer_type_internal_node): Ditti. (long_long_unsigned_type_internal_node): Ditti. * config/rs6000/altivec.md (UNSPEC_LVX): New UNSPEC. (altivec_lvx_): Make altivec_lvx use a mode iterator. (altivec_stvx_): Make altivec_stvx use a mode iterator. * config/rs6000/altivec.h (vec_vsx_ld): Define VSX memory builtin short cuts. (vec_vsx_st): Ditto. [gcc/testsuite] 2011-02-02 Michael Meissner PR target/47272 * gcc.target/powerpc/vsx-builtin-8.c: New file, test vec_vsx_ld and vec_vsx_st. * gcc.target/powerpc/avoid-indexed-addresses.c: Disable altivec and vsx so a default --with-cpu=power7 doesn't give an error when -mavoid-indexed-addresses is used. * gcc.target/powerpc/ppc32-abi-dfp-1.c: Rewrite to use an asm wrapper function to save the arguments and then jump to the real function, rather than depending on the compiler not to move stuff before an asm. * gcc.target/powerpc/ppc64-abi-dfp-2.c: Ditto. Index: gcc/doc/extend.texi =================================================================== --- gcc/doc/extend.texi (revision 169775) +++ gcc/doc/extend.texi (working copy) @@ -12359,6 +12359,12 @@ vector bool long vec_cmplt (vector doubl vector float vec_div (vector float, vector float); vector double vec_div (vector double, vector double); vector double vec_floor (vector double); +vector double vec_ld (int, const vector double *); +vector double vec_ld (int, const double *); +vector double vec_ldl (int, const vector double *); +vector double vec_ldl (int, const double *); +vector unsigned char vec_lvsl (int, const volatile double *); +vector unsigned char vec_lvsr (int, const volatile double *); vector double vec_madd (vector double, vector double, vector double); vector double vec_max (vector double, vector double); vector double vec_min (vector double, vector double); @@ -12387,6 +12393,8 @@ vector double vec_sel (vector double, ve vector double vec_sub (vector double, vector double); vector float vec_sqrt (vector float); vector double vec_sqrt (vector double); +void vec_st (vector double, int, vector double *); +void vec_st (vector double, int, double *); vector double vec_trunc (vector double); vector double vec_xor (vector double, vector double); vector double vec_xor (vector double, vector bool long); @@ -12415,7 +12423,65 @@ int vec_any_ngt (vector double, vector d int vec_any_nle (vector double, vector double); int vec_any_nlt (vector double, vector double); int vec_any_numeric (vector double); -@end smallexample + +vector double vec_vsx_ld (int, const vector double *); +vector double vec_vsx_ld (int, const double *); +vector float vec_vsx_ld (int, const vector float *); +vector float vec_vsx_ld (int, const float *); +vector bool int vec_vsx_ld (int, const vector bool int *); +vector signed int vec_vsx_ld (int, const vector signed int *); +vector signed int vec_vsx_ld (int, const int *); +vector signed int vec_vsx_ld (int, const long *); +vector unsigned int vec_vsx_ld (int, const vector unsigned int *); +vector unsigned int vec_vsx_ld (int, const unsigned int *); +vector unsigned int vec_vsx_ld (int, const unsigned long *); +vector bool short vec_vsx_ld (int, const vector bool short *); +vector pixel vec_vsx_ld (int, const vector pixel *); +vector signed short vec_vsx_ld (int, const vector signed short *); +vector signed short vec_vsx_ld (int, const short *); +vector unsigned short vec_vsx_ld (int, const vector unsigned short *); +vector unsigned short vec_vsx_ld (int, const unsigned short *); +vector bool char vec_vsx_ld (int, const vector bool char *); +vector signed char vec_vsx_ld (int, const vector signed char *); +vector signed char vec_vsx_ld (int, const signed char *); +vector unsigned char vec_vsx_ld (int, const vector unsigned char *); +vector unsigned char vec_vsx_ld (int, const unsigned char *); + +void vec_vsx_st (vector double, int, vector double *); +void vec_vsx_st (vector double, int, double *); +void vec_vsx_st (vector float, int, vector float *); +void vec_vsx_st (vector float, int, float *); +void vec_vsx_st (vector signed int, int, vector signed int *); +void vec_vsx_st (vector signed int, int, int *); +void vec_vsx_st (vector unsigned int, int, vector unsigned int *); +void vec_vsx_st (vector unsigned int, int, unsigned int *); +void vec_vsx_st (vector bool int, int, vector bool int *); +void vec_vsx_st (vector bool int, int, unsigned int *); +void vec_vsx_st (vector bool int, int, int *); +void vec_vsx_st (vector signed short, int, vector signed short *); +void vec_vsx_st (vector signed short, int, short *); +void vec_vsx_st (vector unsigned short, int, vector unsigned short *); +void vec_vsx_st (vector unsigned short, int, unsigned short *); +void vec_vsx_st (vector bool short, int, vector bool short *); +void vec_vsx_st (vector bool short, int, unsigned short *); +void vec_vsx_st (vector pixel, int, vector pixel *); +void vec_vsx_st (vector pixel, int, unsigned short *); +void vec_vsx_st (vector pixel, int, short *); +void vec_vsx_st (vector bool short, int, short *); +void vec_vsx_st (vector signed char, int, vector signed char *); +void vec_vsx_st (vector signed char, int, signed char *); +void vec_vsx_st (vector unsigned char, int, vector unsigned char *); +void vec_vsx_st (vector unsigned char, int, unsigned char *); +void vec_vsx_st (vector bool char, int, vector bool char *); +void vec_vsx_st (vector bool char, int, unsigned char *); +void vec_vsx_st (vector bool char, int, signed char *); +@end smallexample + +Note that the @samp{vec_ld} and @samp{vec_st} builtins will always +generate the Altivec @samp{LVX} and @samp{STVX} instructions even +if the VSX instruction set is available. The @samp{vec_vsx_ld} and +@samp{vec_vsx_st} builtins will always generate the VSX @samp{LXVD2X}, +@samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions. GCC provides a few other builtins on Powerpc to access certain instructions: @smallexample Index: gcc/testsuite/gcc.target/powerpc/vsx-builtin-8.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vsx-builtin-8.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-8.c (revision 0) @@ -0,0 +1,97 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O3 -mcpu=power7" } */ + +/* Test the various load/store varients. */ + +#include + +#define TEST_COPY(NAME, TYPE) \ +void NAME ## _copy_native (vector TYPE *a, vector TYPE *b) \ +{ \ + *a = *b; \ +} \ + \ +void NAME ## _copy_vec (vector TYPE *a, vector TYPE *b) \ +{ \ + vector TYPE x = vec_ld (0, b); \ + vec_st (x, 0, a); \ +} \ + +#define TEST_COPYL(NAME, TYPE) \ +void NAME ## _lvxl (vector TYPE *a, vector TYPE *b) \ +{ \ + vector TYPE x = vec_ldl (0, b); \ + vec_stl (x, 0, a); \ +} \ + +#define TEST_VSX_COPY(NAME, TYPE) \ +void NAME ## _copy_vsx (vector TYPE *a, vector TYPE *b) \ +{ \ + vector TYPE x = vec_vsx_ld (0, b); \ + vec_vsx_st (x, 0, a); \ +} \ + +#define TEST_ALIGN(NAME, TYPE) \ +void NAME ## _align (vector unsigned char *a, TYPE *b) \ +{ \ + vector unsigned char x = vec_lvsl (0, b); \ + vector unsigned char y = vec_lvsr (0, b); \ + vec_st (x, 0, a); \ + vec_st (y, 8, a); \ +} + +#ifndef NO_COPY +TEST_COPY(uchar, unsigned char) +TEST_COPY(schar, signed char) +TEST_COPY(bchar, bool char) +TEST_COPY(ushort, unsigned short) +TEST_COPY(sshort, signed short) +TEST_COPY(bshort, bool short) +TEST_COPY(uint, unsigned int) +TEST_COPY(sint, signed int) +TEST_COPY(bint, bool int) +TEST_COPY(float, float) +TEST_COPY(double, double) +#endif /* NO_COPY */ + +#ifndef NO_COPYL +TEST_COPYL(uchar, unsigned char) +TEST_COPYL(schar, signed char) +TEST_COPYL(bchar, bool char) +TEST_COPYL(ushort, unsigned short) +TEST_COPYL(sshort, signed short) +TEST_COPYL(bshort, bool short) +TEST_COPYL(uint, unsigned int) +TEST_COPYL(sint, signed int) +TEST_COPYL(bint, bool int) +TEST_COPYL(float, float) +TEST_COPYL(double, double) +#endif /* NO_COPYL */ + +#ifndef NO_ALIGN +TEST_ALIGN(uchar, unsigned char) +TEST_ALIGN(schar, signed char) +TEST_ALIGN(ushort, unsigned short) +TEST_ALIGN(sshort, signed short) +TEST_ALIGN(uint, unsigned int) +TEST_ALIGN(sint, signed int) +TEST_ALIGN(float, float) +TEST_ALIGN(double, double) +#endif /* NO_ALIGN */ + + +#ifndef NO_VSX_COPY +TEST_VSX_COPY(uchar, unsigned char) +TEST_VSX_COPY(schar, signed char) +TEST_VSX_COPY(bchar, bool char) +TEST_VSX_COPY(ushort, unsigned short) +TEST_VSX_COPY(sshort, signed short) +TEST_VSX_COPY(bshort, bool short) +TEST_VSX_COPY(uint, unsigned int) +TEST_VSX_COPY(sint, signed int) +TEST_VSX_COPY(bint, bool int) +TEST_VSX_COPY(float, float) +TEST_VSX_COPY(double, double) +#endif /* NO_VSX_COPY */ Index: gcc/testsuite/gcc.target/powerpc/ppc64-abi-dfp-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ppc64-abi-dfp-1.c (revision 169775) +++ gcc/testsuite/gcc.target/powerpc/ppc64-abi-dfp-1.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do run { target { powerpc64-*-* && { lp64 && dfprt } } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ /* { dg-options "-std=gnu99 -O2 -fno-strict-aliasing" } */ /* Testcase to check for ABI compliance of parameter passing @@ -31,60 +32,42 @@ typedef struct reg_parms_t gparms; -/* Testcase could break on future gcc's, if parameter regs - are changed before this asm. */ - -#ifndef __MACH__ -#define save_parms(lparms) \ - asm volatile ("ld 11,gparms@got(2)\n\t" \ - "std 3,0(11)\n\t" \ - "std 4,8(11)\n\t" \ - "std 5,16(11)\n\t" \ - "std 6,24(11)\n\t" \ - "std 7,32(11)\n\t" \ - "std 8,40(11)\n\t" \ - "std 9,48(11)\n\t" \ - "std 10,56(11)\n\t" \ - "stfd 1,64(11)\n\t" \ - "stfd 2,72(11)\n\t" \ - "stfd 3,80(11)\n\t" \ - "stfd 4,88(11)\n\t" \ - "stfd 5,96(11)\n\t" \ - "stfd 6,104(11)\n\t" \ - "stfd 7,112(11)\n\t" \ - "stfd 8,120(11)\n\t" \ - "stfd 9,128(11)\n\t" \ - "stfd 10,136(11)\n\t" \ - "stfd 11,144(11)\n\t" \ - "stfd 12,152(11)\n\t" \ - "stfd 13,160(11)\n\t":::"11", "memory"); \ - lparms = gparms; -#else -#define save_parms(lparms) \ - asm volatile ("ld r11,gparms@got(r2)\n\t" \ - "std r3,0(r11)\n\t" \ - "std r4,8(r11)\n\t" \ - "std r5,16(r11)\n\t" \ - "std r6,24(r11)\n\t" \ - "std r7,32(r11)\n\t" \ - "std r8,40(r11)\n\t" \ - "std r9,48(r11)\n\t" \ - "std r10,56(r11)\n\t" \ - "stfd f1,64(r11)\n\t" \ - "stfd f2,72(r11)\n\t" \ - "stfd f3,80(r11)\n\t" \ - "stfd f4,88(r11)\n\t" \ - "stfd f5,96(r11)\n\t" \ - "stfd f6,104(r11)\n\t" \ - "stfd f7,112(r11)\n\t" \ - "stfd f8,120(r11)\n\t" \ - "stfd f9,128(r11)\n\t" \ - "stfd f10,136(r11)\n\t" \ - "stfd f11,144(r11)\n\t" \ - "stfd f12,152(r11)\n\t" \ - "stfd f13,160(r11)\n\t":::"r11", "memory"); \ - lparms = gparms; -#endif +/* Wrapper to save the GPRs and FPRs and then jump to the real function. */ +#define WRAPPER(NAME) \ +__asm__ ("\t.globl\t" #NAME "_asm\n\t" \ + ".section \".opd\",\"aw\"\n\t" \ + ".align 3\n" \ + #NAME "_asm:\n\t" \ + ".quad .L." #NAME "_asm,.TOC.@tocbase,0\n\t" \ + ".text\n\t" \ + ".type " #NAME "_asm, @function\n" \ + ".L." #NAME "_asm:\n\t" \ + "ld 11,gparms@got(2)\n\t" \ + "std 3,0(11)\n\t" \ + "std 4,8(11)\n\t" \ + "std 5,16(11)\n\t" \ + "std 6,24(11)\n\t" \ + "std 7,32(11)\n\t" \ + "std 8,40(11)\n\t" \ + "std 9,48(11)\n\t" \ + "std 10,56(11)\n\t" \ + "stfd 1,64(11)\n\t" \ + "stfd 2,72(11)\n\t" \ + "stfd 3,80(11)\n\t" \ + "stfd 4,88(11)\n\t" \ + "stfd 5,96(11)\n\t" \ + "stfd 6,104(11)\n\t" \ + "stfd 7,112(11)\n\t" \ + "stfd 8,120(11)\n\t" \ + "stfd 9,128(11)\n\t" \ + "stfd 10,136(11)\n\t" \ + "stfd 11,144(11)\n\t" \ + "stfd 12,152(11)\n\t" \ + "stfd 13,160(11)\n\t" \ + "b " #NAME "\n\t" \ + ".long 0\n\t" \ + ".byte 0,0,0,0,0,0,0,0\n\t" \ + ".size " #NAME ",.-" #NAME "\n") typedef struct sf { @@ -97,6 +80,13 @@ typedef struct sf unsigned long slot[100]; } stack_frame_t; +extern void func0_asm (double, double, double, double, double, double, + double, double, double, double, double, double, + double, double, + _Decimal64, _Decimal128, _Decimal64); + +WRAPPER(func0); + /* Fill up floating point registers with double arguments, forcing decimal float arguments into the parameter save area. */ void __attribute__ ((noinline)) @@ -105,186 +95,209 @@ func0 (double a1, double a2, double a3, double a13, double a14, _Decimal64 a15, _Decimal128 a16, _Decimal64 a17) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != lparms.fprs[0]) FAILURE - if (a2 != lparms.fprs[1]) FAILURE - if (a3 != lparms.fprs[2]) FAILURE - if (a4 != lparms.fprs[3]) FAILURE - if (a5 != lparms.fprs[4]) FAILURE - if (a6 != lparms.fprs[5]) FAILURE - if (a7 != lparms.fprs[6]) FAILURE - if (a8 != lparms.fprs[7]) FAILURE - if (a9 != lparms.fprs[8]) FAILURE - if (a10 != lparms.fprs[9]) FAILURE - if (a11 != lparms.fprs[10]) FAILURE - if (a12 != lparms.fprs[11]) FAILURE - if (a13 != lparms.fprs[12]) FAILURE + if (a1 != gparms.fprs[0]) FAILURE + if (a2 != gparms.fprs[1]) FAILURE + if (a3 != gparms.fprs[2]) FAILURE + if (a4 != gparms.fprs[3]) FAILURE + if (a5 != gparms.fprs[4]) FAILURE + if (a6 != gparms.fprs[5]) FAILURE + if (a7 != gparms.fprs[6]) FAILURE + if (a8 != gparms.fprs[7]) FAILURE + if (a9 != gparms.fprs[8]) FAILURE + if (a10 != gparms.fprs[9]) FAILURE + if (a11 != gparms.fprs[10]) FAILURE + if (a12 != gparms.fprs[11]) FAILURE + if (a13 != gparms.fprs[12]) FAILURE if (a14 != *(double *)&sp->slot[13]) FAILURE if (a15 != *(_Decimal64 *)&sp->slot[14]) FAILURE if (a16 != *(_Decimal128 *)&sp->slot[15]) FAILURE if (a17 != *(_Decimal64 *)&sp->slot[17]) FAILURE } +extern void func1_asm (double, double, double, double, double, double, + double, double, double, double, double, double, + double, _Decimal128 ); + +WRAPPER(func1); + void __attribute__ ((noinline)) func1 (double a1, double a2, double a3, double a4, double a5, double a6, double a7, double a8, double a9, double a10, double a11, double a12, double a13, _Decimal128 a14) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != lparms.fprs[0]) FAILURE - if (a2 != lparms.fprs[1]) FAILURE - if (a3 != lparms.fprs[2]) FAILURE - if (a4 != lparms.fprs[3]) FAILURE - if (a5 != lparms.fprs[4]) FAILURE - if (a6 != lparms.fprs[5]) FAILURE - if (a7 != lparms.fprs[6]) FAILURE - if (a8 != lparms.fprs[7]) FAILURE - if (a9 != lparms.fprs[8]) FAILURE - if (a10 != lparms.fprs[9]) FAILURE - if (a11 != lparms.fprs[10]) FAILURE - if (a12 != lparms.fprs[11]) FAILURE - if (a13 != lparms.fprs[12]) FAILURE + if (a1 != gparms.fprs[0]) FAILURE + if (a2 != gparms.fprs[1]) FAILURE + if (a3 != gparms.fprs[2]) FAILURE + if (a4 != gparms.fprs[3]) FAILURE + if (a5 != gparms.fprs[4]) FAILURE + if (a6 != gparms.fprs[5]) FAILURE + if (a7 != gparms.fprs[6]) FAILURE + if (a8 != gparms.fprs[7]) FAILURE + if (a9 != gparms.fprs[8]) FAILURE + if (a10 != gparms.fprs[9]) FAILURE + if (a11 != gparms.fprs[10]) FAILURE + if (a12 != gparms.fprs[11]) FAILURE + if (a13 != gparms.fprs[12]) FAILURE if (a14 != *(_Decimal128 *)&sp->slot[13]) FAILURE } +extern void func2_asm (double, double, double, double, double, double, + double, double, double, double, double, double, + _Decimal128); + +WRAPPER(func2); + void __attribute__ ((noinline)) func2 (double a1, double a2, double a3, double a4, double a5, double a6, double a7, double a8, double a9, double a10, double a11, double a12, _Decimal128 a13) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != lparms.fprs[0]) FAILURE - if (a2 != lparms.fprs[1]) FAILURE - if (a3 != lparms.fprs[2]) FAILURE - if (a4 != lparms.fprs[3]) FAILURE - if (a5 != lparms.fprs[4]) FAILURE - if (a6 != lparms.fprs[5]) FAILURE - if (a7 != lparms.fprs[6]) FAILURE - if (a8 != lparms.fprs[7]) FAILURE - if (a9 != lparms.fprs[8]) FAILURE - if (a10 != lparms.fprs[9]) FAILURE - if (a11 != lparms.fprs[10]) FAILURE - if (a12 != lparms.fprs[11]) FAILURE + if (a1 != gparms.fprs[0]) FAILURE + if (a2 != gparms.fprs[1]) FAILURE + if (a3 != gparms.fprs[2]) FAILURE + if (a4 != gparms.fprs[3]) FAILURE + if (a5 != gparms.fprs[4]) FAILURE + if (a6 != gparms.fprs[5]) FAILURE + if (a7 != gparms.fprs[6]) FAILURE + if (a8 != gparms.fprs[7]) FAILURE + if (a9 != gparms.fprs[8]) FAILURE + if (a10 != gparms.fprs[9]) FAILURE + if (a11 != gparms.fprs[10]) FAILURE + if (a12 != gparms.fprs[11]) FAILURE if (a13 != *(_Decimal128 *)&sp->slot[12]) FAILURE } +extern void func3_asm (_Decimal64, _Decimal128, _Decimal64, _Decimal128, + _Decimal64, _Decimal128, _Decimal64, _Decimal128, + _Decimal64, _Decimal128); + +WRAPPER(func3); + void __attribute__ ((noinline)) func3 (_Decimal64 a1, _Decimal128 a2, _Decimal64 a3, _Decimal128 a4, _Decimal64 a5, _Decimal128 a6, _Decimal64 a7, _Decimal128 a8, _Decimal64 a9, _Decimal128 a10) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != *(_Decimal64 *)&lparms.fprs[0]) FAILURE /* f1 */ - if (a2 != *(_Decimal128 *)&lparms.fprs[1]) FAILURE /* f2 & f3 */ - if (a3 != *(_Decimal64 *)&lparms.fprs[3]) FAILURE /* f4 */ - if (a4 != *(_Decimal128 *)&lparms.fprs[5]) FAILURE /* f6 & f7 */ - if (a5 != *(_Decimal64 *)&lparms.fprs[7]) FAILURE /* f8 */ - if (a6 != *(_Decimal128 *)&lparms.fprs[9]) FAILURE /* f10 & f11 */ - if (a7 != *(_Decimal64 *)&lparms.fprs[11]) FAILURE /* f12 */ + if (a1 != *(_Decimal64 *)&gparms.fprs[0]) FAILURE /* f1 */ + if (a2 != *(_Decimal128 *)&gparms.fprs[1]) FAILURE /* f2 & f3 */ + if (a3 != *(_Decimal64 *)&gparms.fprs[3]) FAILURE /* f4 */ + if (a4 != *(_Decimal128 *)&gparms.fprs[5]) FAILURE /* f6 & f7 */ + if (a5 != *(_Decimal64 *)&gparms.fprs[7]) FAILURE /* f8 */ + if (a6 != *(_Decimal128 *)&gparms.fprs[9]) FAILURE /* f10 & f11 */ + if (a7 != *(_Decimal64 *)&gparms.fprs[11]) FAILURE /* f12 */ if (a8 != *(_Decimal128 *)&sp->slot[10]) FAILURE if (a9 != *(_Decimal64 *)&sp->slot[12]) FAILURE if (a10 != *(_Decimal128 *)&sp->slot[13]) FAILURE } +extern void func4_asm (_Decimal128, _Decimal64, _Decimal128, _Decimal64, + _Decimal128, _Decimal64, _Decimal128, _Decimal64); + +WRAPPER(func4); + void __attribute__ ((noinline)) func4 (_Decimal128 a1, _Decimal64 a2, _Decimal128 a3, _Decimal64 a4, _Decimal128 a5, _Decimal64 a6, _Decimal128 a7, _Decimal64 a8) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != *(_Decimal128 *)&lparms.fprs[1]) FAILURE /* f2 & f3 */ - if (a2 != *(_Decimal64 *)&lparms.fprs[3]) FAILURE /* f4 */ - if (a3 != *(_Decimal128 *)&lparms.fprs[5]) FAILURE /* f6 & f7 */ - if (a4 != *(_Decimal64 *)&lparms.fprs[7]) FAILURE /* f8 */ - if (a5 != *(_Decimal128 *)&lparms.fprs[9]) FAILURE /* f10 & f11 */ - if (a6 != *(_Decimal64 *)&lparms.fprs[11]) FAILURE /* f12 */ + if (a1 != *(_Decimal128 *)&gparms.fprs[1]) FAILURE /* f2 & f3 */ + if (a2 != *(_Decimal64 *)&gparms.fprs[3]) FAILURE /* f4 */ + if (a3 != *(_Decimal128 *)&gparms.fprs[5]) FAILURE /* f6 & f7 */ + if (a4 != *(_Decimal64 *)&gparms.fprs[7]) FAILURE /* f8 */ + if (a5 != *(_Decimal128 *)&gparms.fprs[9]) FAILURE /* f10 & f11 */ + if (a6 != *(_Decimal64 *)&gparms.fprs[11]) FAILURE /* f12 */ if (a7 != *(_Decimal128 *)&sp->slot[9]) FAILURE if (a8 != *(_Decimal64 *)&sp->slot[11]) FAILURE } +extern void func5_asm (_Decimal32, _Decimal32, _Decimal32, _Decimal32, + _Decimal32, _Decimal32, _Decimal32, _Decimal32, + _Decimal32, _Decimal32, _Decimal32, _Decimal32, + _Decimal32, _Decimal32, _Decimal32, _Decimal32); + +WRAPPER(func5); + void __attribute__ ((noinline)) func5 (_Decimal32 a1, _Decimal32 a2, _Decimal32 a3, _Decimal32 a4, _Decimal32 a5, _Decimal32 a6, _Decimal32 a7, _Decimal32 a8, _Decimal32 a9, _Decimal32 a10, _Decimal32 a11, _Decimal32 a12, _Decimal32 a13, _Decimal32 a14, _Decimal32 a15, _Decimal32 a16) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; /* _Decimal32 is passed in the lower half of an FPR or parameter slot. */ - if (a1 != ((d32parm_t *)&lparms.fprs[0])->d) FAILURE /* f1 */ - if (a2 != ((d32parm_t *)&lparms.fprs[1])->d) FAILURE /* f2 */ - if (a3 != ((d32parm_t *)&lparms.fprs[2])->d) FAILURE /* f3 */ - if (a4 != ((d32parm_t *)&lparms.fprs[3])->d) FAILURE /* f4 */ - if (a5 != ((d32parm_t *)&lparms.fprs[4])->d) FAILURE /* f5 */ - if (a6 != ((d32parm_t *)&lparms.fprs[5])->d) FAILURE /* f6 */ - if (a7 != ((d32parm_t *)&lparms.fprs[6])->d) FAILURE /* f7 */ - if (a8 != ((d32parm_t *)&lparms.fprs[7])->d) FAILURE /* f8 */ - if (a9 != ((d32parm_t *)&lparms.fprs[8])->d) FAILURE /* f9 */ - if (a10 != ((d32parm_t *)&lparms.fprs[9])->d) FAILURE /* f10 */ - if (a11 != ((d32parm_t *)&lparms.fprs[10])->d) FAILURE /* f11 */ - if (a12 != ((d32parm_t *)&lparms.fprs[11])->d) FAILURE /* f12 */ - if (a13 != ((d32parm_t *)&lparms.fprs[12])->d) FAILURE /* f13 */ + if (a1 != ((d32parm_t *)&gparms.fprs[0])->d) FAILURE /* f1 */ + if (a2 != ((d32parm_t *)&gparms.fprs[1])->d) FAILURE /* f2 */ + if (a3 != ((d32parm_t *)&gparms.fprs[2])->d) FAILURE /* f3 */ + if (a4 != ((d32parm_t *)&gparms.fprs[3])->d) FAILURE /* f4 */ + if (a5 != ((d32parm_t *)&gparms.fprs[4])->d) FAILURE /* f5 */ + if (a6 != ((d32parm_t *)&gparms.fprs[5])->d) FAILURE /* f6 */ + if (a7 != ((d32parm_t *)&gparms.fprs[6])->d) FAILURE /* f7 */ + if (a8 != ((d32parm_t *)&gparms.fprs[7])->d) FAILURE /* f8 */ + if (a9 != ((d32parm_t *)&gparms.fprs[8])->d) FAILURE /* f9 */ + if (a10 != ((d32parm_t *)&gparms.fprs[9])->d) FAILURE /* f10 */ + if (a11 != ((d32parm_t *)&gparms.fprs[10])->d) FAILURE /* f11 */ + if (a12 != ((d32parm_t *)&gparms.fprs[11])->d) FAILURE /* f12 */ + if (a13 != ((d32parm_t *)&gparms.fprs[12])->d) FAILURE /* f13 */ if (a14 != ((d32parm_t *)&sp->slot[13])->d) FAILURE if (a15 != ((d32parm_t *)&sp->slot[14])->d) FAILURE if (a16 != ((d32parm_t *)&sp->slot[15])->d) FAILURE } +extern void func6_asm (_Decimal32, _Decimal64, _Decimal128, + _Decimal32, _Decimal64, _Decimal128, + _Decimal32, _Decimal64, _Decimal128, + _Decimal32, _Decimal64, _Decimal128); + +WRAPPER(func6); + void __attribute__ ((noinline)) func6 (_Decimal32 a1, _Decimal64 a2, _Decimal128 a3, _Decimal32 a4, _Decimal64 a5, _Decimal128 a6, _Decimal32 a7, _Decimal64 a8, _Decimal128 a9, _Decimal32 a10, _Decimal64 a11, _Decimal128 a12) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != ((d32parm_t *)&lparms.fprs[0])->d) FAILURE /* f1 */ - if (a2 != *(_Decimal64 *)&lparms.fprs[1]) FAILURE /* f2 */ - if (a3 != *(_Decimal128 *)&lparms.fprs[3]) FAILURE /* f4 & f5 */ - if (a4 != ((d32parm_t *)&lparms.fprs[5])->d) FAILURE /* f6 */ - if (a5 != *(_Decimal64 *)&lparms.fprs[6]) FAILURE /* f7 */ - if (a6 != *(_Decimal128 *)&lparms.fprs[7]) FAILURE /* f8 & f9 */ - if (a7 != ((d32parm_t *)&lparms.fprs[9])->d) FAILURE /* f10 */ - if (a8 != *(_Decimal64 *)&lparms.fprs[10]) FAILURE /* f11 */ - if (a9 != *(_Decimal128 *)&lparms.fprs[11]) FAILURE /* f12 & f13 */ + if (a1 != ((d32parm_t *)&gparms.fprs[0])->d) FAILURE /* f1 */ + if (a2 != *(_Decimal64 *)&gparms.fprs[1]) FAILURE /* f2 */ + if (a3 != *(_Decimal128 *)&gparms.fprs[3]) FAILURE /* f4 & f5 */ + if (a4 != ((d32parm_t *)&gparms.fprs[5])->d) FAILURE /* f6 */ + if (a5 != *(_Decimal64 *)&gparms.fprs[6]) FAILURE /* f7 */ + if (a6 != *(_Decimal128 *)&gparms.fprs[7]) FAILURE /* f8 & f9 */ + if (a7 != ((d32parm_t *)&gparms.fprs[9])->d) FAILURE /* f10 */ + if (a8 != *(_Decimal64 *)&gparms.fprs[10]) FAILURE /* f11 */ + if (a9 != *(_Decimal128 *)&gparms.fprs[11]) FAILURE /* f12 & f13 */ if (a10 != ((d32parm_t *)&sp->slot[12])->d) FAILURE if (a11 != *(_Decimal64 *)&sp->slot[13]) FAILURE } @@ -292,23 +305,23 @@ func6 (_Decimal32 a1, _Decimal64 a2, _De int main (void) { - func0 (1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5, 13.5, - 14.5, 15.2dd, 16.2dl, 17.2dd); - func1 (101.5, 102.5, 103.5, 104.5, 105.5, 106.5, 107.5, 108.5, 109.5, - 110.5, 111.5, 112.5, 113.5, 114.2dd); - func2 (201.5, 202.5, 203.5, 204.5, 205.5, 206.5, 207.5, 208.5, 209.5, - 210.5, 211.5, 212.5, 213.2dd); - func3 (301.2dd, 302.2dl, 303.2dd, 304.2dl, 305.2dd, 306.2dl, 307.2dd, - 308.2dl, 309.2dd, 310.2dl); - func4 (401.2dl, 402.2dd, 403.2dl, 404.2dd, 405.2dl, 406.2dd, 407.2dl, - 408.2dd); + func0_asm (1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5, 13.5, + 14.5, 15.2dd, 16.2dl, 17.2dd); + func1_asm (101.5, 102.5, 103.5, 104.5, 105.5, 106.5, 107.5, 108.5, 109.5, + 110.5, 111.5, 112.5, 113.5, 114.2dd); + func2_asm (201.5, 202.5, 203.5, 204.5, 205.5, 206.5, 207.5, 208.5, 209.5, + 210.5, 211.5, 212.5, 213.2dd); + func3_asm (301.2dd, 302.2dl, 303.2dd, 304.2dl, 305.2dd, 306.2dl, 307.2dd, + 308.2dl, 309.2dd, 310.2dl); + func4_asm (401.2dl, 402.2dd, 403.2dl, 404.2dd, 405.2dl, 406.2dd, 407.2dl, + 408.2dd); #if 0 /* _Decimal32 doesn't yet follow the ABI; enable this when it does. */ - func5 (501.2df, 502.2df, 503.2df, 504.2df, 505.2df, 506.2df, 507.2df, - 508.2df, 509.2df, 510.2df, 511.2df, 512.2df, 513.2df, 514.2df, - 515.2df, 516.2df); - func6 (601.2df, 602.2dd, 603.2dl, 604.2df, 605.2dd, 606.2dl, - 607.2df, 608.2dd, 609.2dl, 610.2df, 611.2dd, 612.2dl); + func5_asm (501.2df, 502.2df, 503.2df, 504.2df, 505.2df, 506.2df, 507.2df, + 508.2df, 509.2df, 510.2df, 511.2df, 512.2df, 513.2df, 514.2df, + 515.2df, 516.2df); + func6_asm (601.2df, 602.2dd, 603.2dl, 604.2df, 605.2dd, 606.2dl, + 607.2df, 608.2dd, 609.2dl, 610.2df, 611.2dd, 612.2dl); #endif if (failcnt != 0) Index: gcc/testsuite/gcc.target/powerpc/ppc32-abi-dfp-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ppc32-abi-dfp-1.c (revision 169775) +++ gcc/testsuite/gcc.target/powerpc/ppc32-abi-dfp-1.c (working copy) @@ -30,31 +30,6 @@ typedef struct reg_parms_t gparms; - -/* Testcase could break on future gcc's, if parameter regs - are changed before this asm. */ - -#define save_parms(lparms) \ - asm volatile ("lis 11,gparms@ha\n\t" \ - "la 11,gparms@l(11)\n\t" \ - "st 3,0(11)\n\t" \ - "st 4,4(11)\n\t" \ - "st 5,8(11)\n\t" \ - "st 6,12(11)\n\t" \ - "st 7,16(11)\n\t" \ - "st 8,20(11)\n\t" \ - "st 9,24(11)\n\t" \ - "st 10,28(11)\n\t" \ - "stfd 1,32(11)\n\t" \ - "stfd 2,40(11)\n\t" \ - "stfd 3,48(11)\n\t" \ - "stfd 4,56(11)\n\t" \ - "stfd 5,64(11)\n\t" \ - "stfd 6,72(11)\n\t" \ - "stfd 7,80(11)\n\t" \ - "stfd 8,88(11)\n\t":::"11", "memory"); \ - lparms = gparms; - typedef struct sf { struct sf *backchain; @@ -62,115 +37,159 @@ typedef struct sf unsigned int slot[200]; } stack_frame_t; +/* Wrapper to save the GPRs and FPRs and then jump to the real function. */ +#define WRAPPER(NAME) \ +__asm__ ("\t.globl\t" #NAME "_asm\n\t" \ + ".text\n\t" \ + ".type " #NAME "_asm, @function\n" \ + #NAME "_asm:\n\t" \ + "lis 11,gparms@ha\n\t" \ + "la 11,gparms@l(11)\n\t" \ + "st 3,0(11)\n\t" \ + "st 4,4(11)\n\t" \ + "st 5,8(11)\n\t" \ + "st 6,12(11)\n\t" \ + "st 7,16(11)\n\t" \ + "st 8,20(11)\n\t" \ + "st 9,24(11)\n\t" \ + "st 10,28(11)\n\t" \ + "stfd 1,32(11)\n\t" \ + "stfd 2,40(11)\n\t" \ + "stfd 3,48(11)\n\t" \ + "stfd 4,56(11)\n\t" \ + "stfd 5,64(11)\n\t" \ + "stfd 6,72(11)\n\t" \ + "stfd 7,80(11)\n\t" \ + "stfd 8,88(11)\n\t" \ + "b " #NAME "\n\t" \ + ".size " #NAME ",.-" #NAME "\n") + /* Fill up floating point registers with double arguments, forcing decimal float arguments into the parameter save area. */ +extern void func0_asm (double, double, double, double, double, + double, double, double, _Decimal64, _Decimal128); + +WRAPPER(func0); + void __attribute__ ((noinline)) func0 (double a1, double a2, double a3, double a4, double a5, double a6, double a7, double a8, _Decimal64 a9, _Decimal128 a10) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != lparms.fprs[0]) FAILURE - if (a2 != lparms.fprs[1]) FAILURE - if (a3 != lparms.fprs[2]) FAILURE - if (a4 != lparms.fprs[3]) FAILURE - if (a5 != lparms.fprs[4]) FAILURE - if (a6 != lparms.fprs[5]) FAILURE - if (a7 != lparms.fprs[6]) FAILURE - if (a8 != lparms.fprs[7]) FAILURE + if (a1 != gparms.fprs[0]) FAILURE + if (a2 != gparms.fprs[1]) FAILURE + if (a3 != gparms.fprs[2]) FAILURE + if (a4 != gparms.fprs[3]) FAILURE + if (a5 != gparms.fprs[4]) FAILURE + if (a6 != gparms.fprs[5]) FAILURE + if (a7 != gparms.fprs[6]) FAILURE + if (a8 != gparms.fprs[7]) FAILURE if (a9 != *(_Decimal64 *)&sp->slot[0]) FAILURE if (a10 != *(_Decimal128 *)&sp->slot[2]) FAILURE } /* Alternate 64-bit and 128-bit decimal float arguments, checking that _Decimal128 is always passed in even/odd register pairs. */ +extern void func1_asm (_Decimal64, _Decimal128, _Decimal64, _Decimal128, + _Decimal64, _Decimal128, _Decimal64, _Decimal128); + +WRAPPER(func1); + void __attribute__ ((noinline)) func1 (_Decimal64 a1, _Decimal128 a2, _Decimal64 a3, _Decimal128 a4, _Decimal64 a5, _Decimal128 a6, _Decimal64 a7, _Decimal128 a8) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != *(_Decimal64 *)&lparms.fprs[0]) FAILURE /* f1 */ - if (a2 != *(_Decimal128 *)&lparms.fprs[1]) FAILURE /* f2 & f3 */ - if (a3 != *(_Decimal64 *)&lparms.fprs[3]) FAILURE /* f4 */ - if (a4 != *(_Decimal128 *)&lparms.fprs[5]) FAILURE /* f6 & f7 */ - if (a5 != *(_Decimal64 *)&lparms.fprs[7]) FAILURE /* f8 */ + if (a1 != *(_Decimal64 *)&gparms.fprs[0]) FAILURE /* f1 */ + if (a2 != *(_Decimal128 *)&gparms.fprs[1]) FAILURE /* f2 & f3 */ + if (a3 != *(_Decimal64 *)&gparms.fprs[3]) FAILURE /* f4 */ + if (a4 != *(_Decimal128 *)&gparms.fprs[5]) FAILURE /* f6 & f7 */ + if (a5 != *(_Decimal64 *)&gparms.fprs[7]) FAILURE /* f8 */ if (a6 != *(_Decimal128 *)&sp->slot[0]) FAILURE if (a7 != *(_Decimal64 *)&sp->slot[4]) FAILURE if (a8 != *(_Decimal128 *)&sp->slot[6]) FAILURE } +extern void func2_asm (_Decimal128, _Decimal64, _Decimal128, _Decimal64, + _Decimal128, _Decimal64, _Decimal128, _Decimal64); + +WRAPPER(func2); + void __attribute__ ((noinline)) func2 (_Decimal128 a1, _Decimal64 a2, _Decimal128 a3, _Decimal64 a4, _Decimal128 a5, _Decimal64 a6, _Decimal128 a7, _Decimal64 a8) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != *(_Decimal128 *)&lparms.fprs[1]) FAILURE /* f2 & f3 */ - if (a2 != *(_Decimal64 *)&lparms.fprs[3]) FAILURE /* f4 */ - if (a3 != *(_Decimal128 *)&lparms.fprs[5]) FAILURE /* f6 & f7 */ - if (a4 != *(_Decimal64 *)&lparms.fprs[7]) FAILURE /* f8 */ + if (a1 != *(_Decimal128 *)&gparms.fprs[1]) FAILURE /* f2 & f3 */ + if (a2 != *(_Decimal64 *)&gparms.fprs[3]) FAILURE /* f4 */ + if (a3 != *(_Decimal128 *)&gparms.fprs[5]) FAILURE /* f6 & f7 */ + if (a4 != *(_Decimal64 *)&gparms.fprs[7]) FAILURE /* f8 */ if (a5 != *(_Decimal128 *)&sp->slot[0]) FAILURE if (a6 != *(_Decimal64 *)&sp->slot[4]) FAILURE if (a7 != *(_Decimal128 *)&sp->slot[6]) FAILURE if (a8 != *(_Decimal64 *)&sp->slot[10]) FAILURE } +extern void func3_asm (_Decimal64, _Decimal128, _Decimal64, _Decimal128, + _Decimal64); + +WRAPPER(func3); + void __attribute__ ((noinline)) func3 (_Decimal64 a1, _Decimal128 a2, _Decimal64 a3, _Decimal128 a4, _Decimal64 a5) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != *(_Decimal64 *)&lparms.fprs[0]) FAILURE /* f1 */ - if (a2 != *(_Decimal128 *)&lparms.fprs[1]) FAILURE /* f2 & f3 */ - if (a3 != *(_Decimal64 *)&lparms.fprs[3]) FAILURE /* f4 */ - if (a4 != *(_Decimal128 *)&lparms.fprs[5]) FAILURE /* f6 & f7 */ + if (a1 != *(_Decimal64 *)&gparms.fprs[0]) FAILURE /* f1 */ + if (a2 != *(_Decimal128 *)&gparms.fprs[1]) FAILURE /* f2 & f3 */ + if (a3 != *(_Decimal64 *)&gparms.fprs[3]) FAILURE /* f4 */ + if (a4 != *(_Decimal128 *)&gparms.fprs[5]) FAILURE /* f6 & f7 */ if (a5 != *(_Decimal128 *)&sp->slot[0]) FAILURE } +extern void func4_asm (_Decimal32, _Decimal32, _Decimal32, _Decimal32, + _Decimal32, _Decimal32, _Decimal32, _Decimal32, + _Decimal32, _Decimal32, _Decimal32, _Decimal32, + _Decimal32, _Decimal32, _Decimal32, _Decimal32); + +WRAPPER(func4); + void __attribute__ ((noinline)) func4 (_Decimal32 a1, _Decimal32 a2, _Decimal32 a3, _Decimal32 a4, _Decimal32 a5, _Decimal32 a6, _Decimal32 a7, _Decimal32 a8, _Decimal32 a9, _Decimal32 a10, _Decimal32 a11, _Decimal32 a12, _Decimal32 a13, _Decimal32 a14, _Decimal32 a15, _Decimal32 a16) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; /* _Decimal32 is passed in the lower half of an FPR, or in parameter slot. */ - if (a1 != ((d32parm_t *)&lparms.fprs[0])->d) FAILURE /* f1 */ - if (a2 != ((d32parm_t *)&lparms.fprs[1])->d) FAILURE /* f2 */ - if (a3 != ((d32parm_t *)&lparms.fprs[2])->d) FAILURE /* f3 */ - if (a4 != ((d32parm_t *)&lparms.fprs[3])->d) FAILURE /* f4 */ - if (a5 != ((d32parm_t *)&lparms.fprs[4])->d) FAILURE /* f5 */ - if (a6 != ((d32parm_t *)&lparms.fprs[5])->d) FAILURE /* f6 */ - if (a7 != ((d32parm_t *)&lparms.fprs[6])->d) FAILURE /* f7 */ - if (a8 != ((d32parm_t *)&lparms.fprs[7])->d) FAILURE /* f8 */ + if (a1 != ((d32parm_t *)&gparms.fprs[0])->d) FAILURE /* f1 */ + if (a2 != ((d32parm_t *)&gparms.fprs[1])->d) FAILURE /* f2 */ + if (a3 != ((d32parm_t *)&gparms.fprs[2])->d) FAILURE /* f3 */ + if (a4 != ((d32parm_t *)&gparms.fprs[3])->d) FAILURE /* f4 */ + if (a5 != ((d32parm_t *)&gparms.fprs[4])->d) FAILURE /* f5 */ + if (a6 != ((d32parm_t *)&gparms.fprs[5])->d) FAILURE /* f6 */ + if (a7 != ((d32parm_t *)&gparms.fprs[6])->d) FAILURE /* f7 */ + if (a8 != ((d32parm_t *)&gparms.fprs[7])->d) FAILURE /* f8 */ if (a9 != *(_Decimal32 *)&sp->slot[0]) FAILURE if (a10 != *(_Decimal32 *)&sp->slot[1]) FAILURE if (a11 != *(_Decimal32 *)&sp->slot[2]) FAILURE @@ -181,24 +200,29 @@ func4 (_Decimal32 a1, _Decimal32 a2, _De if (a16 != *(_Decimal32 *)&sp->slot[7]) FAILURE } +extern void func5_asm (_Decimal32, _Decimal64, _Decimal128, + _Decimal32, _Decimal64, _Decimal128, + _Decimal32, _Decimal64, _Decimal128, + _Decimal32, _Decimal64, _Decimal128); + +WRAPPER(func5); + void __attribute__ ((noinline)) func5 (_Decimal32 a1, _Decimal64 a2, _Decimal128 a3, _Decimal32 a4, _Decimal64 a5, _Decimal128 a6, _Decimal32 a7, _Decimal64 a8, _Decimal128 a9, _Decimal32 a10, _Decimal64 a11, _Decimal128 a12) { - reg_parms_t lparms; stack_frame_t *sp; - save_parms (lparms); sp = __builtin_frame_address (0); sp = sp->backchain; - if (a1 != ((d32parm_t *)&lparms.fprs[0])->d) FAILURE /* f1 */ - if (a2 != *(_Decimal64 *)&lparms.fprs[1]) FAILURE /* f2 */ - if (a3 != *(_Decimal128 *)&lparms.fprs[3]) FAILURE /* f4 & f5 */ - if (a4 != ((d32parm_t *)&lparms.fprs[5])->d) FAILURE /* f6 */ - if (a5 != *(_Decimal64 *)&lparms.fprs[6]) FAILURE /* f7 */ + if (a1 != ((d32parm_t *)&gparms.fprs[0])->d) FAILURE /* f1 */ + if (a2 != *(_Decimal64 *)&gparms.fprs[1]) FAILURE /* f2 */ + if (a3 != *(_Decimal128 *)&gparms.fprs[3]) FAILURE /* f4 & f5 */ + if (a4 != ((d32parm_t *)&gparms.fprs[5])->d) FAILURE /* f6 */ + if (a5 != *(_Decimal64 *)&gparms.fprs[6]) FAILURE /* f7 */ if (a6 != *(_Decimal128 *)&sp->slot[0]) FAILURE if (a7 != *(_Decimal32 *)&sp->slot[4]) FAILURE @@ -212,15 +236,15 @@ func5 (_Decimal32 a1, _Decimal64 a2, _De int main () { - func0 (1., 2., 3., 4., 5., 6., 7., 8., 9.dd, 10.dl); - func1 (1.dd, 2.dl, 3.dd, 4.dl, 5.dd, 6.dl, 7.dd, 8.dl); - func2 (1.dl, 2.dd, 3.dl, 4.dd, 5.dl, 6.dd, 7.dl, 8.dd); - func3 (1.dd, 2.dl, 3.dd, 4.dl, 5.dl); - func4 (501.2df, 502.2df, 503.2df, 504.2df, 505.2df, 506.2df, 507.2df, - 508.2df, 509.2df, 510.2df, 511.2df, 512.2df, 513.2df, 514.2df, - 515.2df, 516.2df); - func5 (601.2df, 602.2dd, 603.2dl, 604.2df, 605.2dd, 606.2dl, - 607.2df, 608.2dd, 609.2dl, 610.2df, 611.2dd, 612.2dl); + func0_asm (1., 2., 3., 4., 5., 6., 7., 8., 9.dd, 10.dl); + func1_asm (1.dd, 2.dl, 3.dd, 4.dl, 5.dd, 6.dl, 7.dd, 8.dl); + func2_asm (1.dl, 2.dd, 3.dl, 4.dd, 5.dl, 6.dd, 7.dl, 8.dd); + func3_asm (1.dd, 2.dl, 3.dd, 4.dl, 5.dl); + func4_asm (501.2df, 502.2df, 503.2df, 504.2df, 505.2df, 506.2df, 507.2df, + 508.2df, 509.2df, 510.2df, 511.2df, 512.2df, 513.2df, 514.2df, + 515.2df, 516.2df); + func5_asm (601.2df, 602.2dd, 603.2dl, 604.2df, 605.2dd, 606.2dl, + 607.2df, 608.2dd, 609.2dl, 610.2df, 611.2dd, 612.2dl); if (failcnt != 0) abort (); Index: gcc/testsuite/gcc.target/powerpc/avoid-indexed-addresses.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/avoid-indexed-addresses.c (revision 169775) +++ gcc/testsuite/gcc.target/powerpc/avoid-indexed-addresses.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile { target { powerpc*-*-* } } } */ -/* { dg-options "-O2 -mavoid-indexed-addresses" } */ +/* { dg-options "-O2 -mavoid-indexed-addresses -mno-altivec -mno-vsx" } */ /* { dg-final { scan-assembler-not "lbzx" } } Index: gcc/config/rs6000/vector.md =================================================================== --- gcc/config/rs6000/vector.md (revision 169775) +++ gcc/config/rs6000/vector.md (working copy) @@ -3,7 +3,7 @@ ;; expander, and the actual vector instructions will be in altivec.md and ;; vsx.md -;; Copyright (C) 2009, 2010 +;; Copyright (C) 2009, 2010, 2011 ;; Free Software Foundation, Inc. ;; Contributed by Michael Meissner @@ -123,6 +123,43 @@ (define_split DONE; }) +;; Vector floating point load/store instructions that uses the Altivec +;; instructions even if we are compiling for VSX, since the Altivec +;; instructions silently ignore the bottom 3 bits of the address, and VSX does +;; not. +(define_expand "vector_altivec_load_" + [(set (match_operand:VEC_M 0 "vfloat_operand" "") + (match_operand:VEC_M 1 "memory_operand" ""))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" + " +{ + gcc_assert (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)); + + if (VECTOR_MEM_VSX_P (mode)) + { + operands[1] = rs6000_address_for_altivec (operands[1]); + emit_insn (gen_altivec_lvx_ (operands[0], operands[1])); + DONE; + } +}") + +(define_expand "vector_altivec_store_" + [(set (match_operand:VEC_M 0 "memory_operand" "") + (match_operand:VEC_M 1 "vfloat_operand" ""))] + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" + " +{ + gcc_assert (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)); + + if (VECTOR_MEM_VSX_P (mode)) + { + operands[0] = rs6000_address_for_altivec (operands[0]); + emit_insn (gen_altivec_stvx_ (operands[0], operands[1])); + DONE; + } +}") + + ;; Reload patterns for vector operations. We may need an addtional base ;; register to convert the reg+offset addressing to reg+reg for vector Index: gcc/config/rs6000/rs6000-protos.h =================================================================== --- gcc/config/rs6000/rs6000-protos.h (revision 169775) +++ gcc/config/rs6000/rs6000-protos.h (working copy) @@ -1,5 +1,6 @@ /* Definitions of target machine for GNU compiler, for IBM RS/6000. - Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 + Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, + 2010, 2011 Free Software Foundation, Inc. Contributed by Richard Kenner (kenner@vlsi1.ultra.nyu.edu) @@ -129,6 +130,7 @@ extern void rs6000_emit_parity (rtx, rtx extern rtx rs6000_machopic_legitimize_pic_address (rtx, enum machine_mode, rtx); extern rtx rs6000_address_for_fpconvert (rtx); +extern rtx rs6000_address_for_altivec (rtx); extern rtx rs6000_allocate_stack_temp (enum machine_mode, bool, bool); extern int rs6000_loop_align (rtx); #endif /* RTX_CODE */ Index: gcc/config/rs6000/rs6000-builtin.def =================================================================== --- gcc/config/rs6000/rs6000-builtin.def (revision 169775) +++ gcc/config/rs6000/rs6000-builtin.def (working copy) @@ -1,5 +1,5 @@ /* Builtin functions for rs6000/powerpc. - Copyright (C) 2009, 2010 + Copyright (C) 2009, 2010, 2011 Free Software Foundation, Inc. Contributed by Michael Meissner (meissner@linux.vnet.ibm.com) @@ -37,6 +37,10 @@ RS6000_BUILTIN(ALTIVEC_BUILTIN_ST_INTERN RS6000_BUILTIN(ALTIVEC_BUILTIN_LD_INTERNAL_16qi, RS6000_BTC_MEM) RS6000_BUILTIN(ALTIVEC_BUILTIN_ST_INTERNAL_4sf, RS6000_BTC_MEM) RS6000_BUILTIN(ALTIVEC_BUILTIN_LD_INTERNAL_4sf, RS6000_BTC_MEM) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ST_INTERNAL_2df, RS6000_BTC_MEM) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LD_INTERNAL_2df, RS6000_BTC_MEM) +RS6000_BUILTIN(ALTIVEC_BUILTIN_ST_INTERNAL_2di, RS6000_BTC_MEM) +RS6000_BUILTIN(ALTIVEC_BUILTIN_LD_INTERNAL_2di, RS6000_BTC_MEM) RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDUBM, RS6000_BTC_CONST) RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDUHM, RS6000_BTC_CONST) RS6000_BUILTIN(ALTIVEC_BUILTIN_VADDUWM, RS6000_BTC_CONST) @@ -778,12 +782,20 @@ RS6000_BUILTIN(PAIRED_BUILTIN_CMPU1, R /* VSX builtins. */ RS6000_BUILTIN(VSX_BUILTIN_LXSDX, RS6000_BTC_MEM) -RS6000_BUILTIN(VSX_BUILTIN_LXVD2X, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_LXVD2X_V2DF, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_LXVD2X_V2DI, RS6000_BTC_MEM) RS6000_BUILTIN(VSX_BUILTIN_LXVDSX, RS6000_BTC_MEM) -RS6000_BUILTIN(VSX_BUILTIN_LXVW4X, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_LXVW4X_V4SF, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_LXVW4X_V4SI, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_LXVW4X_V8HI, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_LXVW4X_V16QI, RS6000_BTC_MEM) RS6000_BUILTIN(VSX_BUILTIN_STXSDX, RS6000_BTC_MEM) -RS6000_BUILTIN(VSX_BUILTIN_STXVD2X, RS6000_BTC_MEM) -RS6000_BUILTIN(VSX_BUILTIN_STXVW4X, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_STXVD2X_V2DF, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_STXVD2X_V2DI, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_STXVW4X_V4SF, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_STXVW4X_V4SI, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_STXVW4X_V8HI, RS6000_BTC_MEM) +RS6000_BUILTIN(VSX_BUILTIN_STXVW4X_V16QI, RS6000_BTC_MEM) RS6000_BUILTIN(VSX_BUILTIN_XSABSDP, RS6000_BTC_CONST) RS6000_BUILTIN(VSX_BUILTIN_XSADDDP, RS6000_BTC_FP_PURE) RS6000_BUILTIN(VSX_BUILTIN_XSCMPODP, RS6000_BTC_FP_PURE) @@ -983,8 +995,10 @@ RS6000_BUILTIN(VSX_BUILTIN_VEC_XXPERMDI, RS6000_BUILTIN(VSX_BUILTIN_VEC_XXSLDWI, RS6000_BTC_MISC) RS6000_BUILTIN(VSX_BUILTIN_VEC_XXSPLTD, RS6000_BTC_MISC) RS6000_BUILTIN(VSX_BUILTIN_VEC_XXSPLTW, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_VEC_LD, RS6000_BTC_MISC) +RS6000_BUILTIN(VSX_BUILTIN_VEC_ST, RS6000_BTC_MISC) RS6000_BUILTIN_EQUATE(VSX_BUILTIN_OVERLOADED_LAST, - VSX_BUILTIN_VEC_XXSPLTW) + VSX_BUILTIN_VEC_ST) /* Combined VSX/Altivec builtins. */ RS6000_BUILTIN(VECTOR_BUILTIN_FLOAT_V4SI_V4SF, RS6000_BTC_FP_PURE) Index: gcc/config/rs6000/rs6000-c.c =================================================================== --- gcc/config/rs6000/rs6000-c.c (revision 169775) +++ gcc/config/rs6000/rs6000-c.c (working copy) @@ -1000,6 +1000,15 @@ const struct altivec_builtin_types altiv { VSX_BUILTIN_VEC_DIV, VSX_BUILTIN_XVDIVDP, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX, + RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX, + RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX, + RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_float, 0 }, @@ -1112,9 +1121,19 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI, 0 }, { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL, - RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V16QI, 0 }, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 }, + { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL, + RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF, 0 }, + { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL, + RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL, + RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_LVSL, ALTIVEC_BUILTIN_LVSL, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 }, { ALTIVEC_BUILTIN_VEC_LVSL, ALTIVEC_BUILTIN_LVSL, @@ -1133,6 +1152,17 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_long, 0 }, { ALTIVEC_BUILTIN_VEC_LVSL, ALTIVEC_BUILTIN_LVSL, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_float, 0 }, + { ALTIVEC_BUILTIN_VEC_LVSL, ALTIVEC_BUILTIN_LVSL, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_double, 0 }, + { ALTIVEC_BUILTIN_VEC_LVSL, ALTIVEC_BUILTIN_LVSL, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTDI, 0 }, + { ALTIVEC_BUILTIN_VEC_LVSL, ALTIVEC_BUILTIN_LVSL, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTDI, 0 }, + { ALTIVEC_BUILTIN_VEC_LVSL, ALTIVEC_BUILTIN_LVSL, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_long_long, 0 }, + { ALTIVEC_BUILTIN_VEC_LVSL, ALTIVEC_BUILTIN_LVSL, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_long_long, 0 }, { ALTIVEC_BUILTIN_VEC_LVSR, ALTIVEC_BUILTIN_LVSR, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 }, { ALTIVEC_BUILTIN_VEC_LVSR, ALTIVEC_BUILTIN_LVSR, @@ -1151,6 +1181,17 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_long, 0 }, { ALTIVEC_BUILTIN_VEC_LVSR, ALTIVEC_BUILTIN_LVSR, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_float, 0 }, + { ALTIVEC_BUILTIN_VEC_LVSR, ALTIVEC_BUILTIN_LVSR, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_double, 0 }, + { ALTIVEC_BUILTIN_VEC_LVSR, ALTIVEC_BUILTIN_LVSR, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTDI, 0 }, + { ALTIVEC_BUILTIN_VEC_LVSR, ALTIVEC_BUILTIN_LVSR, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTDI, 0 }, + { ALTIVEC_BUILTIN_VEC_LVSR, ALTIVEC_BUILTIN_LVSR, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_long_long, 0 }, + { ALTIVEC_BUILTIN_VEC_LVSR, ALTIVEC_BUILTIN_LVSR, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_long_long, 0 }, { ALTIVEC_BUILTIN_VEC_LVLX, ALTIVEC_BUILTIN_LVLX, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_LVLX, ALTIVEC_BUILTIN_LVLX, @@ -2644,6 +2685,16 @@ const struct altivec_builtin_types altiv { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_NOT_OPAQUE }, { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX, + RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF }, + { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX, + RS6000_BTI_void, RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI }, + { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX, + RS6000_BTI_void, RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V2DI }, + { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX, + RS6000_BTI_void, RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI, + ~RS6000_BTI_bool_V2DI }, + { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX, RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX, RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_float }, @@ -2809,6 +2860,18 @@ const struct altivec_builtin_types altiv RS6000_BTI_void, RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI }, { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL, RS6000_BTI_void, RS6000_BTI_pixel_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_pixel_V8HI }, + { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL, + RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF }, + { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL, + RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_double }, + { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL, + RS6000_BTI_void, RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI }, + { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL, + RS6000_BTI_void, RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V2DI }, + { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL, + RS6000_BTI_void, RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI, + ~RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_STVLX, ALTIVEC_BUILTIN_STVLX, RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_STVLX, ALTIVEC_BUILTIN_STVLX, @@ -3002,6 +3065,135 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_NOT_OPAQUE }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVD2X_V2DF, + RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVD2X_V2DI, + RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVD2X_V2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V2DI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVD2X_V2DI, + RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V2DI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V4SF, + RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V4SF, + RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_float, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V4SI, + RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V4SI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_V4SI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_INTSI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_long, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V4SI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTSI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_long, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V8HI, + RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V8HI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V8HI, + RS6000_BTI_pixel_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_pixel_V8HI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V8HI, + RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_V8HI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V8HI, + RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_INTHI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V8HI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTHI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V16QI, + RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V16QI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V16QI, + RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_V16QI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V16QI, + RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V16QI, 0 }, + { VSX_BUILTIN_VEC_LD, VSX_BUILTIN_LXVW4X_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 }, + + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVD2X_V2DF, + RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVD2X_V2DI, + RS6000_BTI_void, RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVD2X_V2DI, + RS6000_BTI_void, RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V2DI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVD2X_V2DI, + RS6000_BTI_void, RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI, + ~RS6000_BTI_bool_V2DI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V4SF, + RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V4SF, + RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_float }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V4SI, + RS6000_BTI_void, RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_V4SI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V4SI, + RS6000_BTI_void, RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_INTSI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V4SI, + RS6000_BTI_void, RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V4SI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V4SI, + RS6000_BTI_void, RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, + ~RS6000_BTI_UINTSI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V4SI, + RS6000_BTI_void, RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, + ~RS6000_BTI_bool_V4SI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V4SI, + RS6000_BTI_void, RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, + ~RS6000_BTI_UINTSI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V4SI, + RS6000_BTI_void, RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, + ~RS6000_BTI_INTSI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V8HI, + RS6000_BTI_void, RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_V8HI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V8HI, + RS6000_BTI_void, RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_INTHI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V8HI, + RS6000_BTI_void, RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V8HI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V8HI, + RS6000_BTI_void, RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, + ~RS6000_BTI_UINTHI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V8HI, + RS6000_BTI_void, RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, + ~RS6000_BTI_bool_V8HI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V8HI, + RS6000_BTI_void, RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, + ~RS6000_BTI_UINTHI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V8HI, + RS6000_BTI_void, RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, + ~RS6000_BTI_INTHI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V16QI, + RS6000_BTI_void, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_V16QI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V16QI, + RS6000_BTI_void, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V16QI, + RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, + ~RS6000_BTI_unsigned_V16QI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V16QI, + RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, + ~RS6000_BTI_UINTQI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V16QI, + RS6000_BTI_void, RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, + ~RS6000_BTI_bool_V16QI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V16QI, + RS6000_BTI_void, RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, + ~RS6000_BTI_UINTQI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V16QI, + RS6000_BTI_void, RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, + ~RS6000_BTI_INTQI }, + { VSX_BUILTIN_VEC_ST, VSX_BUILTIN_STXVW4X_V16QI, + RS6000_BTI_void, RS6000_BTI_pixel_V8HI, RS6000_BTI_INTSI, + ~RS6000_BTI_pixel_V8HI }, + /* Predicates. */ { ALTIVEC_BUILTIN_VCMPGT_P, ALTIVEC_BUILTIN_VCMPGTUB_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI }, Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 169775) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -3316,9 +3316,12 @@ rs6000_option_override_internal (bool gl /* If not explicitly specified via option, decide whether to generate indexed load/store instructions. */ if (TARGET_AVOID_XFORM == -1) - /* Avoid indexed addressing when targeting Power6 in order to avoid - the DERAT mispredict penalty. */ - TARGET_AVOID_XFORM = (rs6000_cpu == PROCESSOR_POWER6 && TARGET_CMPB); + /* Avoid indexed addressing when targeting Power6 in order to avoid the + DERAT mispredict penalty. However the LVE and STVE altivec instructions + need indexed accesses and the type used is the scalar type of the element + being loaded or stored. */ + TARGET_AVOID_XFORM = (rs6000_cpu == PROCESSOR_POWER6 && TARGET_CMPB + && !TARGET_ALTIVEC); /* Set the -mrecip options. */ if (rs6000_recip_name) @@ -11263,16 +11266,22 @@ altivec_expand_ld_builtin (tree exp, rtx switch (fcode) { case ALTIVEC_BUILTIN_LD_INTERNAL_16qi: - icode = CODE_FOR_vector_load_v16qi; + icode = CODE_FOR_vector_altivec_load_v16qi; break; case ALTIVEC_BUILTIN_LD_INTERNAL_8hi: - icode = CODE_FOR_vector_load_v8hi; + icode = CODE_FOR_vector_altivec_load_v8hi; break; case ALTIVEC_BUILTIN_LD_INTERNAL_4si: - icode = CODE_FOR_vector_load_v4si; + icode = CODE_FOR_vector_altivec_load_v4si; break; case ALTIVEC_BUILTIN_LD_INTERNAL_4sf: - icode = CODE_FOR_vector_load_v4sf; + icode = CODE_FOR_vector_altivec_load_v4sf; + break; + case ALTIVEC_BUILTIN_LD_INTERNAL_2df: + icode = CODE_FOR_vector_altivec_load_v2df; + break; + case ALTIVEC_BUILTIN_LD_INTERNAL_2di: + icode = CODE_FOR_vector_altivec_load_v2di; break; default: *expandedp = false; @@ -11316,16 +11325,22 @@ altivec_expand_st_builtin (tree exp, rtx switch (fcode) { case ALTIVEC_BUILTIN_ST_INTERNAL_16qi: - icode = CODE_FOR_vector_store_v16qi; + icode = CODE_FOR_vector_altivec_store_v16qi; break; case ALTIVEC_BUILTIN_ST_INTERNAL_8hi: - icode = CODE_FOR_vector_store_v8hi; + icode = CODE_FOR_vector_altivec_store_v8hi; break; case ALTIVEC_BUILTIN_ST_INTERNAL_4si: - icode = CODE_FOR_vector_store_v4si; + icode = CODE_FOR_vector_altivec_store_v4si; break; case ALTIVEC_BUILTIN_ST_INTERNAL_4sf: - icode = CODE_FOR_vector_store_v4sf; + icode = CODE_FOR_vector_altivec_store_v4sf; + break; + case ALTIVEC_BUILTIN_ST_INTERNAL_2df: + icode = CODE_FOR_vector_altivec_store_v2df; + break; + case ALTIVEC_BUILTIN_ST_INTERNAL_2di: + icode = CODE_FOR_vector_altivec_store_v2di; break; default: *expandedp = false; @@ -11557,7 +11572,7 @@ altivec_expand_builtin (tree exp, rtx ta switch (fcode) { case ALTIVEC_BUILTIN_STVX: - return altivec_expand_stv_builtin (CODE_FOR_altivec_stvx, exp); + return altivec_expand_stv_builtin (CODE_FOR_altivec_stvx_v4si, exp); case ALTIVEC_BUILTIN_STVEBX: return altivec_expand_stv_builtin (CODE_FOR_altivec_stvebx, exp); case ALTIVEC_BUILTIN_STVEHX: @@ -11576,6 +11591,19 @@ altivec_expand_builtin (tree exp, rtx ta case ALTIVEC_BUILTIN_STVRXL: return altivec_expand_stv_builtin (CODE_FOR_altivec_stvrxl, exp); + case VSX_BUILTIN_STXVD2X_V2DF: + return altivec_expand_stv_builtin (CODE_FOR_vsx_store_v2df, exp); + case VSX_BUILTIN_STXVD2X_V2DI: + return altivec_expand_stv_builtin (CODE_FOR_vsx_store_v2di, exp); + case VSX_BUILTIN_STXVW4X_V4SF: + return altivec_expand_stv_builtin (CODE_FOR_vsx_store_v4sf, exp); + case VSX_BUILTIN_STXVW4X_V4SI: + return altivec_expand_stv_builtin (CODE_FOR_vsx_store_v4si, exp); + case VSX_BUILTIN_STXVW4X_V8HI: + return altivec_expand_stv_builtin (CODE_FOR_vsx_store_v8hi, exp); + case VSX_BUILTIN_STXVW4X_V16QI: + return altivec_expand_stv_builtin (CODE_FOR_vsx_store_v16qi, exp); + case ALTIVEC_BUILTIN_MFVSCR: icode = CODE_FOR_altivec_mfvscr; tmode = insn_data[icode].operand[0].mode; @@ -11700,7 +11728,7 @@ altivec_expand_builtin (tree exp, rtx ta return altivec_expand_lv_builtin (CODE_FOR_altivec_lvxl, exp, target, false); case ALTIVEC_BUILTIN_LVX: - return altivec_expand_lv_builtin (CODE_FOR_altivec_lvx, + return altivec_expand_lv_builtin (CODE_FOR_altivec_lvx_v4si, exp, target, false); case ALTIVEC_BUILTIN_LVLX: return altivec_expand_lv_builtin (CODE_FOR_altivec_lvlx, @@ -11714,6 +11742,25 @@ altivec_expand_builtin (tree exp, rtx ta case ALTIVEC_BUILTIN_LVRXL: return altivec_expand_lv_builtin (CODE_FOR_altivec_lvrxl, exp, target, true); + case VSX_BUILTIN_LXVD2X_V2DF: + return altivec_expand_lv_builtin (CODE_FOR_vsx_load_v2df, + exp, target, false); + case VSX_BUILTIN_LXVD2X_V2DI: + return altivec_expand_lv_builtin (CODE_FOR_vsx_load_v2di, + exp, target, false); + case VSX_BUILTIN_LXVW4X_V4SF: + return altivec_expand_lv_builtin (CODE_FOR_vsx_load_v4sf, + exp, target, false); + case VSX_BUILTIN_LXVW4X_V4SI: + return altivec_expand_lv_builtin (CODE_FOR_vsx_load_v4si, + exp, target, false); + case VSX_BUILTIN_LXVW4X_V8HI: + return altivec_expand_lv_builtin (CODE_FOR_vsx_load_v8hi, + exp, target, false); + case VSX_BUILTIN_LXVW4X_V16QI: + return altivec_expand_lv_builtin (CODE_FOR_vsx_load_v16qi, + exp, target, false); + break; default: break; /* Fall through. */ @@ -12331,6 +12378,8 @@ rs6000_init_builtins (void) long_integer_type_internal_node = long_integer_type_node; long_unsigned_type_internal_node = long_unsigned_type_node; + long_long_integer_type_internal_node = long_long_integer_type_node; + long_long_unsigned_type_internal_node = long_long_unsigned_type_node; intQI_type_internal_node = intQI_type_node; uintQI_type_internal_node = unsigned_intQI_type_node; intHI_type_internal_node = intHI_type_node; @@ -12340,7 +12389,7 @@ rs6000_init_builtins (void) intDI_type_internal_node = intDI_type_node; uintDI_type_internal_node = unsigned_intDI_type_node; float_type_internal_node = float_type_node; - double_type_internal_node = float_type_node; + double_type_internal_node = double_type_node; void_type_internal_node = void_type_node; /* Initialize the modes for builtin_function_type, mapping a machine mode to @@ -12872,19 +12921,11 @@ altivec_init_builtins (void) size_t i; tree ftype; - tree pfloat_type_node = build_pointer_type (float_type_node); - tree pint_type_node = build_pointer_type (integer_type_node); - tree pshort_type_node = build_pointer_type (short_integer_type_node); - tree pchar_type_node = build_pointer_type (char_type_node); - tree pvoid_type_node = build_pointer_type (void_type_node); - tree pcfloat_type_node = build_pointer_type (build_qualified_type (float_type_node, TYPE_QUAL_CONST)); - tree pcint_type_node = build_pointer_type (build_qualified_type (integer_type_node, TYPE_QUAL_CONST)); - tree pcshort_type_node = build_pointer_type (build_qualified_type (short_integer_type_node, TYPE_QUAL_CONST)); - tree pcchar_type_node = build_pointer_type (build_qualified_type (char_type_node, TYPE_QUAL_CONST)); - - tree pcvoid_type_node = build_pointer_type (build_qualified_type (void_type_node, TYPE_QUAL_CONST)); + tree pcvoid_type_node + = build_pointer_type (build_qualified_type (void_type_node, + TYPE_QUAL_CONST)); tree int_ftype_opaque = build_function_type_list (integer_type_node, @@ -12907,26 +12948,6 @@ altivec_init_builtins (void) = build_function_type_list (integer_type_node, integer_type_node, V4SI_type_node, V4SI_type_node, NULL_TREE); - tree v4sf_ftype_pcfloat - = build_function_type_list (V4SF_type_node, pcfloat_type_node, NULL_TREE); - tree void_ftype_pfloat_v4sf - = build_function_type_list (void_type_node, - pfloat_type_node, V4SF_type_node, NULL_TREE); - tree v4si_ftype_pcint - = build_function_type_list (V4SI_type_node, pcint_type_node, NULL_TREE); - tree void_ftype_pint_v4si - = build_function_type_list (void_type_node, - pint_type_node, V4SI_type_node, NULL_TREE); - tree v8hi_ftype_pcshort - = build_function_type_list (V8HI_type_node, pcshort_type_node, NULL_TREE); - tree void_ftype_pshort_v8hi - = build_function_type_list (void_type_node, - pshort_type_node, V8HI_type_node, NULL_TREE); - tree v16qi_ftype_pcchar - = build_function_type_list (V16QI_type_node, pcchar_type_node, NULL_TREE); - tree void_ftype_pchar_v16qi - = build_function_type_list (void_type_node, - pchar_type_node, V16QI_type_node, NULL_TREE); tree void_ftype_v4si = build_function_type_list (void_type_node, V4SI_type_node, NULL_TREE); tree v8hi_ftype_void @@ -12938,16 +12959,32 @@ altivec_init_builtins (void) tree opaque_ftype_long_pcvoid = build_function_type_list (opaque_V4SI_type_node, - long_integer_type_node, pcvoid_type_node, NULL_TREE); + long_integer_type_node, pcvoid_type_node, + NULL_TREE); tree v16qi_ftype_long_pcvoid = build_function_type_list (V16QI_type_node, - long_integer_type_node, pcvoid_type_node, NULL_TREE); + long_integer_type_node, pcvoid_type_node, + NULL_TREE); tree v8hi_ftype_long_pcvoid = build_function_type_list (V8HI_type_node, - long_integer_type_node, pcvoid_type_node, NULL_TREE); + long_integer_type_node, pcvoid_type_node, + NULL_TREE); tree v4si_ftype_long_pcvoid = build_function_type_list (V4SI_type_node, - long_integer_type_node, pcvoid_type_node, NULL_TREE); + long_integer_type_node, pcvoid_type_node, + NULL_TREE); + tree v4sf_ftype_long_pcvoid + = build_function_type_list (V4SF_type_node, + long_integer_type_node, pcvoid_type_node, + NULL_TREE); + tree v2df_ftype_long_pcvoid + = build_function_type_list (V2DF_type_node, + long_integer_type_node, pcvoid_type_node, + NULL_TREE); + tree v2di_ftype_long_pcvoid + = build_function_type_list (V2DI_type_node, + long_integer_type_node, pcvoid_type_node, + NULL_TREE); tree void_ftype_opaque_long_pvoid = build_function_type_list (void_type_node, @@ -12965,6 +13002,18 @@ altivec_init_builtins (void) = build_function_type_list (void_type_node, V8HI_type_node, long_integer_type_node, pvoid_type_node, NULL_TREE); + tree void_ftype_v4sf_long_pvoid + = build_function_type_list (void_type_node, + V4SF_type_node, long_integer_type_node, + pvoid_type_node, NULL_TREE); + tree void_ftype_v2df_long_pvoid + = build_function_type_list (void_type_node, + V2DF_type_node, long_integer_type_node, + pvoid_type_node, NULL_TREE); + tree void_ftype_v2di_long_pvoid + = build_function_type_list (void_type_node, + V2DI_type_node, long_integer_type_node, + pvoid_type_node, NULL_TREE); tree int_ftype_int_v8hi_v8hi = build_function_type_list (integer_type_node, integer_type_node, V8HI_type_node, @@ -12996,22 +13045,6 @@ altivec_init_builtins (void) pcvoid_type_node, integer_type_node, integer_type_node, NULL_TREE); - def_builtin (MASK_ALTIVEC, "__builtin_altivec_ld_internal_4sf", v4sf_ftype_pcfloat, - ALTIVEC_BUILTIN_LD_INTERNAL_4sf); - def_builtin (MASK_ALTIVEC, "__builtin_altivec_st_internal_4sf", void_ftype_pfloat_v4sf, - ALTIVEC_BUILTIN_ST_INTERNAL_4sf); - def_builtin (MASK_ALTIVEC, "__builtin_altivec_ld_internal_4si", v4si_ftype_pcint, - ALTIVEC_BUILTIN_LD_INTERNAL_4si); - def_builtin (MASK_ALTIVEC, "__builtin_altivec_st_internal_4si", void_ftype_pint_v4si, - ALTIVEC_BUILTIN_ST_INTERNAL_4si); - def_builtin (MASK_ALTIVEC, "__builtin_altivec_ld_internal_8hi", v8hi_ftype_pcshort, - ALTIVEC_BUILTIN_LD_INTERNAL_8hi); - def_builtin (MASK_ALTIVEC, "__builtin_altivec_st_internal_8hi", void_ftype_pshort_v8hi, - ALTIVEC_BUILTIN_ST_INTERNAL_8hi); - def_builtin (MASK_ALTIVEC, "__builtin_altivec_ld_internal_16qi", v16qi_ftype_pcchar, - ALTIVEC_BUILTIN_LD_INTERNAL_16qi); - def_builtin (MASK_ALTIVEC, "__builtin_altivec_st_internal_16qi", void_ftype_pchar_v16qi, - ALTIVEC_BUILTIN_ST_INTERNAL_16qi); def_builtin (MASK_ALTIVEC, "__builtin_altivec_mtvscr", void_ftype_v4si, ALTIVEC_BUILTIN_MTVSCR); def_builtin (MASK_ALTIVEC, "__builtin_altivec_mfvscr", v8hi_ftype_void, ALTIVEC_BUILTIN_MFVSCR); def_builtin (MASK_ALTIVEC, "__builtin_altivec_dssall", void_ftype_void, ALTIVEC_BUILTIN_DSSALL); @@ -13043,6 +13076,35 @@ altivec_init_builtins (void) def_builtin (MASK_ALTIVEC, "__builtin_vec_stvebx", void_ftype_opaque_long_pvoid, ALTIVEC_BUILTIN_VEC_STVEBX); def_builtin (MASK_ALTIVEC, "__builtin_vec_stvehx", void_ftype_opaque_long_pvoid, ALTIVEC_BUILTIN_VEC_STVEHX); + def_builtin (MASK_VSX, "__builtin_vsx_lxvd2x_v2df", v2df_ftype_long_pcvoid, + VSX_BUILTIN_LXVD2X_V2DF); + def_builtin (MASK_VSX, "__builtin_vsx_lxvd2x_v2di", v2di_ftype_long_pcvoid, + VSX_BUILTIN_LXVD2X_V2DI); + def_builtin (MASK_VSX, "__builtin_vsx_lxvw4x_v4sf", v4sf_ftype_long_pcvoid, + VSX_BUILTIN_LXVW4X_V4SF); + def_builtin (MASK_VSX, "__builtin_vsx_lxvw4x_v4si", v4si_ftype_long_pcvoid, + VSX_BUILTIN_LXVW4X_V4SI); + def_builtin (MASK_VSX, "__builtin_vsx_lxvw4x_v8hi", + v8hi_ftype_long_pcvoid, VSX_BUILTIN_LXVW4X_V8HI); + def_builtin (MASK_VSX, "__builtin_vsx_lxvw4x_v16qi", + v16qi_ftype_long_pcvoid, VSX_BUILTIN_LXVW4X_V16QI); + def_builtin (MASK_VSX, "__builtin_vsx_stxvd2x_v2df", + void_ftype_v2df_long_pvoid, VSX_BUILTIN_STXVD2X_V2DF); + def_builtin (MASK_VSX, "__builtin_vsx_stxvd2x_v2di", + void_ftype_v2di_long_pvoid, VSX_BUILTIN_STXVD2X_V2DI); + def_builtin (MASK_VSX, "__builtin_vsx_stxvw4x_v4sf", + void_ftype_v4sf_long_pvoid, VSX_BUILTIN_STXVW4X_V4SF); + def_builtin (MASK_VSX, "__builtin_vsx_stxvw4x_v4si", + void_ftype_v4si_long_pvoid, VSX_BUILTIN_STXVW4X_V4SI); + def_builtin (MASK_VSX, "__builtin_vsx_stxvw4x_v8hi", + void_ftype_v8hi_long_pvoid, VSX_BUILTIN_STXVW4X_V8HI); + def_builtin (MASK_VSX, "__builtin_vsx_stxvw4x_v16qi", + void_ftype_v16qi_long_pvoid, VSX_BUILTIN_STXVW4X_V16QI); + def_builtin (MASK_VSX, "__builtin_vec_vsx_ld", opaque_ftype_long_pcvoid, + VSX_BUILTIN_VEC_LD); + def_builtin (MASK_VSX, "__builtin_vec_vsx_st", void_ftype_opaque_long_pvoid, + VSX_BUILTIN_VEC_ST); + if (rs6000_cpu == PROCESSOR_CELL) { def_builtin (MASK_ALTIVEC, "__builtin_altivec_lvlx", v16qi_ftype_long_pcvoid, ALTIVEC_BUILTIN_LVLX); @@ -27925,4 +27987,29 @@ rs6000_address_for_fpconvert (rtx x) return x; } +/* Given a memory reference, if it is not in the form for altivec memory + reference instructions (i.e. reg or reg+reg addressing with AND of -16), + convert to the altivec format. */ + +rtx +rs6000_address_for_altivec (rtx x) +{ + gcc_assert (MEM_P (x)); + if (!altivec_indexed_or_indirect_operand (x, GET_MODE (x))) + { + rtx addr = XEXP (x, 0); + int strict_p = (reload_in_progress || reload_completed); + + if (!legitimate_indexed_address_p (addr, strict_p) + && !legitimate_indirect_address_p (addr, strict_p)) + addr = copy_to_mode_reg (Pmode, addr); + + addr = gen_rtx_AND (Pmode, addr, GEN_INT (-16)); + x = change_address (x, GET_MODE (x), addr); + } + + return x; +} + + #include "gt-rs6000.h" Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 169776) +++ gcc/config/rs6000/vsx.md (working copy) @@ -308,6 +308,19 @@ (define_insn "*vsx_movti" } [(set_attr "type" "vecstore,vecload,vecsimple,*,*,*,vecsimple,*,vecstore,vecload")]) +;; Explicit load/store expanders for the builtin functions +(define_expand "vsx_load_" + [(set (match_operand:VSX_M 0 "vsx_register_operand" "") + (match_operand:VSX_M 1 "memory_operand" ""))] + "VECTOR_MEM_VSX_P (mode)" + "") + +(define_expand "vsx_store_" + [(set (match_operand:VEC_M 0 "memory_operand" "") + (match_operand:VEC_M 1 "vsx_register_operand" ""))] + "VECTOR_MEM_VSX_P (mode)" + "") + ;; VSX scalar and vector floating point arithmetic instructions (define_insn "*vsx_add3" Index: gcc/config/rs6000/rs6000.h =================================================================== --- gcc/config/rs6000/rs6000.h (revision 169775) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -1,7 +1,7 @@ /* Definitions of target machine for GNU compiler, for IBM RS/6000. Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, - 2010 + 2010, 2011 Free Software Foundation, Inc. Contributed by Richard Kenner (kenner@vlsi1.ultra.nyu.edu) @@ -2368,6 +2368,8 @@ enum rs6000_builtin_type_index RS6000_BTI_pixel_V8HI, /* __vector __pixel */ RS6000_BTI_long, /* long_integer_type_node */ RS6000_BTI_unsigned_long, /* long_unsigned_type_node */ + RS6000_BTI_long_long, /* long_long_integer_type_node */ + RS6000_BTI_unsigned_long_long, /* long_long_unsigned_type_node */ RS6000_BTI_INTQI, /* intQI_type_node */ RS6000_BTI_UINTQI, /* unsigned_intQI_type_node */ RS6000_BTI_INTHI, /* intHI_type_node */ @@ -2411,6 +2413,8 @@ enum rs6000_builtin_type_index #define bool_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V2DI]) #define pixel_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_pixel_V8HI]) +#define long_long_integer_type_internal_node (rs6000_builtin_types[RS6000_BTI_long_long]) +#define long_long_unsigned_type_internal_node (rs6000_builtin_types[RS6000_BTI_unsigned_long_long]) #define long_integer_type_internal_node (rs6000_builtin_types[RS6000_BTI_long]) #define long_unsigned_type_internal_node (rs6000_builtin_types[RS6000_BTI_unsigned_long]) #define intQI_type_internal_node (rs6000_builtin_types[RS6000_BTI_INTQI]) Index: gcc/config/rs6000/altivec.md =================================================================== --- gcc/config/rs6000/altivec.md (revision 169775) +++ gcc/config/rs6000/altivec.md (working copy) @@ -1,5 +1,5 @@ ;; AltiVec patterns. -;; Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 +;; Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 ;; Free Software Foundation, Inc. ;; Contributed by Aldy Hernandez (aldy@quesejoda.com) @@ -96,7 +96,7 @@ (define_constants (UNSPEC_STVE 203) (UNSPEC_SET_VSCR 213) (UNSPEC_GET_VRSAVE 214) - ;; 215 deleted + (UNSPEC_LVX 215) (UNSPEC_REDUC_PLUS 217) (UNSPEC_VECSH 219) (UNSPEC_EXTEVEN_V4SI 220) @@ -1750,17 +1750,19 @@ (define_insn "altivec_lvxl" "lvxl %0,%y1" [(set_attr "type" "vecload")]) -(define_insn "altivec_lvx" - [(set (match_operand:V4SI 0 "register_operand" "=v") - (match_operand:V4SI 1 "memory_operand" "Z"))] +(define_insn "altivec_lvx_" + [(parallel + [(set (match_operand:VM2 0 "register_operand" "=v") + (match_operand:VM2 1 "memory_operand" "Z")) + (unspec [(const_int 0)] UNSPEC_LVX)])] "TARGET_ALTIVEC" "lvx %0,%y1" [(set_attr "type" "vecload")]) -(define_insn "altivec_stvx" +(define_insn "altivec_stvx_" [(parallel - [(set (match_operand:V4SI 0 "memory_operand" "=Z") - (match_operand:V4SI 1 "register_operand" "v")) + [(set (match_operand:VM2 0 "memory_operand" "=Z") + (match_operand:VM2 1 "register_operand" "v")) (unspec [(const_int 0)] UNSPEC_STVX)])] "TARGET_ALTIVEC" "stvx %1,%y0" Index: gcc/config/rs6000/altivec.h =================================================================== --- gcc/config/rs6000/altivec.h (revision 169775) +++ gcc/config/rs6000/altivec.h (working copy) @@ -1,5 +1,5 @@ /* PowerPC AltiVec include file. - Copyright (C) 2002, 2003, 2004, 2005, 2008, 2009, 2010 + Copyright (C) 2002, 2003, 2004, 2005, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. Contributed by Aldy Hernandez (aldyh@redhat.com). Rewritten by Paolo Bonzini (bonzini@gnu.org). @@ -318,6 +318,8 @@ #define vec_nearbyint __builtin_vec_nearbyint #define vec_rint __builtin_vec_rint #define vec_sqrt __builtin_vec_sqrt +#define vec_vsx_ld __builtin_vec_vsx_ld +#define vec_vsx_st __builtin_vec_vsx_st #endif /* Predicates.