Message ID | 87oazxhi4m.fsf@talisman.default |
---|---|
State | New |
Headers | show |
On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford <rdsandiford@googlemail.com> wrote: > wide-int fails to build libitm because of a bad interaction between: > > /* Keep the OI and XI modes from confusing the compiler into thinking > that these modes could actually be used for computation. They are > only holders for vectors during data movement. */ > #define MAX_BITSIZE_MODE_ANY_INT (128) > > and the memcpy folding code: > > /* Make sure we are not copying using a floating-point mode or > a type whose size possibly does not match its precision. */ > if (FLOAT_MODE_P (TYPE_MODE (desttype)) > || TREE_CODE (desttype) == BOOLEAN_TYPE > || TREE_CODE (desttype) == ENUMERAL_TYPE) > { > /* A more suitable int_mode_for_mode would return a vector > integer mode for a vector float mode or a integer complex > mode for a float complex mode if there isn't a regular > integer mode covering the mode of desttype. */ > enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); > if (mode == BLKmode) > desttype = NULL_TREE; > else > desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), > 1); > } > if (FLOAT_MODE_P (TYPE_MODE (srctype)) > || TREE_CODE (srctype) == BOOLEAN_TYPE > || TREE_CODE (srctype) == ENUMERAL_TYPE) > { > enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); > if (mode == BLKmode) > srctype = NULL_TREE; > else > srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), > 1); > } > > The failure occurs for complex long double, which we try to copy as > a 256-bit integer type (OImode). > > This patch tries to do what the comment suggests by introducing a new > form of int_mode_for_mode that replaces vector modes with vector modes > and complex modes with complex modes. The fallback case of using a > MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above > 128 bits on x86_64. > > The question then is what to do about 128-bit types for i386. > MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be > used for optimisation. However, gcc.target/i386/pr49168-1.c only passes > for -m32 -msse2 because we use int128_t to copy a float128_t. > > I handled that by allowing MODE_VECTOR_INT to be used instead of > MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, > even if the original type wasn't a vector. Hmm. Sounds reasonable unless there are very weird targets that cannot efficiently load/store vectors unaligned but can handle efficient load/store of unaligned scalars. > It might be that other callers to int_mode_for_mode should use > the new function too, but I'll look at that separately. > > I used the attached testcase (with printfs added to gcc) to check that > the right modes and types were being chosen. The patch fixes the > complex float and complex double cases, since the integer type that we > previously picked had a larger alignment than the original complex type. As of complex int modes - are we positively sure that targets even try to do sth "optimal" for loads/stores of those? > One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) > is that vectors are copied as integer vectors if the target supports > them directly but are copied as float vectors otherwise, since in the > latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the > test are copied as vector floats and vector doubles both before and > after the patch. That wasn't intended ... the folding should have failed if we can't copy using an integer mode ... Richard. > > Tested against trunk with x86_64-linux-gnu {,-m32}. OK to install? > > Thanks, > Richard > > > gcc/ > * machmode.h (bitwise_mode_for_size): Declare. > * stor-layout.h (bitwise_type_for_mode): Likewise. > * stor-layout.c (bitwise_mode_for_size): New function. > (bitwise_type_for_mode): Likewise. > * builtins.c (fold_builtin_memory_op): Use it instead of > int_mode_for_mode and build_nonstandard_integer_type. > > gcc/testsuite/ > * gcc.dg/memcpy-5.c: New test. > > Index: gcc/machmode.h > =================================================================== > --- gcc/machmode.h 2014-04-18 11:16:12.706092658 +0100 > +++ gcc/machmode.h 2014-04-18 11:16:38.179299261 +0100 > @@ -253,6 +253,8 @@ extern enum machine_mode smallest_mode_f > > extern enum machine_mode int_mode_for_mode (enum machine_mode); > > +extern enum machine_mode bitwise_mode_for_size (enum machine_mode); > + > /* Return a mode that is suitable for representing a vector, > or BLKmode on failure. */ > > Index: gcc/stor-layout.h > =================================================================== > --- gcc/stor-layout.h 2014-04-18 11:16:12.707092667 +0100 > +++ gcc/stor-layout.h 2014-04-18 11:16:12.830093665 +0100 > @@ -98,6 +98,8 @@ extern tree make_unsigned_type (int); > mode_for_size, but is passed a tree. */ > extern enum machine_mode mode_for_size_tree (const_tree, enum mode_class, int); > > +extern tree bitwise_type_for_mode (enum machine_mode); > + > /* Given a VAR_DECL, PARM_DECL or RESULT_DECL, clears the results of > a previous call to layout_decl and calls it again. */ > extern void relayout_decl (tree); > Index: gcc/stor-layout.c > =================================================================== > --- gcc/stor-layout.c 2014-04-18 11:16:12.707092667 +0100 > +++ gcc/stor-layout.c 2014-04-18 16:08:20.033752312 +0100 > @@ -403,6 +403,73 @@ int_mode_for_mode (enum machine_mode mod > return mode; > } > > +/* Find a mode that can be used for efficient bitwise operations on MODE. > + Return BLKmode if no such mode exists. */ > + > +enum machine_mode > +bitwise_mode_for_mode (enum machine_mode mode) > +{ > + /* Quick exit if we already have a suitable mode. */ > + unsigned int bitsize = GET_MODE_BITSIZE (mode); > + if (SCALAR_INT_MODE_P (mode) && bitsize <= MAX_FIXED_MODE_SIZE) > + return mode; > + > + /* Reuse the sanity checks from int_mode_for_mode. */ > + gcc_checking_assert ((int_mode_for_mode (mode), true)); > + > + /* Try to replace complex modes with complex modes. In general we > + expect both components to be processed independently, so we only > + care whether there is a register for the inner mode. */ > + if (COMPLEX_MODE_P (mode)) > + { > + enum machine_mode trial = mode; > + if (GET_MODE_CLASS (mode) != MODE_COMPLEX_INT) > + trial = mode_for_size (bitsize, MODE_COMPLEX_INT, false); > + if (trial != BLKmode > + && have_regs_of_mode[GET_MODE_INNER (trial)]) > + return trial; > + } > + > + /* Try to replace vector modes with vector modes. Also try using vector > + modes if an integer mode would be too big. */ > + if (VECTOR_MODE_P (mode) || bitsize > MAX_FIXED_MODE_SIZE) > + { > + enum machine_mode trial = mode; > + if (GET_MODE_CLASS (mode) != MODE_VECTOR_INT) > + trial = mode_for_size (bitsize, MODE_VECTOR_INT, 0); > + if (trial != BLKmode > + && have_regs_of_mode[trial] > + && targetm.vector_mode_supported_p (trial)) > + return trial; > + } > + > + /* Otherwise fall back on integers while honoring MAX_FIXED_MODE_SIZE. */ > + return mode_for_size (bitsize, MODE_INT, true); > +} > + > +/* Find a type that can be used for efficient bitwise operations on MODE. > + Return null if no such mode exists. */ > + > +tree > +bitwise_type_for_mode (enum machine_mode mode) > +{ > + mode = bitwise_mode_for_mode (mode); > + if (mode == BLKmode) > + return NULL_TREE; > + > + unsigned int inner_size = GET_MODE_UNIT_BITSIZE (mode); > + tree inner_type = build_nonstandard_integer_type (inner_size, true); > + > + if (VECTOR_MODE_P (mode)) > + return build_vector_type_for_mode (inner_type, mode); > + > + if (COMPLEX_MODE_P (mode)) > + return build_complex_type (inner_type); > + > + gcc_checking_assert (GET_MODE_INNER (mode) == VOIDmode); > + return inner_type; > +} > + > /* Find a mode that is suitable for representing a vector with > NUNITS elements of mode INNERMODE. Returns BLKmode if there > is no suitable mode. */ > Index: gcc/builtins.c > =================================================================== > --- gcc/builtins.c 2014-04-18 11:16:12.676092416 +0100 > +++ gcc/builtins.c 2014-04-18 11:22:50.107315234 +0100 > @@ -8921,29 +8921,11 @@ fold_builtin_memory_op (location_t loc, > if (FLOAT_MODE_P (TYPE_MODE (desttype)) > || TREE_CODE (desttype) == BOOLEAN_TYPE > || TREE_CODE (desttype) == ENUMERAL_TYPE) > - { > - /* A more suitable int_mode_for_mode would return a vector > - integer mode for a vector float mode or a integer complex > - mode for a float complex mode if there isn't a regular > - integer mode covering the mode of desttype. */ > - enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); > - if (mode == BLKmode) > - desttype = NULL_TREE; > - else > - desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), > - 1); > - } > + desttype = bitwise_type_for_mode (TYPE_MODE (desttype)); > if (FLOAT_MODE_P (TYPE_MODE (srctype)) > || TREE_CODE (srctype) == BOOLEAN_TYPE > || TREE_CODE (srctype) == ENUMERAL_TYPE) > - { > - enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); > - if (mode == BLKmode) > - srctype = NULL_TREE; > - else > - srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), > - 1); > - } > + srctype = bitwise_type_for_mode (TYPE_MODE (srctype)); > if (!srctype) > srctype = desttype; > if (!desttype) > Index: gcc/testsuite/gcc.dg/memcpy-5.c > =================================================================== > --- /dev/null 2014-04-15 08:10:27.294524132 +0100 > +++ gcc/testsuite/gcc.dg/memcpy-5.c 2014-04-18 11:21:52.036844396 +0100 > @@ -0,0 +1,27 @@ > +/* { dg-options "-O -fdump-tree-optimized" } */ > + > +extern void *memcpy (void *, const void *, __SIZE_TYPE__); > + > +#define TEST(NAME, TYPE) \ > + TYPE NAME##x; \ > + char NAME##y[sizeof (NAME##x)] __attribute__((aligned (__alignof__ (NAME##x)))); \ > + void NAME (void) { memcpy (&NAME##x, &NAME##y, sizeof (NAME##x)); } > + > +TEST (f, float); > +TEST (d, double); > +TEST (ld, long double); > +TEST (cf, _Complex float); > +TEST (cd, _Complex double); > +TEST (cld, _Complex long double); > +TEST (d8f, float __attribute__((vector_size (8)))); > +TEST (d16f, float __attribute__((vector_size (16)))); > +TEST (d32f, float __attribute__((vector_size (32)))); > +TEST (d64f, float __attribute__((vector_size (64)))); > +TEST (d128f, float __attribute__((vector_size (128)))); > +TEST (d16d, double __attribute__((vector_size (16)))); > +TEST (d32d, double __attribute__((vector_size (32)))); > +TEST (d64d, double __attribute__((vector_size (64)))); > +TEST (d128d, double __attribute__((vector_size (128)))); > + > +/* { dg-final { scan-tree-dump-not "memcpy" "optimized" { target x86_64-*-* } } } */ > +/* { dg-final { cleanup-tree-dump "optimized" } } */
Richard Biener <richard.guenther@gmail.com> writes: > On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford > <rdsandiford@googlemail.com> wrote: >> wide-int fails to build libitm because of a bad interaction between: >> >> /* Keep the OI and XI modes from confusing the compiler into thinking >> that these modes could actually be used for computation. They are >> only holders for vectors during data movement. */ >> #define MAX_BITSIZE_MODE_ANY_INT (128) >> >> and the memcpy folding code: >> >> /* Make sure we are not copying using a floating-point mode or >> a type whose size possibly does not match its precision. */ >> if (FLOAT_MODE_P (TYPE_MODE (desttype)) >> || TREE_CODE (desttype) == BOOLEAN_TYPE >> || TREE_CODE (desttype) == ENUMERAL_TYPE) >> { >> /* A more suitable int_mode_for_mode would return a vector >> integer mode for a vector float mode or a integer complex >> mode for a float complex mode if there isn't a regular >> integer mode covering the mode of desttype. */ >> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); >> if (mode == BLKmode) >> desttype = NULL_TREE; >> else >> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >> 1); >> } >> if (FLOAT_MODE_P (TYPE_MODE (srctype)) >> || TREE_CODE (srctype) == BOOLEAN_TYPE >> || TREE_CODE (srctype) == ENUMERAL_TYPE) >> { >> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); >> if (mode == BLKmode) >> srctype = NULL_TREE; >> else >> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >> 1); >> } >> >> The failure occurs for complex long double, which we try to copy as >> a 256-bit integer type (OImode). >> >> This patch tries to do what the comment suggests by introducing a new >> form of int_mode_for_mode that replaces vector modes with vector modes >> and complex modes with complex modes. The fallback case of using a >> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above >> 128 bits on x86_64. >> >> The question then is what to do about 128-bit types for i386. >> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be >> used for optimisation. However, gcc.target/i386/pr49168-1.c only passes >> for -m32 -msse2 because we use int128_t to copy a float128_t. >> >> I handled that by allowing MODE_VECTOR_INT to be used instead of >> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, >> even if the original type wasn't a vector. > > Hmm. Sounds reasonable unless there are very weird targets that > cannot efficiently load/store vectors unaligned but can handle > efficient load/store of unaligned scalars. Yeah, in general there's no guarantee that even int_mode_for_mode will return a mode with the same alignment as the original. Callers need to check that (like the memcpy folder does). >> It might be that other callers to int_mode_for_mode should use >> the new function too, but I'll look at that separately. >> >> I used the attached testcase (with printfs added to gcc) to check that >> the right modes and types were being chosen. The patch fixes the >> complex float and complex double cases, since the integer type that we >> previously picked had a larger alignment than the original complex type. > > As of complex int modes - are we positively sure that targets even > try to do sth "optimal" for loads/stores of those? Complex modes usually aren't handled directly by .md patterns, either int or float. They're really treated as a pair of values. So IMO it still makes sense to fold this case. >> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) >> is that vectors are copied as integer vectors if the target supports >> them directly but are copied as float vectors otherwise, since in the >> latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the >> test are copied as vector floats and vector doubles both before and >> after the patch. > > That wasn't intended ... the folding should have failed if we can't > copy using an integer mode ... Does that mean that the fold give up if TYPE_MODE is BLKmode? I can do that as a separate patch if so. Thanks, Richard
On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford <rdsandiford@googlemail.com> wrote: > Richard Biener <richard.guenther@gmail.com> writes: >> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford >> <rdsandiford@googlemail.com> wrote: >>> wide-int fails to build libitm because of a bad interaction between: >>> >>> /* Keep the OI and XI modes from confusing the compiler into thinking >>> that these modes could actually be used for computation. They are >>> only holders for vectors during data movement. */ >>> #define MAX_BITSIZE_MODE_ANY_INT (128) >>> >>> and the memcpy folding code: >>> >>> /* Make sure we are not copying using a floating-point mode or >>> a type whose size possibly does not match its precision. */ >>> if (FLOAT_MODE_P (TYPE_MODE (desttype)) >>> || TREE_CODE (desttype) == BOOLEAN_TYPE >>> || TREE_CODE (desttype) == ENUMERAL_TYPE) >>> { >>> /* A more suitable int_mode_for_mode would return a vector >>> integer mode for a vector float mode or a integer complex >>> mode for a float complex mode if there isn't a regular >>> integer mode covering the mode of desttype. */ >>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); >>> if (mode == BLKmode) >>> desttype = NULL_TREE; >>> else >>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >>> 1); >>> } >>> if (FLOAT_MODE_P (TYPE_MODE (srctype)) >>> || TREE_CODE (srctype) == BOOLEAN_TYPE >>> || TREE_CODE (srctype) == ENUMERAL_TYPE) >>> { >>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); >>> if (mode == BLKmode) >>> srctype = NULL_TREE; >>> else >>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >>> 1); >>> } >>> >>> The failure occurs for complex long double, which we try to copy as >>> a 256-bit integer type (OImode). >>> >>> This patch tries to do what the comment suggests by introducing a new >>> form of int_mode_for_mode that replaces vector modes with vector modes >>> and complex modes with complex modes. The fallback case of using a >>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above >>> 128 bits on x86_64. >>> >>> The question then is what to do about 128-bit types for i386. >>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be >>> used for optimisation. However, gcc.target/i386/pr49168-1.c only passes >>> for -m32 -msse2 because we use int128_t to copy a float128_t. >>> >>> I handled that by allowing MODE_VECTOR_INT to be used instead of >>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, >>> even if the original type wasn't a vector. >> >> Hmm. Sounds reasonable unless there are very weird targets that >> cannot efficiently load/store vectors unaligned but can handle >> efficient load/store of unaligned scalars. > > Yeah, in general there's no guarantee that even int_mode_for_mode > will return a mode with the same alignment as the original. Callers > need to check that (like the memcpy folder does). > >>> It might be that other callers to int_mode_for_mode should use >>> the new function too, but I'll look at that separately. >>> >>> I used the attached testcase (with printfs added to gcc) to check that >>> the right modes and types were being chosen. The patch fixes the >>> complex float and complex double cases, since the integer type that we >>> previously picked had a larger alignment than the original complex type. >> >> As of complex int modes - are we positively sure that targets even >> try to do sth "optimal" for loads/stores of those? > > Complex modes usually aren't handled directly by .md patterns, > either int or float. They're really treated as a pair of values. > So IMO it still makes sense to fold this case. > >>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) >>> is that vectors are copied as integer vectors if the target supports >>> them directly but are copied as float vectors otherwise, since in the >>> latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the >>> test are copied as vector floats and vector doubles both before and >>> after the patch. >> >> That wasn't intended ... the folding should have failed if we can't >> copy using an integer mode ... > > Does that mean that the fold give up if TYPE_MODE is BLKmode? > I can do that as a separate patch if so. Looking at the code again it should always choose an integer mode/type via setting desttype/srctype to NULL for BLKmode and if (!srctype) srctype = desttype; if (!desttype) desttype = srctype; if (!srctype) return NULL_TREE; no? Thus if we can't get a integer type for either src or dest then we fail. But we should never end up with srctype or desttype being a float mode. No? Richard. > Thanks, > Richard
Richard Biener <richard.guenther@gmail.com> writes: > On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford > <rdsandiford@googlemail.com> wrote: >> Richard Biener <richard.guenther@gmail.com> writes: >>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford >>> <rdsandiford@googlemail.com> wrote: >>>> wide-int fails to build libitm because of a bad interaction between: >>>> >>>> /* Keep the OI and XI modes from confusing the compiler into thinking >>>> that these modes could actually be used for computation. They are >>>> only holders for vectors during data movement. */ >>>> #define MAX_BITSIZE_MODE_ANY_INT (128) >>>> >>>> and the memcpy folding code: >>>> >>>> /* Make sure we are not copying using a floating-point mode or >>>> a type whose size possibly does not match its precision. */ >>>> if (FLOAT_MODE_P (TYPE_MODE (desttype)) >>>> || TREE_CODE (desttype) == BOOLEAN_TYPE >>>> || TREE_CODE (desttype) == ENUMERAL_TYPE) >>>> { >>>> /* A more suitable int_mode_for_mode would return a vector >>>> integer mode for a vector float mode or a integer complex >>>> mode for a float complex mode if there isn't a regular >>>> integer mode covering the mode of desttype. */ >>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); >>>> if (mode == BLKmode) >>>> desttype = NULL_TREE; >>>> else >>>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >>>> 1); >>>> } >>>> if (FLOAT_MODE_P (TYPE_MODE (srctype)) >>>> || TREE_CODE (srctype) == BOOLEAN_TYPE >>>> || TREE_CODE (srctype) == ENUMERAL_TYPE) >>>> { >>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); >>>> if (mode == BLKmode) >>>> srctype = NULL_TREE; >>>> else >>>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >>>> 1); >>>> } >>>> >>>> The failure occurs for complex long double, which we try to copy as >>>> a 256-bit integer type (OImode). >>>> >>>> This patch tries to do what the comment suggests by introducing a new >>>> form of int_mode_for_mode that replaces vector modes with vector modes >>>> and complex modes with complex modes. The fallback case of using a >>>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above >>>> 128 bits on x86_64. >>>> >>>> The question then is what to do about 128-bit types for i386. >>>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be >>>> used for optimisation. However, gcc.target/i386/pr49168-1.c only passes >>>> for -m32 -msse2 because we use int128_t to copy a float128_t. >>>> >>>> I handled that by allowing MODE_VECTOR_INT to be used instead of >>>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, >>>> even if the original type wasn't a vector. >>> >>> Hmm. Sounds reasonable unless there are very weird targets that >>> cannot efficiently load/store vectors unaligned but can handle >>> efficient load/store of unaligned scalars. >> >> Yeah, in general there's no guarantee that even int_mode_for_mode >> will return a mode with the same alignment as the original. Callers >> need to check that (like the memcpy folder does). >> >>>> It might be that other callers to int_mode_for_mode should use >>>> the new function too, but I'll look at that separately. >>>> >>>> I used the attached testcase (with printfs added to gcc) to check that >>>> the right modes and types were being chosen. The patch fixes the >>>> complex float and complex double cases, since the integer type that we >>>> previously picked had a larger alignment than the original complex type. >>> >>> As of complex int modes - are we positively sure that targets even >>> try to do sth "optimal" for loads/stores of those? >> >> Complex modes usually aren't handled directly by .md patterns, >> either int or float. They're really treated as a pair of values. >> So IMO it still makes sense to fold this case. >> >>>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) >>>> is that vectors are copied as integer vectors if the target supports >>>> them directly but are copied as float vectors otherwise, since in the >>>> latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the >>>> test are copied as vector floats and vector doubles both before and >>>> after the patch. >>> >>> That wasn't intended ... the folding should have failed if we can't >>> copy using an integer mode ... >> >> Does that mean that the fold give up if TYPE_MODE is BLKmode? >> I can do that as a separate patch if so. > > Looking at the code again it should always choose an integer mode/type > via setting desttype/srctype to NULL for BLKmode and > > if (!srctype) > srctype = desttype; > if (!desttype) > desttype = srctype; > if (!srctype) > return NULL_TREE; > > no? Thus if we can't get a integer type for either src or dest then > we fail. But we should never end up with srctype or desttype > being a float mode. No? Right, they don't have a float _mode_, because the target doesn't support 1024-bit modes. Like I say, they have BLKmode instead. But they have a float type. Thanks, Richard
On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford <rdsandiford@googlemail.com> wrote: > Richard Biener <richard.guenther@gmail.com> writes: >> On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford >> <rdsandiford@googlemail.com> wrote: >>> Richard Biener <richard.guenther@gmail.com> writes: >>>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford >>>> <rdsandiford@googlemail.com> wrote: >>>>> wide-int fails to build libitm because of a bad interaction between: >>>>> >>>>> /* Keep the OI and XI modes from confusing the compiler into thinking >>>>> that these modes could actually be used for computation. They are >>>>> only holders for vectors during data movement. */ >>>>> #define MAX_BITSIZE_MODE_ANY_INT (128) >>>>> >>>>> and the memcpy folding code: >>>>> >>>>> /* Make sure we are not copying using a floating-point mode or >>>>> a type whose size possibly does not match its precision. */ >>>>> if (FLOAT_MODE_P (TYPE_MODE (desttype)) >>>>> || TREE_CODE (desttype) == BOOLEAN_TYPE >>>>> || TREE_CODE (desttype) == ENUMERAL_TYPE) >>>>> { >>>>> /* A more suitable int_mode_for_mode would return a vector >>>>> integer mode for a vector float mode or a integer complex >>>>> mode for a float complex mode if there isn't a regular >>>>> integer mode covering the mode of desttype. */ >>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); >>>>> if (mode == BLKmode) >>>>> desttype = NULL_TREE; >>>>> else >>>>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >>>>> 1); >>>>> } >>>>> if (FLOAT_MODE_P (TYPE_MODE (srctype)) >>>>> || TREE_CODE (srctype) == BOOLEAN_TYPE >>>>> || TREE_CODE (srctype) == ENUMERAL_TYPE) >>>>> { >>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); >>>>> if (mode == BLKmode) >>>>> srctype = NULL_TREE; >>>>> else >>>>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >>>>> 1); >>>>> } >>>>> >>>>> The failure occurs for complex long double, which we try to copy as >>>>> a 256-bit integer type (OImode). >>>>> >>>>> This patch tries to do what the comment suggests by introducing a new >>>>> form of int_mode_for_mode that replaces vector modes with vector modes >>>>> and complex modes with complex modes. The fallback case of using a >>>>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above >>>>> 128 bits on x86_64. >>>>> >>>>> The question then is what to do about 128-bit types for i386. >>>>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be >>>>> used for optimisation. However, gcc.target/i386/pr49168-1.c only passes >>>>> for -m32 -msse2 because we use int128_t to copy a float128_t. >>>>> >>>>> I handled that by allowing MODE_VECTOR_INT to be used instead of >>>>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, >>>>> even if the original type wasn't a vector. >>>> >>>> Hmm. Sounds reasonable unless there are very weird targets that >>>> cannot efficiently load/store vectors unaligned but can handle >>>> efficient load/store of unaligned scalars. >>> >>> Yeah, in general there's no guarantee that even int_mode_for_mode >>> will return a mode with the same alignment as the original. Callers >>> need to check that (like the memcpy folder does). >>> >>>>> It might be that other callers to int_mode_for_mode should use >>>>> the new function too, but I'll look at that separately. >>>>> >>>>> I used the attached testcase (with printfs added to gcc) to check that >>>>> the right modes and types were being chosen. The patch fixes the >>>>> complex float and complex double cases, since the integer type that we >>>>> previously picked had a larger alignment than the original complex type. >>>> >>>> As of complex int modes - are we positively sure that targets even >>>> try to do sth "optimal" for loads/stores of those? >>> >>> Complex modes usually aren't handled directly by .md patterns, >>> either int or float. They're really treated as a pair of values. >>> So IMO it still makes sense to fold this case. >>> >>>>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) >>>>> is that vectors are copied as integer vectors if the target supports >>>>> them directly but are copied as float vectors otherwise, since in the >>>>> latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the >>>>> test are copied as vector floats and vector doubles both before and >>>>> after the patch. >>>> >>>> That wasn't intended ... the folding should have failed if we can't >>>> copy using an integer mode ... >>> >>> Does that mean that the fold give up if TYPE_MODE is BLKmode? >>> I can do that as a separate patch if so. >> >> Looking at the code again it should always choose an integer mode/type >> via setting desttype/srctype to NULL for BLKmode and >> >> if (!srctype) >> srctype = desttype; >> if (!desttype) >> desttype = srctype; >> if (!srctype) >> return NULL_TREE; >> >> no? Thus if we can't get a integer type for either src or dest then >> we fail. But we should never end up with srctype or desttype >> being a float mode. No? > > Right, they don't have a float _mode_, because the target doesn't > support 1024-bit modes. Like I say, they have BLKmode instead. > But they have a float type. Ok. I clearly used FLOAT_MODE_P (TYPE_MODE ()) because I assumed that only float-modes will eventually end up in FP registers (that's what the bug was about - we load into a FP register and that load can trigger a FP exception). Richard. > Thanks, > Richard
Richard Biener <richard.guenther@gmail.com> writes: > On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford > <rdsandiford@googlemail.com> wrote: >> Richard Biener <richard.guenther@gmail.com> writes: >>> On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford >>> <rdsandiford@googlemail.com> wrote: >>>> Richard Biener <richard.guenther@gmail.com> writes: >>>>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford >>>>> <rdsandiford@googlemail.com> wrote: >>>>>> wide-int fails to build libitm because of a bad interaction between: >>>>>> >>>>>> /* Keep the OI and XI modes from confusing the compiler into thinking >>>>>> that these modes could actually be used for computation. They are >>>>>> only holders for vectors during data movement. */ >>>>>> #define MAX_BITSIZE_MODE_ANY_INT (128) >>>>>> >>>>>> and the memcpy folding code: >>>>>> >>>>>> /* Make sure we are not copying using a floating-point mode or >>>>>> a type whose size possibly does not match its precision. */ >>>>>> if (FLOAT_MODE_P (TYPE_MODE (desttype)) >>>>>> || TREE_CODE (desttype) == BOOLEAN_TYPE >>>>>> || TREE_CODE (desttype) == ENUMERAL_TYPE) >>>>>> { >>>>>> /* A more suitable int_mode_for_mode would return a vector >>>>>> integer mode for a vector float mode or a integer complex >>>>>> mode for a float complex mode if there isn't a regular >>>>>> integer mode covering the mode of desttype. */ >>>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); >>>>>> if (mode == BLKmode) >>>>>> desttype = NULL_TREE; >>>>>> else >>>>>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >>>>>> 1); >>>>>> } >>>>>> if (FLOAT_MODE_P (TYPE_MODE (srctype)) >>>>>> || TREE_CODE (srctype) == BOOLEAN_TYPE >>>>>> || TREE_CODE (srctype) == ENUMERAL_TYPE) >>>>>> { >>>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); >>>>>> if (mode == BLKmode) >>>>>> srctype = NULL_TREE; >>>>>> else >>>>>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >>>>>> 1); >>>>>> } >>>>>> >>>>>> The failure occurs for complex long double, which we try to copy as >>>>>> a 256-bit integer type (OImode). >>>>>> >>>>>> This patch tries to do what the comment suggests by introducing a new >>>>>> form of int_mode_for_mode that replaces vector modes with vector modes >>>>>> and complex modes with complex modes. The fallback case of using a >>>>>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above >>>>>> 128 bits on x86_64. >>>>>> >>>>>> The question then is what to do about 128-bit types for i386. >>>>>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be >>>>>> used for optimisation. However, gcc.target/i386/pr49168-1.c only passes >>>>>> for -m32 -msse2 because we use int128_t to copy a float128_t. >>>>>> >>>>>> I handled that by allowing MODE_VECTOR_INT to be used instead of >>>>>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, >>>>>> even if the original type wasn't a vector. >>>>> >>>>> Hmm. Sounds reasonable unless there are very weird targets that >>>>> cannot efficiently load/store vectors unaligned but can handle >>>>> efficient load/store of unaligned scalars. >>>> >>>> Yeah, in general there's no guarantee that even int_mode_for_mode >>>> will return a mode with the same alignment as the original. Callers >>>> need to check that (like the memcpy folder does). >>>> >>>>>> It might be that other callers to int_mode_for_mode should use >>>>>> the new function too, but I'll look at that separately. >>>>>> >>>>>> I used the attached testcase (with printfs added to gcc) to check that >>>>>> the right modes and types were being chosen. The patch fixes the >>>>>> complex float and complex double cases, since the integer type that we >>>>>> previously picked had a larger alignment than the original complex type. >>>>> >>>>> As of complex int modes - are we positively sure that targets even >>>>> try to do sth "optimal" for loads/stores of those? >>>> >>>> Complex modes usually aren't handled directly by .md patterns, >>>> either int or float. They're really treated as a pair of values. >>>> So IMO it still makes sense to fold this case. >>>> >>>>>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) >>>>>> is that vectors are copied as integer vectors if the target supports >>>>>> them directly but are copied as float vectors otherwise, since in the >>>>>> latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the >>>>>> test are copied as vector floats and vector doubles both before and >>>>>> after the patch. >>>>> >>>>> That wasn't intended ... the folding should have failed if we can't >>>>> copy using an integer mode ... >>>> >>>> Does that mean that the fold give up if TYPE_MODE is BLKmode? >>>> I can do that as a separate patch if so. >>> >>> Looking at the code again it should always choose an integer mode/type >>> via setting desttype/srctype to NULL for BLKmode and >>> >>> if (!srctype) >>> srctype = desttype; >>> if (!desttype) >>> desttype = srctype; >>> if (!srctype) >>> return NULL_TREE; >>> >>> no? Thus if we can't get a integer type for either src or dest then >>> we fail. But we should never end up with srctype or desttype >>> being a float mode. No? >> >> Right, they don't have a float _mode_, because the target doesn't >> support 1024-bit modes. Like I say, they have BLKmode instead. >> But they have a float type. > > Ok. I clearly used FLOAT_MODE_P (TYPE_MODE ()) because > I assumed that only float-modes will eventually end up in FP > registers (that's what the bug was about - we load into a FP register > and that load can trigger a FP exception). Yeah. I just flagged it up because it wasn't obvious that native vector float types need to be copied as vector ints while non-native ones don't. A non-native vector float might be scalarised into individual float operations and thus might end up using an FP register load. Thanks, Richard
On Tue, Apr 22, 2014 at 11:51 AM, Richard Sandiford <rdsandiford@googlemail.com> wrote: > Richard Biener <richard.guenther@gmail.com> writes: >> On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford >> <rdsandiford@googlemail.com> wrote: >>> Richard Biener <richard.guenther@gmail.com> writes: >>>> On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford >>>> <rdsandiford@googlemail.com> wrote: >>>>> Richard Biener <richard.guenther@gmail.com> writes: >>>>>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford >>>>>> <rdsandiford@googlemail.com> wrote: >>>>>>> wide-int fails to build libitm because of a bad interaction between: >>>>>>> >>>>>>> /* Keep the OI and XI modes from confusing the compiler into thinking >>>>>>> that these modes could actually be used for computation. They are >>>>>>> only holders for vectors during data movement. */ >>>>>>> #define MAX_BITSIZE_MODE_ANY_INT (128) >>>>>>> >>>>>>> and the memcpy folding code: >>>>>>> >>>>>>> /* Make sure we are not copying using a floating-point mode or >>>>>>> a type whose size possibly does not match its precision. */ >>>>>>> if (FLOAT_MODE_P (TYPE_MODE (desttype)) >>>>>>> || TREE_CODE (desttype) == BOOLEAN_TYPE >>>>>>> || TREE_CODE (desttype) == ENUMERAL_TYPE) >>>>>>> { >>>>>>> /* A more suitable int_mode_for_mode would return a vector >>>>>>> integer mode for a vector float mode or a integer complex >>>>>>> mode for a float complex mode if there isn't a regular >>>>>>> integer mode covering the mode of desttype. */ >>>>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); >>>>>>> if (mode == BLKmode) >>>>>>> desttype = NULL_TREE; >>>>>>> else >>>>>>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >>>>>>> 1); >>>>>>> } >>>>>>> if (FLOAT_MODE_P (TYPE_MODE (srctype)) >>>>>>> || TREE_CODE (srctype) == BOOLEAN_TYPE >>>>>>> || TREE_CODE (srctype) == ENUMERAL_TYPE) >>>>>>> { >>>>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); >>>>>>> if (mode == BLKmode) >>>>>>> srctype = NULL_TREE; >>>>>>> else >>>>>>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), >>>>>>> 1); >>>>>>> } >>>>>>> >>>>>>> The failure occurs for complex long double, which we try to copy as >>>>>>> a 256-bit integer type (OImode). >>>>>>> >>>>>>> This patch tries to do what the comment suggests by introducing a new >>>>>>> form of int_mode_for_mode that replaces vector modes with vector modes >>>>>>> and complex modes with complex modes. The fallback case of using a >>>>>>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above >>>>>>> 128 bits on x86_64. >>>>>>> >>>>>>> The question then is what to do about 128-bit types for i386. >>>>>>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be >>>>>>> used for optimisation. However, gcc.target/i386/pr49168-1.c only passes >>>>>>> for -m32 -msse2 because we use int128_t to copy a float128_t. >>>>>>> >>>>>>> I handled that by allowing MODE_VECTOR_INT to be used instead of >>>>>>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, >>>>>>> even if the original type wasn't a vector. >>>>>> >>>>>> Hmm. Sounds reasonable unless there are very weird targets that >>>>>> cannot efficiently load/store vectors unaligned but can handle >>>>>> efficient load/store of unaligned scalars. >>>>> >>>>> Yeah, in general there's no guarantee that even int_mode_for_mode >>>>> will return a mode with the same alignment as the original. Callers >>>>> need to check that (like the memcpy folder does). >>>>> >>>>>>> It might be that other callers to int_mode_for_mode should use >>>>>>> the new function too, but I'll look at that separately. >>>>>>> >>>>>>> I used the attached testcase (with printfs added to gcc) to check that >>>>>>> the right modes and types were being chosen. The patch fixes the >>>>>>> complex float and complex double cases, since the integer type that we >>>>>>> previously picked had a larger alignment than the original complex type. >>>>>> >>>>>> As of complex int modes - are we positively sure that targets even >>>>>> try to do sth "optimal" for loads/stores of those? >>>>> >>>>> Complex modes usually aren't handled directly by .md patterns, >>>>> either int or float. They're really treated as a pair of values. >>>>> So IMO it still makes sense to fold this case. >>>>> >>>>>>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) >>>>>>> is that vectors are copied as integer vectors if the target supports >>>>>>> them directly but are copied as float vectors otherwise, since in the >>>>>>> latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the >>>>>>> test are copied as vector floats and vector doubles both before and >>>>>>> after the patch. >>>>>> >>>>>> That wasn't intended ... the folding should have failed if we can't >>>>>> copy using an integer mode ... >>>>> >>>>> Does that mean that the fold give up if TYPE_MODE is BLKmode? >>>>> I can do that as a separate patch if so. >>>> >>>> Looking at the code again it should always choose an integer mode/type >>>> via setting desttype/srctype to NULL for BLKmode and >>>> >>>> if (!srctype) >>>> srctype = desttype; >>>> if (!desttype) >>>> desttype = srctype; >>>> if (!srctype) >>>> return NULL_TREE; >>>> >>>> no? Thus if we can't get a integer type for either src or dest then >>>> we fail. But we should never end up with srctype or desttype >>>> being a float mode. No? >>> >>> Right, they don't have a float _mode_, because the target doesn't >>> support 1024-bit modes. Like I say, they have BLKmode instead. >>> But they have a float type. >> >> Ok. I clearly used FLOAT_MODE_P (TYPE_MODE ()) because >> I assumed that only float-modes will eventually end up in FP >> registers (that's what the bug was about - we load into a FP register >> and that load can trigger a FP exception). > > Yeah. I just flagged it up because it wasn't obvious that native vector > float types need to be copied as vector ints while non-native ones don't. > A non-native vector float might be scalarised into individual float > operations and thus might end up using an FP register load. That's true - we should fail for that case. Richard. > Thanks, > Richard
Index: gcc/machmode.h =================================================================== --- gcc/machmode.h 2014-04-18 11:16:12.706092658 +0100 +++ gcc/machmode.h 2014-04-18 11:16:38.179299261 +0100 @@ -253,6 +253,8 @@ extern enum machine_mode smallest_mode_f extern enum machine_mode int_mode_for_mode (enum machine_mode); +extern enum machine_mode bitwise_mode_for_size (enum machine_mode); + /* Return a mode that is suitable for representing a vector, or BLKmode on failure. */ Index: gcc/stor-layout.h =================================================================== --- gcc/stor-layout.h 2014-04-18 11:16:12.707092667 +0100 +++ gcc/stor-layout.h 2014-04-18 11:16:12.830093665 +0100 @@ -98,6 +98,8 @@ extern tree make_unsigned_type (int); mode_for_size, but is passed a tree. */ extern enum machine_mode mode_for_size_tree (const_tree, enum mode_class, int); +extern tree bitwise_type_for_mode (enum machine_mode); + /* Given a VAR_DECL, PARM_DECL or RESULT_DECL, clears the results of a previous call to layout_decl and calls it again. */ extern void relayout_decl (tree); Index: gcc/stor-layout.c =================================================================== --- gcc/stor-layout.c 2014-04-18 11:16:12.707092667 +0100 +++ gcc/stor-layout.c 2014-04-18 16:08:20.033752312 +0100 @@ -403,6 +403,73 @@ int_mode_for_mode (enum machine_mode mod return mode; } +/* Find a mode that can be used for efficient bitwise operations on MODE. + Return BLKmode if no such mode exists. */ + +enum machine_mode +bitwise_mode_for_mode (enum machine_mode mode) +{ + /* Quick exit if we already have a suitable mode. */ + unsigned int bitsize = GET_MODE_BITSIZE (mode); + if (SCALAR_INT_MODE_P (mode) && bitsize <= MAX_FIXED_MODE_SIZE) + return mode; + + /* Reuse the sanity checks from int_mode_for_mode. */ + gcc_checking_assert ((int_mode_for_mode (mode), true)); + + /* Try to replace complex modes with complex modes. In general we + expect both components to be processed independently, so we only + care whether there is a register for the inner mode. */ + if (COMPLEX_MODE_P (mode)) + { + enum machine_mode trial = mode; + if (GET_MODE_CLASS (mode) != MODE_COMPLEX_INT) + trial = mode_for_size (bitsize, MODE_COMPLEX_INT, false); + if (trial != BLKmode + && have_regs_of_mode[GET_MODE_INNER (trial)]) + return trial; + } + + /* Try to replace vector modes with vector modes. Also try using vector + modes if an integer mode would be too big. */ + if (VECTOR_MODE_P (mode) || bitsize > MAX_FIXED_MODE_SIZE) + { + enum machine_mode trial = mode; + if (GET_MODE_CLASS (mode) != MODE_VECTOR_INT) + trial = mode_for_size (bitsize, MODE_VECTOR_INT, 0); + if (trial != BLKmode + && have_regs_of_mode[trial] + && targetm.vector_mode_supported_p (trial)) + return trial; + } + + /* Otherwise fall back on integers while honoring MAX_FIXED_MODE_SIZE. */ + return mode_for_size (bitsize, MODE_INT, true); +} + +/* Find a type that can be used for efficient bitwise operations on MODE. + Return null if no such mode exists. */ + +tree +bitwise_type_for_mode (enum machine_mode mode) +{ + mode = bitwise_mode_for_mode (mode); + if (mode == BLKmode) + return NULL_TREE; + + unsigned int inner_size = GET_MODE_UNIT_BITSIZE (mode); + tree inner_type = build_nonstandard_integer_type (inner_size, true); + + if (VECTOR_MODE_P (mode)) + return build_vector_type_for_mode (inner_type, mode); + + if (COMPLEX_MODE_P (mode)) + return build_complex_type (inner_type); + + gcc_checking_assert (GET_MODE_INNER (mode) == VOIDmode); + return inner_type; +} + /* Find a mode that is suitable for representing a vector with NUNITS elements of mode INNERMODE. Returns BLKmode if there is no suitable mode. */ Index: gcc/builtins.c =================================================================== --- gcc/builtins.c 2014-04-18 11:16:12.676092416 +0100 +++ gcc/builtins.c 2014-04-18 11:22:50.107315234 +0100 @@ -8921,29 +8921,11 @@ fold_builtin_memory_op (location_t loc, if (FLOAT_MODE_P (TYPE_MODE (desttype)) || TREE_CODE (desttype) == BOOLEAN_TYPE || TREE_CODE (desttype) == ENUMERAL_TYPE) - { - /* A more suitable int_mode_for_mode would return a vector - integer mode for a vector float mode or a integer complex - mode for a float complex mode if there isn't a regular - integer mode covering the mode of desttype. */ - enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); - if (mode == BLKmode) - desttype = NULL_TREE; - else - desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), - 1); - } + desttype = bitwise_type_for_mode (TYPE_MODE (desttype)); if (FLOAT_MODE_P (TYPE_MODE (srctype)) || TREE_CODE (srctype) == BOOLEAN_TYPE || TREE_CODE (srctype) == ENUMERAL_TYPE) - { - enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); - if (mode == BLKmode) - srctype = NULL_TREE; - else - srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), - 1); - } + srctype = bitwise_type_for_mode (TYPE_MODE (srctype)); if (!srctype) srctype = desttype; if (!desttype) Index: gcc/testsuite/gcc.dg/memcpy-5.c =================================================================== --- /dev/null 2014-04-15 08:10:27.294524132 +0100 +++ gcc/testsuite/gcc.dg/memcpy-5.c 2014-04-18 11:21:52.036844396 +0100 @@ -0,0 +1,27 @@ +/* { dg-options "-O -fdump-tree-optimized" } */ + +extern void *memcpy (void *, const void *, __SIZE_TYPE__); + +#define TEST(NAME, TYPE) \ + TYPE NAME##x; \ + char NAME##y[sizeof (NAME##x)] __attribute__((aligned (__alignof__ (NAME##x)))); \ + void NAME (void) { memcpy (&NAME##x, &NAME##y, sizeof (NAME##x)); } + +TEST (f, float); +TEST (d, double); +TEST (ld, long double); +TEST (cf, _Complex float); +TEST (cd, _Complex double); +TEST (cld, _Complex long double); +TEST (d8f, float __attribute__((vector_size (8)))); +TEST (d16f, float __attribute__((vector_size (16)))); +TEST (d32f, float __attribute__((vector_size (32)))); +TEST (d64f, float __attribute__((vector_size (64)))); +TEST (d128f, float __attribute__((vector_size (128)))); +TEST (d16d, double __attribute__((vector_size (16)))); +TEST (d32d, double __attribute__((vector_size (32)))); +TEST (d64d, double __attribute__((vector_size (64)))); +TEST (d128d, double __attribute__((vector_size (128)))); + +/* { dg-final { scan-tree-dump-not "memcpy" "optimized" { target x86_64-*-* } } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */