Message ID | alpine.LNX.2.20.13.1901201255380.13981@monopod.intra.ispras.ru |
---|---|
State | New |
Headers | show |
Series | doc: showcase a "union of vectors" pattern (PR 88698) | expand |
On Sun, Jan 20, 2019 at 11:40 AM Alexander Monakov <amonakov@ispras.ru> wrote: > > Hi, > > PR 88698 ("Relax generic vector conversions") is asking to be more permissive > about type compatibility for generic vector types. While I don't think that's > a good idea per se, from a "compiler user" point of view I also see how writing > code for SSE/AVX using both generic vectors and x86 intrinsics is inconvenient > because all intrinsics use the same type such as __m128i, regardless of whether > they operate on 8/16/32/64-bit signed/unsigned lanes. > > My solution to that is to introduce a union type that holds both the type > accepted by intrinsics and generic vector types used by my code. It also has > an interesting effect that each statement spells out the lane type. > > Is it appropriate to showcase this approach in our documentation, as done in > this patch? What to do with the PR, should I leave it open? I think it's appropriate to have this kind of hints in the documentation. I'm not sure whether it's permitted to use this in C++ or not (and whether the C usage is now officially sanctioned other than via a non-normative footnote) - but at least GCC will guarantee that it works. Might be interesting to show how to do argument marshalling with the same union. Richard. > PR c/88698 > * doc/extend.texi (Vector Extensions): Add an example of using vector > types together with x86 intrinsics. > > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -10632,6 +10632,40 @@ v4si g = __builtin_convertvector (f, v4si); /* g is @{1,-2,3,7@} */ > v4si h = __builtin_convertvector (c, v4si); /* h is @{1,5,0,10@} */ > @end smallexample > > +@cindex vector types, using with x86 intrinsics > +Sometimes it is desirable to write code using a mix of generic vector > +operations (for clarity) and machine-specific vector intrinsics (to > +access vector instructions that are not exposed via generic built-ins). > +On x86, intrinsic functions for integer vectors typically use the same > +vector type @code{__m128i} irrespective of how they interpret the vector, > +making it necessary to cast their arguments and return values from/to > +other vector types. In C, you can make use of a @code{union} type: > +@c In C++ such type punning via a union is not allowed by the language > +@smallexample > +#include <immintrin.h> > + > +typedef unsigned char u8x16 __attribute__ ((vector_size (16))); > +typedef unsigned int u32x4 __attribute__ ((vector_size (16))); > + > +typedef union @{ > + u8x16 u8; > + u32x4 u32; > + __m128i mm; > +@} v128; > +@end smallexample > + > +@noindent > +for variables that can be used with both built-in operations and x86 > +intrinsics: > + > +@smallexample > +v128 x, y = @{ 0 @}; > +memcpy (&x, ptr, sizeof x); > +y.u8 += 0x80; > +x.mm = _mm_adds_epu8 (x.mm, y.mm); > +x.u32 &= 0xffffff; > +@end smallexample > + > @node Offsetof > @section Support for @code{offsetof} > @findex __builtin_offsetof
On Mon, 21 Jan 2019, Richard Biener wrote: > I'm not sure whether it's permitted to use this in C++ or not (and whether > the C usage is now officially sanctioned other than via a non-normative > footnote) - but at least GCC will guarantee that it works. My impression is that, via the "active member" wording, C++ standard says clearly this is UB. I assume on the GCC side the intention is to treat it like in C (for POD types only?..), but I don't want to go there in the "Vector Extensions" section. Hence my patch limits the suggestion to C. I think in C++ it's a bit less of an issue anyway, because people can use operator overloading to achieve a similar end result, with some extra legwork. Do you want me to change the text somehow? > Might be interesting to show how to do argument marshalling with the > same union. Sorry, I don't see what you mean here; can you give an example or elaborate? Thanks! Alexander
On Mon, Jan 21, 2019 at 10:14 AM Alexander Monakov <amonakov@ispras.ru> wrote: > > On Mon, 21 Jan 2019, Richard Biener wrote: > > > I'm not sure whether it's permitted to use this in C++ or not (and whether > > the C usage is now officially sanctioned other than via a non-normative > > footnote) - but at least GCC will guarantee that it works. > > My impression is that, via the "active member" wording, C++ standard says > clearly this is UB. I assume on the GCC side the intention is to treat it > like in C (for POD types only?..), but I don't want to go there in the > "Vector Extensions" section. Hence my patch limits the suggestion to C. > > I think in C++ it's a bit less of an issue anyway, because people can use > operator overloading to achieve a similar end result, with some extra legwork. > > Do you want me to change the text somehow? No, it was just a note from my side. > > Might be interesting to show how to do argument marshalling with the > > same union. > > Sorry, I don't see what you mean here; can you give an example or elaborate? when you call a function with v4si signature but want to pass it a result from a __mm intrinsic you'd need a temporary union to do the marshalling. I think you can use a union argument in the function itself and you can elide the temporary then if you use transparent_union. But I didn't try. Richard. > Thanks! > Alexander
On Mon, 21 Jan 2019, Richard Biener wrote: > > Sorry, I don't see what you mean here; can you give an example or elaborate? > > when you call a function with v4si signature but want to pass it a result > from a __mm intrinsic you'd need a temporary union to do the marshalling. > I think you can use a union argument in the function itself and you can > elide the temporary then if you use transparent_union. But I didn't try. Ah, I see now. I agree transparent_union ought to work, but today both GCC and Clang will reject such attempt; I've filed PR 88955 for the GCC issue. So unfortunately such code would still need a cast or an unnamed temporary, which may be relatively obvious to the reader. Alexander
On Mon, 21 Jan 2019, Alexander Monakov wrote: > Ah, I see now. I agree transparent_union ought to work, but today both GCC > and Clang will reject such attempt; I've filed PR 88955 for the GCC issue. > > So unfortunately such code would still need a cast or an unnamed temporary, > which may be relatively obvious to the reader. In the end it may be preferable to have it anyway, so here's a revised version with an extra line in the example for passing arguments via compound literals. Checked with 'make html', OK to apply? PR c/88698 * doc/extend.texi (Vector Extensions): Add an example of using vector types together with x86 intrinsics. diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 95d22ac1e3c..34a76927b12 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -10632,6 +10632,47 @@ v4si g = __builtin_convertvector (f, v4si); /* g is @{1,-2,3,7@} */ v4si h = __builtin_convertvector (c, v4si); /* h is @{1,5,0,10@} */ @end smallexample +@cindex vector types, using with x86 intrinsics +Sometimes it is desirable to write code using a mix of generic vector +operations (for clarity) and machine-specific vector intrinsics (to +access vector instructions that are not exposed via generic built-ins). +On x86, intrinsic functions for integer vectors typically use the same +vector type @code{__m128i} irrespective of how they interpret the vector, +making it necessary to cast their arguments and return values from/to +other vector types. In C, you can make use of a @code{union} type: +@c In C++ such type punning via a union is not allowed by the language +@smallexample +#include <immintrin.h> + +typedef unsigned char u8x16 __attribute__ ((vector_size (16))); +typedef unsigned int u32x4 __attribute__ ((vector_size (16))); + +typedef union @{ + __m128i mm; + u8x16 u8; + u32x4 u32; +@} v128; +@end smallexample + +@noindent +for variables that can be used with both built-in operators and x86 +intrinsics: + +@smallexample +v128 x, y = @{ 0 @}; +memcpy (&x, ptr, sizeof x); +y.u8 += 0x80; +x.mm = _mm_adds_epu8 (x.mm, y.mm); +x.u32 &= 0xffffff; + +/* Instead of a variable, a compound literal may be used to pass the + return value of an intrinsic call to a function expecting the union: */ +v128 foo (v128); +x = foo ((v128) @{_mm_adds_epu8 (x.mm, y.mm)@}); +@c This could be done implicitly with __attribute__((transparent_union)), +@c but GCC does not accept it for unions of vector types (PR 88955). +@end smallexample + @node Offsetof @section Support for @code{offsetof} @findex __builtin_offsetof
On Wed, Jan 30, 2019 at 3:42 PM Alexander Monakov <amonakov@ispras.ru> wrote: > > On Mon, 21 Jan 2019, Alexander Monakov wrote: > > > Ah, I see now. I agree transparent_union ought to work, but today both GCC > > and Clang will reject such attempt; I've filed PR 88955 for the GCC issue. > > > > So unfortunately such code would still need a cast or an unnamed temporary, > > which may be relatively obvious to the reader. > > In the end it may be preferable to have it anyway, so here's a revised version > with an extra line in the example for passing arguments via compound literals. > > Checked with 'make html', OK to apply? OK unless somebody objects within a reasonable time frame. Richard. > PR c/88698 > * doc/extend.texi (Vector Extensions): Add an example of using vector > types together with x86 intrinsics. > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index 95d22ac1e3c..34a76927b12 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -10632,6 +10632,47 @@ v4si g = __builtin_convertvector (f, v4si); /* g is @{1,-2,3,7@} */ > v4si h = __builtin_convertvector (c, v4si); /* h is @{1,5,0,10@} */ > @end smallexample > > +@cindex vector types, using with x86 intrinsics > +Sometimes it is desirable to write code using a mix of generic vector > +operations (for clarity) and machine-specific vector intrinsics (to > +access vector instructions that are not exposed via generic built-ins). > +On x86, intrinsic functions for integer vectors typically use the same > +vector type @code{__m128i} irrespective of how they interpret the vector, > +making it necessary to cast their arguments and return values from/to > +other vector types. In C, you can make use of a @code{union} type: > +@c In C++ such type punning via a union is not allowed by the language > +@smallexample > +#include <immintrin.h> > + > +typedef unsigned char u8x16 __attribute__ ((vector_size (16))); > +typedef unsigned int u32x4 __attribute__ ((vector_size (16))); > + > +typedef union @{ > + __m128i mm; > + u8x16 u8; > + u32x4 u32; > +@} v128; > +@end smallexample > + > +@noindent > +for variables that can be used with both built-in operators and x86 > +intrinsics: > + > +@smallexample > +v128 x, y = @{ 0 @}; > +memcpy (&x, ptr, sizeof x); > +y.u8 += 0x80; > +x.mm = _mm_adds_epu8 (x.mm, y.mm); > +x.u32 &= 0xffffff; > + > +/* Instead of a variable, a compound literal may be used to pass the > + return value of an intrinsic call to a function expecting the union: */ > +v128 foo (v128); > +x = foo ((v128) @{_mm_adds_epu8 (x.mm, y.mm)@}); > +@c This could be done implicitly with __attribute__((transparent_union)), > +@c but GCC does not accept it for unions of vector types (PR 88955). > +@end smallexample > + > @node Offsetof > @section Support for @code{offsetof} > @findex __builtin_offsetof
--- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -10632,6 +10632,40 @@ v4si g = __builtin_convertvector (f, v4si); /* g is @{1,-2,3,7@} */ v4si h = __builtin_convertvector (c, v4si); /* h is @{1,5,0,10@} */ @end smallexample +@cindex vector types, using with x86 intrinsics +Sometimes it is desirable to write code using a mix of generic vector +operations (for clarity) and machine-specific vector intrinsics (to +access vector instructions that are not exposed via generic built-ins). +On x86, intrinsic functions for integer vectors typically use the same +vector type @code{__m128i} irrespective of how they interpret the vector, +making it necessary to cast their arguments and return values from/to +other vector types. In C, you can make use of a @code{union} type: +@c In C++ such type punning via a union is not allowed by the language +@smallexample +#include <immintrin.h> + +typedef unsigned char u8x16 __attribute__ ((vector_size (16))); +typedef unsigned int u32x4 __attribute__ ((vector_size (16))); + +typedef union @{ + u8x16 u8; + u32x4 u32; + __m128i mm; +@} v128; +@end smallexample + +@noindent +for variables that can be used with both built-in operations and x86 +intrinsics: + +@smallexample +v128 x, y = @{ 0 @}; +memcpy (&x, ptr, sizeof x); +y.u8 += 0x80; +x.mm = _mm_adds_epu8 (x.mm, y.mm); +x.u32 &= 0xffffff; +@end smallexample + @node Offsetof @section Support for @code{offsetof} @findex __builtin_offsetof