Message ID | 8738idzj52.fsf@x240.local.i-did-not-set--mail-host-address--so-tickle-me |
---|---|
State | New |
Headers | show |
Hello Ulrich, On 19 Mar 22:41, Ulrich Drepper wrote: > Another set of functions missing are those to set all elements of a > 512-bit vector to the same float or double value. I think the patch > below uses the optimal code sequence for that. The patch requires the > previous patch introducing _mm*_undefined_*. > > > 2014-03-19 Ulrich Drepper <drepper@gmail.com> > > * config/i386/avx512fintrin.h: Define _mm512_set1_ps and > _mm512_set1_pd. Your patch is correct IMHO, but maybe it worst to add all missing `mm512_set1*' stuff? According to trunk and [1] we're still missing (beside mentioned by you) _mm512_set1_epi16 and _mm512_set1_epi8 broadcasts. [1] - http://software.intel.com/sites/landingpage/IntrinsicsGuide/ -- Thanks, K
On Mon, Mar 24, 2014 at 1:50 AM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote: > Your patch is correct IMHO, but maybe it worst to add all missing > `mm512_set1*' stuff? > > According to trunk and [1] we're still missing (beside mentioned by you) > _mm512_set1_epi16 and _mm512_set1_epi8 broadcasts. Yes, more are missing, but I think those will need new builtins. The _ps and _pd don't require additional instructions. _mm512_set1_epi16 might have to map to vpbroadcastw. _mm512_set1_epi8 might have to map to vpbroadcastb. I haven't seen a way to generate those instructions if needed and so this work was out of scope for now due to time constraints. I agree, they should be added as quickly as possible to avoid releasing headers with incomplete APIs. What is the verdict on checking these changes in? Too late for the next release?
On Mon, Mar 24, 2014 at 12:13 PM, Ulrich Drepper <drepper@gmail.com> wrote: > On Mon, Mar 24, 2014 at 1:50 AM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote: >> Your patch is correct IMHO, but maybe it worst to add all missing >> `mm512_set1*' stuff? >> >> According to trunk and [1] we're still missing (beside mentioned by you) >> _mm512_set1_epi16 and _mm512_set1_epi8 broadcasts. > > Yes, more are missing, but I think those will need new builtins. The > _ps and _pd don't require additional instructions. > > _mm512_set1_epi16 might have to map to vpbroadcastw. _mm512_set1_epi8 > might have to map to vpbroadcastb. I haven't seen a way to generate > those instructions if needed and so this work was out of scope for now > due to time constraints. I agree, they should be added as quickly as > possible to avoid releasing headers with incomplete APIs. > > What is the verdict on checking these changes in? Too late for the > next release? This kind of changes can also be made for 4.9.1 for example. Richard.
diff -u b/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h --- b/gcc/config/i386/avx512fintrin.h +++ b/gcc/config/i386/avx512fintrin.h @@ -130,6 +130,28 @@ return __Y; } +extern __inline __m512d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_set1_pd (double __A) +{ + return (__m512d) __builtin_ia32_broadcastsd512 (__extension__ + (__v2df) { __A, }, + (__v8df) + _mm512_undefined_pd (), + (__mmask8) -1); +} + +extern __inline __m512 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_set1_ps (float __A) +{ + return (__m512) __builtin_ia32_broadcastss512 (__extension__ + (__v4sf) { __A, }, + (__v16sf) + _mm512_undefined_ps (), + (__mmask16) -1); +} + extern __inline __m512 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_setzero_ps (void)