Patchwork [i386] : Fix PR target/47840, [4.4/4.5/4.6 Regression] incorrect _mm256_insert_epi{32,64} implementations

login
register
mail settings
Submitter Uros Bizjak
Date Feb. 21, 2011, 7:57 p.m.
Message ID <AANLkTi=6vwm4u+ZYy8o=hvoSxgQ2m+72uBABbbJPrThA@mail.gmail.com>
Download mbox | patch
Permalink /patch/83870/
State New
Headers show

Comments

Uros Bizjak - Feb. 21, 2011, 7:57 p.m.
Hello!

Attached patch fixes a typo in the implementations of
_mm256_insert_epi32 and _mm256_insert_epi64.

2010-02-21  Uros Bizjak  <ubizjak@gmail.com>

	PR target/47840

	* config/i386/avxintrin.h (_mm256_insert_epi32): Use _mm_insert_epi32.
	(_mm256_insert_epi64): Use _mm_insert_epi64.

Patch was tested on x86_64-pc-linux-gnu  {,-m32} and will be committed
to all release branches.

Uros.
H.J. Lu - Feb. 21, 2011, 8 p.m.
On Mon, Feb 21, 2011 at 11:57 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
> Attached patch fixes a typo in the implementations of
> _mm256_insert_epi32 and _mm256_insert_epi64.
>
> 2010-02-21  Uros Bizjak  <ubizjak@gmail.com>
>
>        PR target/47840
>
>        * config/i386/avxintrin.h (_mm256_insert_epi32): Use _mm_insert_epi32.
>        (_mm256_insert_epi64): Use _mm_insert_epi64.
>
> Patch was tested on x86_64-pc-linux-gnu  {,-m32} and will be committed
> to all release branches.
>

Can we add a few testcases?

Thanks.

Patch

Index: avxintrin.h
===================================================================
--- avxintrin.h	(revision 170367)
+++ avxintrin.h	(working copy)
@@ -737,7 +737,7 @@ 
 _mm256_insert_epi32 (__m256i __X, int __D, int const __N)
 {
   __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 2);
-  __Y = _mm_insert_epi16 (__Y, __D, __N % 4);
+  __Y = _mm_insert_epi32 (__Y, __D, __N % 4);
   return _mm256_insertf128_si256 (__X, __Y, __N >> 2);
 }

@@ -762,7 +762,7 @@ 
 _mm256_insert_epi64 (__m256i __X, int __D, int const __N)
 {
   __m128i __Y = _mm256_extractf128_si256 (__X, __N >> 1);
-  __Y = _mm_insert_epi16 (__Y, __D, __N % 2);
+  __Y = _mm_insert_epi64 (__Y, __D, __N % 2);
   return _mm256_insertf128_si256 (__X, __Y, __N >> 1);
 }
 #endif