diff mbox

[AArch64_be] Fix vtbl[34] and vtbx4

Message ID CAKdteOYuOzJgQdn29QJrnUParfk2_bvZDUepivkfzUteG5eHBA@mail.gmail.com
State New
Headers show

Commit Message

Christophe Lyon Oct. 13, 2015, 1:05 p.m. UTC
On 12 October 2015 at 15:30, James Greenhalgh <james.greenhalgh@arm.com> wrote:
> On Fri, Oct 09, 2015 at 05:16:05PM +0100, Christophe Lyon wrote:
>> On 8 October 2015 at 11:12, James Greenhalgh <james.greenhalgh@arm.com> wrote:
>> > On Wed, Oct 07, 2015 at 09:07:30PM +0100, Christophe Lyon wrote:
>> >> On 7 October 2015 at 17:09, James Greenhalgh <james.greenhalgh@arm.com> wrote:
>> >> > On Tue, Sep 15, 2015 at 05:25:25PM +0100, Christophe Lyon wrote:
>> >> >
>> >> > Why do we want this for vtbx4 rather than putting out a VTBX instruction
>> >> > directly (as in the inline asm versions you replace)?
>> >> >
>> >> I just followed the pattern used for vtbx3.
>> >>
>> >> > This sequence does make sense for vtbx3.
>> >> In fact, I don't see why vtbx3 and vtbx4 should be different?
>> >
>> > The difference between TBL and TBX is in their handling of a request to
>> > select an out-of-range value. For TBL this returns zero, for TBX this
>> > returns the value which was already in the destination register.
>> >
>> > Because the byte-vectors used by the TBX instruction in aarch64 are 128-bit
>> > (so two of them togather allow selecting elements in the range 0-31), and
>> > vtbx3 needs to emulate the AArch32 behaviour of picking elements from 3x64-bit
>> > vectors (allowing elements in the range 0-23), we need to manually check for
>> > values which would have been out-of-range on AArch32, but are not out
>> > of range for AArch64 and handle them appropriately. For vtbx4 on the other
>> > hand, 2x128-bit registers give the range 0..31 and 4x64-bit registers give
>> > the range 0..31, so we don't need the special masked handling.
>> >
>> > You can find the suggested instruction sequences for the Neon intrinsics
>> > in this document:
>> >
>> >   http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/IHI0073A_arm_neon_intrinsics_ref.pdf
>> >
>>
>> Hi James,
>>
>> Please find attached an updated version which hopefully addresses your comments.
>> Tested on aarch64-none-elf and aarch64_be-none-elf using the Foundation Model.
>>
>> OK?
>
> Looks good to me,
>
> Thanks,
> James
>

I commited this as r228716, and noticed later that
gcc.target/aarch64/table-intrinsics.c failed because of this patch.

This is because that testcase scans the assembly for 'tbl v' or 'tbx
v', but since I replaced some asm statements,
the space is now a tab.

I plan to commit this (probably obvious?):
2015-10-13  Christophe Lyon  <christophe.lyon@linaro.org>

	* gcc/testsuite/gcc.target/aarch64/table-intrinsics.c: Fix regexp
	after r228716 (Fix vtbl[34] and vtbx4).

Comments

James Greenhalgh Oct. 13, 2015, 1:07 p.m. UTC | #1
On Tue, Oct 13, 2015 at 02:05:01PM +0100, Christophe Lyon wrote:
> I commited this as r228716, and noticed later that
> gcc.target/aarch64/table-intrinsics.c failed because of this patch.
> 
> This is because that testcase scans the assembly for 'tbl v' or 'tbx
> v', but since I replaced some asm statements,
> the space is now a tab.
> 
> I plan to commit this (probably obvious?):

> 2015-10-13  Christophe Lyon  <christophe.lyon@linaro.org>
> 
> 	* gcc/testsuite/gcc.target/aarch64/table-intrinsics.c: Fix regexp
> 	after r228716 (Fix vtbl[34] and vtbx4).

Bad luck. This is fine (and yes, obvious).

Thanks,
James

> Index: gcc/testsuite/gcc.target/aarch64/table-intrinsics.c
> ===================================================================
> --- gcc/testsuite/gcc.target/aarch64/table-intrinsics.c	(revision 228759)
> +++ gcc/testsuite/gcc.target/aarch64/table-intrinsics.c	(working copy)
> @@ -435,5 +435,5 @@
>    return vqtbx4q_p8 (r, tab, idx);
>  }
>  
> -/* { dg-final { scan-assembler-times "tbl v" 42} }  */
> -/* { dg-final { scan-assembler-times "tbx v" 30} }  */
> +/* { dg-final { scan-assembler-times "tbl\[ |\t\]*v" 42} }  */
> +/* { dg-final { scan-assembler-times "tbx\[ |\t\]*v" 30} }  */
diff mbox

Patch

Index: gcc/testsuite/gcc.target/aarch64/table-intrinsics.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/table-intrinsics.c	(revision 228759)
+++ gcc/testsuite/gcc.target/aarch64/table-intrinsics.c	(working copy)
@@ -435,5 +435,5 @@ 
   return vqtbx4q_p8 (r, tab, idx);
 }
 
-/* { dg-final { scan-assembler-times "tbl v" 42} }  */
-/* { dg-final { scan-assembler-times "tbx v" 30} }  */
+/* { dg-final { scan-assembler-times "tbl\[ |\t\]*v" 42} }  */
+/* { dg-final { scan-assembler-times "tbx\[ |\t\]*v" 30} }  */