From patchwork Fri Feb 21 16:30:41 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejas Belagod X-Patchwork-Id: 322945 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 70C7C2C031A for ; Sat, 22 Feb 2014 03:30:56 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=UkHpmfgmb3AguAwEHD8fsfYZV2c0FMGO4o4w9VDpsEd tF48EX61qcAEIGfscrJ8hYjOjGI7FJ6VOgdQVDEktwLvEHfw7gJTePb0TygxZQ++ l0HkJi4zZXQN9GGcYTLzAc27nh2CG7oRCs/lNyVaVtwBqWvOw+RRv9aXffJRxO5U = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=0x99PhJfloloTYmDfe7NVY+bIEc=; b=UAe2UK22FQOQMgMyZ i+RtIoxiBCS292KrIjObwmDGlDAv1269xvGIaQsZnn8xJE46qFIADF/iUjINpoqU BtB8ggXoYMg0y4uq74cLvXX8cC8hl8gkvSsYJ8xGBejJUWmlPlRS9X0543c7X5bX zx1rkwZ1Bo+l70pMgSWDs9NudY= Received: (qmail 24731 invoked by alias); 21 Feb 2014 16:30:49 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 24704 invoked by uid 89); 21 Feb 2014 16:30:48 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: service87.mimecast.com Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 21 Feb 2014 16:30:46 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Fri, 21 Feb 2014 16:30:43 +0000 Received: from [10.1.203.80] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 21 Feb 2014 16:30:42 +0000 Message-ID: <53077F31.8070003@arm.com> Date: Fri, 21 Feb 2014 16:30:41 +0000 From: Tejas Belagod User-Agent: Thunderbird 2.0.0.18 (X11/20081120) MIME-Version: 1.0 To: "gcc-patches@gcc.gnu.org" CC: Marcus Shawcroft Subject: [Patch, AArch64] Fix shuffle for big-endian. X-MC-Unique: 114022116304308501 X-IsSubscribed: yes Hi, When a shuffle of more than one input happens, on NEON we end up with a 'mixed-endian' format in the register list which TBL operates on. We don't make this correction in RTL and therefore the shuffle operation gets it incorrect. Here is a patch that fixes-up the index table in the selector rtx in RTL to also be mixed-endian to reflect what's happening on NEON. As trunk stands, this patch will not be exercised as constant vector permute for Big-endian is disabled. I've tested this by locally enabling const vec_perm and it fixes the some regressions we have on big-endian: aarch64_be-none-elf: FAIL->PASS: gcc.c-torture/execute/loop-11.c execution, -O3 -fomit-frame-pointer FAIL->PASS: gcc.c-torture/execute/loop-11.c execution, -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions FAIL->PASS: gcc.c-torture/execute/loop-11.c execution, -O3 -fomit-frame-pointer -funroll-loops FAIL->PASS: gcc.c-torture/execute/loop-11.c execution, -O3 -g FAIL->PASS: gcc.dg/torture/vector-shuffle1.c -O0 execution test FAIL->PASS: gcc.dg/torture/vshuf-v16qi.c -O2 execution test FAIL->PASS: gcc.dg/torture/vshuf-v2df.c -O2 execution test FAIL->PASS: gcc.dg/torture/vshuf-v2di.c -O2 execution test FAIL->PASS: gcc.dg/torture/vshuf-v2sf.c -O2 execution test FAIL->PASS: gcc.dg/torture/vshuf-v2si.c -O2 execution test FAIL->PASS: gcc.dg/torture/vshuf-v4sf.c -O2 execution test FAIL->PASS: gcc.dg/torture/vshuf-v4si.c -O2 execution test FAIL->PASS: gcc.dg/torture/vshuf-v8hi.c -O2 execution test FAIL->PASS: gcc.dg/torture/vshuf-v8qi.c -O2 execution test FAIL->PASS: gcc.dg/vect/vect-114.c -flto -ffat-lto-objects execution test FAIL->PASS: gcc.dg/vect/vect-114.c execution test FAIL->PASS: gcc.dg/vect/vect-15.c -flto -ffat-lto-objects execution test FAIL->PASS: gcc.dg/vect/vect-15.c execution test Also regressed on aarch64-none-elf. OK for stage-1? Thanks, Tejas. 2014-02-21 Tejas Belagod gcc/ * config/aarch64/aarch64.c (aarch64_evpc_tbl): Fix index vector for big-endian when dealing with more than one input shuffle vector. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index ea90311..fd473a3 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -8128,7 +8128,28 @@ aarch64_evpc_tbl (struct expand_vec_perm_d *d) return false; for (i = 0; i < nelt; ++i) - rperm[i] = GEN_INT (d->perm[i]); + { + int nunits = GET_MODE_NUNITS (vmode); + int elt = d->perm[i]; + + /* If two vectors, we end up with a wierd mixed-endian mode on NEON. */ + if (BYTES_BIG_ENDIAN) + { + if (!d->one_vector_p && d->perm[i] & nunits) + { + /* Extract the offset. */ + elt = d->perm[i] & (nunits - 1); + /* Reverse the top half. */ + elt = nunits - 1 - elt; + /* Offset it by the bottom half. */ + elt += nunits; + } + else + elt = nunits - 1 - d->perm[i]; + } + + rperm[i] = GEN_INT (elt); + } sel = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, rperm)); sel = force_reg (vmode, sel);