From patchwork Sat Oct 20 11:38:24 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 192913 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 05F262C0087 for ; Sat, 20 Oct 2012 22:39:14 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1351337955; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Date:From:To:Subject:Message-ID:MIME-Version: Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=uzlMvom WiccWuHvQdKX1BgFdu4U=; b=pllPFYTZzHqFOJ2RT1vPBVSllKgf6OtvZUFofqW IUwYMRdP3Tsw3X89sX972yoTQF/5O31x7a3cFkN0IHv7ElPQTuy01grbWaxQObLi pC05bv7xGi5Ftqm4sYzgjYlFam+S9unoId7F51+VQLi/B1/qUciqUnD+4k3Fg8mP f7u0= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Date:From:To:Subject:Message-ID:MIME-Version:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=eBmfs0uXeTvtretOUzn0+gm/Q0ZHBwvgTTFWJihDdA6A1UFFeEFy1wIPla9Svk p0NiS43c+EOqE3oZzgyIrmEM9YCbY+hHEki4uB4eEGpyKO3uLOvP84KlMWHxjYS0 45SWYRmAbR2q2NWEQ7FpuFrrnQc7SEsxVYCwwTfav5pcc=; Received: (qmail 17230 invoked by alias); 20 Oct 2012 11:39:10 -0000 Received: (qmail 17221 invoked by uid 22791); 20 Oct 2012 11:39:09 -0000 X-SWARE-Spam-Status: No, hits=-3.4 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL X-Spam-Check-By: sourceware.org Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 20 Oct 2012 11:38:38 +0000 Received: from svr-orw-fem-01.mgc.mentorg.com ([147.34.98.93]) by relay1.mentorg.com with esmtp id 1TPXO0-0002Gj-9F from Julian_Brown@mentor.com ; Sat, 20 Oct 2012 04:38:36 -0700 Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by svr-orw-fem-01.mgc.mentorg.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Sat, 20 Oct 2012 04:38:36 -0700 Received: from octopus (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.1.289.1; Sat, 20 Oct 2012 12:38:33 +0100 Date: Sat, 20 Oct 2012 12:38:24 +0100 From: Julian Brown To: , Richard Earnshaw , Ramana Radhakrishnan Subject: [PATCH, ARM] Subregs of VFP registers in big-endian mode Message-ID: <20121020123824.4e9251b5@octopus> MIME-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, Quite a few tests fail for big-endian multilibs which use VFP instructions at present. One reason for many of these is glaringly obvious once you notice it: for D registers interpreted as two S registers, the lower-numbered register is always the less-significant part of the value, and the higher-numbered register the more-significant -- regardless of the endianness the processor is running in. However, for big-endian mode, when DFmode values are represented in memory (or indeed core registers), the opposite is true. So, a subreg expression such as the following will work fine on core registers (or e.g. pseudos assigned to stack slots): (subreg:SI (reg:DF) 0) but, when applied to a VFP register Dn, it should be resolved to the hard register S(n*2+1). At present though, it resolves to S(n*2) -- i.e. the wrong half of the value (for WORDS_BIG_ENDIAN, such a subreg should be the most-significant part of the value). For the relatively few cases where DFmode values are interpreted as a pair of (integer) words, this means that wrong code is generated. My feeling is that implementing a "proper" solution to this problem is probably impractical -- the closest existing macros to control behaviour aren't sufficient for this case: * FLOAT_WORDS_BIG_ENDIAN only refers to memory layout, which is correct as is it. * REG_WORDS_BIG_ENDIAN controls whether values are stored in big-endian order in registers, but refers to *all* registers. We only want to change the behaviour for the VFP registers. Defining a new macro FLOAT_REG_WORDS_BIG_ENDIAN wouldn't do, because the behaviour would differ depending on the hard register under observation: that seems like too much to ask of generic machinery in the middle-end. So, the attached patch just avoids the problem, by pretending that greater-than-word-size values in VFP registers, in big-endian mode, are opaque and cannot be subreg'ed. In practice, for at least the test case I looked at, this isn't as much of a pessimisation as you might expect -- the value in question might already be stored in core registers (e.g. for function arguments with -mfloat-abi=softfp), so can be retrieved directly from those rather than via memory. This is the testsuite delta for current FSF mainline, with multilibs adjusted to build for little/big-endian, and using options "-mbig-endian -mfloat-abi=softfp -mfpu=vfpv3" for testing: FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C -O1 execution test FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C -O2 execution test FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C -O3 -fomit-frame-pointer execution test FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C -O3 -g execution test FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C -Os execution test FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/ieee/copysign1.c execution, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/ieee/mzero6.c execution, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr35456.c execution, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution, -O1 FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution, -O2 FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution, -O2 -flto -fno-use-linker-plugin -flto-partition=none FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution, -O3 -fomit-frame-pointer FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution, -O3 -g FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution, -Og -g FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution, -Os FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/compat/scalar-by-value-3 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c -O1 execution test FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c -O2 execution test FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c -O3 -fomit-frame-pointer execution test FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c -O3 -g execution test FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c -Os execution test OK for mainline, or any comments? (I've included the multilib tweaks I used in the attached patch for reference, though I'm not proposing to apply those.) Thanks, Julian ChangeLog gcc/ * config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Avoid subreg'ing VFP D registers in big-endian mode. Index: gcc/config/arm/arm.h =================================================================== --- gcc/config/arm/arm.h (revision 192576) +++ gcc/config/arm/arm.h (working copy) @@ -1205,8 +1205,15 @@ enum reg_class /* In VFPv1, VFP registers could only be accessed in the mode they were set, so subregs would be invalid there. However, we don't support VFPv1 at the moment, and the restriction was lifted in - VFPv2. */ -#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) 0 + VFPv2. + In big-endian mode, modes greater than word size (i.e. DFmode) are stored in + VFP registers in little-endian order. We can't describe that accurately to + GCC, so avoid taking subregs of such values. */ +#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \ + (TARGET_VFP && TARGET_BIG_END \ + && (GET_MODE_SIZE (FROM) > UNITS_PER_WORD \ + || GET_MODE_SIZE (TO) > UNITS_PER_WORD) \ + && reg_classes_intersect_p (VFP_REGS, (CLASS))) /* The class value for index registers, and the one for base regs. */ #define INDEX_REG_CLASS (TARGET_THUMB1 ? LO_REGS : GENERAL_REGS) Index: gcc/config/arm/t-arm-elf =================================================================== --- gcc/config/arm/t-arm-elf (revision 192576) +++ gcc/config/arm/t-arm-elf (working copy) @@ -17,8 +17,8 @@ # along with GCC; see the file COPYING3. If not see # . -MULTILIB_OPTIONS = marm/mthumb -MULTILIB_DIRNAMES = arm thumb +MULTILIB_OPTIONS = marm +MULTILIB_DIRNAMES = arm MULTILIB_EXCEPTIONS = MULTILIB_MATCHES = @@ -49,9 +49,9 @@ MULTILIB_EXCEPTIONS += *mthumb/*mfloa # MULTILIB_DIRNAMES += ep9312 # MULTILIB_EXCEPTIONS += *mthumb/*mcpu=ep9312* # -# MULTILIB_OPTIONS += mlittle-endian/mbig-endian -# MULTILIB_DIRNAMES += le be -# MULTILIB_MATCHES += mbig-endian=mbe mlittle-endian=mle +MULTILIB_OPTIONS += mlittle-endian/mbig-endian +MULTILIB_DIRNAMES += le be +MULTILIB_MATCHES += mbig-endian=mbe mlittle-endian=mle # # MULTILIB_OPTIONS += mfloat-abi=hard/mfloat-abi=soft # MULTILIB_DIRNAMES += fpu soft