From patchwork Thu Mar 24 15:40:53 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 88216 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id F40571007D1 for ; Fri, 25 Mar 2011 02:41:06 +1100 (EST) Received: (qmail 25547 invoked by alias); 24 Mar 2011 15:41:04 -0000 Received: (qmail 25537 invoked by uid 22791); 24 Mar 2011 15:41:03 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, TW_DD X-Spam-Check-By: sourceware.org Received: from mail-wy0-f175.google.com (HELO mail-wy0-f175.google.com) (74.125.82.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 24 Mar 2011 15:40:58 +0000 Received: by wyb40 with SMTP id 40so60850wyb.20 for ; Thu, 24 Mar 2011 08:40:56 -0700 (PDT) Received: by 10.216.142.35 with SMTP id h35mr863035wej.31.1300981256737; Thu, 24 Mar 2011 08:40:56 -0700 (PDT) Received: from richards-thinkpad (gbibp9ph1--blueice2n1.emea.ibm.com [195.212.29.75]) by mx.google.com with ESMTPS id d54sm10509wej.34.2011.03.24.08.40.54 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 24 Mar 2011 08:40:55 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, patches@linaro.org, richard.sandiford@linaro.org Subject: Tighten ARM's CANNOT_CHANGE_MODE_CLASS cc: patches@linaro.org Date: Thu, 24 Mar 2011 15:40:53 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org We currently generate very poor code for tests like: #include void foo (uint32_t *a, uint32_t *b, uint32_t *c) { uint32x4x3_t x, y; x = vld3q_u32 (a); y = vld3q_u32 (b); x.val[0] = vaddq_u32 (x.val[0], y.val[0]); x.val[1] = vaddq_u32 (x.val[1], y.val[1]); x.val[2] = vaddq_u32 (x.val[2], y.val[2]); vst3q_u32 (a, x); } This is because we force the uint32x4x3_t values to the stack and then load and store the individual vectors. What we actually want is for the uint32x4x3_t values to be stored in registers, and for the individual vectors to be accessed as subregs of those registers. The first part involves some middle-end mode changes (see recent gcc@ thread), while the second part requires a change to ARM's CANNOT_CHANGE_MODE_CLASS. CANNOT_CHANGE_MODE_CLASS is defined as: /* FPA registers can't do subreg as all values are reformatted to internal precision. VFP registers may only be accessed in the mode they were set. */ #define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \ (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \ ? reg_classes_intersect_p (FPA_REGS, (CLASS)) \ || reg_classes_intersect_p (VFP_REGS, (CLASS)) \ : 0) But this VFP restriction appears to apply only to VFPv1; thanks to Peter Maydell for the archaeology. Tested on arm-linux-gnueabi. OK to install? This doesn't have any direct benefit without the middle-end mode change, but it needs to go in first in order for that change not to regress. Richard gcc/ * config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Restrict FPA_REGS case to VFPv1. Index: gcc/config/arm/arm.h =================================================================== --- gcc/config/arm/arm.h 2011-03-24 13:47:14.000000000 +0000 +++ gcc/config/arm/arm.h 2011-03-24 15:26:19.000000000 +0000 @@ -1167,12 +1167,14 @@ #define IRA_COVER_CLASSES \ } /* FPA registers can't do subreg as all values are reformatted to internal - precision. VFP registers may only be accessed in the mode they + precision. VFPv1 registers may only be accessed in the mode they were set. */ -#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \ - (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \ - ? reg_classes_intersect_p (FPA_REGS, (CLASS)) \ - || reg_classes_intersect_p (VFP_REGS, (CLASS)) \ +#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \ + (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \ + ? (reg_classes_intersect_p (FPA_REGS, (CLASS)) \ + || (TARGET_VFP \ + && arm_fpu_desc->rev == 1 \ + && reg_classes_intersect_p (VFP_REGS, (CLASS)))) \ : 0) /* The class value for index registers, and the one for base regs. */