From patchwork Tue Feb 2 14:56:46 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 577203 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 7CE2C140321 for ; Wed, 3 Feb 2016 01:57:22 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=mVs4NRZQ; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=eLN2c3KgT8ZGRfnf Te/8x2VbI3YF8BNV/n+/T+ahnJ7Xn7g+Ur+oJxmHpGN0zTGP6aGU+iZ2SIcyCO0g RwxRh+UBJnYHd3DieLF0RxznktGkULrK3FV/4NwgQl1Z1kypO7ecxGhVYKjMcz9f tXMOHSQQIS9lA9JnJIXs4b6bxn0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=default; bh=fAsqeyFtktSZq1bgvlha5M WYaRo=; b=mVs4NRZQ8msksG07/hx1FjKVIX2/YyuRfEGwS1D33Rm1TaB/ymZKgh b9sXNdVxh0N5x8G3Fcu09b2rHisIlf0mnZY6Cr8eM32l5ubyKkZNA7ZeS5cMgllH CC5Efl/7MTaGfuqPQL4ZCffSWR14eKp+nyBykf4ASr5BFO0bwDpXg= Received: (qmail 6462 invoked by alias); 2 Feb 2016 14:57:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 6408 invoked by uid 89); 2 Feb 2016 14:57:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 spammy=H*r:15.1.409, Hx-spam-relays-external:15.1.409.7, H*RU:15.1.409.7, H*r:ip*15.1.409.7 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 02 Feb 2016 14:57:11 +0000 Received: from emea01-db3-obe.outbound.protection.outlook.com (mail-db3lrp0083.outbound.protection.outlook.com [213.199.154.83]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-2-FPVBHGfbQ7-NZT_qhTQOVw-1; Tue, 02 Feb 2016 14:57:06 +0000 Received: from AM3PR08CA0043.eurprd08.prod.outlook.com (10.163.23.139) by AM4PR08MB0883.eurprd08.prod.outlook.com (10.164.83.29) with Microsoft SMTP Server (TLS) id 15.1.396.15; Tue, 2 Feb 2016 14:57:05 +0000 Received: from DB3FFO11FD027.protection.gbl (2a01:111:f400:7e04::160) by AM3PR08CA0043.outlook.office365.com (2a01:111:e400:8854::11) with Microsoft SMTP Server (TLS) id 15.1.396.15 via Frontend Transport; Tue, 2 Feb 2016 14:57:05 +0000 Received: from nebula.arm.com (217.140.96.140) by DB3FFO11FD027.mail.protection.outlook.com (10.47.217.58) with Microsoft SMTP Server (TLS) id 15.1.409.7 via Frontend Transport; Tue, 2 Feb 2016 14:57:04 +0000 Received: from localhost (10.1.2.79) by mail.arm.com (10.1.105.66) with Microsoft SMTP Server id 14.3.266.1; Tue, 2 Feb 2016 14:56:45 +0000 From: Richard Sandiford To: Mail-Followup-To: gcc-patches@gcc.gnu.org, nd@arm.com, ubizjak@gmail.com, jakub@redhat.com, vmakarov@redhat.com, richard.sandiford@arm.com CC: , , , Subject: PR 69577: Invalid RA of destination subregs Date: Tue, 2 Feb 2016 14:56:46 +0000 Message-ID: <871t8vjh3l.fsf@e105548-lin.cambridge.arm.com> User-Agent: Gnus/5.130012 (Ma Gnus v0.12) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1; DB3FFO11FD027; 1:FUtNnycZ3CUoIz/cKeccjhTX3tvZm612U9c4cjYIIdPUtdy7ITdZx077eHC9Y2REqFFPC0YC6/N50KTA0qxJAuSTj7dYGISfFDjD/GJRt10LIgBaqlzJoXmR+9+yQMfP1Bp1nNwYKYod+BbPWJZThE2C/AEEA6xfA1fJzitvOrQp1iPt7hgsYqJG9zNQUTJ7Vd2T7yaOQel+k3rDt1AvnR5vBXcbPUMDaEs9FF+AgwtjGyr41yo4hmrZt2Vobl8/zEw8ewHI9xwVFadHz2zrBFoL4G/HWDBi/HeLUrXmaVTagKPVBivrSX34vNyMrBGR66/+hE52jpDzlghBdfIZOL4zx3VsbLC2/4/0D4IPqUi8vZV0cfeamOQBryjW+ypf6EOqpskwnSVeaTTID0h7Y3ZBximMVcigB1DhI1s+JZY= X-Forefront-Antispam-Report: CIP:217.140.96.140; CTRY:GB; IPV:CAL; IPV:NLI; EFV:NLI; SFV:NSPM; SFS:(10009020)(6009001)(2980300002)(438002)(199003)(164054003)(189002)(50466002)(189998001)(5008740100001)(83506001)(110136002)(54356999)(1220700001)(6806005)(1096002)(104016004)(86362001)(4326007)(2351001)(4001350100001)(47776003)(3470700001)(50986999)(229853001)(106466001)(586003)(57986006)(33646002)(2906002)(77096005)(76506005)(87936001)(5003940100001)(11100500001)(48376002)(92566002); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR08MB0883; H:nebula.arm.com; FPR:; SPF:Pass; MLV:sfv; A:1; MX:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; AM4PR08MB0883; 2:N73bMYMU904TZxXRUrHWaT+PQ3BogBXnlwKr6KWsFh+rAoflkv0i0NpxdnwYqIUr7YwMxmuRN1xEjSPC27v1EF2QnrSxZOCfcYSPb9fT6WMX30pXih+w60X77WQ6ocm80CeUEl6lzPP+ksYu7FmZlw==; 3:bB+qDcfrKb4OsQEZPz4s1jGr4H9M0FDJjyUUW9FRZ90IJedqeE4acqsSUduCOvVpLP5HZDKUaS9Hfik9WSBlbfAqR5w8zScDcrnHru6W0Vz7wRq/DknUh6Cnhc7pgwi4sQ8YaoEeZZoy884vyFNrHXrLPHttUJ423yXNy37kudg8vPXyBW7fGxSgDvfGOZWlPoPb3F/dE2R3QaC5txbUQgbtzwbt2T0pzy6ud+GF2JRQmaN+x5MGa8IZYWv/DrpUslTxh5gvK0kPvVX6+/grJQ==; 25:Z/WK7TkvIeHcB7sbnBWPYaVPqFoTdTMB7/4vOok5F5BLOYfEzivbQODyfev/X9UCF0/kpmib/fYon0ZUWLgVrfhUDXtPyYOxhiEytWB5g1OJBSnja32YqNj/XG3qfQJ2YjVtR2LS/ff7wYwxKUPFlAxkFvVrBp4QHLFnkuqAGoCvDkYRm+s9YDzxDeHsrQyDZsaNZ5BVzP/gDEI200be5CZX3Qx57lUh9Ef6Y4t0T9pmzVwESacXjmZHAOpaMIv9; 20:wCk1bJlMEm4fUkFnadui/81zQNNcDlsPrDglsihg45floOoAd0HHGL4GSLahSBrNjRKwzC30ryYcMwjiwki0+ANzW3HicnDCPTbXx4YxopPeUWqnbD0STaaAYxfmChcbXVGeylOAAMNN6cOV55/oqiZJ0rt0giDjhTt/hm52TKA= X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(8251501001); SRVR:AM4PR08MB0883; X-MS-Office365-Filtering-Correlation-Id: 55b0e5e7-58e3-4bfa-3d6b-08d32be11d52 NoDisclaimer: True X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(13020025)(13024025)(13023025)(5005006)(13013025)(13016025)(13018025)(8121501046)(3002001)(10201501046); SRVR:AM4PR08MB0883; BCL:0; PCL:0; RULEID:; SRVR:AM4PR08MB0883; X-Microsoft-Exchange-Diagnostics: 1; AM4PR08MB0883; 4:PQRCmC7GvFdV1E6gbwuv68e83SMRJA5jm8hjHO2r0mG+XFulGwgPPQ+oXmQ3no2jL9tR+YwxV/G3E3vGq2Zp3JqCi0AoBPrxPtecp6aFe+Evr2DpSEfq0/pkb2vzrANz5ozRChPfyaDcGW/RSjSblNDHu4T2dRePVJdvYeo5Uml4KcQRQpxCiCEdhgISuuTYZzeRQaucnXNBDJ9dLgut9NsH7Bs+IcYWaMfdihm/KlJjMHNgTxbnLZHAxEwqZdwMnBRr3VmAtHYBDYvAZgjYfuLy1oE5MPHr8MD08RM465/wBAWfO2j3Hfb16KaixUUh+hGDkzwAZszApsscXAiabcyrqBZoD8VmVtkx7jdo0VEBp7TzDagFQMf7gvbZGuRKorvpDy8v0+f2k+iUZdpAQaIiXSOZgQfDu6/4RwWkxZfIk1C97k9n4OxxS6aagfnMdzhTkfUQ03oy9pClGgDSI76tdgK3sEkMHIGbSCLAjxo= X-Forefront-PRVS: 084080FC15 X-Microsoft-Exchange-Diagnostics: 1; AM4PR08MB0883; 23:fglj2MXFq5nxeJYDxOwLJer1r822xQKTHysDsNPl1A8AQfvrjWGs5MWbRMasmynnQUmm87q46+wRv6b91VAbP56iqDL5O7pE0+IGWzI1FgSz/ab+6fj/mZ+mLmpBRL8n3ulYAT+zRBh7ZA2mm5i/iR3lrfGQWuV6Zlkb28MU7uy3UN+g3rSM8WetQ+/VZTVJeSmaNGKCGgxb0bQw3lhkrTAnN3F6MW51he38X4gd/FYFWLIJ3v9NjteTh5wE9ePLlnqCRxl+XmSzHICibgBZbt+PkJ53LTXtFgb8+c+1/NZ4YoA9hP9jtUKpxGVudRTe1AUv+dKf37U/2lIE8CHqiTx+atvJJieqyCdyNz79zpMnCmBH3CTHJLE0UkkOiiBp1L7Fxs5ti6AHcX3nbvHWTH4A+N3t0MLkV9MeR/sS2pkhQfHz1e99/XjnmHCacTeBgoA6iz5VOzbaPy0yVqyIK7FrgsvwPsLJUoiQxzCh9O4tsPMV2ptA1O18raC1qwjCy3Bj3O4/AoBqQTGIcJ6s6R6xluiqP1GkR8SgaFl4a77doiqDPHJSGC0c2w0G3essC54px1TkekuM2r+XJHzivBfP34+MG68QueuGglplDF6HwjAZS/3wSg2XUvdjurJg33ZuPtAnvd3ku/S80DmaGJq2OLc2z5QTWLmqDsQ9DYMwua9hFtoYIAsM0OFrc8EltLkeJd8d7Fd3EYkkXXfVz2KLp06kr6M4J/bVvhe+1ph7fJFA+W0KqPrxZhNPHsU8S1lWBl2Ewopz8Im9TuyakbZDAtE6fd3F72ibNT3dDoKzBwB2/gkzlZN70JZGy+oQowRhI2E8aPlCrgGZ1eQo9gJN1MxS1OR8JNMpqYiAN8EVdRt+EFiTn//bhaa48Vikmh5VguSnAgR2Mui/hxDPXMUH5GkbPskMa56Y4sFZSEZQp20A9wfJuBCjU6Kjp7ZD X-Microsoft-Exchange-Diagnostics: 1; AM4PR08MB0883; 5:fXykZbXQxvPZQk5OsrD/MHAElCJ8/qFcPFvcZn4iAJpAD9TEPh+HVtHMawKXG1b3DWGNHDZnUM7cleuYRitnyELb9nXoJeMoU4WORb7pq5K23azK/KxiDY+j9Ly8SczoxFGzGIHrRnbkdgbyqQHg+w==; 24:Q/V/+TK8k5rjLk3W0nGgc9amtcLOWXUTh1p03johTcAbIy3oQAEI112mOxUcEeoVsuVdHFZDMZanC8n7N8pKMKE9Xto8/z4RWGItSRfC6z0=; 20:Dec51D2qSUAwYkv5Cjb+s4X4JX1LF4VqkPYj2rOA0cwlizY8cJC2PEJQ6G1FJb2j8YywqS1Bu4r6A3FmA00efTzJTtgIrFC5dS49vxDzXi19GOaxtJxjDpMK0s/uSDx5p06Wra45RzIPrUmqQtIWgjxHZahtP+aSLM+ymquTB60= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Feb 2016 14:57:04.5972 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[217.140.96.140]; Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR08MB0883 X-MC-Unique: FPVBHGfbQ7-NZT_qhTQOVw-1 In PR 69577 we have: A: (set (reg:V2TI X) ...) B: (set (subreg:TI (reg:V2TI X) 0) ...) X gets allocated to an AVX register, as usual for V2TI. The problem is that the movti for B doesn't then preserve the other half of X, even though the subreg semantics are supposed to guarantee that. If instead the same value had been set by: A': (set (subreg:TI (reg:V2TI X) 16) ...) B: (set (subreg:TI (reg:V2TI X) 0) ...) the subreg in A' would have prevented the use of AVX registers for X, since you can't directly access the high part. IMO these are really the same thing. An alternative way to view it is that the original sequence is equivalent to: A: (set (reg:V2TI X) ...) B1: (set (subreg:TI (reg:V2TI X) 0) ...) B2: (set (subreg:TI (reg:V2TI X) 16) (subreg:TI (reg:V2TI X) 16)) in which B2 is a no-op and therefore implicit. The handling ought to be the same regardless of whether there is an rtl insn that explicitly assigns to (subreg:TI (reg:V2TI X) 16). This patch implements that idea. Hopefully the comments explain what's going on. Tested on x86_64-linux-gnu so far. Will test on aarch64-linux-gnu and arm-linux-gnueabihf as well. OK to install if the additional testing succeeds? Thanks, Richard diff --git a/gcc/reginfo.c b/gcc/reginfo.c index 6814eed..afb36aa 100644 --- a/gcc/reginfo.c +++ b/gcc/reginfo.c @@ -1244,8 +1244,16 @@ simplifiable_subregs (const subreg_shape &shape) static HARD_REG_SET **valid_mode_changes; static obstack valid_mode_changes_obstack; +/* Restrict the choice of register for SUBREG_REG (SUBREG) based + on information about SUBREG. + + If PARTIAL_DEF, SUBREG is a partial definition of a multipart inner + register and we want to ensure that the other parts of the inner + register are correctly preserved. If !PARTIAL_DEF we need to + ensure that SUBREG itself can be formed. */ + static void -record_subregs_of_mode (rtx subreg) +record_subregs_of_mode (rtx subreg, bool partial_def) { unsigned int regno; @@ -1256,15 +1264,41 @@ record_subregs_of_mode (rtx subreg) if (regno < FIRST_PSEUDO_REGISTER) return; + subreg_shape shape (shape_of_subreg (subreg)); + if (partial_def) + { + /* The number of independently-accessible SHAPE.outer_mode values + in SHAPE.inner_mode is GET_MODE_SIZE (SHAPE.inner_mode) / SIZE. + We need to check that the assignment will preserve all the other + SIZE-byte chunks in the inner register besides the one that + includes SUBREG. + + In practice it is enough to check whether an equivalent + SHAPE.inner_mode value in an adjacent SIZE-byte chunk can be formed. + If the underlying registers are small enough, both subregs will + be valid. If the underlying registers are too large, one of the + subregs will be invalid. + + This relies on the fact that we've already been passed + SUBREG with PARTIAL_DEF set to false. */ + unsigned int size = MAX (REGMODE_NATURAL_SIZE (shape.inner_mode), + GET_MODE_SIZE (shape.outer_mode)); + gcc_checking_assert (size < GET_MODE_SIZE (shape.inner_mode)); + if (shape.offset >= size) + shape.offset -= size; + else + shape.offset += size; + } + if (valid_mode_changes[regno]) AND_HARD_REG_SET (*valid_mode_changes[regno], - simplifiable_subregs (shape_of_subreg (subreg))); + simplifiable_subregs (shape)); else { valid_mode_changes[regno] = XOBNEW (&valid_mode_changes_obstack, HARD_REG_SET); COPY_HARD_REG_SET (*valid_mode_changes[regno], - simplifiable_subregs (shape_of_subreg (subreg))); + simplifiable_subregs (shape)); } } @@ -1277,7 +1311,7 @@ find_subregs_of_mode (rtx x) int i; if (code == SUBREG) - record_subregs_of_mode (x); + record_subregs_of_mode (x, false); /* Time for some deep diving. */ for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) @@ -1304,8 +1338,15 @@ init_subregs_of_mode (void) FOR_EACH_BB_FN (bb, cfun) FOR_BB_INSNS (bb, insn) - if (NONDEBUG_INSN_P (insn)) - find_subregs_of_mode (PATTERN (insn)); + { + if (NONDEBUG_INSN_P (insn)) + find_subregs_of_mode (PATTERN (insn)); + df_ref def; + FOR_EACH_INSN_DEF (def, insn) + if (DF_REF_FLAGS_IS_SET (def, DF_REF_PARTIAL) + && df_read_modify_subreg_p (DF_REF_REG (def))) + record_subregs_of_mode (DF_REF_REG (def), true); + } } const HARD_REG_SET * diff --git a/gcc/testsuite/gcc.target/i386/pr69577.c b/gcc/testsuite/gcc.target/i386/pr69577.c new file mode 100644 index 0000000..d680539 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr69577.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-require-effective-target avx } */ +/* { dg-require-effective-target int128 } */ +/* { dg-options "-O -fno-forward-propagate -fno-split-wide-types -mavx" } */ + +typedef unsigned int u32; +typedef unsigned __int128 u128; +typedef unsigned __int128 v32u128 __attribute__ ((vector_size (32))); + +u128 __attribute__ ((noinline, noclone)) +foo (u32 u32_0, v32u128 v32u128_0) +{ + v32u128_0[0] >>= u32_0; + v32u128_0 += (v32u128) {u32_0, 0}; + return u32_0 + v32u128_0[0] + v32u128_0[1]; +} + +int +main() +{ + u128 x = foo (1, (v32u128) {1, 4}); + if (x != 6) + __builtin_abort (); + return 0; +}