From patchwork Fri Oct 19 21:16:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Bergner X-Patchwork-Id: 987046 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-487913-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="ZfpU0iXr"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42cJcw4n01z9vZs for ; Sat, 20 Oct 2018 08:16:30 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:date:mime-version:content-type :content-transfer-encoding:message-id; q=dns; s=default; b=MajKH tnIhc2HF0c6NVZFov01KCIJKNrUL72D6oXM8ovHczXebdiHZvdqIjUi2YS4defLG /5xXSe/IT+WWgNEEy5ppXCM3VugkeEEjRVu9cbOPViEirld8q/Yw/wj4M62sc5sy AJs1goDlfMnhU9K0rJfFN0Gb35yCZWMhD22iJY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:date:mime-version:content-type :content-transfer-encoding:message-id; s=default; bh=xm7+rOZSUze 1X4CenBv2m5t9Iwg=; b=ZfpU0iXrJGtf/2QZdEaBOHZqQ423lUyZ+IMwpKVnnEr 7D87nPZl5vahgZ9ryvEsfVY28g9NYRvZ4BCbFxMqM+f5VGw4GR8X6Eyb0kffZss7 PMGkwRzGQQ4/C6QR4trVVq0HYCfKBaGGwwDt5m7XaYMEuXbQ4t4hgtif11M1CIGE = Received: (qmail 38248 invoked by alias); 19 Oct 2018 21:16:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 38198 invoked by uid 89); 19 Oct 2018 21:16:22 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.1 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_NUMSUBJECT, KHOP_DYNAMIC, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=proved, new1, x16, vlad X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 19 Oct 2018 21:16:19 +0000 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w9JLDga9134625 for ; Fri, 19 Oct 2018 17:16:17 -0400 Received: from e15.ny.us.ibm.com (e15.ny.us.ibm.com [129.33.205.205]) by mx0a-001b2d01.pphosted.com with ESMTP id 2n7ka4s5me-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 19 Oct 2018 17:16:17 -0400 Received: from localhost by e15.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 19 Oct 2018 17:16:16 -0400 Received: from b01cxnp23033.gho.pok.ibm.com (9.57.198.28) by e15.ny.us.ibm.com (146.89.104.202) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Fri, 19 Oct 2018 17:16:14 -0400 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w9JLGDe829163680 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Fri, 19 Oct 2018 21:16:13 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1C32E112066; Fri, 19 Oct 2018 21:16:13 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A9FE5112064; Fri, 19 Oct 2018 21:16:12 +0000 (GMT) Received: from otta.local (unknown [9.85.196.123]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP; Fri, 19 Oct 2018 21:16:12 +0000 (GMT) From: Peter Bergner Subject: [RFC][PATCH LRA] WIP patch to fix one part of PR87507 To: Vladimir N Makarov , Jeff Law , Segher Boessenkool Cc: GCC Patches Date: Fri, 19 Oct 2018 16:16:12 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 x-cbid: 18101921-0068-0000-0000-000003504B0C X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009903; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000268; SDB=6.01105099; UDB=6.00572151; IPR=6.00885149; MB=3.00023828; MTD=3.00000008; XFM=3.00000015; UTC=2018-10-19 21:16:15 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18101921-0069-0000-0000-00004621C309 Message-Id: <2c69adfc-b9db-687d-3a32-08f2efdde647@linux.ibm.com> X-IsSubscribed: yes Vlad, Jeff and Segher, I think I have determined what is happening with the aarch64 test case that is failing after my r264897 commit. It appears my patch is just exposing an issue in lra-constraints.c:process_alt_operands() when processing an insn with early clobber operands. Jeff & Segher, I have reverted all of my previous patches we were making to lra-lives.c for the analysis below and am just using straight trunk revision 264897 (ie, my last patch). The aarch64 test case is below. Note that the first 2 input/output operands are early clobber and are used with asm defined hard regs (x0 and x1). Also notice that oldval1 and oldval2 are pseudos with the same values as those in/out operands input values. long __cmpxchg_double (unsigned long arg) { unsigned long old1 = 1; unsigned long old2 = arg; unsigned long new1 = 2; unsigned long new2 = 3; void *ptr = 0; unsigned long oldval1 = old1; unsigned long oldval2 = old2; register unsigned long x0 asm ("x0") = old1; register unsigned long x1 asm ("x1") = old2; register unsigned long x2 asm ("x2") = new1; register unsigned long x3 asm ("x3") = new2; register unsigned long x4 asm ("x4") = (unsigned long) ptr; asm volatile ("casp %[old1], %[old2], %[new1], %[new2], %[v]\n" "eor %[old1], %[old1], %[oldval1]\n" "eor %[old2], %[old2], %[oldval2]\n" "orr %[old1], %[old1], %[old2]\n" : [old1] "+&r" (x0), [old2] "+&r" (x1), [v] "+Q" (* (unsigned long *) ptr) : [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), [oldval1] "r" (oldval1), [oldval2] "r" (oldval2) : "x16", "x17", "x30"); return x0; } After IRA, we have the following register assignment and rtl. The only difference between before my patch and after my patch is that r92 used to be assigned to a hardreg that is not x0 - x4, which hides the lurking problem. r92 -> x1, r93 -> x5, r94 -> x6 (insn 2 3 8 2 (set (reg/v:DI 92 [arg]) (reg:DI 0 x0)) (REG_DEAD (reg:DI 0 x0))) (insn 8 2 7 2 (set (reg/v:DI 2 x2 [*x2]) (const_int 2))) (insn 7 8 6 2 (set (reg/v:DI 1 x1 [*x1]) (reg/v:DI 92 [arg]))) (insn 6 7 9 2 (set (reg/v:DI 0 x0 [*x0]) (const_int 1))) (insn 9 6 12 2 (set (reg/v:DI 3 x3 [*x3]) (const_int 3))) (insn 12 9 10 2 (set (reg:DI 94 [*x0]) (reg/v:DI 0 x0 [*x0]))) (insn 10 12 11 2 (set (reg/v:DI 4 x4 [*x4]) (const_int 0))) (insn 11 10 14 2 (set (reg/f:DI 93) (const_int 0))) (insn 14 11 21 2 (parallel [ (set (reg/v:DI 0 x0 [*x0]) (asm_operands/v:DI ("casp %0, %1, %3, %4, %2 eor %0, %0, %6 eor %1, %1, %7 orr %0, %0, %1 ") ("=&r") 0 [(reg/v:DI 2 x2 [*x2]) (reg/v:DI 3 x3 [*x3]) (reg/v:DI 4 x4 [*x4]) (reg:DI 94 [*x0]) (reg/v:DI 92 [arg]) (reg/v:DI 0 x0 [*x0]) (reg/v:DI 1 x1 [*x1]) (mem:DI (reg/f:DI 93))] [(asm_input:DI ("r")) (asm_input:DI ("r")) (asm_input:DI ("r")) (asm_input:DI ("r")) (asm_input:DI ("r")) (asm_input:DI ("0")) (asm_input:DI ("1")) (asm_input:DI ("Q"))]) (set (reg/v:DI 1 x1 [*x1]) ") ("=&r") 1 [] (set (mem:DI (reg/f:DI 93) [1 MEM[(long unsigned intD.11 *)0B]+0 S8 A128]) ") ("=Q") 2 [] (clobber (reg:DI 30 x30)) (clobber (reg:DI 17 x17)) (clobber (reg:DI 16 x16)) ]) "slub-min.c":17 -1 (expr_list:REG_DEAD (reg:DI 94 [*x0]) (expr_list:REG_DEAD (reg/f:DI 93) (expr_list:REG_DEAD (reg/v:DI 92 [arg]) (expr_list:REG_DEAD (reg/v:DI 4 x4 [*x4]) (expr_list:REG_DEAD (reg/v:DI 3 x3 [*x3]) (expr_list:REG_DEAD (reg/v:DI 2 x2 [*x2]) (expr_list:REG_UNUSED (reg:DI 30 x30) (expr_list:REG_UNUSED (reg:DI 17 x17) (expr_list:REG_UNUSED (reg:DI 16 x16) (expr_list:REG_UNUSED (reg/v:DI 1 x1 [*x1]) (nil)))))))))))) (insn 21 14 23 2 (use (reg/i:DI 0 x0))) In lra-constraints.c:process_alt_operands(), we notice that pseudo 92 is assigned to x1 and that an early clobber operand is also assigned to x1, or rather, that it uses x1 explicitly. This is enough to trigger reload(s), but the problem is we end up trying to reload the early clobber operand which has been forced into x1 via register asm assignment, instead of pseudo 92 which conflicts with it. The problematic code in question is: /* If earlyclobber operand conflicts with another non-matching operand which is actually the same register as the earlyclobber operand, it is better to reload the another operand as an operand matching the earlyclobber operand can be also the same. */ if (first_conflict_j == last_conflict_j && operand_reg[last_conflict_j] != NULL_RTX && ! curr_alt_match_win[last_conflict_j] && REGNO (operand_reg[i]) == REGNO (operand_reg[last_conflict_j])) { curr_alt_win[last_conflict_j] = false; curr_alt_dont_inherit_ops[curr_alt_dont_inherit_ops_num++] = last_conflict_j; losers++; /* Early clobber was already reflected in REJECT. */ lra_assert (reject > 0); if (lra_dump_file != NULL) fprintf (lra_dump_file, " %d Conflict early clobber reload: reject--\n", i); reject--; overall += LRA_LOSER_COST_FACTOR - 1; } else { /* We need to reload early clobbered register and the matched registers. */ .... In our test case, first_conflict_j == last_conflict_j and they point to the operand pseudo 92. The first three tests in the "if" condition are all true and then we check: REGNO (operand_reg[i]) == REGNO (operand_reg[last_conflict_j]) Now REGNO (operand_reg[i]) is our 2nd early clobber operand and is regno x1. However, REGNO (operand_reg[last_conflict_j]) returns regno 92, not the hard register pseudo 92 was assigned to (ie, x1). This causes us to execute the "else" clause of this code which tries to reload the early clobber operand. We end up replacing the hard coded x1 in the 2nd early clobber operand and then assign it to some reg like x7 and then the assembler barfs because it isn't x1 which it has to be. Now the REGNO (...) == REGNO (...) test predates the code that added first_conflict_j and last_conflict_j, so I think we can just eliminate that code altogether, since we've already proved there is a conflict in register usage between the early clobber operand and this non-matching operand. If I eliminate that check, we correctly reload input pseudo 92 and everything is good. Secondly, I don't think it's ever legal to reload an operand that has been hard coded by the user to a hard register via a register asm definition like in the test case above. With that in mind, what do people think of the patch below? This fixes the AARCH64 test case. However, it ICE's on the following test case: long foo (long arg) { register long var asm("x0"); asm("bla %0 %1" : "+&r"(var) : "r"(arg)); return var; } ...but that is due to a combine bug where combine replaces the use of "arg"'s pseudo in the inline asm with the incoming argument reg x0 which should be illegal. Ie, it's taking a valid inline asm and creating an invalid one since the earlyclobber op and non-matching op have both been forced to use the same hard reg. Segher has a combine patch to stop that which he is going to submit (commit?) soon. With my patch, we are also now able to catch the following user error, when before we could not: bergner@pike:~/gcc/BUGS/PR87507/AARCH64$ cat ice4.i long foo (void) { register long arg asm("x1"); register long var asm("x1"); asm ("bla %0 %1" : "=&r"(arg) : "r"(var)); return arg; } bergner@pike:~/gcc/BUGS/PR87507/AARCH64$ .../xgcc -B.../gcc -O2 -march=armv8.1-a -S ice4.i ice4.i: In function ‘foo’: ice4.i:7:1: error: unable to generate reloads for: 7 | } | ^ (insn 5 6 10 2 (set (reg/v:DI 1 x1 [ arg ]) (asm_operands:DI ("bla %0 %1") ("=&r") 0 [ (reg/v:DI 1 x1 [ var ]) ] [ (asm_input:DI ("r") ice4.i:5) ] [] ice4.i:5)) "ice4.i":5 -1 (nil)) during RTL pass: reload ice4.i:7:1: internal compiler error: in process_alt_operands, at lra-constraints.c:2911 ... Thoughts? I'll note that this does not fix the S390 bugs, since those seem to be due to problems with early clobber operands and "matching" constraint operands. I'm still working on that and hope to have something soon. I am currently bootstrapping this on s390x just to make sure I don't break anything there, before debugging the existing s390x issue(s). I can't test this on aarch64 or arm, other than knowing my aarch64 cross doesn't ICE on the test case above. Peter * lra-constraints.c (process_alt_operands): Abort on illegal hard register usage. Prefer reloading non hard register operands. Index: gcc/lra-constraints.c =================================================================== --- gcc/lra-constraints.c (revision 264897) +++ gcc/lra-constraints.c (working copy) @@ -2904,18 +2904,29 @@ process_alt_operands (int only_alternati if (first_conflict_j < 0) first_conflict_j = j; last_conflict_j = j; + /* Both the earlyclobber operand and conflicting operand + cannot both be hard registers. */ + if (REGNO (operand_reg[i]) < FIRST_PSEUDO_REGISTER + && operand_reg[j] != NULL_RTX + && REGNO (operand_reg[j]) < FIRST_PSEUDO_REGISTER) + fatal_insn ("unable to generate reloads for:", curr_insn); } if (last_conflict_j < 0) continue; - /* If earlyclobber operand conflicts with another - non-matching operand which is actually the same register - as the earlyclobber operand, it is better to reload the - another operand as an operand matching the earlyclobber - operand can be also the same. */ - if (first_conflict_j == last_conflict_j - && operand_reg[last_conflict_j] != NULL_RTX - && ! curr_alt_match_win[last_conflict_j] - && REGNO (operand_reg[i]) == REGNO (operand_reg[last_conflict_j])) + + /* If an earlyclobber operand conflicts with another non-matching + operand (ie, they have been assigned the same hard register), + then it is better to reload the other operand, as there may + exist yet another operand with a matching constraint associated + with the earlyclobber operand. However, if one of the operands + is an explicit use of a hard register, then we must reload the + other non-hard register operand. */ + if (REGNO (operand_reg[i]) < FIRST_PSEUDO_REGISTER + || (operand_reg[last_conflict_j] != NULL_RTX + && REGNO (operand_reg[last_conflict_j]) + >= FIRST_PSEUDO_REGISTER + && first_conflict_j == last_conflict_j + && ! curr_alt_match_win[last_conflict_j])) { curr_alt_win[last_conflict_j] = false; curr_alt_dont_inherit_ops[curr_alt_dont_inherit_ops_num++]