From patchwork Thu Oct 6 07:47:54 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 117954 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id DCF2DB6F97 for ; Thu, 6 Oct 2011 18:48:26 +1100 (EST) Received: (qmail 4445 invoked by alias); 6 Oct 2011 07:48:18 -0000 Received: (qmail 4429 invoked by uid 22791); 6 Oct 2011 07:48:15 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, TW_FW X-Spam-Check-By: sourceware.org Received: from mail-ww0-f51.google.com (HELO mail-ww0-f51.google.com) (74.125.82.51) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 06 Oct 2011 07:47:58 +0000 Received: by wwf10 with SMTP id 10so3270252wwf.8 for ; Thu, 06 Oct 2011 00:47:57 -0700 (PDT) Received: by 10.227.167.1 with SMTP id o1mr612838wby.6.1317887277115; Thu, 06 Oct 2011 00:47:57 -0700 (PDT) Received: from yakj.usersys.redhat.com (93-34-218-143.ip51.fastwebnet.it. [93.34.218.143]) by mx.google.com with ESMTPS id h14sm8724658wbo.7.2011.10.06.00.47.55 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 06 Oct 2011 00:47:55 -0700 (PDT) Message-ID: <4E8D5D2A.1000208@gnu.org> Date: Thu, 06 Oct 2011 09:47:54 +0200 From: Paolo Bonzini User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20110906 Thunderbird/6.0.2 MIME-Version: 1.0 To: "William J. Schmidt" CC: gcc-patches@gcc.gnu.org, bergner@vnet.ibm.com, rguenth@gcc.gnu.org Subject: Re: [PATCH] Fix PR46556 (poor address generation) References: <1317831212.4199.45.camel@oc2474580526.ibm.com> <4E8C8426.2090502@gnu.org> <1317835364.4199.55.camel@oc2474580526.ibm.com> <4E8CA99E.8000303@gnu.org> <1317845812.4199.59.camel@oc2474580526.ibm.com> In-Reply-To: <1317845812.4199.59.camel@oc2474580526.ibm.com> Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 10/05/2011 10:16 PM, William J. Schmidt wrote: > OK, I see. If there's a better place downstream to make a swizzle, I'm > certainly fine with that. > > I disabled locally_poor_mem_replacement and added some dump information > in should_replace_address to show the costs for the replacement I'm > trying to avoid: > > In should_replace_address: > old_rtx = (reg/f:DI 125 [ D.2036 ]) > new_rtx = (plus:DI (reg/v/f:DI 126 [ p ]) > (reg:DI 128)) > address_cost (old_rtx) = 0 > address_cost (new_rtx) = 0 > set_src_cost (old_rtx) = 0 > set_src_cost (new_rtx) = 4 > > In insn 11, replacing > (mem/s:SI (reg/f:DI 125 [ D.2036 ]) [2 p_1(D)->a S4 A32]) > with (mem/s:SI (plus:DI (reg/v/f:DI 126 [ p ]) > (reg:DI 128)) [2 p_1(D)->a S4 A32]) > Changed insn 11 > deferring rescan insn with uid = 11. > deferring rescan insn with uid = 11. And IIUC the other address is based on pseudo 125 as well, but the combination is (plus (plus (reg 126) (reg 128)) (const_int X)) and cannot be represented on ppc. I think _this_ is the problem, so I'm afraid your patch could cause pessimizations on x86 for example. On x86, which has a cheap REG+REG+CONST addressing mode, it is much better to propagate pseudo 125 so that you can delete the set altogether. However, indeed there is no downstream pass that undoes the transformation. Perhaps we can do it in CSE, since this _is_ CSE after all. :) The attached untested (uncompiled) patch is an attempt. Paolo Index: cse.c =================================================================== --- cse.c (revision 177688) +++ cse.c (working copy) @@ -3136,6 +3136,75 @@ find_comparison_args (enum rtx_code code return code; } +static rtx +lookup_addr (rtx insn, rtx *loc, enum machine_mode mode) +{ + struct table_elt *elt, *p; + int regno; + int hash; + int base_cost; + rtx addr = *loc; + rtx exp; + + /* Try to reuse existing registers for addresses, in hope of shortening + live ranges for the registers that compose the addresses. This happens + when you have + + (set (reg C) (plus (reg A) (reg B)) + (set (reg D) (mem (reg C))) + (set (reg E) (mem (plus (reg C) (const_int X)))) + + In this case fwprop will try to propagate into the addresses, but + if propagation into reg E fails, the only result will have been to + uselessly lengthen the live range of A and B. */ + + if (!REG_P (addr)) + return; + + regno = REGNO (addr); + if (regno == FRAME_POINTER_REGNUM + || regno == HARD_FRAME_POINTER_REGNUM + || regno == ARG_POINTER_REGNUM) + return; + + /* If this address is not in the hash table, we can't look for equivalences + of the whole address. Also, ignore if volatile. */ + + { + int save_do_not_record = do_not_record; + int save_hash_arg_in_memory = hash_arg_in_memory; + int addr_volatile; + + do_not_record = 0; + hash = HASH (addr, Pmode); + addr_volatile = do_not_record; + do_not_record = save_do_not_record; + hash_arg_in_memory = save_hash_arg_in_memory; + + if (addr_volatile) + return; + } + + /* Try to find a REG that holds the same address. */ + + elt = lookup (addr, hash, Pmode); + if (!elt) + return; + + base_cost = address_cost (*loc, mode); + for (p = elt->first_same_value; p; p = p->next_same_value) + { + exp = p->exp; + if (REG_P (exp) + && exp_equiv_p (exp, exp, 1, false) + && address_cost (exp, mode) > base_cost) + break; + } + + if (p) + validate_change (insn, loc, canon_reg (copy_rtx (exp), NULL_RTX), 0)); +} + /* If X is a nontrivial arithmetic operation on an argument for which a constant value can be determined, return the result of operating on that value, as a constant. Otherwise, return X, possibly with @@ -3180,6 +3249,12 @@ fold_rtx (rtx x, rtx insn) switch (code) { case MEM: + if ((new_rtx = equiv_constant (x)) != NULL_RTX) + return new_rtx; + if (insn) + lookup_addr (insn, &XEXP (x, 0), GET_MODE (x)); + return x; + case SUBREG: if ((new_rtx = equiv_constant (x)) != NULL_RTX) return new_rtx; Index: passes.c =================================================================== --- passes.c (revision 177688) +++ passes.c (working copy) @@ -1448,9 +1448,9 @@ init_optimization_passes (void) } NEXT_PASS (pass_web); NEXT_PASS (pass_rtl_cprop); + NEXT_PASS (pass_rtl_fwprop_addr); NEXT_PASS (pass_cse2); NEXT_PASS (pass_rtl_dse1); - NEXT_PASS (pass_rtl_fwprop_addr); NEXT_PASS (pass_inc_dec); NEXT_PASS (pass_initialize_regs); NEXT_PASS (pass_ud_rtl_dce);