From patchwork Fri Nov 25 18:48:50 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 127744 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 5986DB6F76 for ; Sat, 26 Nov 2011 05:49:16 +1100 (EST) Received: (qmail 24894 invoked by alias); 25 Nov 2011 18:49:12 -0000 Received: (qmail 24589 invoked by uid 22791); 25 Nov 2011 18:49:10 -0000 X-SWARE-Spam-Status: No, hits=-7.2 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, SPF_HELO_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 25 Nov 2011 18:48:52 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pAPImpdl030270 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 25 Nov 2011 13:48:51 -0500 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [10.16.42.4]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id pAPImovm005973 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 25 Nov 2011 13:48:51 -0500 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [127.0.0.1]) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4) with ESMTP id pAPImofl027119 for ; Fri, 25 Nov 2011 19:48:50 +0100 Received: (from jakub@localhost) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4/Submit) id pAPImolB027117 for gcc-patches@gcc.gnu.org; Fri, 25 Nov 2011 19:48:50 +0100 Date: Fri, 25 Nov 2011 19:48:50 +0100 From: Jakub Jelinek To: gcc-patches@gcc.gnu.org Subject: [PATCH] Make sibcall argument overlap check less pessimistic (PR middle-end/50074) Message-ID: <20111125184850.GQ27242@tyan-ft48-01.lab.bos.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi! Kirill's recent change to mem_overlaps_already_clobbered_arg_p resulted in various code quality regressions, many calls that used to be tail call optimized no longer are. Here is an attempt to make the check more complete (e.g. the change wouldn't see overlap if addr was PLUS of two REGs, where one of the REGs was based on internal_arg_pointer, etc.) and less pessimistic. As tree-tailcall.c doesn't allow tail calls from functions that have address of any of the caller's parameters taken, IMHO it is enough to look for internal_arg_pointer based pseudos initialized in the tail call sequence. This patch scans the tail call sequence and notes which pseudos are based on internal_arg_pointer (and what offset from that pointer they have) and uses that in mem_overlaps_already_clobbered_arg_p. Bootstrapped/regtested on x86_64-linux and i686-linux, tested on some testcases on ia64-linux (as an example of target which doesn't have reg + disp addressing and thus forces everything into registers). Ok for trunk? 2011-11-25 Jakub Jelinek PR middle-end/50074 * calls.c (internal_arg_pointer_seq_start, internal_arg_pointer_cache): New variables. (internal_arg_pointer_based_reg_1): New function. (internal_arg_pointer_based_reg): New function. (mem_overlaps_already_clobbered_arg_p): Use it. (expand_call): Free internal_arg_pointer_cache vector and clear internal_arg_pointer_seq_start. Jakub --- gcc/calls.c.jj 2011-11-08 23:35:12.000000000 +0100 +++ gcc/calls.c 2011-11-25 17:24:52.445878841 +0100 @@ -1658,6 +1658,106 @@ rtx_for_function_call (tree fndecl, tree return funexp; } +/* Last insn that has been already scanned by internal_arg_pointer_based_reg, + or NULL_RTX if none has been scanned yet. */ +static rtx internal_arg_pointer_seq_start; +/* Vector indexed by REGNO () - FIRST_PSEUDO_REGISTER, recoding if a pseudo + is based on crtl->args.internal_arg_pointer. It is NULL_RTX if not based + on it, some CONST_INT as offset from crtl->args.internal_arg_pointer + or PC for unknown offset from it. */ +static VEC(rtx, heap) *internal_arg_pointer_cache; + +static rtx internal_arg_pointer_based_reg (rtx, bool); + +/* Helper function for internal_arg_pointer_based_reg, called through + for_each_rtx. Return 1 if a crtl->args.internal_arg_pointer based + register is seen anywhere. */ + +static int +internal_arg_pointer_based_reg_1 (rtx *loc, void *data ATTRIBUTE_UNUSED) +{ + if (REG_P (*loc) && internal_arg_pointer_based_reg (*loc, false) != NULL_RTX) + return 1; + if (MEM_P (*loc)) + return -1; + return 0; +} + +/* If REG is based on crtl->args.internal_arg_pointer, return either + a CONST_INT offset from crtl->args.internal_arg_pointer if + offset from it is known constant, or PC if the offset is unknown. + Return NULL_RTX if REG isn't based on crtl->args.internal_arg_pointer. */ + +static rtx +internal_arg_pointer_based_reg (rtx reg, bool scan) +{ + rtx insn; + + if (CONSTANT_P (reg)) + return NULL_RTX; + + if (reg == crtl->args.internal_arg_pointer) + return const0_rtx; + + if (REG_P (reg) && REGNO (reg) < FIRST_PSEUDO_REGISTER) + return NULL_RTX; + + if (GET_CODE (reg) == PLUS && CONST_INT_P (XEXP (reg, 1))) + { + rtx val = internal_arg_pointer_based_reg (XEXP (reg, 0), scan); + if (val == NULL_RTX || val == pc_rtx) + return val; + return plus_constant (val, INTVAL (XEXP (reg, 1))); + } + + if (!scan) + insn = NULL_RTX; + else if (internal_arg_pointer_seq_start == NULL_RTX) + insn = get_insns (); + else + insn = NEXT_INSN (internal_arg_pointer_seq_start); + while (insn) + { + rtx set = single_set (insn); + if (set + && REG_P (SET_DEST (set)) + && REGNO (SET_DEST (set)) >= FIRST_PSEUDO_REGISTER) + { + rtx val = NULL_RTX; + unsigned int idx = REGNO (SET_DEST (set)) - FIRST_PSEUDO_REGISTER; + /* Punt on pseudos set multiple times. */ + if (idx < VEC_length (rtx, internal_arg_pointer_cache) + && VEC_index (rtx, internal_arg_pointer_cache, idx) + != NULL_RTX) + val = pc_rtx; + else + val = internal_arg_pointer_based_reg (SET_SRC (set), false); + if (val != NULL_RTX) + { + VEC_safe_grow_cleared (rtx, heap, internal_arg_pointer_cache, + idx + 1); + VEC_replace (rtx, internal_arg_pointer_cache, idx, val); + } + } + if (NEXT_INSN (insn) == NULL_RTX) + internal_arg_pointer_seq_start = insn; + insn = NEXT_INSN (insn); + } + + if (REG_P (reg)) + { + unsigned int idx = REGNO (reg) - FIRST_PSEUDO_REGISTER; + if (idx < VEC_length (rtx, internal_arg_pointer_cache)) + return VEC_index (rtx, internal_arg_pointer_cache, idx); + else + return NULL_RTX; + } + + if (for_each_rtx (®, internal_arg_pointer_based_reg_1, NULL)) + return pc_rtx; + return NULL_RTX; +} + /* Return true if and only if SIZE storage units (usually bytes) starting from address ADDR overlap with already clobbered argument area. This function is used to determine if we should give up a @@ -1667,24 +1767,15 @@ static bool mem_overlaps_already_clobbered_arg_p (rtx addr, unsigned HOST_WIDE_INT size) { HOST_WIDE_INT i; + rtx val; - if (addr == crtl->args.internal_arg_pointer) - i = 0; - else if (GET_CODE (addr) == PLUS - && XEXP (addr, 0) == crtl->args.internal_arg_pointer - && CONST_INT_P (XEXP (addr, 1))) - i = INTVAL (XEXP (addr, 1)); - /* Return true for arg pointer based indexed addressing. */ - else if (GET_CODE (addr) == PLUS - && (XEXP (addr, 0) == crtl->args.internal_arg_pointer - || XEXP (addr, 1) == crtl->args.internal_arg_pointer)) - return true; - /* If the address comes in a register, we have no idea of its origin so - give up and conservatively return true. */ - else if (REG_P(addr)) + val = internal_arg_pointer_based_reg (addr, true); + if (val == NULL_RTX) + return false; + else if (val == pc_rtx) return true; else - return false; + i = INTVAL (val); #ifdef ARGS_GROW_DOWNWARD i = -i - size; @@ -3292,6 +3383,8 @@ expand_call (tree exp, rtx target, int i } sbitmap_free (stored_args_map); + internal_arg_pointer_seq_start = NULL_RTX; + VEC_free (rtx, heap, internal_arg_pointer_cache); } else {