From patchwork Thu Feb 28 12:26:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: JiangNing OS X-Patchwork-Id: 1049421 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-497151-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=os.amperecomputing.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="ZW8Zif2C"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=amperemail.onmicrosoft.com header.i=@amperemail.onmicrosoft.com header.b="Z0suAMmW"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 449Bd137Cmz9s5R for ; Thu, 28 Feb 2019 23:26:59 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type :content-transfer-encoding:mime-version; q=dns; s=default; b=UQE 273/S4svw6UayeQwnEmZP4XVKM3i4Ycm4ZeBa1Qso2CP8rp1r89JWpOKjGBGVTyX OdrC70wysUTg8H+dLE9CBv+6CB/kM5Gb8Mfw0i35PlLHhQG90/Q0LSrSuz3DK4jX MXS/EeK9C0jQfuhvFtMYu5MFdYEjiypI3iWj2T7s= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type :content-transfer-encoding:mime-version; s=default; bh=GB+0H6OSD mnWaw0ontYhWXwOGPI=; b=ZW8Zif2CuPdGIq/L9qaRpwvUZahtybQXFqwAPcEw/ XAa0VQC3q/SsCNv/Af6oO5p8jsFN97UC/txA5qklveADd+8mH1X2bbi6cRyrHLIt j6E4SHbNerQ0moK2xtG1QMWjVW7GbHJSyDsstS42mMOy8fGIDbaKu2bWR61fm3IF iU= Received: (qmail 44158 invoked by alias); 28 Feb 2019 12:26:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 44117 invoked by uid 89); 28 Feb 2019 12:26:50 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-8.6 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS, UNSUBSCRIBE_BODY autolearn=ham version=3.3.2 spammy=if-then, rtlanalc, UD:rtlanal.c, xexp X-HELO: NAM05-CO1-obe.outbound.protection.outlook.com Received: from mail-eopbgr720108.outbound.protection.outlook.com (HELO NAM05-CO1-obe.outbound.protection.outlook.com) (40.107.72.108) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 28 Feb 2019 12:26:47 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amperemail.onmicrosoft.com; s=selector1-os-amperecomputing-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jOfqeyUFKppQpfpzS0AvWV0dgJsoLOKIc7gXPOwnTRY=; b=Z0suAMmWdfenmQAgkeFFz05QRjDAJt4jLh1ZLKlIFH+aFJUguCCHKibp0ji6l7eSDoM+gTg+3x3SVLcnHQPkhI/mSOsdurZmF9ZaG+LNoll8Io22Zc4NAd8P48DoAQdDeJdHUG/sFRiK+w3bn6OZ3nb0TDd7qCX/hklHAhJvuek= Received: from BL0PR0102MB3588.prod.exchangelabs.com (52.132.18.138) by BL0PR0102MB3475.prod.exchangelabs.com (52.132.17.156) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1665.15; Thu, 28 Feb 2019 12:26:43 +0000 Received: from BL0PR0102MB3588.prod.exchangelabs.com ([fe80::c0a1:c97:e53b:2a69]) by BL0PR0102MB3588.prod.exchangelabs.com ([fe80::c0a1:c97:e53b:2a69%2]) with mapi id 15.20.1665.015; Thu, 28 Feb 2019 12:26:42 +0000 From: JiangNing OS To: "gcc-patches@gcc.gnu.org" , Richard Biener , "pinskia@gcc.gnu.org" Subject: Fixing ifcvt issue as exposed by BZ89430 Date: Thu, 28 Feb 2019 12:26:42 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=jiangning@os.amperecomputing.com; x-ms-exchange-purlcount: 2 received-spf: None (protection.outlook.com: os.amperecomputing.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 MIME-Version: 1.0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-IsSubscribed: yes To solve BZ89430 the followings are needed, (1) The code below in noce_try_cmove_arith needs to be fixed. /* ??? We could handle this if we knew that a load from A or B could not trap or fault. This is also true if we've already loaded from the address along the path from ENTRY. */ else if (may_trap_or_fault_p (a) || may_trap_or_fault_p (b)) return FALSE; Finding dominating memory access could help to decide whether the memory access following the conditional move is valid or not. * If there is a dominating memory write to the same memory address in test_bb as the one from x=a, it is always safe. * When the dominating memory access to the same memory address in test_bb is a read, for the load out of x=a, it is always safe. For the store out of x=a, if the memory is on stack, it is still safe. Besides, there is a bug around rearranging the then_bb and else_bb layout, which needs to be fixed. (2) The fix by https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02998.html is an overkill. If the write target following conditional move is a memory access, it exits shortly. if (!set_b && MEM_P (orig_x)) /* We want to avoid store speculation to avoid cases like if (pthread_mutex_trylock(mutex)) ++global_variable; Rather than go to much effort here, we rely on the SSA optimizers, which do a good enough job these days. */ return FALSE; It looks a bit hard for compiler to know the program semantic is related to mutex and avoid store speculation. Without removing this overkill, the fix mentioned by (1) would not work. Any idea? An alternative solution is to detect the following pattern conservatively, if (a_function_call(...)) ++local_variable; If the local_variable doesn't have address taken at all in current function, it mustn't be a pthread mutex lock semantic, and then the following patch would work to solve (1) and pass the last case as described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89430. Thanks, -Jiangning Index: gcc/cse.c =================================================================== --- gcc/cse.c (revision 268256) +++ gcc/cse.c (working copy) @@ -540,7 +540,6 @@ already as part of an already processed extended basic block. */ static sbitmap cse_visited_basic_blocks; -static bool fixed_base_plus_p (rtx x); static int notreg_cost (rtx, machine_mode, enum rtx_code, int); static int preferable (int, int, int, int); static void new_basic_block (void); @@ -606,30 +605,7 @@ static const struct rtl_hooks cse_rtl_hooks = RTL_HOOKS_INITIALIZER; -/* Nonzero if X has the form (PLUS frame-pointer integer). */ -static bool -fixed_base_plus_p (rtx x) -{ - switch (GET_CODE (x)) - { - case REG: - if (x == frame_pointer_rtx || x == hard_frame_pointer_rtx) - return true; - if (x == arg_pointer_rtx && fixed_regs[ARG_POINTER_REGNUM]) - return true; - return false; - - case PLUS: - if (!CONST_INT_P (XEXP (x, 1))) - return false; - return fixed_base_plus_p (XEXP (x, 0)); - - default: - return false; - } -} - /* Dump the expressions in the equivalence class indicated by CLASSP. This function is used only for debugging. */ DEBUG_FUNCTION void Index: gcc/ifcvt.c =================================================================== --- gcc/ifcvt.c (revision 268256) +++ gcc/ifcvt.c (working copy) @@ -76,6 +76,9 @@ /* Whether conditional execution changes were made. */ static int cond_exec_changed_p; +/* bitmap for stack frame pointer definition insns. */ +static bitmap bba_sets_sfp; + /* Forward references. */ static int count_bb_insns (const_basic_block); static bool cheap_bb_rtx_cost_p (const_basic_block, profile_probability, int); @@ -99,6 +102,7 @@ edge, int); static void noce_emit_move_insn (rtx, rtx); static rtx_insn *block_has_only_trap (basic_block); +static void collect_all_fp_insns (void); /* Count the number of non-jump active insns in BB. */ @@ -2029,6 +2033,110 @@ return true; } +/* Return true if MEM x is on stack. a_insn contains x, if it exists. */ + +static bool +noce_mem_is_on_stack (rtx_insn *a_insn, const_rtx x) +{ + df_ref use; + + gcc_assert (x); + gcc_assert (MEM_P (x)); + + /* Early exits if find base register is a stack register. */ + rtx a = XEXP (x, 0); + if (fixed_base_plus_p(a)) + return true; + + if (!a_insn) + return false; + + /* Check if x is on stack. Assume a mem expression using registers + related to stack register is always on stack. */ + FOR_EACH_INSN_USE (use, a_insn) + { + if (bitmap_bit_p (bba_sets_sfp, DF_REF_REGNO (use))) + return true; + } + + return false; +} + +/* Always return true, if there is a dominating write. + + When there is a dominating read from memory on stack, + 1) if x = a is a memory read, return true. + 2) if x = a is a memory write, return true if the memory is on stack. + This is the guarantee the memory is *not* readonly. */ + +static bool +noce_valid_for_dominating (basic_block bb, rtx_insn *a_insn, + const_rtx x, bool is_store) +{ + rtx_insn *insn; + rtx set; + bool find_load = false; + + gcc_assert (MEM_P (x)); + + FOR_BB_INSNS (bb, insn) + { + set = single_set (insn); + if (!set) + continue; + + /* Dominating store */ + if (rtx_equal_p (x, SET_DEST (set))) + return true; + + /* Dominating load */ + if (rtx_equal_p (x, SET_SRC (set))) + find_load = true; + } + + if (find_load) + { + if (is_store && noce_mem_is_on_stack (a_insn, x)) + return true; + } + + return false; +} + +/* Return false if the memory a or b must be valid. + This function must be called before latent swap of a and b. */ + +static bool +noce_mem_maybe_invalid_p (struct noce_if_info *if_info) +{ + /* for memory load */ + if (if_info->a && MEM_P (if_info->a)) + { + return !noce_valid_for_dominating (if_info->test_bb, + if_info->insn_a, + if_info->a, false); + } + + /* for memory store */ + else if (if_info->b && MEM_P (if_info->b)) + { + if (!if_info->else_bb) + return !noce_valid_for_dominating (if_info->test_bb, + if_info->insn_a, + if_info->b, true); + else + return !noce_valid_for_dominating (if_info->test_bb, + if_info->insn_b, + if_info->b, true); + } + + /* ??? We could handle this if we knew that a load from A or B could + not trap or fault. This is also true if we've already loaded + from the address along the path from ENTRY. */ + return (may_trap_or_fault_p (if_info->a) + || may_trap_or_fault_p (if_info->b)); +} + /* Try more complex cases involving conditional_move. */ static int @@ -2065,10 +2173,7 @@ is_mem = 1; } - /* ??? We could handle this if we knew that a load from A or B could - not trap or fault. This is also true if we've already loaded - from the address along the path from ENTRY. */ - else if (may_trap_or_fault_p (a) || may_trap_or_fault_p (b)) + else if (noce_mem_maybe_invalid_p (if_info)) return FALSE; /* if (test) x = a + b; else x = c - d; @@ -2234,7 +2339,7 @@ /* If insn to set up A clobbers any registers B depends on, try to swap insn that sets up A with the one that sets up B. If even that doesn't help, punt. */ - if (modified_in_a && !modified_in_b) + if (!modified_in_a && modified_in_b) { if (!noce_emit_bb (emit_b, else_bb, b_simple)) goto end_seq_and_fail; @@ -2242,7 +2347,7 @@ if (!noce_emit_bb (emit_a, then_bb, a_simple)) goto end_seq_and_fail; } - else if (!modified_in_a) + else if (!modified_in_b) { if (!noce_emit_bb (emit_a, then_bb, a_simple)) goto end_seq_and_fail; @@ -5347,6 +5452,30 @@ return FALSE; } + +static void +collect_all_fp_insns (void) +{ + rtx_insn *a_insn; + bba_sets_sfp = BITMAP_ALLOC (®_obstack); + df_ref def; + basic_block bb; + + FOR_ALL_BB_FN (bb, cfun) + FOR_BB_INSNS (bb, a_insn) + { + rtx sset_a = single_set (a_insn); + if (!sset_a) + continue; + + if (fixed_base_plus_p (SET_SRC (sset_a))) + { + FOR_EACH_INSN_DEF (def, a_insn) + bitmap_set_bit (bba_sets_sfp, DF_REF_REGNO (def)); + } + } +} + /* Main entry point for all if-conversion. AFTER_COMBINE is true if we are after combine pass. */ @@ -5381,6 +5510,8 @@ df_set_flags (DF_LR_RUN_DCE); + collect_all_fp_insns (); + /* Go through each of the basic blocks looking for things to convert. If we have conditional execution, we make multiple passes to allow us to handle IF-THEN{-ELSE} blocks within other IF-THEN{-ELSE} blocks. */ @@ -5413,6 +5544,8 @@ } while (cond_exec_changed_p); + BITMAP_FREE (bba_sets_sfp); + #ifdef IFCVT_MULTIPLE_DUMPS if (dump_file) fprintf (dump_file, "\n\n========== no more changes\n"); Index: gcc/rtl.h =================================================================== --- gcc/rtl.h (revision 268256) +++ gcc/rtl.h (working copy) @@ -3751,6 +3751,8 @@ #define hard_frame_pointer_rtx (global_rtl[GR_HARD_FRAME_POINTER]) #define arg_pointer_rtx (global_rtl[GR_ARG_POINTER]) +extern bool fixed_base_plus_p (rtx x); + #ifndef GENERATOR_FILE /* Return the attributes of a MEM rtx. */ static inline const struct mem_attrs * Index: gcc/rtlanal.c =================================================================== --- gcc/rtlanal.c (revision 268256) +++ gcc/rtlanal.c (working copy) @@ -341,6 +341,30 @@ return 0; } +/* Nonzero if X has the form (PLUS frame-pointer integer). */ + +bool +fixed_base_plus_p (rtx x) +{ + switch (GET_CODE (x)) + { + case REG: + if (x == frame_pointer_rtx || x == hard_frame_pointer_rtx) + return true; + if (x == arg_pointer_rtx && fixed_regs[ARG_POINTER_REGNUM]) + return true; + return false; + + case PLUS: + if (!CONST_INT_P (XEXP (x, 1))) + return false; + return fixed_base_plus_p (XEXP (x, 0)); + + default: + return false; + } +} + /* Compute an approximation for the offset between the register FROM and TO for the current function, as it was at the start of the routine. */