From patchwork Wed Jun 30 10:56:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1498867 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=D2Lz4n8e; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=EZymYoXI; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=D2Lz4n8e; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=EZymYoXI; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GFJDj3p7Wz9sRN for ; Wed, 30 Jun 2021 20:56:20 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BE49D385C019 for ; Wed, 30 Jun 2021 10:56:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 9F9EA385AC33 for ; Wed, 30 Jun 2021 10:56:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9F9EA385AC33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from imap.suse.de (imap-alt.suse-dmz.suse.de [192.168.254.47]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 6B01B22708; Wed, 30 Jun 2021 10:56:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1625050562; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=zwh+r5R8cnM3MLyEwyIYLsQAZbS+dT8eHktqFI/mSDc=; b=D2Lz4n8eWrYFF1tPWWY/vUmfl2SHJSErEeWYxorV8mDTK5jt8ccqVKtTkKeQ8JSaFGTXdN Hc+NZilp5S9XIb/Inol0+3szLqqMdgK5F8lkxq776rtWwULQfHP3bPe7u9ub9vzgCsNsOi O3IHFGd0H+8TEEA+T6uWRNfbZ7FsJHU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1625050562; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=zwh+r5R8cnM3MLyEwyIYLsQAZbS+dT8eHktqFI/mSDc=; b=EZymYoXIm2zDUYHyUFtbuXn66Ljh4mKKFzYA2X638N50+AEtZFYpKBjOTaHMk98DNhkM/a biYJxHT4hFC6isAg== Received: from imap3-int (imap-alt.suse-dmz.suse.de [192.168.254.47]) by imap.suse.de (Postfix) with ESMTP id 46585118DD; Wed, 30 Jun 2021 10:56:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1625050562; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=zwh+r5R8cnM3MLyEwyIYLsQAZbS+dT8eHktqFI/mSDc=; b=D2Lz4n8eWrYFF1tPWWY/vUmfl2SHJSErEeWYxorV8mDTK5jt8ccqVKtTkKeQ8JSaFGTXdN Hc+NZilp5S9XIb/Inol0+3szLqqMdgK5F8lkxq776rtWwULQfHP3bPe7u9ub9vzgCsNsOi O3IHFGd0H+8TEEA+T6uWRNfbZ7FsJHU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1625050562; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=zwh+r5R8cnM3MLyEwyIYLsQAZbS+dT8eHktqFI/mSDc=; b=EZymYoXIm2zDUYHyUFtbuXn66Ljh4mKKFzYA2X638N50+AEtZFYpKBjOTaHMk98DNhkM/a biYJxHT4hFC6isAg== Received: from director2.suse.de ([192.168.254.72]) by imap3-int with ESMTPSA id emtqD8JN3GC1GgAALh3uQQ (envelope-from ); Wed, 30 Jun 2021 10:56:02 +0000 Date: Wed, 30 Jun 2021 12:56:01 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/101267 - fix SLP vect with masked operations Message-ID: <62q8s5q8-s764-852q-6qo9-598r17s49s9@fhfr.qr> MIME-Version: 1.0 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: richard.sandiford@arm.com Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" This fixes the missed handling of external/constant mask SLP operations, for the testcase in particular masked loads. The patch adjusts the vect_check_scalar_mask API to reflect the required vect_is_simple_use SLP compatible API plus adjusts for the special handling of masked loads in SLP discovery. The issue is likely latent. Lightly tested as fixing the 521.wrf_r build and being clean on vect.exp and i386.exp on x86_64. Full bootstrap and regtest running on x86_64-unknown-linux-gnu, I'll push it unless I hear otherwise. I'm quite sure that SLP masked operations test coverage is weak though. Maybe somebody can throw it at SVE[2] which should expose more masking (but eventually not SLP - I don't know about the state of SLP and masking with respect to SVE) Thanks, Richard. 2021-06-30 Richard Biener PR tree-optimization/101267 * tree-vect-stmts.c (vect_check_scalar_mask): Adjust API and use SLP compatible interface of vect_is_simple_use. Reject not vectorized SLP defs for callers that do not support that. (vect_check_store_rhs): Handle masked stores and pass down the appropriate operator index. (vectorizable_call): Adjust. (vectorizable_store): Likewise. (vectorizable_load): Likewise. Handle SLP pecularity of masked loads. (vect_is_simple_use): Remove special-casing of masked stores. * gfortran.dg/pr101267.f90: New testcase. --- gcc/testsuite/gfortran.dg/pr101267.f90 | 23 +++++++ gcc/tree-vect-stmts.c | 92 +++++++++++++++----------- 2 files changed, 77 insertions(+), 38 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/pr101267.f90 diff --git a/gcc/testsuite/gfortran.dg/pr101267.f90 b/gcc/testsuite/gfortran.dg/pr101267.f90 new file mode 100644 index 00000000000..12723cf9c22 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr101267.f90 @@ -0,0 +1,23 @@ +! { dg-do compile } +! { dg-options "-Ofast" } +! { dg-additional-options "-march=znver2" { target x86_64-*-* i?86-*-* } } + SUBROUTINE sfddagd( regime, znt,ite ,jte ) + REAL, DIMENSION( ime, IN) :: regime, znt + REAL, DIMENSION( ite, jte) :: wndcor_u + LOGICAL wrf_dm_on_monitor + IF( int4 == 1 ) THEN + DO j=jts,jtf + DO i=itsu,itf + reg = regime(i, j) + IF( reg > 10.0 ) THEN + znt0 = znt(i-1, j) + znt(i, j) + IF( znt0 <= 0.2) THEN + wndcor_u(i,j) = 0.2 + ENDIF + ENDIF + ENDDO + ENDDO + IF ( wrf_dm_on_monitor()) THEN + ENDIF + ENDIF + END diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index 4ee11b2041a..e590f34d75d 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -2439,17 +2439,31 @@ get_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, return true; } -/* Return true if boolean argument MASK is suitable for vectorizing - conditional operation STMT_INFO. When returning true, store the type - of the definition in *MASK_DT_OUT and the type of the vectorized mask - in *MASK_VECTYPE_OUT. */ +/* Return true if boolean argument at MASK_INDEX is suitable for vectorizing + conditional operation STMT_INFO. When returning true, store the mask + in *MASK, the type of its definition in *MASK_DT_OUT, the type of the + vectorized mask in *MASK_VECTYPE_OUT and the SLP node corresponding + to the mask in *MASK_NODE if MASK_NODE is not NULL. */ static bool -vect_check_scalar_mask (vec_info *vinfo, stmt_vec_info stmt_info, tree mask, - vect_def_type *mask_dt_out, - tree *mask_vectype_out) +vect_check_scalar_mask (vec_info *vinfo, stmt_vec_info stmt_info, + slp_tree slp_node, unsigned mask_index, + tree *mask, slp_tree *mask_node, + vect_def_type *mask_dt_out, tree *mask_vectype_out) { - if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (mask))) + enum vect_def_type mask_dt; + tree mask_vectype; + slp_tree mask_node_1; + if (!vect_is_simple_use (vinfo, stmt_info, slp_node, mask_index, + mask, &mask_node_1, &mask_dt, &mask_vectype)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "mask use not simple.\n"); + return false; + } + + if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (*mask))) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -2457,7 +2471,7 @@ vect_check_scalar_mask (vec_info *vinfo, stmt_vec_info stmt_info, tree mask, return false; } - if (TREE_CODE (mask) != SSA_NAME) + if (TREE_CODE (*mask) != SSA_NAME) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -2465,13 +2479,15 @@ vect_check_scalar_mask (vec_info *vinfo, stmt_vec_info stmt_info, tree mask, return false; } - enum vect_def_type mask_dt; - tree mask_vectype; - if (!vect_is_simple_use (mask, vinfo, &mask_dt, &mask_vectype)) + /* If the caller is not prepared for adjusting an external/constant + SLP mask vector type fail. */ + if (slp_node + && !mask_node + && SLP_TREE_DEF_TYPE (mask_node_1) != vect_internal_def) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "mask use not simple.\n"); + "SLP mask argument is not vectorized.\n"); return false; } @@ -2501,6 +2517,8 @@ vect_check_scalar_mask (vec_info *vinfo, stmt_vec_info stmt_info, tree mask, *mask_dt_out = mask_dt; *mask_vectype_out = mask_vectype; + if (mask_node) + *mask_node = mask_node_1; return true; } @@ -2525,10 +2543,18 @@ vect_check_store_rhs (vec_info *vinfo, stmt_vec_info stmt_info, return false; } + unsigned op_no = 0; + if (gcall *call = dyn_cast (stmt_info->stmt)) + { + if (gimple_call_internal_p (call) + && internal_store_fn_p (gimple_call_internal_fn (call))) + op_no = internal_fn_stored_value_index (gimple_call_internal_fn (call)); + } + enum vect_def_type rhs_dt; tree rhs_vectype; slp_tree slp_op; - if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 0, + if (!vect_is_simple_use (vinfo, stmt_info, slp_node, op_no, &rhs, &slp_op, &rhs_dt, &rhs_vectype)) { if (dump_enabled_p ()) @@ -3163,9 +3189,8 @@ vectorizable_call (vec_info *vinfo, { if ((int) i == mask_opno) { - op = gimple_call_arg (stmt, i); - if (!vect_check_scalar_mask (vinfo, - stmt_info, op, &dt[i], &vectypes[i])) + if (!vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_opno, + &op, &slp_op[i], &dt[i], &vectypes[i])) return false; continue; } @@ -7213,13 +7238,10 @@ vectorizable_store (vec_info *vinfo, } int mask_index = internal_fn_mask_index (ifn); - if (mask_index >= 0) - { - mask = gimple_call_arg (call, mask_index); - if (!vect_check_scalar_mask (vinfo, stmt_info, mask, &mask_dt, - &mask_vectype)) - return false; - } + if (mask_index >= 0 + && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_index, + &mask, NULL, &mask_dt, &mask_vectype)) + return false; } op = vect_get_store_rhs (stmt_info); @@ -8494,13 +8516,13 @@ vectorizable_load (vec_info *vinfo, return false; int mask_index = internal_fn_mask_index (ifn); - if (mask_index >= 0) - { - mask = gimple_call_arg (call, mask_index); - if (!vect_check_scalar_mask (vinfo, stmt_info, mask, &mask_dt, - &mask_vectype)) - return false; - } + if (mask_index >= 0 + && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, + /* ??? For SLP we only have operands for + the mask operand. */ + slp_node ? 0 : mask_index, + &mask, NULL, &mask_dt, &mask_vectype)) + return false; } tree vectype = STMT_VINFO_VECTYPE (stmt_info); @@ -11484,13 +11506,7 @@ vect_is_simple_use (vec_info *vinfo, stmt_vec_info stmt, slp_tree slp_node, *op = gimple_op (ass, operand + 1); } else if (gcall *call = dyn_cast (stmt->stmt)) - { - if (gimple_call_internal_p (call) - && internal_store_fn_p (gimple_call_internal_fn (call))) - operand = internal_fn_stored_value_index (gimple_call_internal_fn - (call)); - *op = gimple_call_arg (call, operand); - } + *op = gimple_call_arg (call, operand); else gcc_unreachable (); return vect_is_simple_use (*op, vinfo, dt, vectype, def_stmt_info_out);