From patchwork Thu Dec 17 14:51:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 1417706 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=PqjfuSP8; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CxZh54dbQz9sWK for ; Fri, 18 Dec 2020 01:51:31 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B75153890414; Thu, 17 Dec 2020 14:51:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B75153890414 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1608216688; bh=EGKGJ5pjx1doeGlDEcDhAoxQvzSbH6IEfqpeNCSPQSM=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=PqjfuSP8/PS5GS5+UP5IiYbF6HaNKdLnkjCdt2Bqyw+IJ23NvPenXKqfirwhkRdcM iO5id4/bdL+mlfA7HKf42/XZRXXUdybRL6NMvu+DrtvT5kgKp6q5DuSg36WN7Nyafz NIG8C8PdGcqbD/kRhQRhiS6eO3UZU8pA/Rl6XXGc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id 9A2CD388E807 for ; Thu, 17 Dec 2020 14:51:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9A2CD388E807 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-33-JKTls_tNOIO4pvghSV0RZQ-1; Thu, 17 Dec 2020 09:51:22 -0500 X-MC-Unique: JKTls_tNOIO4pvghSV0RZQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 51DAE800D62; Thu, 17 Dec 2020 14:51:21 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-112-11.ams2.redhat.com [10.36.112.11]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D4BEB10023B4; Thu, 17 Dec 2020 14:51:20 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 0BHEpITd2650884 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 17 Dec 2020 15:51:19 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 0BHEpH212650883; Thu, 17 Dec 2020 15:51:17 +0100 Date: Thu, 17 Dec 2020 15:51:17 +0100 To: Richard Biener Subject: [PATCH] store-merging: Handle vector CONSTRUCTORs using bswap [PR96239] Message-ID: <20201217145117.GS3788@tucnak> References: <20201215091505.GN3788@tucnak> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-5.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" On Wed, Dec 16, 2020 at 09:29:31AM +0100, Richard Biener wrote: > I think it probably makes sense to have some helper split out that > collects & classifies vector constructor components we can use from > both forwprop (where matching the V_C_E from integer could be done > as well IMHO) and bswap (when a permute is involved) and store-merging. I've tried to add such helper, but handling over just analysis and letting each pass handle it differently seems complicated given the limitations of the bswap infrastructure. So, this patch just hooks the optimization also into store-merging so that the original testcase from the PR can be fixed. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Note, I have no idea why the bswap code needs TODO_update_ssa if it changed things, for the vuses it copies them from the surrounding vuses, which looks correct to me. Perhaps because it uses force_gimple_operand_gsi* in a few spots in bswap_replace? Confused... 2020-12-17 Jakub Jelinek PR tree-optimization/96239 * gimple-ssa-store-merging.c (maybe_optimize_vector_constructor): New function. (get_status_for_store_merging): Don't return BB_INVALID for blocks with potential bswap optimizable CONSTRUCTORs. (pass_store_merging::execute): Optimize vector CONSTRUCTORs with bswap if possible. * gcc.dg/tree-ssa/pr96239.c: New test. Jakub --- gcc/gimple-ssa-store-merging.c.jj 2020-12-16 13:07:51.729733816 +0100 +++ gcc/gimple-ssa-store-merging.c 2020-12-16 16:02:06.238868137 +0100 @@ -1255,6 +1255,75 @@ bswap_replace (gimple_stmt_iterator gsi, return tgt; } +/* Try to optimize an assignment CUR_STMT with CONSTRUCTOR on the rhs + using bswap optimizations. CDI_DOMINATORS need to be + computed on entry. Return true if it has been optimized and + TODO_update_ssa is needed. */ + +static bool +maybe_optimize_vector_constructor (gimple *cur_stmt) +{ + tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type; + struct symbolic_number n; + bool bswap; + + gcc_assert (is_gimple_assign (cur_stmt) + && gimple_assign_rhs_code (cur_stmt) == CONSTRUCTOR); + + tree rhs = gimple_assign_rhs1 (cur_stmt); + if (!VECTOR_TYPE_P (TREE_TYPE (rhs)) + || !INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (rhs))) + || gimple_assign_lhs (cur_stmt) == NULL_TREE) + return false; + + HOST_WIDE_INT sz = int_size_in_bytes (TREE_TYPE (rhs)) * BITS_PER_UNIT; + switch (sz) + { + case 16: + load_type = bswap_type = uint16_type_node; + break; + case 32: + if (builtin_decl_explicit_p (BUILT_IN_BSWAP32) + && optab_handler (bswap_optab, SImode) != CODE_FOR_nothing) + { + load_type = uint32_type_node; + fndecl = builtin_decl_explicit (BUILT_IN_BSWAP32); + bswap_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl))); + } + else + return false; + break; + case 64: + if (builtin_decl_explicit_p (BUILT_IN_BSWAP64) + && (optab_handler (bswap_optab, DImode) != CODE_FOR_nothing + || (word_mode == SImode + && builtin_decl_explicit_p (BUILT_IN_BSWAP32) + && optab_handler (bswap_optab, SImode) != CODE_FOR_nothing))) + { + load_type = uint64_type_node; + fndecl = builtin_decl_explicit (BUILT_IN_BSWAP64); + bswap_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl))); + } + else + return false; + break; + default: + return false; + } + + gimple *ins_stmt = find_bswap_or_nop (cur_stmt, &n, &bswap); + if (!ins_stmt || n.range != (unsigned HOST_WIDE_INT) sz) + return false; + + if (bswap && !fndecl && n.range != 16) + return false; + + memset (&nop_stats, 0, sizeof (nop_stats)); + memset (&bswap_stats, 0, sizeof (bswap_stats)); + return bswap_replace (gsi_for_stmt (cur_stmt), ins_stmt, fndecl, + bswap_type, load_type, &n, bswap) != NULL_TREE; +} + /* Find manual byte swap implementations as well as load in a given endianness. Byte swaps are turned into a bswap builtin invokation while endian loads are converted to bswap builtin invokation or @@ -5126,6 +5195,7 @@ static enum basic_block_status get_status_for_store_merging (basic_block bb) { unsigned int num_statements = 0; + unsigned int num_constructors = 0; gimple_stmt_iterator gsi; edge e; @@ -5138,9 +5208,27 @@ get_status_for_store_merging (basic_bloc if (store_valid_for_store_merging_p (stmt) && ++num_statements >= 2) break; + + if (is_gimple_assign (stmt) + && gimple_assign_rhs_code (stmt) == CONSTRUCTOR) + { + tree rhs = gimple_assign_rhs1 (stmt); + if (VECTOR_TYPE_P (TREE_TYPE (rhs)) + && INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (rhs))) + && gimple_assign_lhs (stmt) != NULL_TREE) + { + HOST_WIDE_INT sz + = int_size_in_bytes (TREE_TYPE (rhs)) * BITS_PER_UNIT; + if (sz == 16 || sz == 32 || sz == 64) + { + num_constructors = 1; + break; + } + } + } } - if (num_statements == 0) + if (num_statements == 0 && num_constructors == 0) return BB_INVALID; if (cfun->can_throw_non_call_exceptions && cfun->eh @@ -5149,7 +5237,7 @@ get_status_for_store_merging (basic_bloc && e->dest == bb->next_bb) return BB_EXTENDED_VALID; - return num_statements >= 2 ? BB_VALID : BB_INVALID; + return (num_statements >= 2 || num_constructors) ? BB_VALID : BB_INVALID; } /* Entry point for the pass. Go over each basic block recording chains of @@ -5163,6 +5251,7 @@ pass_store_merging::execute (function *f basic_block bb; hash_set orig_stmts; bool changed = false, open_chains = false; + unsigned todo = 0; /* If the function can throw and catch non-call exceptions, we'll be trying to merge stores across different basic blocks so we need to first unsplit @@ -5189,9 +5278,10 @@ pass_store_merging::execute (function *f if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Processing basic block <%d>:\n", bb->index); - for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); ) { gimple *stmt = gsi_stmt (gsi); + gsi_next (&gsi); if (is_gimple_debug (stmt)) continue; @@ -5207,6 +5297,14 @@ pass_store_merging::execute (function *f continue; } + if (is_gimple_assign (stmt) + && gimple_assign_rhs_code (stmt) == CONSTRUCTOR + && maybe_optimize_vector_constructor (stmt)) + { + todo = TODO_update_ssa; + continue; + } + if (store_valid_for_store_merging_p (stmt)) changed |= process_store (stmt); else @@ -5230,10 +5328,10 @@ pass_store_merging::execute (function *f if (cfun->can_throw_non_call_exceptions && cfun->eh && changed) { free_dominance_info (CDI_DOMINATORS); - return TODO_cleanup_cfg; + return TODO_cleanup_cfg | todo; } - return 0; + return todo; } } // anon namespace --- gcc/testsuite/gcc.dg/tree-ssa/pr96239.c.jj 2020-12-16 16:10:30.013256862 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/pr96239.c 2020-12-16 16:11:08.802822537 +0100 @@ -0,0 +1,18 @@ +/* PR tree-optimization/96239 */ +/* { dg-do compile { target { ilp32 || lp64 } } } */ +/* { dg-options "-O3 -fdump-tree-optimized" } */ +/* { dg-final { scan-tree-dump " r>> 8;" "optimized" { target bswap } } } */ + +union U { unsigned char c[2]; unsigned short s; }; + +unsigned short +foo (unsigned short x) +{ + union U u; + u.s = x; + unsigned char v = u.c[0]; + unsigned char w = u.c[1]; + u.c[0] = w; + u.c[1] = v; + return u.s; +}