From patchwork Sun Mar 23 16:13:46 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Glisse X-Patchwork-Id: 332896 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id ADCE22C00C5 for ; Mon, 24 Mar 2014 03:14:08 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; q=dns; s=default; b=bydCM0sVG1rFBTCZ fze8QnjB4RdXrsM/71CsdDjqksCg4D/fo94mQaDz4u1Knf0GjbpAx78xzP2Qf4Vv m4ARMUX3hDd9xbkgJ5oTti/7OCK3idqpNkP7QrFDOf4pc3eZQRVMBtb3ojHAs1up ygQiVo2yTzLvH2/hw+SaVfl+9Y0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; s=default; bh=7pvbl7JMKahLlfNBLn9162 kxfN8=; b=Wiz+w0d7/pYNeGoHRYr5bkiATk4LlsrxzK1WSBovbT2gcd1sWJtyIl HB6Dnz2PN5Th8Zo1IeQ5Gb1YDfjiSMeSymRpBENhBrvnIrD6lowQjht07dJSRgX1 Z+ZJen4UM4cu0d27IvjQQUZu4y0bP4w9hFmVsUwFIlklN5dsm8CHU= Received: (qmail 3992 invoked by alias); 23 Mar 2014 16:13:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 3973 invoked by uid 89); 23 Mar 2014 16:13:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.6 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mail3-relais-sop.national.inria.fr Received: from mail3-relais-sop.national.inria.fr (HELO mail3-relais-sop.national.inria.fr) (192.134.164.104) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Sun, 23 Mar 2014 16:13:50 +0000 Received: from stedding.saclay.inria.fr ([193.55.250.194]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES128-SHA; 23 Mar 2014 17:13:46 +0100 Received: from glisse (helo=localhost) by stedding.saclay.inria.fr with local-esmtp (Exim 4.82) (envelope-from ) id 1WRl1u-0003tS-Cr; Sun, 23 Mar 2014 17:13:46 +0100 Date: Sun, 23 Mar 2014 17:13:46 +0100 (CET) From: Marc Glisse To: Richard Biener cc: GCC Patches Subject: Re: calloc = malloc + memset In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 On Mon, 3 Mar 2014, Richard Biener wrote: > That's a bit much of ad-hoc pattern-matching ... wouldn't be > p = malloc (n); > memset (p, 0, n); > > transform better suited to the strlen opt pass? After all that tracks > what 'string' is associated with a SSA name pointer through > arbitrary satements using a lattice. Like this? I had to move the strlen pass after the loop passes (and after dom or everything was too dirty) but long enough before the end (some optimizations are necessary after strlen). As a bonus, one more strlen is optimized in the current testcases :-) Running the pass twice would be another option I guess (it would require implementing the clone method), but without a testcase showing it is needed... Passes bootstrap+testsuite on x86_64-linux-gnu. 2014-03-23 Marc Glisse PR tree-optimization/57742 gcc/ * tree-ssa-strlen.c (get_string_length): Ignore malloc. (handle_builtin_malloc, handle_builtin_memset): New functions. (strlen_optimize_stmt): Call them. * passes.def: Move strlen after loop+dom. gcc/testsuite/ * g++.dg/tree-ssa/calloc.C: New testcase. * gcc.dg/tree-ssa/calloc-1.c: Likewise. * gcc.dg/tree-ssa/calloc-2.c: Likewise. * gcc.dg/strlenopt-9.c: Adapt. Index: gcc/passes.def =================================================================== --- gcc/passes.def (revision 208772) +++ gcc/passes.def (working copy) @@ -176,21 +176,20 @@ along with GCC; see the file COPYING3. DOM and erroneous path isolation should be due to degenerate PHI nodes. So rather than run the full propagators, run a specialized pass which only examines PHIs to discover const/copy propagation opportunities. */ NEXT_PASS (pass_phi_only_cprop); NEXT_PASS (pass_dse); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_dce); NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt); - NEXT_PASS (pass_strlen); NEXT_PASS (pass_ccp); /* After CCP we rewrite no longer addressed locals into SSA form if possible. */ NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_cse_sincos); NEXT_PASS (pass_optimize_bswap); NEXT_PASS (pass_split_crit_edges); NEXT_PASS (pass_pre); NEXT_PASS (pass_sink_code); NEXT_PASS (pass_asan); @@ -235,20 +234,21 @@ along with GCC; see the file COPYING3. NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_strength_reduction); NEXT_PASS (pass_dominator); /* The only const/copy propagation opportunities left after DOM should be due to degenerate PHI nodes. So rather than run the full propagators, run a specialized pass which only examines PHIs to discover const/copy propagation opportunities. */ NEXT_PASS (pass_phi_only_cprop); + NEXT_PASS (pass_strlen); NEXT_PASS (pass_vrp); NEXT_PASS (pass_cd_dce); NEXT_PASS (pass_tracer); NEXT_PASS (pass_dse); NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt); NEXT_PASS (pass_fold_builtins); NEXT_PASS (pass_optimize_widening_mul); NEXT_PASS (pass_tail_calls); NEXT_PASS (pass_rename_ssa_copies); Index: gcc/testsuite/g++.dg/tree-ssa/calloc.C =================================================================== --- gcc/testsuite/g++.dg/tree-ssa/calloc.C (revision 0) +++ gcc/testsuite/g++.dg/tree-ssa/calloc.C (working copy) @@ -0,0 +1,35 @@ +/* { dg-do compile { target c++11 } } */ +/* { dg-options "-O3 -fdump-tree-optimized" } */ + +#include +#include +#include + +void g(void*); +inline void* operator new(std::size_t sz) +{ + void *p; + + if (sz == 0) + sz = 1; + + // Slightly modified from the libsupc++ version, that one has 2 calls + // to malloc which makes it too hard to optimize. + while ((p = std::malloc (sz)) == 0) + { + std::new_handler handler = std::get_new_handler (); + if (! handler) + throw std::bad_alloc(); + handler (); + } + return p; +} + +void f(void*p,int n){ + new(p)std::vector(n); +} + +/* { dg-final { scan-tree-dump-times "calloc" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-not "malloc" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "memset" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: gcc/testsuite/g++.dg/tree-ssa/calloc.C ___________________________________________________________________ Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: gcc/testsuite/gcc.dg/strlenopt-9.c =================================================================== --- gcc/testsuite/gcc.dg/strlenopt-9.c (revision 208772) +++ gcc/testsuite/gcc.dg/strlenopt-9.c (working copy) @@ -11,21 +11,21 @@ fn1 (int r) optimized away. */ return strchr (p, '\0'); } __attribute__((noinline, noclone)) size_t fn2 (int r) { char *p, q[10]; strcpy (q, "abc"); p = r ? "a" : q; - /* String length for p varies, therefore strlen below isn't + /* String length is constant for both alternatives, and strlen is optimized away. */ return strlen (p); } __attribute__((noinline, noclone)) size_t fn3 (char *p, int n) { int i; p = strchr (p, '\0'); /* strcat here can be optimized into memcpy. */ @@ -91,19 +91,19 @@ main () || memcmp (b, "aaaabcdabcdefgabcdefgefgabcdabcd", 33) != 0 || memcmp (buf, "aaaabcdabcdefgabcdefgefgabcdefg", 32) != 0) abort (); if (fn4 (b, 256) != 4 || memcmp (b, "aaaabcdabcdefgabcdefgefgabcdabcdabcd", 37) != 0 || memcmp (buf, "aaaabcdabcdefgabcdefgefgabcdabcdefgefg", 39) != 0) abort (); return 0; } -/* { dg-final { scan-tree-dump-times "strlen \\(" 4 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strlen \\(" 3 "strlen" } } */ /* { dg-final { scan-tree-dump-times "memcpy \\(" 6 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strchr \\(" 3 "strlen" } } */ /* { dg-final { scan-tree-dump-times "stpcpy \\(" 0 "strlen" } } */ /* { dg-final { cleanup-tree-dump "strlen" } } */ /* { dg-final { scan-tree-dump-times "return 4;" 1 "optimized" } } */ /* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c (working copy) @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#include +#include + +extern int a; +extern int* b; +int n; +void* f(long*q){ + int*p=malloc(n); + ++*q; + if(p){ + ++*q; + a=2; + memset(p,0,n); + *b=3; + } + return p; +} +void* g(void){ + float*p=calloc(8,4); + return memset(p,0,24); // not 32 +} + +/* { dg-final { scan-tree-dump-times "calloc" 2 "optimized" } } */ +/* { dg-final { scan-tree-dump-not "malloc" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "memset" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c ___________________________________________________________________ Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: gcc/tree-ssa-strlen.c =================================================================== --- gcc/tree-ssa-strlen.c (revision 208772) +++ gcc/tree-ssa-strlen.c (working copy) @@ -496,20 +496,23 @@ get_string_length (strinfo si) case BUILT_IN_STPCPY_CHK: gcc_assert (lhs != NULL_TREE); loc = gimple_location (stmt); si->endptr = lhs; si->stmt = NULL; lhs = fold_convert_loc (loc, size_type_node, lhs); si->length = fold_convert_loc (loc, size_type_node, si->ptr); si->length = fold_build2_loc (loc, MINUS_EXPR, size_type_node, lhs, si->length); break; + case BUILT_IN_MALLOC: + break; + /* calloc always has si->length set. */ default: gcc_unreachable (); break; } } return si->length; } /* Invalidate string length information for strings whose length @@ -521,20 +524,21 @@ maybe_invalidate (gimple stmt) strinfo si; unsigned int i; bool nonempty = false; for (i = 1; vec_safe_iterate (stridx_to_strinfo, i, &si); ++i) if (si != NULL) { if (!si->dont_invalidate) { ao_ref r; + // Do not use si->length. ao_ref_init_from_ptr_and_size (&r, si->ptr, NULL_TREE); if (stmt_may_clobber_ref_p_1 (stmt, &r)) { set_strinfo (i, NULL); free_strinfo (si); continue; } } si->dont_invalidate = false; nonempty = true; @@ -1608,20 +1612,90 @@ handle_builtin_strcat (enum built_in_fun { laststmt.stmt = stmt; laststmt.len = srclen; laststmt.stridx = dsi->idx; } } else if (dump_file && (dump_flags & TDF_DETAILS) != 0) fprintf (dump_file, "not possible.\n"); } +/* Handle a call to malloc or calloc. */ + +static void +handle_builtin_malloc (enum built_in_function bcode, gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + tree lhs = gimple_call_lhs (stmt); + gcc_assert (get_stridx (lhs) == 0); + int idx = new_stridx (lhs); + tree length = NULL_TREE; + if (bcode == BUILT_IN_CALLOC) + length = build_int_cst (size_type_node, 0); + strinfo si = new_strinfo (lhs, idx, length); + if (bcode == BUILT_IN_CALLOC) + si->endptr = lhs; + set_strinfo (idx, si); + si->writable = true; + si->stmt = stmt; + si->dont_invalidate = true; +} + +/* Handle a call to memset. + After a call to calloc, memset(,0,) is unnecessary. + memset(malloc(n),0,n) is calloc(n,1). */ + +static bool +handle_builtin_memset (gimple_stmt_iterator *gsi) +{ + gimple stmt1, stmt2 = gsi_stmt (*gsi); + if (!integer_zerop (gimple_call_arg (stmt2, 1))) + return true; + tree ptr = gimple_call_arg (stmt2, 0); + int idx1 = get_stridx (ptr); + if (idx1 <= 0) + return true; + strinfo si1 = get_strinfo (idx1); + if (!si1 || !(stmt1 = si1->stmt) || !is_gimple_call (stmt1)) + return true; + tree callee1 = gimple_call_fndecl (stmt1); + if (!gimple_call_builtin_p (stmt1, BUILT_IN_NORMAL)) + return true; + enum built_in_function code1 = DECL_FUNCTION_CODE (callee1); + tree size = gimple_call_arg (stmt2, 2); + if (code1 == BUILT_IN_CALLOC) + /* Not touching stmt1 */ ; + else if (code1 == BUILT_IN_MALLOC + && operand_equal_p (gimple_call_arg (stmt1, 0), size, 0)) + { + gimple_stmt_iterator gsi1 = gsi_for_stmt (stmt1); + update_gimple_call (&gsi1, builtin_decl_implicit (BUILT_IN_CALLOC), 2, + size, build_one_cst (size_type_node)); + } + else + return true; + tree lhs = gimple_call_lhs (stmt2); + unlink_stmt_vdef (stmt2); + if (lhs) + { + gimple assign = gimple_build_assign (lhs, ptr); + gsi_replace (gsi, assign, false); + } + else + { + gsi_remove (gsi, true); + release_defs (stmt2); + } + + return false; +} + /* Handle a POINTER_PLUS_EXPR statement. For p = "abcd" + 2; compute associated length, or if p = q + off is pointing to a '\0' character of a string, call zero_length_string on it. */ static void handle_pointer_plus (gimple_stmt_iterator *gsi) { gimple stmt = gsi_stmt (*gsi); tree lhs = gimple_assign_lhs (stmt), off; @@ -1845,20 +1919,28 @@ strlen_optimize_stmt (gimple_stmt_iterat case BUILT_IN_MEMCPY: case BUILT_IN_MEMCPY_CHK: case BUILT_IN_MEMPCPY: case BUILT_IN_MEMPCPY_CHK: handle_builtin_memcpy (DECL_FUNCTION_CODE (callee), gsi); break; case BUILT_IN_STRCAT: case BUILT_IN_STRCAT_CHK: handle_builtin_strcat (DECL_FUNCTION_CODE (callee), gsi); break; + case BUILT_IN_MALLOC: + case BUILT_IN_CALLOC: + handle_builtin_malloc (DECL_FUNCTION_CODE (callee), gsi); + break; + case BUILT_IN_MEMSET: + if (!handle_builtin_memset (gsi)) + return false; + break; default: break; } } else if (is_gimple_assign (stmt)) { tree lhs = gimple_assign_lhs (stmt); if (TREE_CODE (lhs) == SSA_NAME && POINTER_TYPE_P (TREE_TYPE (lhs))) {