From patchwork Tue Feb 12 16:28:11 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dodji Seketeli X-Patchwork-Id: 219879 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id C0FE92C007C for ; Wed, 13 Feb 2013 03:28:58 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1361291339; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:From:To:Cc:Subject:References:Date:In-Reply-To: Message-ID:User-Agent:MIME-Version:Content-Type:Mailing-List: Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:Sender:Delivered-To; bh=A+BQNTuKxaz2P/FDvs34uy+TOrQ=; b=Cye0yR2iUnIZABdjqq9pxfNEXV7LJmrZK1Xd7CTrPiMek2Gv14UKIhkmMDusWx INlgbg134vRYGqLZ7yD6LFBHXQYuKdjaLBP9BRuJoOO8mbLoTFEAoLqnZR+dV0T9 824uJrPEQ39y598Cx2nxcut4nB4F8fEt50cczqOfjxVdM= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:From:To:Cc:Subject:References:X-URL:Date:In-Reply-To:Message-ID:User-Agent:MIME-Version:Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=LK+UImPn8jd7tb7sOiu9A3Te67zeZjXEbT/R8Bpr3QHc47qseUXJAcd1I5rv8Q sdMuTX4+oMbDmDHg2ZsMTjoRReC153dmRTOnwZ+5UOKQCBGjQp33symNUXtCXzvB xuz1TbQ/1UVna2b6eHyRWUcLHB/x7qZB54sDXr0XvgXOQ=; Received: (qmail 23913 invoked by alias); 12 Feb 2013 16:28:49 -0000 Received: (qmail 23893 invoked by uid 22791); 12 Feb 2013 16:28:46 -0000 X-SWARE-Spam-Status: No, hits=-5.9 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, KHOP_SPAMHAUS_DROP, RCVD_IN_DNSWL_HI, RCVD_IN_HOSTKARMA_W, RP_MATCHES_RCVD, SPF_HELO_PASS, TW_TM X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 12 Feb 2013 16:28:14 +0000 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r1CGSDQi004971 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 12 Feb 2013 11:28:13 -0500 Received: from localhost (ovpn-116-33.ams2.redhat.com [10.36.116.33]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r1CGSBKk016992 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 12 Feb 2013 11:28:12 -0500 Received: by localhost (Postfix, from userid 1000) id 177694063D; Tue, 12 Feb 2013 17:28:11 +0100 (CET) From: Dodji Seketeli To: Jakub Jelinek Cc: GCC Patches , Kostya Serebryany , Dmitry Vyukov Subject: Re: [PATCH 2/2] [asan] Avoid instrumenting duplicated memory access in the same basic block References: <87mwvtxalf.fsf@redhat.com> <87ehh5xabo.fsf@redhat.com> <20130129150022.GS4385@tucnak.redhat.com> <87obfpfwcm.fsf@redhat.com> <20130212142734.GE4385@tucnak.redhat.com> X-URL: http://www.redhat.com Date: Tue, 12 Feb 2013 17:28:11 +0100 In-Reply-To: <20130212142734.GE4385@tucnak.redhat.com> (Jakub Jelinek's message of "Tue, 12 Feb 2013 15:27:34 +0100") Message-ID: <87y5etebtw.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) MIME-Version: 1.0 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Jakub Jelinek writes: > On Tue, Feb 12, 2013 at 03:19:37PM +0100, Dodji Seketeli wrote: >> gcc/ >> * Makefile.in (asan.o): Add new dependency on hash-table.h >> * asan.c (struct asan_mem_ref, struct mem_ref_hasher): New types. >> (asan_mem_ref_init, asan_mem_ref_get_end, get_mem_ref_hash_table) >> (has_stmt_been_instrumented_p, empty_mem_ref_hash_table) >> (free_mem_ref_resources, has_mem_ref_been_instrumented) >> (has_stmt_been_instrumented_p, update_mem_ref_hash_table) >> (get_mem_ref_of_assignment): New functions. >> (get_mem_refs_of_builtin_call): Extract from >> instrument_builtin_call and tweak a little bit to make it fit with >> the new signature. >> (instrument_builtin_call): Use the new >> get_mem_refs_of_builtin_call. Use gimple_call_builtin_p instead >> of is_gimple_builtin_call. >> (instrument_derefs, instrument_mem_region_access): Insert the >> instrumented memory reference into the hash table. >> (maybe_instrument_assignment): Renamed instrument_assignment into >> this, and change it to advance the iterator when instrumentation >> actually happened and return true in that case. This makes it >> homogeneous with maybe_instrument_assignment, and thus give a >> chance to callers to be more 'regular'. >> (transform_statements): Clear the memory reference hash table >> whenever we enter a new BB, when we cross a function call, or when >> we are done transforming statements. Use >> maybe_instrument_assignment instead of instrumentation. No more >> need to special case maybe_instrument_assignment and advance the >> iterator after calling it; it's now handled just like >> maybe_instrument_call. Update comment. > > Ok. Just some testsuite nits. Thanks. Here is the updated patch that hopefully addresses your comments. Tested against trunk on x86-64-unknown-linux-gnu. gcc/ * Makefile.in (asan.o): Add new dependency on hash-table.h * asan.c (struct asan_mem_ref, struct mem_ref_hasher): New types. (asan_mem_ref_init, asan_mem_ref_get_end, get_mem_ref_hash_table) (has_stmt_been_instrumented_p, empty_mem_ref_hash_table) (free_mem_ref_resources, has_mem_ref_been_instrumented) (has_stmt_been_instrumented_p, update_mem_ref_hash_table) (get_mem_ref_of_assignment): New functions. (get_mem_refs_of_builtin_call): Extract from instrument_builtin_call and tweak a little bit to make it fit with the new signature. (instrument_builtin_call): Use the new get_mem_refs_of_builtin_call. Use gimple_call_builtin_p instead of is_gimple_builtin_call. (instrument_derefs, instrument_mem_region_access): Insert the instrumented memory reference into the hash table. (maybe_instrument_assignment): Renamed instrument_assignment into this, and change it to advance the iterator when instrumentation actually happened and return true in that case. This makes it homogeneous with maybe_instrument_assignment, and thus give a chance to callers to be more 'regular'. (transform_statements): Clear the memory reference hash table whenever we enter a new BB, when we cross a function call, or when we are done transforming statements. Use maybe_instrument_assignment instead of instrumentation. No more need to special case maybe_instrument_assignment and advance the iterator after calling it; it's now handled just like maybe_instrument_call. Update comment. gcc/testsuite/ * c-c++-common/asan/no-redundant-instrumentation-1.c: New test. * testsuite/c-c++-common/asan/no-redundant-instrumentation-2.c: Likewise. * testsuite/c-c++-common/asan/no-redundant-instrumentation-3.c: Likewise. * testsuite/c-c++-common/asan/inc.c: Likewise. --- gcc/Makefile.in | 3 +- gcc/asan.c | 1258 +++++++++++++------- gcc/testsuite/c-c++-common/asan/inc.c | 19 + .../asan/no-redundant-instrumentation-1.c | 64 + .../asan/no-redundant-instrumentation-2.c | 24 + .../asan/no-redundant-instrumentation-3.c | 16 + 6 files changed, 974 insertions(+), 410 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/asan/inc.c create mode 100644 gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c create mode 100644 gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-2.c create mode 100644 gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-3.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 375d5f5..f3bb168 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2226,7 +2226,8 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ asan.o : asan.c asan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \ output.h coretypes.h $(GIMPLE_PRETTY_PRINT_H) \ tree-iterator.h $(TREE_FLOW_H) $(TREE_PASS_H) \ - $(TARGET_H) $(EXPR_H) $(OPTABS_H) $(TM_P_H) langhooks.h + $(TARGET_H) $(EXPR_H) $(OPTABS_H) $(TM_P_H) langhooks.h \ + $(HASH_TABLE_H) alloc-pool.h tsan.o : $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(TREE_INLINE_H) \ $(GIMPLE_H) $(DIAGNOSTIC_H) langhooks.h \ $(TM_H) coretypes.h $(TREE_DUMP_H) $(TREE_PASS_H) $(CGRAPH_H) $(GGC_H) \ diff --git a/gcc/asan.c b/gcc/asan.c index f05e36c..3cb9511 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -34,6 +34,8 @@ along with GCC; see the file COPYING3. If not see #include "output.h" #include "tm_p.h" #include "langhooks.h" +#include "hash-table.h" +#include "alloc-pool.h" /* AddressSanitizer finds out-of-bounds and use-after-free bugs with <2x slowdown on average. @@ -212,6 +214,620 @@ alias_set_type asan_shadow_set = -1; alias set is used for all shadow memory accesses. */ static GTY(()) tree shadow_ptr_types[2]; +/* Hashtable support for memory references used by gimple + statements. */ + +/* This type represents a reference to a memory region. */ +struct asan_mem_ref +{ + /* The expression of the begining of the memory region. */ + tree start; + + /* The size of the access (can be 1, 2, 4, 8, 16 for now). */ + char access_size; +}; + +static alloc_pool asan_mem_ref_alloc_pool; + +/* This creates the alloc pool used to store the instances of + asan_mem_ref that are stored in the hash table asan_mem_ref_ht. */ + +static alloc_pool +asan_mem_ref_get_alloc_pool () +{ + if (asan_mem_ref_alloc_pool == NULL) + asan_mem_ref_alloc_pool = create_alloc_pool ("asan_mem_ref", + sizeof (asan_mem_ref), + 10); + return asan_mem_ref_alloc_pool; + +} + +/* Initializes an instance of asan_mem_ref. */ + +static void +asan_mem_ref_init (asan_mem_ref *ref, tree start, char access_size) +{ + ref->start = start; + ref->access_size = access_size; +} + +/* Allocates memory for an instance of asan_mem_ref into the memory + pool returned by asan_mem_ref_get_alloc_pool and initialize it. + START is the address of (or the expression pointing to) the + beginning of memory reference. ACCESS_SIZE is the size of the + access to the referenced memory. */ + +static asan_mem_ref* +asan_mem_ref_new (tree start, char access_size) +{ + asan_mem_ref *ref = + (asan_mem_ref *) pool_alloc (asan_mem_ref_get_alloc_pool ()); + + asan_mem_ref_init (ref, start, access_size); + return ref; +} + +/* This builds and returns a pointer to the end of the memory region + that starts at START and of length LEN. */ + +tree +asan_mem_ref_get_end (tree start, tree len) +{ + if (len == NULL_TREE || integer_zerop (len)) + return start; + + return fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (start), start, len); +} + +/* Return a tree expression that represents the end of the referenced + memory region. Beware that this function can actually build a new + tree expression. */ + +tree +asan_mem_ref_get_end (const asan_mem_ref *ref, tree len) +{ + return asan_mem_ref_get_end (ref->start, len); +} + +struct asan_mem_ref_hasher + : typed_noop_remove +{ + typedef asan_mem_ref value_type; + typedef asan_mem_ref compare_type; + + static inline hashval_t hash (const value_type *); + static inline bool equal (const value_type *, const compare_type *); +}; + +/* Hash a memory reference. */ + +inline hashval_t +asan_mem_ref_hasher::hash (const asan_mem_ref *mem_ref) +{ + hashval_t h = iterative_hash_expr (mem_ref->start, 0); + h = iterative_hash_hashval_t (h, mem_ref->access_size); + return h; +} + +/* Compare two memory references. We accept the length of either + memory references to be NULL_TREE. */ + +inline bool +asan_mem_ref_hasher::equal (const asan_mem_ref *m1, + const asan_mem_ref *m2) +{ + return (m1->access_size == m2->access_size + && operand_equal_p (m1->start, m2->start, 0)); +} + +static hash_table asan_mem_ref_ht; + +/* Returns a reference to the hash table containing memory references. + This function ensures that the hash table is created. Note that + this hash table is updated by the function + update_mem_ref_hash_table. */ + +static hash_table & +get_mem_ref_hash_table () +{ + if (!asan_mem_ref_ht.is_created ()) + asan_mem_ref_ht.create (10); + + return asan_mem_ref_ht; +} + +/* Clear all entries from the memory references hash table. */ + +static void +empty_mem_ref_hash_table () +{ + if (asan_mem_ref_ht.is_created ()) + asan_mem_ref_ht.empty (); +} + +/* Free the memory references hash table. */ + +static void +free_mem_ref_resources () +{ + if (asan_mem_ref_ht.is_created ()) + asan_mem_ref_ht.dispose (); + + if (asan_mem_ref_alloc_pool) + { + free_alloc_pool (asan_mem_ref_alloc_pool); + asan_mem_ref_alloc_pool = NULL; + } +} + +/* Return true iff the memory reference REF has been instrumented. */ + +static bool +has_mem_ref_been_instrumented (tree ref, char access_size) +{ + asan_mem_ref r; + asan_mem_ref_init (&r, ref, access_size); + + return (get_mem_ref_hash_table ().find (&r) != NULL); +} + +/* Return true iff the memory reference REF has been instrumented. */ + +static bool +has_mem_ref_been_instrumented (const asan_mem_ref *ref) +{ + return has_mem_ref_been_instrumented (ref->start, ref->access_size); +} + +/* Return true iff access to memory region starting at REF and of + length LEN has been instrumented. */ + +static bool +has_mem_ref_been_instrumented (const asan_mem_ref *ref, tree len) +{ + /* First let's see if the address of the beginning of REF has been + instrumented. */ + if (!has_mem_ref_been_instrumented (ref)) + return false; + + if (len != 0) + { + /* Let's see if the end of the region has been instrumented. */ + if (!has_mem_ref_been_instrumented (asan_mem_ref_get_end (ref, len), + ref->access_size)) + return false; + } + return true; +} + +/* Set REF to the memory reference present in a gimple assignment + ASSIGNMENT. Return true upon successful completion, false + otherwise. */ + +static bool +get_mem_ref_of_assignment (const gimple assignment, + asan_mem_ref *ref, + bool *ref_is_store) +{ + gcc_assert (gimple_assign_single_p (assignment)); + + if (gimple_store_p (assignment)) + { + ref->start = gimple_assign_lhs (assignment); + *ref_is_store = true; + } + else if (gimple_assign_load_p (assignment)) + { + ref->start = gimple_assign_rhs1 (assignment); + *ref_is_store = false; + } + else + return false; + + ref->access_size = int_size_in_bytes (TREE_TYPE (ref->start)); + return true; +} + +/* Return the memory references contained in a gimple statement + representing a builtin call that has to do with memory access. */ + +static bool +get_mem_refs_of_builtin_call (const gimple call, + asan_mem_ref *src0, + tree *src0_len, + bool *src0_is_store, + asan_mem_ref *src1, + tree *src1_len, + bool *src1_is_store, + asan_mem_ref *dst, + tree *dst_len, + bool *dst_is_store, + bool *dest_is_deref) +{ + gcc_checking_assert (gimple_call_builtin_p (call, BUILT_IN_NORMAL)); + + tree callee = gimple_call_fndecl (call); + tree source0 = NULL_TREE, source1 = NULL_TREE, + dest = NULL_TREE, len = NULL_TREE; + bool is_store = true, got_reference_p = false; + char access_size = 1; + + switch (DECL_FUNCTION_CODE (callee)) + { + /* (s, s, n) style memops. */ + case BUILT_IN_BCMP: + case BUILT_IN_MEMCMP: + source0 = gimple_call_arg (call, 0); + source1 = gimple_call_arg (call, 1); + len = gimple_call_arg (call, 2); + break; + + /* (src, dest, n) style memops. */ + case BUILT_IN_BCOPY: + source0 = gimple_call_arg (call, 0); + dest = gimple_call_arg (call, 1); + len = gimple_call_arg (call, 2); + break; + + /* (dest, src, n) style memops. */ + case BUILT_IN_MEMCPY: + case BUILT_IN_MEMCPY_CHK: + case BUILT_IN_MEMMOVE: + case BUILT_IN_MEMMOVE_CHK: + case BUILT_IN_MEMPCPY: + case BUILT_IN_MEMPCPY_CHK: + dest = gimple_call_arg (call, 0); + source0 = gimple_call_arg (call, 1); + len = gimple_call_arg (call, 2); + break; + + /* (dest, n) style memops. */ + case BUILT_IN_BZERO: + dest = gimple_call_arg (call, 0); + len = gimple_call_arg (call, 1); + break; + + /* (dest, x, n) style memops*/ + case BUILT_IN_MEMSET: + case BUILT_IN_MEMSET_CHK: + dest = gimple_call_arg (call, 0); + len = gimple_call_arg (call, 2); + break; + + case BUILT_IN_STRLEN: + source0 = gimple_call_arg (call, 0); + len = gimple_call_lhs (call); + break ; + + /* And now the __atomic* and __sync builtins. + These are handled differently from the classical memory memory + access builtins above. */ + + case BUILT_IN_ATOMIC_LOAD_1: + case BUILT_IN_ATOMIC_LOAD_2: + case BUILT_IN_ATOMIC_LOAD_4: + case BUILT_IN_ATOMIC_LOAD_8: + case BUILT_IN_ATOMIC_LOAD_16: + is_store = false; + /* fall through. */ + + case BUILT_IN_SYNC_FETCH_AND_ADD_1: + case BUILT_IN_SYNC_FETCH_AND_ADD_2: + case BUILT_IN_SYNC_FETCH_AND_ADD_4: + case BUILT_IN_SYNC_FETCH_AND_ADD_8: + case BUILT_IN_SYNC_FETCH_AND_ADD_16: + + case BUILT_IN_SYNC_FETCH_AND_SUB_1: + case BUILT_IN_SYNC_FETCH_AND_SUB_2: + case BUILT_IN_SYNC_FETCH_AND_SUB_4: + case BUILT_IN_SYNC_FETCH_AND_SUB_8: + case BUILT_IN_SYNC_FETCH_AND_SUB_16: + + case BUILT_IN_SYNC_FETCH_AND_OR_1: + case BUILT_IN_SYNC_FETCH_AND_OR_2: + case BUILT_IN_SYNC_FETCH_AND_OR_4: + case BUILT_IN_SYNC_FETCH_AND_OR_8: + case BUILT_IN_SYNC_FETCH_AND_OR_16: + + case BUILT_IN_SYNC_FETCH_AND_AND_1: + case BUILT_IN_SYNC_FETCH_AND_AND_2: + case BUILT_IN_SYNC_FETCH_AND_AND_4: + case BUILT_IN_SYNC_FETCH_AND_AND_8: + case BUILT_IN_SYNC_FETCH_AND_AND_16: + + case BUILT_IN_SYNC_FETCH_AND_XOR_1: + case BUILT_IN_SYNC_FETCH_AND_XOR_2: + case BUILT_IN_SYNC_FETCH_AND_XOR_4: + case BUILT_IN_SYNC_FETCH_AND_XOR_8: + case BUILT_IN_SYNC_FETCH_AND_XOR_16: + + case BUILT_IN_SYNC_FETCH_AND_NAND_1: + case BUILT_IN_SYNC_FETCH_AND_NAND_2: + case BUILT_IN_SYNC_FETCH_AND_NAND_4: + case BUILT_IN_SYNC_FETCH_AND_NAND_8: + + case BUILT_IN_SYNC_ADD_AND_FETCH_1: + case BUILT_IN_SYNC_ADD_AND_FETCH_2: + case BUILT_IN_SYNC_ADD_AND_FETCH_4: + case BUILT_IN_SYNC_ADD_AND_FETCH_8: + case BUILT_IN_SYNC_ADD_AND_FETCH_16: + + case BUILT_IN_SYNC_SUB_AND_FETCH_1: + case BUILT_IN_SYNC_SUB_AND_FETCH_2: + case BUILT_IN_SYNC_SUB_AND_FETCH_4: + case BUILT_IN_SYNC_SUB_AND_FETCH_8: + case BUILT_IN_SYNC_SUB_AND_FETCH_16: + + case BUILT_IN_SYNC_OR_AND_FETCH_1: + case BUILT_IN_SYNC_OR_AND_FETCH_2: + case BUILT_IN_SYNC_OR_AND_FETCH_4: + case BUILT_IN_SYNC_OR_AND_FETCH_8: + case BUILT_IN_SYNC_OR_AND_FETCH_16: + + case BUILT_IN_SYNC_AND_AND_FETCH_1: + case BUILT_IN_SYNC_AND_AND_FETCH_2: + case BUILT_IN_SYNC_AND_AND_FETCH_4: + case BUILT_IN_SYNC_AND_AND_FETCH_8: + case BUILT_IN_SYNC_AND_AND_FETCH_16: + + case BUILT_IN_SYNC_XOR_AND_FETCH_1: + case BUILT_IN_SYNC_XOR_AND_FETCH_2: + case BUILT_IN_SYNC_XOR_AND_FETCH_4: + case BUILT_IN_SYNC_XOR_AND_FETCH_8: + case BUILT_IN_SYNC_XOR_AND_FETCH_16: + + case BUILT_IN_SYNC_NAND_AND_FETCH_1: + case BUILT_IN_SYNC_NAND_AND_FETCH_2: + case BUILT_IN_SYNC_NAND_AND_FETCH_4: + case BUILT_IN_SYNC_NAND_AND_FETCH_8: + + case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_1: + case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_2: + case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_4: + case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_8: + case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_16: + + case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_1: + case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_2: + case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_4: + case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_8: + case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_16: + + case BUILT_IN_SYNC_LOCK_TEST_AND_SET_1: + case BUILT_IN_SYNC_LOCK_TEST_AND_SET_2: + case BUILT_IN_SYNC_LOCK_TEST_AND_SET_4: + case BUILT_IN_SYNC_LOCK_TEST_AND_SET_8: + case BUILT_IN_SYNC_LOCK_TEST_AND_SET_16: + + case BUILT_IN_SYNC_LOCK_RELEASE_1: + case BUILT_IN_SYNC_LOCK_RELEASE_2: + case BUILT_IN_SYNC_LOCK_RELEASE_4: + case BUILT_IN_SYNC_LOCK_RELEASE_8: + case BUILT_IN_SYNC_LOCK_RELEASE_16: + + case BUILT_IN_ATOMIC_EXCHANGE_1: + case BUILT_IN_ATOMIC_EXCHANGE_2: + case BUILT_IN_ATOMIC_EXCHANGE_4: + case BUILT_IN_ATOMIC_EXCHANGE_8: + case BUILT_IN_ATOMIC_EXCHANGE_16: + + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1: + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2: + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4: + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8: + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16: + + case BUILT_IN_ATOMIC_STORE_1: + case BUILT_IN_ATOMIC_STORE_2: + case BUILT_IN_ATOMIC_STORE_4: + case BUILT_IN_ATOMIC_STORE_8: + case BUILT_IN_ATOMIC_STORE_16: + + case BUILT_IN_ATOMIC_ADD_FETCH_1: + case BUILT_IN_ATOMIC_ADD_FETCH_2: + case BUILT_IN_ATOMIC_ADD_FETCH_4: + case BUILT_IN_ATOMIC_ADD_FETCH_8: + case BUILT_IN_ATOMIC_ADD_FETCH_16: + + case BUILT_IN_ATOMIC_SUB_FETCH_1: + case BUILT_IN_ATOMIC_SUB_FETCH_2: + case BUILT_IN_ATOMIC_SUB_FETCH_4: + case BUILT_IN_ATOMIC_SUB_FETCH_8: + case BUILT_IN_ATOMIC_SUB_FETCH_16: + + case BUILT_IN_ATOMIC_AND_FETCH_1: + case BUILT_IN_ATOMIC_AND_FETCH_2: + case BUILT_IN_ATOMIC_AND_FETCH_4: + case BUILT_IN_ATOMIC_AND_FETCH_8: + case BUILT_IN_ATOMIC_AND_FETCH_16: + + case BUILT_IN_ATOMIC_NAND_FETCH_1: + case BUILT_IN_ATOMIC_NAND_FETCH_2: + case BUILT_IN_ATOMIC_NAND_FETCH_4: + case BUILT_IN_ATOMIC_NAND_FETCH_8: + case BUILT_IN_ATOMIC_NAND_FETCH_16: + + case BUILT_IN_ATOMIC_XOR_FETCH_1: + case BUILT_IN_ATOMIC_XOR_FETCH_2: + case BUILT_IN_ATOMIC_XOR_FETCH_4: + case BUILT_IN_ATOMIC_XOR_FETCH_8: + case BUILT_IN_ATOMIC_XOR_FETCH_16: + + case BUILT_IN_ATOMIC_OR_FETCH_1: + case BUILT_IN_ATOMIC_OR_FETCH_2: + case BUILT_IN_ATOMIC_OR_FETCH_4: + case BUILT_IN_ATOMIC_OR_FETCH_8: + case BUILT_IN_ATOMIC_OR_FETCH_16: + + case BUILT_IN_ATOMIC_FETCH_ADD_1: + case BUILT_IN_ATOMIC_FETCH_ADD_2: + case BUILT_IN_ATOMIC_FETCH_ADD_4: + case BUILT_IN_ATOMIC_FETCH_ADD_8: + case BUILT_IN_ATOMIC_FETCH_ADD_16: + + case BUILT_IN_ATOMIC_FETCH_SUB_1: + case BUILT_IN_ATOMIC_FETCH_SUB_2: + case BUILT_IN_ATOMIC_FETCH_SUB_4: + case BUILT_IN_ATOMIC_FETCH_SUB_8: + case BUILT_IN_ATOMIC_FETCH_SUB_16: + + case BUILT_IN_ATOMIC_FETCH_AND_1: + case BUILT_IN_ATOMIC_FETCH_AND_2: + case BUILT_IN_ATOMIC_FETCH_AND_4: + case BUILT_IN_ATOMIC_FETCH_AND_8: + case BUILT_IN_ATOMIC_FETCH_AND_16: + + case BUILT_IN_ATOMIC_FETCH_NAND_1: + case BUILT_IN_ATOMIC_FETCH_NAND_2: + case BUILT_IN_ATOMIC_FETCH_NAND_4: + case BUILT_IN_ATOMIC_FETCH_NAND_8: + case BUILT_IN_ATOMIC_FETCH_NAND_16: + + case BUILT_IN_ATOMIC_FETCH_XOR_1: + case BUILT_IN_ATOMIC_FETCH_XOR_2: + case BUILT_IN_ATOMIC_FETCH_XOR_4: + case BUILT_IN_ATOMIC_FETCH_XOR_8: + case BUILT_IN_ATOMIC_FETCH_XOR_16: + + case BUILT_IN_ATOMIC_FETCH_OR_1: + case BUILT_IN_ATOMIC_FETCH_OR_2: + case BUILT_IN_ATOMIC_FETCH_OR_4: + case BUILT_IN_ATOMIC_FETCH_OR_8: + case BUILT_IN_ATOMIC_FETCH_OR_16: + { + dest = gimple_call_arg (call, 0); + /* DEST represents the address of a memory location. + instrument_derefs wants the memory location, so lets + dereference the address DEST before handing it to + instrument_derefs. */ + if (TREE_CODE (dest) == ADDR_EXPR) + dest = TREE_OPERAND (dest, 0); + else if (TREE_CODE (dest) == SSA_NAME) + dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)), + dest, build_int_cst (TREE_TYPE (dest), 0)); + else + gcc_unreachable (); + + access_size = int_size_in_bytes (TREE_TYPE (dest)); + } + + default: + /* The other builtins memory access are not instrumented in this + function because they either don't have any length parameter, + or their length parameter is just a limit. */ + break; + } + + if (len != NULL_TREE) + { + if (source0 != NULL_TREE) + { + src0->start = source0; + src0->access_size = access_size; + *src0_len = len; + *src0_is_store = false; + } + + if (source1 != NULL_TREE) + { + src1->start = source1; + src1->access_size = access_size; + *src1_len = len; + *src1_is_store = false; + } + + if (dest != NULL_TREE) + { + dst->start = dest; + dst->access_size = access_size; + *dst_len = len; + *dst_is_store = true; + } + + got_reference_p = true; + } + else + { + if (dest) + { + dst->start = dest; + dst->access_size = access_size; + *dst_len = NULL_TREE; + *dst_is_store = is_store; + *dest_is_deref = true; + got_reference_p = true; + } + } + + return got_reference_p; +} + +/* Return true iff a given gimple statement has been instrumented. + Note that the statement is "defined" by the memory references it + contains. */ + +static bool +has_stmt_been_instrumented_p (gimple stmt) +{ + if (gimple_assign_single_p (stmt)) + { + bool r_is_store; + asan_mem_ref r; + asan_mem_ref_init (&r, NULL, 1); + + if (get_mem_ref_of_assignment (stmt, &r, &r_is_store)) + return has_mem_ref_been_instrumented (&r); + } + else if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)) + { + asan_mem_ref src0, src1, dest; + asan_mem_ref_init (&src0, NULL, 1); + asan_mem_ref_init (&src1, NULL, 1); + asan_mem_ref_init (&dest, NULL, 1); + + tree src0_len = NULL_TREE, src1_len = NULL_TREE, dest_len = NULL_TREE; + bool src0_is_store = false, src1_is_store = false, + dest_is_store = false, dest_is_deref = false; + if (get_mem_refs_of_builtin_call (stmt, + &src0, &src0_len, &src0_is_store, + &src1, &src1_len, &src1_is_store, + &dest, &dest_len, &dest_is_store, + &dest_is_deref)) + { + if (src0.start != NULL_TREE + && !has_mem_ref_been_instrumented (&src0, src0_len)) + return false; + + if (src1.start != NULL_TREE + && !has_mem_ref_been_instrumented (&src1, src1_len)) + return false; + + if (dest.start != NULL_TREE + && !has_mem_ref_been_instrumented (&dest, dest_len)) + return false; + + return true; + } + } + return false; +} + +/* Insert a memory reference into the hash table. */ + +static void +update_mem_ref_hash_table (tree ref, char access_size) +{ + hash_table ht = get_mem_ref_hash_table (); + + asan_mem_ref r; + asan_mem_ref_init (&r, ref, access_size); + + asan_mem_ref **slot = ht.find_slot (&r, INSERT); + if (*slot == NULL) + *slot = asan_mem_ref_new (ref, access_size); +} + /* Initialize shadow_ptr_types array. */ static void @@ -835,7 +1451,7 @@ build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter, static void instrument_derefs (gimple_stmt_iterator *iter, tree t, - location_t location, bool is_store) + location_t location, bool is_store) { tree type, base; HOST_WIDE_INT size_in_bytes; @@ -878,8 +1494,14 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t, } base = build_fold_addr_expr (t); - build_check_stmt (location, base, iter, /*before_p=*/true, - is_store, size_in_bytes); + if (!has_mem_ref_been_instrumented (base, size_in_bytes)) + { + build_check_stmt (location, base, iter, /*before_p=*/true, + is_store, size_in_bytes); + update_mem_ref_hash_table (base, size_in_bytes); + update_mem_ref_hash_table (t, size_in_bytes); + } + } /* Instrument an access to a contiguous memory region that starts at @@ -903,6 +1525,12 @@ instrument_mem_region_access (tree base, tree len, gimple_stmt_iterator gsi = *iter; basic_block fallthrough_bb = NULL, then_bb = NULL; + + /* If the beginning of the memory region has already been + instrumented, do not instrument it. */ + if (has_mem_ref_been_instrumented (base, 1)) + goto after_first_instrumentation; + if (!is_gimple_constant (len)) { /* So, the length of the memory area to asan-protect is @@ -945,9 +1573,19 @@ instrument_mem_region_access (tree base, tree len, else *iter = gsi; + update_mem_ref_hash_table (base, 1); + + after_first_instrumentation: + /* We want to instrument the access at the end of the memory region, which is at (base + len - 1). */ + /* If the end of the memory region has already been instrumented, do + not instrument it. */ + tree end = asan_mem_ref_get_end (base, len); + if (has_mem_ref_been_instrumented (end, 1)) + return; + /* offset = len - 1; */ len = unshare_expr (len); tree offset; @@ -982,434 +1620,221 @@ instrument_mem_region_access (tree base, tree len, g = gimple_build_assign_with_ops (MINUS_EXPR, t, len, build_int_cst (size_type_node, 1)); gimple_set_location (g, location); - gimple_seq_add_stmt_without_update (&seq, g); - offset = gimple_assign_lhs (g); - } - - /* _1 = base; */ - base = unshare_expr (base); - gimple region_end = - gimple_build_assign_with_ops (TREE_CODE (base), - make_ssa_name (TREE_TYPE (base), NULL), - base, NULL); - gimple_set_location (region_end, location); - gimple_seq_add_stmt_without_update (&seq, region_end); - gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); - gsi_prev (&gsi); - - /* _2 = _1 + offset; */ - region_end = - gimple_build_assign_with_ops (POINTER_PLUS_EXPR, - make_ssa_name (TREE_TYPE (base), NULL), - gimple_assign_lhs (region_end), - offset); - gimple_set_location (region_end, location); - gsi_insert_after (&gsi, region_end, GSI_NEW_STMT); - - /* instrument access at _2; */ - build_check_stmt (location, gimple_assign_lhs (region_end), - &gsi, /*before_p=*/false, is_store, 1); -} - -/* Instrument the call (to the builtin strlen function) pointed to by - ITER. - - This function instruments the access to the first byte of the - argument, right before the call. After the call it instruments the - access to the last byte of the argument; it uses the result of the - call to deduce the offset of that last byte. - - Upon completion, iff the call has actullay been instrumented, this - function returns TRUE and *ITER points to the statement logically - following the built-in strlen function call *ITER was initially - pointing to. Otherwise, the function returns FALSE and *ITER - remains unchanged. */ - -static bool -instrument_strlen_call (gimple_stmt_iterator *iter) -{ - gimple call = gsi_stmt (*iter); - gcc_assert (is_gimple_call (call)); - - tree callee = gimple_call_fndecl (call); - gcc_assert (is_builtin_fn (callee) - && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL - && DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN); - - tree len = gimple_call_lhs (call); - if (len == NULL) - /* Some passes might clear the return value of the strlen call; - bail out in that case. Return FALSE as we are not advancing - *ITER. */ - return false; - gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (len))); - - location_t loc = gimple_location (call); - tree str_arg = gimple_call_arg (call, 0); - - /* Instrument the access to the first byte of str_arg. i.e: - - _1 = str_arg; instrument (_1); */ - gimple str_arg_ssa = - gimple_build_assign_with_ops (NOP_EXPR, - make_ssa_name (build_pointer_type - (char_type_node), NULL), - str_arg, NULL); - gimple_set_location (str_arg_ssa, loc); - gimple_stmt_iterator gsi = *iter; - gsi_insert_before (&gsi, str_arg_ssa, GSI_NEW_STMT); - build_check_stmt (loc, gimple_assign_lhs (str_arg_ssa), &gsi, - /*before_p=*/false, /*is_store=*/false, 1); - - /* If we initially had an instruction like: - - int n = strlen (str) - - we now want to instrument the access to str[n], after the - instruction above.*/ - - /* So let's build the access to str[n] that is, access through the - pointer_plus expr: (_1 + len). */ - gimple stmt = - gimple_build_assign_with_ops (POINTER_PLUS_EXPR, - make_ssa_name (TREE_TYPE (str_arg), - NULL), - gimple_assign_lhs (str_arg_ssa), - len); - gimple_set_location (stmt, loc); - gsi_insert_after (&gsi, stmt, GSI_NEW_STMT); - - build_check_stmt (loc, gimple_assign_lhs (stmt), &gsi, - /*before_p=*/false, /*is_store=*/false, 1); - - /* Ensure that iter points to the statement logically following the - one it was initially pointing to. */ - *iter = gsi; - /* As *ITER has been advanced to point to the next statement, let's - return true to inform transform_statements that it shouldn't - advance *ITER anymore; otherwises it will skip that next - statement, which wouldn't be instrumented. */ - return true; -} - -/* Instrument the call to a built-in memory access function that is - pointed to by the iterator ITER. - - Upon completion, return TRUE iff *ITER has been advanced to the - statement following the one it was originally pointing to. */ - -static bool -instrument_builtin_call (gimple_stmt_iterator *iter) -{ - gimple call = gsi_stmt (*iter); - - gcc_checking_assert (is_gimple_builtin_call (call)); - - tree callee = gimple_call_fndecl (call); - location_t loc = gimple_location (call); - tree source0 = NULL_TREE, source1 = NULL_TREE, - dest = NULL_TREE, len = NULL_TREE; - bool is_store = true; - - switch (DECL_FUNCTION_CODE (callee)) - { - /* (s, s, n) style memops. */ - case BUILT_IN_BCMP: - case BUILT_IN_MEMCMP: - source0 = gimple_call_arg (call, 0); - source1 = gimple_call_arg (call, 1); - len = gimple_call_arg (call, 2); - break; - - /* (src, dest, n) style memops. */ - case BUILT_IN_BCOPY: - source0 = gimple_call_arg (call, 0); - dest = gimple_call_arg (call, 1); - len = gimple_call_arg (call, 2); - break; - - /* (dest, src, n) style memops. */ - case BUILT_IN_MEMCPY: - case BUILT_IN_MEMCPY_CHK: - case BUILT_IN_MEMMOVE: - case BUILT_IN_MEMMOVE_CHK: - case BUILT_IN_MEMPCPY: - case BUILT_IN_MEMPCPY_CHK: - dest = gimple_call_arg (call, 0); - source0 = gimple_call_arg (call, 1); - len = gimple_call_arg (call, 2); - break; - - /* (dest, n) style memops. */ - case BUILT_IN_BZERO: - dest = gimple_call_arg (call, 0); - len = gimple_call_arg (call, 1); - break; - - /* (dest, x, n) style memops*/ - case BUILT_IN_MEMSET: - case BUILT_IN_MEMSET_CHK: - dest = gimple_call_arg (call, 0); - len = gimple_call_arg (call, 2); - break; - - case BUILT_IN_STRLEN: - return instrument_strlen_call (iter); - - /* And now the __atomic* and __sync builtins. - These are handled differently from the classical memory memory - access builtins above. */ - - case BUILT_IN_ATOMIC_LOAD_1: - case BUILT_IN_ATOMIC_LOAD_2: - case BUILT_IN_ATOMIC_LOAD_4: - case BUILT_IN_ATOMIC_LOAD_8: - case BUILT_IN_ATOMIC_LOAD_16: - is_store = false; - /* fall through. */ - - case BUILT_IN_SYNC_FETCH_AND_ADD_1: - case BUILT_IN_SYNC_FETCH_AND_ADD_2: - case BUILT_IN_SYNC_FETCH_AND_ADD_4: - case BUILT_IN_SYNC_FETCH_AND_ADD_8: - case BUILT_IN_SYNC_FETCH_AND_ADD_16: - - case BUILT_IN_SYNC_FETCH_AND_SUB_1: - case BUILT_IN_SYNC_FETCH_AND_SUB_2: - case BUILT_IN_SYNC_FETCH_AND_SUB_4: - case BUILT_IN_SYNC_FETCH_AND_SUB_8: - case BUILT_IN_SYNC_FETCH_AND_SUB_16: - - case BUILT_IN_SYNC_FETCH_AND_OR_1: - case BUILT_IN_SYNC_FETCH_AND_OR_2: - case BUILT_IN_SYNC_FETCH_AND_OR_4: - case BUILT_IN_SYNC_FETCH_AND_OR_8: - case BUILT_IN_SYNC_FETCH_AND_OR_16: - - case BUILT_IN_SYNC_FETCH_AND_AND_1: - case BUILT_IN_SYNC_FETCH_AND_AND_2: - case BUILT_IN_SYNC_FETCH_AND_AND_4: - case BUILT_IN_SYNC_FETCH_AND_AND_8: - case BUILT_IN_SYNC_FETCH_AND_AND_16: - - case BUILT_IN_SYNC_FETCH_AND_XOR_1: - case BUILT_IN_SYNC_FETCH_AND_XOR_2: - case BUILT_IN_SYNC_FETCH_AND_XOR_4: - case BUILT_IN_SYNC_FETCH_AND_XOR_8: - case BUILT_IN_SYNC_FETCH_AND_XOR_16: - - case BUILT_IN_SYNC_FETCH_AND_NAND_1: - case BUILT_IN_SYNC_FETCH_AND_NAND_2: - case BUILT_IN_SYNC_FETCH_AND_NAND_4: - case BUILT_IN_SYNC_FETCH_AND_NAND_8: - - case BUILT_IN_SYNC_ADD_AND_FETCH_1: - case BUILT_IN_SYNC_ADD_AND_FETCH_2: - case BUILT_IN_SYNC_ADD_AND_FETCH_4: - case BUILT_IN_SYNC_ADD_AND_FETCH_8: - case BUILT_IN_SYNC_ADD_AND_FETCH_16: - - case BUILT_IN_SYNC_SUB_AND_FETCH_1: - case BUILT_IN_SYNC_SUB_AND_FETCH_2: - case BUILT_IN_SYNC_SUB_AND_FETCH_4: - case BUILT_IN_SYNC_SUB_AND_FETCH_8: - case BUILT_IN_SYNC_SUB_AND_FETCH_16: - - case BUILT_IN_SYNC_OR_AND_FETCH_1: - case BUILT_IN_SYNC_OR_AND_FETCH_2: - case BUILT_IN_SYNC_OR_AND_FETCH_4: - case BUILT_IN_SYNC_OR_AND_FETCH_8: - case BUILT_IN_SYNC_OR_AND_FETCH_16: + gimple_seq_add_stmt_without_update (&seq, g); + offset = gimple_assign_lhs (g); + } - case BUILT_IN_SYNC_AND_AND_FETCH_1: - case BUILT_IN_SYNC_AND_AND_FETCH_2: - case BUILT_IN_SYNC_AND_AND_FETCH_4: - case BUILT_IN_SYNC_AND_AND_FETCH_8: - case BUILT_IN_SYNC_AND_AND_FETCH_16: + /* _1 = base; */ + base = unshare_expr (base); + gimple region_end = + gimple_build_assign_with_ops (TREE_CODE (base), + make_ssa_name (TREE_TYPE (base), NULL), + base, NULL); + gimple_set_location (region_end, location); + gimple_seq_add_stmt_without_update (&seq, region_end); + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); + gsi_prev (&gsi); - case BUILT_IN_SYNC_XOR_AND_FETCH_1: - case BUILT_IN_SYNC_XOR_AND_FETCH_2: - case BUILT_IN_SYNC_XOR_AND_FETCH_4: - case BUILT_IN_SYNC_XOR_AND_FETCH_8: - case BUILT_IN_SYNC_XOR_AND_FETCH_16: + /* _2 = _1 + offset; */ + region_end = + gimple_build_assign_with_ops (POINTER_PLUS_EXPR, + make_ssa_name (TREE_TYPE (base), NULL), + gimple_assign_lhs (region_end), + offset); + gimple_set_location (region_end, location); + gsi_insert_after (&gsi, region_end, GSI_NEW_STMT); - case BUILT_IN_SYNC_NAND_AND_FETCH_1: - case BUILT_IN_SYNC_NAND_AND_FETCH_2: - case BUILT_IN_SYNC_NAND_AND_FETCH_4: - case BUILT_IN_SYNC_NAND_AND_FETCH_8: + /* instrument access at _2; */ + build_check_stmt (location, gimple_assign_lhs (region_end), + &gsi, /*before_p=*/false, is_store, 1); - case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_1: - case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_2: - case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_4: - case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_8: - case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_16: + update_mem_ref_hash_table (end, 1); +} - case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_1: - case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_2: - case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_4: - case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_8: - case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_16: +/* Instrument the call (to the builtin strlen function) pointed to by + ITER. - case BUILT_IN_SYNC_LOCK_TEST_AND_SET_1: - case BUILT_IN_SYNC_LOCK_TEST_AND_SET_2: - case BUILT_IN_SYNC_LOCK_TEST_AND_SET_4: - case BUILT_IN_SYNC_LOCK_TEST_AND_SET_8: - case BUILT_IN_SYNC_LOCK_TEST_AND_SET_16: + This function instruments the access to the first byte of the + argument, right before the call. After the call it instruments the + access to the last byte of the argument; it uses the result of the + call to deduce the offset of that last byte. - case BUILT_IN_SYNC_LOCK_RELEASE_1: - case BUILT_IN_SYNC_LOCK_RELEASE_2: - case BUILT_IN_SYNC_LOCK_RELEASE_4: - case BUILT_IN_SYNC_LOCK_RELEASE_8: - case BUILT_IN_SYNC_LOCK_RELEASE_16: + Upon completion, iff the call has actullay been instrumented, this + function returns TRUE and *ITER points to the statement logically + following the built-in strlen function call *ITER was initially + pointing to. Otherwise, the function returns FALSE and *ITER + remains unchanged. */ - case BUILT_IN_ATOMIC_EXCHANGE_1: - case BUILT_IN_ATOMIC_EXCHANGE_2: - case BUILT_IN_ATOMIC_EXCHANGE_4: - case BUILT_IN_ATOMIC_EXCHANGE_8: - case BUILT_IN_ATOMIC_EXCHANGE_16: +static bool +instrument_strlen_call (gimple_stmt_iterator *iter) +{ + gimple call = gsi_stmt (*iter); + gcc_assert (is_gimple_call (call)); - case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1: - case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2: - case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4: - case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8: - case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16: + tree callee = gimple_call_fndecl (call); + gcc_assert (is_builtin_fn (callee) + && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL + && DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN); - case BUILT_IN_ATOMIC_STORE_1: - case BUILT_IN_ATOMIC_STORE_2: - case BUILT_IN_ATOMIC_STORE_4: - case BUILT_IN_ATOMIC_STORE_8: - case BUILT_IN_ATOMIC_STORE_16: + tree len = gimple_call_lhs (call); + if (len == NULL) + /* Some passes might clear the return value of the strlen call; + bail out in that case. Return FALSE as we are not advancing + *ITER. */ + return false; + gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (len))); - case BUILT_IN_ATOMIC_ADD_FETCH_1: - case BUILT_IN_ATOMIC_ADD_FETCH_2: - case BUILT_IN_ATOMIC_ADD_FETCH_4: - case BUILT_IN_ATOMIC_ADD_FETCH_8: - case BUILT_IN_ATOMIC_ADD_FETCH_16: + location_t loc = gimple_location (call); + tree str_arg = gimple_call_arg (call, 0); - case BUILT_IN_ATOMIC_SUB_FETCH_1: - case BUILT_IN_ATOMIC_SUB_FETCH_2: - case BUILT_IN_ATOMIC_SUB_FETCH_4: - case BUILT_IN_ATOMIC_SUB_FETCH_8: - case BUILT_IN_ATOMIC_SUB_FETCH_16: + /* Instrument the access to the first byte of str_arg. i.e: - case BUILT_IN_ATOMIC_AND_FETCH_1: - case BUILT_IN_ATOMIC_AND_FETCH_2: - case BUILT_IN_ATOMIC_AND_FETCH_4: - case BUILT_IN_ATOMIC_AND_FETCH_8: - case BUILT_IN_ATOMIC_AND_FETCH_16: + _1 = str_arg; instrument (_1); */ + gimple str_arg_ssa = + gimple_build_assign_with_ops (NOP_EXPR, + make_ssa_name (build_pointer_type + (char_type_node), NULL), + str_arg, NULL); + gimple_set_location (str_arg_ssa, loc); + gimple_stmt_iterator gsi = *iter; + gsi_insert_before (&gsi, str_arg_ssa, GSI_NEW_STMT); + build_check_stmt (loc, gimple_assign_lhs (str_arg_ssa), &gsi, + /*before_p=*/false, /*is_store=*/false, 1); - case BUILT_IN_ATOMIC_NAND_FETCH_1: - case BUILT_IN_ATOMIC_NAND_FETCH_2: - case BUILT_IN_ATOMIC_NAND_FETCH_4: - case BUILT_IN_ATOMIC_NAND_FETCH_8: - case BUILT_IN_ATOMIC_NAND_FETCH_16: + /* If we initially had an instruction like: - case BUILT_IN_ATOMIC_XOR_FETCH_1: - case BUILT_IN_ATOMIC_XOR_FETCH_2: - case BUILT_IN_ATOMIC_XOR_FETCH_4: - case BUILT_IN_ATOMIC_XOR_FETCH_8: - case BUILT_IN_ATOMIC_XOR_FETCH_16: + int n = strlen (str) - case BUILT_IN_ATOMIC_OR_FETCH_1: - case BUILT_IN_ATOMIC_OR_FETCH_2: - case BUILT_IN_ATOMIC_OR_FETCH_4: - case BUILT_IN_ATOMIC_OR_FETCH_8: - case BUILT_IN_ATOMIC_OR_FETCH_16: + we now want to instrument the access to str[n], after the + instruction above.*/ - case BUILT_IN_ATOMIC_FETCH_ADD_1: - case BUILT_IN_ATOMIC_FETCH_ADD_2: - case BUILT_IN_ATOMIC_FETCH_ADD_4: - case BUILT_IN_ATOMIC_FETCH_ADD_8: - case BUILT_IN_ATOMIC_FETCH_ADD_16: + /* So let's build the access to str[n] that is, access through the + pointer_plus expr: (_1 + len). */ + gimple stmt = + gimple_build_assign_with_ops (POINTER_PLUS_EXPR, + make_ssa_name (TREE_TYPE (str_arg), + NULL), + gimple_assign_lhs (str_arg_ssa), + len); + gimple_set_location (stmt, loc); + gsi_insert_after (&gsi, stmt, GSI_NEW_STMT); - case BUILT_IN_ATOMIC_FETCH_SUB_1: - case BUILT_IN_ATOMIC_FETCH_SUB_2: - case BUILT_IN_ATOMIC_FETCH_SUB_4: - case BUILT_IN_ATOMIC_FETCH_SUB_8: - case BUILT_IN_ATOMIC_FETCH_SUB_16: + build_check_stmt (loc, gimple_assign_lhs (stmt), &gsi, + /*before_p=*/false, /*is_store=*/false, 1); - case BUILT_IN_ATOMIC_FETCH_AND_1: - case BUILT_IN_ATOMIC_FETCH_AND_2: - case BUILT_IN_ATOMIC_FETCH_AND_4: - case BUILT_IN_ATOMIC_FETCH_AND_8: - case BUILT_IN_ATOMIC_FETCH_AND_16: + /* Ensure that iter points to the statement logically following the + one it was initially pointing to. */ + *iter = gsi; + /* As *ITER has been advanced to point to the next statement, let's + return true to inform transform_statements that it shouldn't + advance *ITER anymore; otherwises it will skip that next + statement, which wouldn't be instrumented. */ + return true; +} - case BUILT_IN_ATOMIC_FETCH_NAND_1: - case BUILT_IN_ATOMIC_FETCH_NAND_2: - case BUILT_IN_ATOMIC_FETCH_NAND_4: - case BUILT_IN_ATOMIC_FETCH_NAND_8: - case BUILT_IN_ATOMIC_FETCH_NAND_16: +/* Instrument the call to a built-in memory access function that is + pointed to by the iterator ITER. - case BUILT_IN_ATOMIC_FETCH_XOR_1: - case BUILT_IN_ATOMIC_FETCH_XOR_2: - case BUILT_IN_ATOMIC_FETCH_XOR_4: - case BUILT_IN_ATOMIC_FETCH_XOR_8: - case BUILT_IN_ATOMIC_FETCH_XOR_16: + Upon completion, return TRUE iff *ITER has been advanced to the + statement following the one it was originally pointing to. */ - case BUILT_IN_ATOMIC_FETCH_OR_1: - case BUILT_IN_ATOMIC_FETCH_OR_2: - case BUILT_IN_ATOMIC_FETCH_OR_4: - case BUILT_IN_ATOMIC_FETCH_OR_8: - case BUILT_IN_ATOMIC_FETCH_OR_16: - { - dest = gimple_call_arg (call, 0); - /* So DEST represents the address of a memory location. - instrument_derefs wants the memory location, so lets - dereference the address DEST before handing it to - instrument_derefs. */ - if (TREE_CODE (dest) == ADDR_EXPR) - dest = TREE_OPERAND (dest, 0); - else if (TREE_CODE (dest) == SSA_NAME) - dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)), - dest, build_int_cst (TREE_TYPE (dest), 0)); - else - gcc_unreachable (); +static bool +instrument_builtin_call (gimple_stmt_iterator *iter) +{ + bool iter_advanced_p = false; + gimple call = gsi_stmt (*iter); - instrument_derefs (iter, dest, loc, is_store); - return false; - } + gcc_checking_assert (gimple_call_builtin_p (call, BUILT_IN_NORMAL)); - default: - /* The other builtins memory access are not instrumented in this - function because they either don't have any length parameter, - or their length parameter is just a limit. */ - break; - } + tree callee = gimple_call_fndecl (call); + location_t loc = gimple_location (call); - if (len != NULL_TREE) + if (DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN) + iter_advanced_p = instrument_strlen_call (iter); + else { - if (source0 != NULL_TREE) - instrument_mem_region_access (source0, len, iter, - loc, /*is_store=*/false); - if (source1 != NULL_TREE) - instrument_mem_region_access (source1, len, iter, - loc, /*is_store=*/false); - else if (dest != NULL_TREE) - instrument_mem_region_access (dest, len, iter, - loc, /*is_store=*/true); - - *iter = gsi_for_stmt (call); - return false; + asan_mem_ref src0, src1, dest; + asan_mem_ref_init (&src0, NULL, 1); + asan_mem_ref_init (&src1, NULL, 1); + asan_mem_ref_init (&dest, NULL, 1); + + tree src0_len = NULL_TREE, src1_len = NULL_TREE, dest_len = NULL_TREE; + bool src0_is_store = false, src1_is_store = false, + dest_is_store = false, dest_is_deref = false; + + if (get_mem_refs_of_builtin_call (call, + &src0, &src0_len, &src0_is_store, + &src1, &src0_len, &src1_is_store, + &dest, &dest_len, &dest_is_store, + &dest_is_deref)) + { + if (dest_is_deref) + { + instrument_derefs (iter, dest.start, loc, dest_is_store); + gsi_next (iter); + iter_advanced_p = true; + } + else if (src0_len || src1_len || dest_len) + { + if (src0.start) + instrument_mem_region_access (src0.start, src0_len, + iter, loc, /*is_store=*/false); + if (src1.start != NULL_TREE) + instrument_mem_region_access (src1.start, src1_len, + iter, loc, /*is_store=*/false); + if (dest.start != NULL_TREE) + instrument_mem_region_access (dest.start, dest_len, + iter, loc, /*is_store=*/true); + *iter = gsi_for_stmt (call); + gsi_next (iter); + iter_advanced_p = true; + } + } } - return false; + return iter_advanced_p; } /* Instrument the assignment statement ITER if it is subject to - instrumentation. */ + instrumentation. Return TRUE iff instrumentation actually + happened. In that case, the iterator ITER is advanced to the next + logical expression following the one initially pointed to by ITER, + and the relevant memory reference that which access has been + instrumented is added to the memory references hash table. */ -static void -instrument_assignment (gimple_stmt_iterator *iter) +static bool +maybe_instrument_assignment (gimple_stmt_iterator *iter) { gimple s = gsi_stmt (*iter); gcc_assert (gimple_assign_single_p (s)); + tree ref_expr = NULL_TREE; + bool is_store, is_instrumented = false; + if (gimple_store_p (s)) - instrument_derefs (iter, gimple_assign_lhs (s), - gimple_location (s), true); + { + ref_expr = gimple_assign_lhs (s); + is_store = true; + instrument_derefs (iter, ref_expr, + gimple_location (s), + is_store); + is_instrumented = true; + } + if (gimple_assign_load_p (s)) - instrument_derefs (iter, gimple_assign_rhs1 (s), - gimple_location (s), false); + { + ref_expr = gimple_assign_rhs1 (s); + is_store = false; + instrument_derefs (iter, ref_expr, + gimple_location (s), + is_store); + is_instrumented = true; + } + + if (is_instrumented) + gsi_next (iter); + + return is_instrumented; } /* Instrument the function call pointed to by the iterator ITER, if it @@ -1424,10 +1849,11 @@ static bool maybe_instrument_call (gimple_stmt_iterator *iter) { gimple stmt = gsi_stmt (*iter); - bool is_builtin = is_gimple_builtin_call (stmt); - if (is_builtin - && instrument_builtin_call (iter)) + bool is_builtin = gimple_call_builtin_p (stmt, BUILT_IN_NORMAL); + + if (is_builtin && instrument_builtin_call (iter)) return true; + if (gimple_call_noreturn_p (stmt)) { if (is_builtin) @@ -1449,11 +1875,10 @@ maybe_instrument_call (gimple_stmt_iterator *iter) return false; } -/* asan: this looks too complex. Can this be done simpler? */ -/* Transform - 1) Memory references. - 2) BUILTIN_ALLOCA calls. -*/ +/* Walk each instruction of all basic block and instrument those that + represent memory references: loads, stores, or function calls. + In a given basic block, this function avoids instrumenting memory + references that have already been instrumented. */ static void transform_statements (void) @@ -1464,23 +1889,38 @@ transform_statements (void) FOR_EACH_BB (bb) { + empty_mem_ref_hash_table (); + if (bb->index >= saved_last_basic_block) continue; for (i = gsi_start_bb (bb); !gsi_end_p (i);) { gimple s = gsi_stmt (i); - if (gimple_assign_single_p (s)) - instrument_assignment (&i); - else if (is_gimple_call (s)) + if (has_stmt_been_instrumented_p (s)) + gsi_next (&i); + else if (gimple_assign_single_p (s) + && maybe_instrument_assignment (&i)) + /* Nothing to do as maybe_instrument_assignment advanced + the iterator I. */; + else if (is_gimple_call (s) && maybe_instrument_call (&i)) + /* Nothing to do as maybe_instrument_call + advanced the iterator I. */; + else { - if (maybe_instrument_call (&i)) - /* Avoid gsi_next (&i), because maybe_instrument_call - advanced the I iterator already. */ - continue; + /* No instrumentation happened. + + If the current instruction is a function call, let's + forget about the memory references that got + instrumented. Otherwise we might miss some + instrumentation opportunities. */ + if (is_gimple_call (s)) + empty_mem_ref_hash_table (); + + gsi_next (&i); } - gsi_next (&i); } } + free_mem_ref_resources (); } /* Build diff --git a/gcc/testsuite/c-c++-common/asan/inc.c b/gcc/testsuite/c-c++-common/asan/inc.c new file mode 100644 index 0000000..b7a8902 --- /dev/null +++ b/gcc/testsuite/c-c++-common/asan/inc.c @@ -0,0 +1,19 @@ +/* { dg-options "-fdump-tree-asan0" } */ +/* { dg-do compile } */ + +void +foo(int *a) +{ + (*a)++; +} + +int +main () +{ + int a = 0; + foo (&a); + return 0; +} + +/* { dg-final { scan-tree-dump-times "__builtin___asan_report" 1 "asan0" } } */ +/* { dg-final { scan-tree-dump "__builtin___asan_report_load4" "asan0" } } */ diff --git a/gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c b/gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c new file mode 100644 index 0000000..f87b1c5 --- /dev/null +++ b/gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c @@ -0,0 +1,64 @@ +/* This tests that when faced with two references to the same memory + location in the same basic block, the second reference should not + be instrumented by the Address Sanitizer. */ + +/* { dg-options "-fdump-tree-asan0" } */ +/* { dg-do compile } */ + +static char tab[4] = {0}; + +static int +test0 () +{ + /* __builtin___asan_report_store1 called 2 times for the two stores + below. */ + tab[0] = 1; + tab[1] = 2; + + /* __builtin___asan_report_load1 called 1 time for the store + below. */ + char t0 = tab[1]; + + /* This load should not be instrumented because it is to the same + memory location as above. */ + char t1 = tab[1]; + + return t0 + t1; +} + +static int +test1 () +{ + /*__builtin___asan_report_store1 called 1 time here to instrument + the initialization. */ + char foo[4] = {1}; + + /*__builtin___asan_report_store1 called 2 times here to instrument + the store to the memory region of tab. */ + __builtin_memset (tab, 3, sizeof (tab)); + + /* There is no instrumentation for the two memset calls below. */ + __builtin_memset (tab, 4, sizeof (tab)); + __builtin_memset (tab, 5, sizeof (tab)); + + /* There are 2 calls to __builtin___asan_report_store1 and 2 calls + to __builtin___asan_report_load1 to instrument the store to + (subset of) the memory region of tab. */ + __builtin_memcpy (&tab[1], foo, sizeof (tab) - 1); + + /* This should not generate a __builtin___asan_report_load1 because + the reference to tab[1] has been already instrumented above. */ + return tab[1]; + + /* So for these function, there should be 7 calls to + __builtin___asan_report_store1. */ +} + +int +main () +{ + return test0 () && test1 (); +} + +/* { dg-final { scan-tree-dump-times "__builtin___asan_report_store1" 7 "asan0" } } */ +/* { dg-final { scan-tree-dump-times "__builtin___asan_report_load" 2 "asan0" } } */ diff --git a/gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-2.c b/gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-2.c new file mode 100644 index 0000000..2dc55eb --- /dev/null +++ b/gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-2.c @@ -0,0 +1,24 @@ +/* This tests that when faced with two references to the same memory + location in the same basic block, the second reference should not + be instrumented by the Address Sanitizer. But in case of access to + overlapping regions we must be precise. */ + +/* { dg-options "-fdump-tree-asan0" } */ +/* { dg-do compile } */ + +int +main () +{ + char tab[5]; + + /* Here, we instrument the access at offset 0 and access at offset + 4. */ + __builtin_memset (tab, 1, sizeof (tab)); + /* We instrumented access at offset 0 above already, so only access + at offset 3 is instrumented. */ + __builtin_memset (tab, 1, 3); +} + +/* { dg-final { scan-tree-dump-times "__builtin___asan_report_store1" 3 "asan0" } } */ +/* { dg-final { scan-tree-dump-times "__builtin___asan_report" 3 "asan0" } } */ + diff --git a/gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-3.c b/gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-3.c new file mode 100644 index 0000000..1858759 --- /dev/null +++ b/gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-3.c @@ -0,0 +1,16 @@ +/* { dg-options "-fdump-tree-asan0" } */ +/* { dg-do compile } */ + +char +foo (__INT32_TYPE__ *p) +{ + /* This generates a __builtin___asan_report_load1. */ + __INT32_TYPE__ ret = *(char *) p; + /* This generates a __builtin___asan_report_store4 depending on the. */ + *p = 26; + return ret; +} + +/* { dg-final { scan-tree-dump-times "__builtin___asan_report" 2 "asan0" } } */ +/* { dg-final { scan-tree-dump-times "__builtin___asan_report_load1" 1 "asan0" } } */ +/* { dg-final { scan-tree-dump-times "__builtin___asan_report_store" 1 "asan0" } } */