From patchwork Mon Aug 2 14:21:50 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aldy Hernandez X-Patchwork-Id: 60539 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id EA8F71007D1 for ; Tue, 3 Aug 2010 00:22:12 +1000 (EST) Received: (qmail 24578 invoked by alias); 2 Aug 2010 14:22:08 -0000 Received: (qmail 24561 invoked by uid 22791); 2 Aug 2010 14:22:05 -0000 X-SWARE-Spam-Status: No, hits=-6.2 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, TW_CX, TW_TM, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 02 Aug 2010 14:21:56 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o72ELsC6032648 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 2 Aug 2010 10:21:54 -0400 Received: from redhat.com (vpn-10-11.rdu.redhat.com [10.11.10.11]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o72ELpU4019882 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Mon, 2 Aug 2010 10:21:53 -0400 Date: Mon, 2 Aug 2010 10:21:50 -0400 From: Aldy Hernandez To: Richard Henderson Cc: gcc-patches@gcc.gnu.org Subject: Re: [trans-mem] optimize exception object memory Message-ID: <20100802142149.GA3088@redhat.com> References: <20100719212255.GA12644@redhat.com> <4C44D0D4.2020109@redhat.com> <20100729174317.GA2836@redhat.com> <4C51CB13.1090506@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <4C51CB13.1090506@redhat.com> User-Agent: Mutt/1.5.20 (2009-08-17) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org > This does not follow in general. It does if we're also told > that it's new storage (i.e. malloc attribute). Well, we can check for ECF_MALLOC and tag _ITM_cxa_allocate_exception() with the malloc attribute. See patch. > Now, the irritating part here is that we Know that begin_catch > must be returning transaction_local storage, but only because > we know it must be part of a properly nested try-catch block. > > Unfortunately, begin_catch can't be marked malloc because the > constructor for the thrown data type can do anything at all, > including storing the value of "this" in some global storage. Hmmm, I guess we won't optimize the memory returned by _ITM_cxa_begin_catch(), unless we want to go back to special casing all this, which btw, doesn't seem like such a bad idea now. I do wonder how much use we'll get out of optimizing the exception object anyhow. By definition, exceptions are rare. Is it worth this hassle? Hmmm... ?? libitm/ * libitm.h (THROW): Define. Add attributes to _ITM_cxa_allocate_exception and _ITM_cxa_begin_catch. * eh_cpp.cc (_ITM_cxa_throw): Fix prototype argument. (_ITM_cxa_throw) Fix argument type. gcc/ * trans-mem.c (thread_private_new_memory): Handle `thread_local' attribute. (is_thread_local): New. * c-common.c (handle_thread_local_attribute): New. Add `thread_local' entry to c_common_att. * doc/extend.texi: Add documentation for `thread_local' attribute. Index: libitm/libitm.h =================================================================== --- libitm/libitm.h (revision 162234) +++ libitm/libitm.h (working copy) @@ -33,7 +33,10 @@ #include #ifdef __cplusplus +#define THROW throw() extern "C" { +#else +#define THROW #endif #ifdef __i386__ @@ -278,9 +281,9 @@ extern void *_ITM_getTMCloneSafe (void * extern void _ITM_registerTMCloneTable (void *, size_t); extern void _ITM_deregisterTMCloneTable (void *); -extern void *_ITM_cxa_allocate_exception (size_t); -extern void _ITM_cxa_throw (void *obj, void *tinfo, void *dest); -extern void *_ITM_cxa_begin_catch (void *exc_ptr); +extern void *_ITM_cxa_allocate_exception (size_t) THROW __attribute__((malloc, thread_local)); +extern void _ITM_cxa_throw (void *obj, void *tinfo, void (*)(void*)); +extern void *_ITM_cxa_begin_catch (void *exc_ptr) __attribute__((thread_local)); extern void _ITM_cxa_end_catch (void); extern void _ITM_commitTransactionEH(void *exc_ptr) ITM_REGPARM; Index: libitm/eh_cpp.cc =================================================================== --- libitm/eh_cpp.cc (revision 162234) +++ libitm/eh_cpp.cc (working copy) @@ -34,7 +34,7 @@ using namespace GTM; extern "C" { extern void *__cxa_allocate_exception (size_t) WEAK; -extern void __cxa_throw (void *, void *, void *) WEAK; +extern void __cxa_throw (void *, void *, void (*) (void*)) WEAK; extern void *__cxa_begin_catch (void *) WEAK; extern void *__cxa_end_catch (void) WEAK; extern void __cxa_tm_cleanup (void *, void *, unsigned int) WEAK; @@ -51,7 +51,7 @@ _ITM_cxa_allocate_exception (size_t size } void -_ITM_cxa_throw (void *obj, void *tinfo, void *dest) +_ITM_cxa_throw (void *obj, void *tinfo, void (*dest)(void*)) { gtm_tx()->cxa_unthrown = NULL; __cxa_throw (obj, tinfo, dest); Index: gcc/doc/extend.texi =================================================================== --- gcc/doc/extend.texi (revision 161318) +++ gcc/doc/extend.texi (working copy) @@ -1916,6 +1916,7 @@ attributes are currently defined for fun @code{destructor}, @code{used}, @code{unused}, @code{deprecated}, @code{weak}, @code{malloc}, @code{alias}, @code{warn_unused_result}, @code{nonnull}, @code{gnu_inline}, @code{externally_visible}, +@code{thread_local}, @code{hot}, @code{cold}, @code{artificial}, @code{error} and @code{warning}. Several other attributes are defined for functions on particular target systems. Other attributes, including @code{section} @@ -2572,6 +2573,19 @@ void __attribute__ ((interrupt, use_shad use_debug_exception_return)) v7 (); @end smallexample +@item thread_local +@cindex @code{thread_local} attribute. +This attribute, attached to a function definition, specifies that the +memory returned by this function lives entirely within the current +thread and is not visible to any other threads. This attribute is +only applicable to functions returning pointers. + +If this attribute is present, the compiler is able to optimize certain +memory accesses that are known to only affect the current thread. + +Currently, GCC only uses this attribute for transactional memory +code (available with @option{-fgnu-tm}). + @item interrupt_handler @cindex interrupt handler functions on the Blackfin, m68k, H8/300 and SH processors Use this attribute on the Blackfin, m68k, H8/300, H8/300H, H8S, and SH to Index: gcc/testsuite/c-c++-common/tm/thread-local.c =================================================================== --- gcc/testsuite/c-c++-common/tm/thread-local.c (revision 0) +++ gcc/testsuite/c-c++-common/tm/thread-local.c (revision 0) @@ -0,0 +1,17 @@ +// { dg-do compile } +// { dg-options "-fgnu-tm -O -fdump-tree-optimized" } + +/* Test that `thread_local' attribute inhibits memory instrumentation. */ + +extern char *funky (void) __attribute__((transaction_pure, thread_local, malloc)); + +void tranfunc() { + __transaction { + char *p; + p = funky(); + *p = 5; /* This should not be instrumented. */ + } +} + +// { dg-final { scan-tree-dump-times "_ITM_WU" 0 "optimized" } } +// { dg-final { cleanup-tree-dump "optimized" } } Index: gcc/trans-mem.c =================================================================== --- gcc/trans-mem.c (revision 162577) +++ gcc/trans-mem.c (working copy) @@ -222,6 +222,18 @@ is_tm_safe (tree x) return false; } +/* Return true if X has been marked `thread_local'. */ + +static bool +is_thread_local (tree x) +{ + tree attrs = get_attrs_for (x); + + if (attrs) + return lookup_attribute ("thread_local", attrs) != NULL; + return false; +} + /* Return true if CALL is const, or tm_pure. */ static bool @@ -1258,6 +1270,7 @@ thread_private_new_memory (basic_block e tm_new_mem_map_t elt, *elt_p; tree val = x; enum thread_memory_type retval = mem_transaction_local; + bool escapes_function_p = false; if (!entry_block || TREE_CODE (x) != SSA_NAME @@ -1283,17 +1296,14 @@ thread_private_new_memory (basic_block e do { if (ptr_deref_may_alias_global_p (x)) - { - /* Address escapes. This is not thread-private. */ - retval = mem_non_local; - goto new_memory_ret; - } + escapes_function_p = true; stmt = SSA_NAME_DEF_STMT (x); /* If the malloc call is outside the transaction, this is thread-local. */ if (retval != mem_thread_local + && gimple_code (stmt) != GIMPLE_NOP && !dominated_by_p (CDI_DOMINATORS, gimple_bb (stmt), entry_block)) retval = mem_thread_local; @@ -1357,6 +1367,34 @@ thread_private_new_memory (basic_block e retval = mem_non_local; new_memory_ret: + if (escapes_function_p) + { + if (is_gimple_call (stmt) && gimple_call_flags (stmt) & ECF_MALLOC) + { + tree fn = gimple_call_fn (stmt); + + /* If the address has been declared thread-local, believe it. */ + if (is_thread_local (TREE_TYPE (fn))) + { + /* If the thread-local call is inside the transaction, + the memory is transaction local. */ + if (dominated_by_p (CDI_DOMINATORS, + gimple_bb (stmt), entry_block)) + retval = mem_transaction_local; + else + retval = mem_thread_local; + } + else + { + /* Otherwise, we have to assume that any address that + escapes the function also escapes the thread. */ + retval = mem_non_local; + } + } + else + retval = mem_non_local; + } + elt_p->local_new_memory = retval; return retval; } Index: gcc/c-common.c =================================================================== --- gcc/c-common.c (revision 161318) +++ gcc/c-common.c (working copy) @@ -517,6 +517,7 @@ static tree handle_no_limit_stack_attrib static tree handle_pure_attribute (tree *, tree, tree, int, bool *); static tree handle_tm_attribute (tree *, tree, tree, int, bool *); static tree handle_tm_wrap_attribute (tree *, tree, tree, int, bool *); +static tree handle_thread_local_attribute (tree *, tree, tree, int, bool *); static tree handle_novops_attribute (tree *, tree, tree, int, bool *); static tree handle_deprecated_attribute (tree *, tree, tree, int, bool *); @@ -807,6 +808,8 @@ const struct attribute_spec c_common_att handle_tm_attribute }, { "transaction_wrap", 1, 1, true, false, false, handle_tm_wrap_attribute }, + { "thread_local", 0, 0, false, true, false, + handle_thread_local_attribute }, /* For internal use (marking of builtins) only. The name contains space to prevent its usage in source code. */ { "no vops", 0, 0, true, false, false, @@ -7205,6 +7208,30 @@ handle_pure_attribute (tree *node, tree return NULL_TREE; } +/* Handle a "thread_local" attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_thread_local_attribute (tree *node, tree name, tree ARG_UNUSED (args), + int ARG_UNUSED (flags), bool *no_add_attrs) +{ + if (TREE_CODE (*node) == FUNCTION_TYPE + && TREE_CODE (TREE_TYPE (*node)) == POINTER_TYPE) + { + /* Do nothing. No sense wasting a bit to store this + information. We can just get at the attribute string + directly later. */ + ; + } + else + { + warning (OPT_Wattributes, "%qE attribute ignored", name); + *no_add_attrs = true; + } + + return NULL_TREE; +} + /* Digest an attribute list destined for a transactional memory statement. ALLOWED is the set of attributes that are allowed for this statement; return the attribute we parsed. Multiple attributes are never allowed. */