From patchwork Tue Mar 11 02:54:18 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Carr X-Patchwork-Id: 328905 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 634322C00B9 for ; Tue, 11 Mar 2014 13:54:35 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:to:subject:mime-version:content-type:content-id:date :from; q=dns; s=default; b=UAY+WNia2/ZfJA7PNmSewfzy8DCzX6UUS28bO YcpYP+fvpAdkLTDl51ZsKHISaYTlNeA4d5nQuMUBBMz1ukLVSSLy2HJ4au4puXHP svjoNAkgKnIJ5FbYpmgL3WZ10ev6HtQ7lQU2nFVPtS8SBJmp+iF57GgW0n6pR+88 XE+0IA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:to:subject:mime-version:content-type:content-id:date :from; s=default; bh=kFWsMeF8qfRjZqXEaT/KwarSmTs=; b=wZXE3cC93yD 5QjLhuf+W0fJ/bHWkbYSPMJFrnDo2WaOSNGPZcW8XdoD0Hq4ktIpagBLjlJvcjp3 oa1i1zOaO/LaKkwfh/Z+PDUFCMySoyPnxkTqviOH/9pEXGnjJHE2uSnCVWDR/AK5 8w70Q5TzA3LjwspG4xJhlH3JJ/lyTs3c= Received: (qmail 22715 invoked by alias); 11 Mar 2014 02:54:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 22696 invoked by uid 89); 11 Mar 2014 02:54:25 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.2 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: dmz-mailsec-scanner-1.mit.edu Received: from dmz-mailsec-scanner-1.mit.edu (HELO dmz-mailsec-scanner-1.mit.edu) (18.9.25.12) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Tue, 11 Mar 2014 02:54:24 +0000 Received: from mailhub-auth-3.mit.edu ( [18.9.21.43]) (using TLS with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by dmz-mailsec-scanner-1.mit.edu (Symantec Messaging Gateway) with SMTP id 4E.59.03111.DDA7E135; Mon, 10 Mar 2014 22:54:21 -0400 (EDT) Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id s2B2sKpL007268 for ; Mon, 10 Mar 2014 22:54:20 -0400 Received: from localhost (contents-vnder-pressvre.mit.edu [18.9.64.11]) (authenticated bits=0) (User authenticated as jfc@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s2B2sJ1g023579 for ; Mon, 10 Mar 2014 22:54:20 -0400 Message-Id: <201403110254.s2B2sJ1g023579@outgoing.mit.edu> To: gcc-patches@gcc.gnu.org Subject: [PATCH] ARM: Weaker memory barriers MIME-Version: 1.0 Content-ID: <48947.1394506458.0@contents-vnder-pressvre.MIT.EDU> Date: Mon, 10 Mar 2014 22:54:18 -0400 From: John Carr A comment in arm/sync.md notes "We should consider issuing a inner shareability zone barrier here instead." Here is my first attempt at a patch to emit weaker memory barriers. Three instructions seem to be relevant for user mode code on my Cortex A9 Linux box: dmb ishst, dmb ish, dmb sy I believe these correspond to a release barrier, a full barrier with respect to other CPUs, and a full barrier that also orders relative to I/O. Consider this a request for comments on whether the approach is correct. I haven't done any testing yet (beyond eyeballing the assembly output). 2014-03-10 John F. Carr * config/arm/sync.md (mem_thread_fence): New expander for weaker memory barriers. * config/arm/arm.c (arm_pre_atomic_barrier, arm_post_atomic_barrier): Emit only as strong a fence as needed. Index: config/arm/arm.c =================================================================== --- config/arm/arm.c (revision 208470) +++ config/arm/arm.c (working copy) @@ -29813,7 +29813,12 @@ arm_pre_atomic_barrier (enum memmodel model) { if (need_atomic_barrier_p (model, true)) - emit_insn (gen_memory_barrier ()); + { + if (HAVE_mem_thread_fence) + emit_insn (gen_mem_thread_fence (GEN_INT ((int) model))); + else + emit_insn (gen_memory_barrier ()); + } } static void @@ -29820,7 +29825,12 @@ arm_post_atomic_barrier (enum memmodel model) { if (need_atomic_barrier_p (model, false)) - emit_insn (gen_memory_barrier ()); + { + if (HAVE_mem_thread_fence) + emit_insn (gen_mem_thread_fence (GEN_INT ((int) model))); + else + emit_insn (gen_memory_barrier ()); + } } /* Emit the load-exclusive and store-exclusive instructions. Index: config/arm/sync.md =================================================================== --- config/arm/sync.md (revision 208470) +++ config/arm/sync.md (working copy) @@ -34,26 +34,54 @@ (define_mode_attr sync_sfx [(QI "b") (HI "h") (SI "") (DI "d")]) +(define_expand "mem_thread_fence" + [(set (match_dup 1) + (unspec:BLK + [(match_dup 1) + (match_operand:SI 0 "const_int_operand")] + UNSPEC_MEMORY_BARRIER))] + "TARGET_HAVE_DMB" +{ + if (INTVAL(operands[0]) == MEMMODEL_RELAXED) + DONE; + operands[1] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode)); + MEM_VOLATILE_P (operands[1]) = 1; +}) + (define_expand "memory_barrier" [(set (match_dup 0) - (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))] + (unspec:BLK [(match_dup 0) (match_dup 1)] + UNSPEC_MEMORY_BARRIER))] "TARGET_HAVE_MEMORY_BARRIER" { operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode)); MEM_VOLATILE_P (operands[0]) = 1; + operands[1] = GEN_INT((int) MEMMODEL_SEQ_CST); }) (define_insn "*memory_barrier" [(set (match_operand:BLK 0 "" "") - (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))] - "TARGET_HAVE_MEMORY_BARRIER" + (unspec:BLK + [(match_dup 0) (match_operand:SI 1 "const_int_operand")] + UNSPEC_MEMORY_BARRIER))] + "TARGET_HAVE_DMB || TARGET_HAVE_MEMORY_BARRIER" { if (TARGET_HAVE_DMB) { - /* Note we issue a system level barrier. We should consider issuing - a inner shareabilty zone barrier here instead, ie. "DMB ISH". */ - /* ??? Differentiate based on SEQ_CST vs less strict? */ - return "dmb\tsy"; + switch (INTVAL(operands[1])) + { + case MEMMODEL_RELEASE: + return "dmb\tishst"; + case MEMMODEL_CONSUME: + case MEMMODEL_ACQUIRE: + case MEMMODEL_ACQ_REL: + return "dmb\tish"; + case MEMMODEL_SEQ_CST: + return "dmb\tsy"; + case MEMMODEL_RELAXED: + default: + gcc_unreachable (); + } } if (TARGET_HAVE_DMB_MCR)