From patchwork Tue Jun 23 13:34:39 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Torvald Riegel X-Patchwork-Id: 487629 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 6A1981401CB for ; Tue, 23 Jun 2015 23:34:51 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=sourceware.org header.i=@sourceware.org header.b=CcSNMfxN; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:message-id:subject:from:to:cc:date :content-type:mime-version; q=dns; s=default; b=mXCARCASoSzRWENh /v1AoMwTAe7MKvOm8/tRbHntNvWHwciZleCAn8NRBc2r4eJ7DG/2CgP4N7DBPfDl HL7IJnLsOiCvlrQHfYaVbN5YWPXo39E4Esq7kjXv0w84UapzSEsv6EBkeHuVkiDm 9ggPU3fWEqeQitsKoxwDDnclmG4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:message-id:subject:from:to:cc:date :content-type:mime-version; s=default; bh=/dMa912JqjoWL8tFLFrKxK /gkeI=; b=CcSNMfxNL8AyCm6uCRYqrfk1+5nEAbxtQOYOzWqlZiFbazRAcmrtWW 5m2WqkMBuTHqQPPnLmv2LPU34RhpavpXBp6/nHRWQl7ecnPbRrMAXDcTTFZPh79u ePLe8hDwSwzyWh9bTTU1sVOGRi8c4ojCgCjiwmfqlP0BcgAnZoOMs= Received: (qmail 130779 invoked by alias); 23 Jun 2015 13:34:45 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 130768 invoked by uid 89); 23 Jun 2015 13:34:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL, BAYES_50, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Message-ID: <1435066479.25759.79.camel@localhost.localdomain> Subject: [PATCH v2] Fix atomic_full_barrier on x86 and x86_64. From: Torvald Riegel To: GLIBC Devel Cc: "H.J. Lu" Date: Tue, 23 Jun 2015 15:34:39 +0200 Mime-Version: 1.0 This is a revision of https://sourceware.org/ml/libc-alpha/2014-10/msg00702. This fixes BZ #17403 by defining atomic_full_barrier, atomic_read_barrier, and atomic_write_barrier on x86 and x86_64. A full barrier is implemented through an atomic idempotent modification to the stack and not through using mfence because the latter can supposedly be somewhat slower due to having to provide stronger guarantees wrt self-modifying code, for example. Tested on x86_64-linux. H.J., is this OK for x86 and x32? 2015-06-23 Torvald Riegel [BZ #17403] * sysdeps/x86_64/bits/atomic.h: (atomic_full_barrier, atomic_read_barrier, atomic_write_barrier): Define. * sysdeps/i386/i486/bits/atomic.h (atomic_full_barrier, atomic_read_barrier, atomic_write_barrier): Define. commit ae5defc04e383b92b7828833783b36d1d49c188c Author: Torvald Riegel Date: Wed Oct 29 10:34:36 2014 +0100 Fix atomic_full_barrier on x86 and x86_64. This fixes BZ #17403 by defining atomic_full_barrier, atomic_read_barrier, and atomic_write_barrier on x86 and x86_64. A full barrier is implemented through an atomic idempotent modification to the stack and not through using mfence because the latter can supposedly be somewhat slower due to having to provide stronger guarantees wrt. self-modifying code, for example. diff --git a/NEWS b/NEWS index 4e21210..323ea52 100644 --- a/NEWS +++ b/NEWS @@ -12,18 +12,18 @@ Version 2.22 438, 4719, 6792, 13028, 13064, 14094, 14841, 14906, 14958, 15319, 15467, 15790, 15969, 16159, 16339, 16350, 16351, 16352, 16353, 16361, 16512, 16560, 16704, 16783, 16850, 17053, 17090, 17195, 17269, 17293, 17322, - 17523, 17542, 17569, 17581, 17588, 17596, 17620, 17621, 17628, 17631, - 17692, 17711, 17715, 17776, 17779, 17792, 17836, 17912, 17916, 17930, - 17932, 17944, 17949, 17964, 17965, 17967, 17969, 17978, 17987, 17991, - 17996, 17998, 17999, 18007, 18019, 18020, 18029, 18030, 18032, 18034, - 18036, 18038, 18039, 18042, 18043, 18046, 18047, 18049, 18068, 18080, - 18093, 18100, 18104, 18110, 18111, 18116, 18125, 18128, 18138, 18185, - 18196, 18197, 18206, 18210, 18211, 18217, 18220, 18221, 18234, 18244, - 18247, 18287, 18319, 18324, 18333, 18346, 18397, 18409, 18410, 18412, - 18418, 18422, 18434, 18444, 18468, 18469, 18470, 18479, 18483, 18495, - 18496, 18497, 18498, 18507, 18512, 18513, 18519, 18520, 18522, 18527, - 18528, 18529, 18530, 18532, 18533, 18534, 18536, 18539, 18540, 18542, - 18544, 18545, 18546, 18547, 18553, 18558, 18569. + 17403, 17523, 17542, 17569, 17581, 17588, 17596, 17620, 17621, 17628, + 17631, 17692, 17711, 17715, 17776, 17779, 17792, 17836, 17912, 17916, + 17930, 17932, 17944, 17949, 17964, 17965, 17967, 17969, 17978, 17987, + 17991, 17996, 17998, 17999, 18007, 18019, 18020, 18029, 18030, 18032, + 18034, 18036, 18038, 18039, 18042, 18043, 18046, 18047, 18049, 18068, + 18080, 18093, 18100, 18104, 18110, 18111, 18116, 18125, 18128, 18138, + 18185, 18196, 18197, 18206, 18210, 18211, 18217, 18220, 18221, 18234, + 18244, 18247, 18287, 18319, 18324, 18333, 18346, 18397, 18409, 18410, + 18412, 18418, 18422, 18434, 18444, 18468, 18469, 18470, 18479, 18483, + 18495, 18496, 18497, 18498, 18507, 18512, 18513, 18519, 18520, 18522, + 18527, 18528, 18529, 18530, 18532, 18533, 18534, 18536, 18539, 18540, + 18542, 18544, 18545, 18546, 18547, 18553, 18558, 18569. * Cache information can be queried via sysconf() function on s390 e.g. with _SC_LEVEL1_ICACHE_SIZE as argument. diff --git a/sysdeps/i386/i486/bits/atomic.h b/sysdeps/i386/i486/bits/atomic.h index 01c2b94..59165be 100644 --- a/sysdeps/i386/i486/bits/atomic.h +++ b/sysdeps/i386/i486/bits/atomic.h @@ -535,3 +535,10 @@ typedef uintmax_t uatomic_max_t; #define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask) #define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask) + +/* We don't use mfence because it is supposedly slower due to having to + provide stronger guarantees (e.g., regarding self-modifying code). */ +#define atomic_full_barrier() \ + __asm __volatile (LOCK_PREFIX "orl $0, (%%esp)" ::: "memory") +#define atomic_read_barrier() __asm ("" ::: "memory") +#define atomic_write_barrier() __asm ("" ::: "memory") diff --git a/sysdeps/x86_64/bits/atomic.h b/sysdeps/x86_64/bits/atomic.h index 5d08146..203d92c 100644 --- a/sysdeps/x86_64/bits/atomic.h +++ b/sysdeps/x86_64/bits/atomic.h @@ -472,3 +472,10 @@ typedef uintmax_t uatomic_max_t; #define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask) #define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask) + +/* We don't use mfence because it is supposedly slower due to having to + provide stronger guarantees (e.g., regarding self-modifying code). */ +#define atomic_full_barrier() \ + __asm __volatile (LOCK_PREFIX "orl $0, (%%rsp)" ::: "memory") +#define atomic_read_barrier() __asm ("" ::: "memory") +#define atomic_write_barrier() __asm ("" ::: "memory")