From patchwork Sun Feb 19 17:33:27 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Uros Bizjak <ubizjak@gmail.com>
X-Patchwork-Id: 729578
Return-Path: 
 <gcc-patches-return-448781-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 3vRDPw45CFz9s7m
	for <incoming@patchwork.ozlabs.org>;
	Mon, 20 Feb 2017 04:33:38 +1100 (AEDT)
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b="EFfm/ADM"; dkim-atps=neutral
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:mime-version:in-reply-to:references:from:date:message-id
	:subject:to:cc:content-type; q=dns; s=default; b=aVNoU44AQWBVxpO
	1bSaupBf90opP4+ScL5NGvj+scXK5LuZIHN208muuz0UbN+34lynxmuJus3Y26JA
	OohHKRmLfQuKLWinsioT+u48aTbpn78g/8NvdHf1PhLiENoj7XKW9v1ec7WVSlSz
	Eo6DzJJEqTpQykE0z54FHCQy342c=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:mime-version:in-reply-to:references:from:date:message-id
	:subject:to:cc:content-type; s=default; bh=v32CvSkEBbqQ/MBOtmZpv
	FntRkA=; b=EFfm/ADMg9N+Cc5Ki0zj84s+JOMAew1tFyIxt+9p6G3gSF+pP8iMS
	NKGxQMY2fjhKo2YVe9K/eqVhLM28WnVOKydnRlLvzDJM6XOrHgS/c+4AR71T3FFS
	ya/CWC3yJhRgcnYezIiIu+0air8x6dryyvJoOKK1qQoNezlPjpdr1E=
Received: (qmail 29924 invoked by alias); 19 Feb 2017 17:33:31 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 29905 invoked by uid 89); 19 Feb 2017 17:33:30 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-5.9 required=5.0 tests=AWL, BAYES_00,
	FREEMAIL_FROM, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE,
	SPAM_SUBJECT1,
	SPF_PASS autolearn=ham version=3.3.2 spammy=H*f:sk:wqarJt0,
	H*f:CAFULd4bkusi, H*i:CAFULd4bkusi, H*i:sk:wqarJt0
X-HELO: mail-ua0-f173.google.com
Received: from mail-ua0-f173.google.com (HELO mail-ua0-f173.google.com)
	(209.85.217.173) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Sun, 19 Feb 2017 17:33:29 +0000
Received: by mail-ua0-f173.google.com with SMTP id 35so53871321uak.1 for
	<gcc-patches@gcc.gnu.org>; Sun, 19 Feb 2017 09:33:29 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net;
	s=20161025;
	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
	:message-id:subject:to:cc;
	bh=vG7qdJWjJ3R34J6R2qhoBwe40t3jQAGnzcF7EWRX98w=;
	b=g5luVMNDK1uObp4XfWWfEuiIioQwVhwZLVafttvVb5ke5s+5i4tCAqLIjEg59FVE6x
	VKValv1GdiXJl+SnYrhtgfHQBS42xjzyBOL5NW47lhjNX5iwYhPZo4tfXL2N+/gLF4FV
	vYA+DwKp63dBjl3TKHLiweMxoro7ltIr0wd/vOd2waumrg23IJoVy2v3xPv9wrwKpWX2
	C6aj5rWHd5JqgO3lMOcvJljGu/Y0juF62LffGLJVGe3fsN3N/dXEb2APJTPbCU34fSFn
	QhNUf+bswTFuEo5ssiwGAwyZ0j5qof/uFsSrw1cHYQPHTuKcCR7HDKec/30iyxFkkSx+
	pbUw==
X-Gm-Message-State: 
 AMke39kp3Z4It21GMpDm3yBfYKE5KyWGaQcwSLDv2gHBYfRvSIoYoQUycNf+T5ug2CJzabjFbYXcj2vGriy3gw==
X-Received: by 10.176.71.234 with SMTP id w42mr575052uac.141.1487525607804;
	Sun, 19 Feb 2017 09:33:27 -0800 (PST)
MIME-Version: 1.0
Received: by 10.103.87.11 with HTTP; Sun, 19 Feb 2017 09:33:27 -0800 (PST)
In-Reply-To: 
 <CAFULd4bkusi=wqarJt0pHhZSLHKqnXKZwR4j_6AsjsxdU3i-vg@mail.gmail.com>
References: 
 <CAFULd4ZB8jehEJZBDmn10HGqQvOho9MJ9wDZVorRmbZMduJxDA@mail.gmail.com>
	<20170217163022.GK1849@tucnak>
	<CAFULd4bkusi=wqarJt0pHhZSLHKqnXKZwR4j_6AsjsxdU3i-vg@mail.gmail.com>
From: Uros Bizjak <ubizjak@gmail.com>
Date: Sun, 19 Feb 2017 18:33:27 +0100
Message-ID: 
 <CAFULd4ac5F_nbtuSi42t7uf6SNCiHyZ0NvHnRx_1y2FV6ZdmmQ@mail.gmail.com>
Subject: Re: [RFC PATCH, i386]: Use "lock orl $0, -4(%esp)" in mfence_nosse
To: Jakub Jelinek <jakub@redhat.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>, peter@cordes.ca

On Fri, Feb 17, 2017 at 5:59 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Fri, Feb 17, 2017 at 5:30 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Sun, May 29, 2016 at 11:10:15PM +0200, Uros Bizjak wrote:
>>> As explained in PR71245, comment #3 [1], it is better to use offset -4
>>> to a %esp to implement a non-SSE memory fence instruction:
>>>
>>> -q-
>>>
>>> I guess it costs a code byte for a disp8 in the addressing mode, but
>>> it avoids adding a lot of latency to a critical path involving a
>>> spill/reload to (%esp), in functions where there is something at
>>> (%esp).
>>>
>>> If it's an object larger than 4B, the lock orl could even cause a
>>> store-forwarding stall when the object is reloaded.  (e.g. a double or
>>> a vector).
>>>
>>> Ideally we could do the  lock orl  on some padding between two locals,
>>> or on something in memory that wasn't going to be loaded soon, to
>>> avoid touching more stack memory (which might be in the next page
>>> down).  But we still want to do it on a cache line that's hot, so
>>> going way up above our own stack frame isn't good either.
>>
>> Unfortunately this makes valgrind unhappy about that:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1423434
>> I assume it will complain now on anything pre-SSE2 that contains the memory
>> barrier in 32-bit code.
>> Perhaps we should decrement and increment %esp around it or something
>> similar (or push/pop)?  Of course, that would mean we need to take care
>> of async unwind info.
>
> Or, we can simply revert the patch? Not that the barrier performance
> of non-SSE 32bit targets matter...

Attached patch was committed to mainline to revert 2016-05-30 change.

2017-02-19  Uros Bizjak  <ubizjak@gmail.com>

    Revert:
    2016-05-30  Uros Bizjak  <ubizjak@gmail.com>

    * config/i386/sync.md (mfence_nosse): Use "lock orl $0, -4(%esp)".

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}

Uros.

--cut here--

Index: config/i386/sync.md
===================================================================
--- config/i386/sync.md (revision 245574)
+++ config/i386/sync.md (working copy)
@@ -98,7 +98,7 @@
        (unspec:BLK [(match_dup 0)] UNSPEC_MFENCE))
    (clobber (reg:CC FLAGS_REG))]
   "!(TARGET_64BIT || TARGET_SSE2)"
-  "lock{%;} or{l}\t{$0, -4(%%esp)|DWORD PTR [esp-4], 0}"
+  "lock{%;} or{l}\t{$0, (%%esp)|DWORD PTR [esp], 0}"
   [(set_attr "memory" "unknown")])

--cut here--