From patchwork Tue Apr 5 12:22:58 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 606447 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3qfSgQ4Brnz9t3Z for ; Tue, 5 Apr 2016 22:23:14 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b=pn5IgRQQ; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:subject:message-id:reply-to :mime-version:content-type; q=dns; s=default; b=M3TZoK7f+TpW0XVs 1/Mzq0lwNxT94Vuef2wykg22oKgtLwG6QJo9G87VTSMC8uNnD67HoVFcMXirbe80 GbRwqsfS3NK20yKsrhVtKMLzORuYI/YvVJWVis/DBf9QA4WDcVcqz37XNgovyD8J tLjGRQ8NK6x97cGA1TlTunb5ujo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:subject:message-id:reply-to :mime-version:content-type; s=default; bh=6Z6fr1scm8FONACKdMv6YH jOFFo=; b=pn5IgRQQSzBahQ1erB1pYGXf9hXWO+Of7nRWf0Diek6Bu0TrMHtE2g NDXPG9TpIqKPUafxyoeCwXiEd6Sx1+5/raSX+emh7z1xHhoOLv44lf+kcnrCH9ID ZxFTLtiKsJtL4jvhUKKsU76iYMkspS3EJhBgt9/i47XDbmT6nx2YI= Received: (qmail 44091 invoked by alias); 5 Apr 2016 12:23:08 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 44075 invoked by uid 89); 5 Apr 2016 12:23:07 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.8 required=5.0 tests=BAYES_50, KAM_LAZY_DOMAIN_SECURITY, NO_DNS_FOR_FROM, RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=UD:VMOVU.d32, p2align, vmovud32, vmovu.d32 X-HELO: mga03.intel.com X-ExtLoop1: 1 Date: Tue, 5 Apr 2016 05:22:58 -0700 From: "H.J. Lu" To: GNU C Library Subject: [committed, PATCH] Force 32-bit displacement in memset-vec-unaligned-erms.S Message-ID: <20160405122258.GA6778@intel.com> Reply-To: "H.J. Lu" MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Force 32-bit displacement to avoid long nop between instructions. --- sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S index 9383517..fe0f745 100644 --- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S +++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S @@ -159,10 +159,23 @@ L(return): .p2align 4 L(loop_start): leaq (VEC_SIZE * 4)(%rdi), %rcx +# if VEC_SIZE == 32 || VEC_SIZE == 64 + /* Force 32-bit displacement to avoid long nop between + instructions. */ + VMOVU.d32 %VEC(0), (%rdi) +# else VMOVU %VEC(0), (%rdi) +# endif andq $-(VEC_SIZE * 4), %rcx +# if VEC_SIZE == 32 + /* Force 32-bit displacement to avoid long nop between + instructions. */ + VMOVU.d32 %VEC(0), -VEC_SIZE(%rdi,%rdx) + VMOVU.d32 %VEC(0), VEC_SIZE(%rdi) +# else VMOVU %VEC(0), -VEC_SIZE(%rdi,%rdx) VMOVU %VEC(0), VEC_SIZE(%rdi) +# endif VMOVU %VEC(0), -(VEC_SIZE * 2)(%rdi,%rdx) VMOVU %VEC(0), (VEC_SIZE * 2)(%rdi) VMOVU %VEC(0), -(VEC_SIZE * 3)(%rdi,%rdx)