Message ID | 1507869000-24336-1-git-send-email-wei.guo.simon@gmail.com (mailing list archive) |
---|---|
Headers | show
Return-Path: <linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org> X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yCvvX6Jbxz9s7p for <patchwork-incoming@ozlabs.org>; Fri, 13 Oct 2017 15:32:24 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="e39lH2pt"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3yCvvX21dYzDrF8 for <patchwork-incoming@ozlabs.org>; Fri, 13 Oct 2017 15:32:24 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="e39lH2pt"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400e:c00::243; helo=mail-pf0-x243.google.com; envelope-from=wei.guo.simon@gmail.com; receiver=<UNKNOWN>) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="e39lH2pt"; dkim-atps=neutral Received: from mail-pf0-x243.google.com (mail-pf0-x243.google.com [IPv6:2607:f8b0:400e:c00::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3yCvsp5nFZzDrD7 for <linuxppc-dev@lists.ozlabs.org>; Fri, 13 Oct 2017 15:30:54 +1100 (AEDT) Received: by mail-pf0-x243.google.com with SMTP id 17so8150188pfn.12 for <linuxppc-dev@lists.ozlabs.org>; Thu, 12 Oct 2017 21:30:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=mTa2XtKsL2P+RreyMCBAconDUYhNlgNnS9RP1/zj+Q0=; b=e39lH2ptJoDexuDP2xugUKdYENGuvwa9xiU6H0XES6o4mK4YZ/2oiW0vjiXTD5yrzZ rFDWcYN5gXHM3z75y3rmcrpVadCMzzNBExgTThiC9pzjA9VZDZm/jVjiqnxS1ZPT13XV apOSpEC9/O5iW8MUdOCTWFSGSJPcAuRGHq5ZSEid6XPK/baKizLRATZ+EzgVyM054yd4 58muLbnsQxGgngIyeQSKmfJwXI3A2uPr0Tx7Ie+3bJgYdZgZQqcgZ06lnwXYmsV2RFE2 uyqMqPjMznSfXGTk3GUWv6gfxbRfS2YHnhYxWi9X8Adwv+Wz236WzA7bqqd2w+AE3zZK Ijyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=mTa2XtKsL2P+RreyMCBAconDUYhNlgNnS9RP1/zj+Q0=; b=FPXIKoTz5x4BIvRZo3mz0kBO0HLJ0xZ6XFeSQyB0oyBvyI63EhDhkDj4Bupp/VRLN8 Jl1Gi5b5/ACLq/t6/M4y0n4Ar3dx92eMPIVlJxbqEBs/z8OsnfZNcsxV/3PKUniU6wQH PQRUK6NvTYYWhaTmNSPlnb/E0SK5fPg3exXvL6twtkMvk/lD9YozXzCT/XLyhQZ7gSRq K+863FYhdW7nRUgyLcsAQWV3Tpvwqd5b+d/jNrtV7CuFHcBi80hke9YGG7ejs8Jkfr/j oEIE5KWG02yn9sauR4YWX85M7p5rUl1XOVgWXPdLH3eK1FnL1/lkmTyZxy9TBLdVAnZj 6DCg== X-Gm-Message-State: AMCzsaVmT4cOYHFDVKsJupDE9Uq/Ug1QWXNO3EkJ4VJrk2D24p8O4vLz 9JnEZUVLytbVo3bcqTexcYKl+Ryl X-Google-Smtp-Source: AOwi7QB3Mf/Si2OQMWCe4DTw+k3xZSNHve0cwcVlYrCy58XW/TyffDklNq/hq15VSyrZqx8GgWuhUQ== X-Received: by 10.84.244.136 with SMTP id h8mr190244pll.84.1507869051897; Thu, 12 Oct 2017 21:30:51 -0700 (PDT) Received: from simonLocalRHEL7.x64 ([112.73.0.90]) by smtp.gmail.com with ESMTPSA id o2sm208034pfj.136.2017.10.12.21.30.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 12 Oct 2017 21:30:51 -0700 (PDT) From: wei.guo.simon@gmail.com To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 0/3] powerpc/64: memcmp() optimization Date: Fri, 13 Oct 2017 12:29:57 +0800 Message-Id: <1507869000-24336-1-git-send-email-wei.guo.simon@gmail.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List <linuxppc-dev.lists.ozlabs.org> List-Unsubscribe: <https://lists.ozlabs.org/options/linuxppc-dev>, <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe> List-Archive: <http://lists.ozlabs.org/pipermail/linuxppc-dev/> List-Post: <mailto:linuxppc-dev@lists.ozlabs.org> List-Help: <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help> List-Subscribe: <https://lists.ozlabs.org/listinfo/linuxppc-dev>, <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe> Cc: Simon Guo <wei.guo.simon@gmail.com>, David Laight <David.Laight@ACULAB.COM>, "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>, Cyril Bur <cyrilbur@gmail.com> Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" <linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org> |
Series | powerpc/64: memcmp() optimization | expand |
From: Simon Guo <wei.guo.simon@gmail.com> There is some room to optimize memcmp() in powerpc 64 bits version for following 2 cases: (1) Even src/dst addresses are not aligned with 8 bytes at the beginning, memcmp() can align them and go with .Llong comparision mode without fallback to .Lshort comparision mode do compare buffer byte by byte. (2) VMX instructions can be used to speed up for large size comparision, currently the threshold is set for 4K bytes. glibc commit dec4a7105e (powerpc: Improve memcmp performance for POWER8) did the similar. Thanks Cyril Bur's information. This patch set also updates memcmp selftest case to make it compiled and incorporate large size comparison case. v2 -> v3: - add optimization for src/dst with different offset against 8 bytes boundary. - renamed some label names. - reworked some comments from Cyril Bur, such as fill the pipeline, and use VMX when size == 4K. - fix a bug of enter/exit_vmx_ops pairness issue. And revised test case to test whether enter/exit_vmx_ops are paired. v1 -> v2: - update 8bytes unaligned bytes comparison method. - fix a VMX comparision bug. - enhanced the original memcmp() selftest. - add powerpc/64 to subject/commit message. Simon Guo (3): powerpc/64: Align bytes before fall back to .Lshort in powerpc64 memcmp(). powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision powerpc:selftest update memcmp_64 selftest for VMX implementation arch/powerpc/include/asm/asm-prototypes.h | 4 +- arch/powerpc/lib/copypage_power7.S | 4 +- arch/powerpc/lib/memcmp_64.S | 374 ++++++++++++++++++++- arch/powerpc/lib/memcpy_power7.S | 6 +- arch/powerpc/lib/vmx-helper.c | 4 +- .../selftests/powerpc/copyloops/asm/ppc_asm.h | 4 +- .../selftests/powerpc/stringloops/asm/ppc_asm.h | 22 ++ .../testing/selftests/powerpc/stringloops/memcmp.c | 98 ++++-- 8 files changed, 476 insertions(+), 40 deletions(-)