From patchwork Thu May 24 16:17:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 919972 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40sF1n3zhDz9s0q for ; Fri, 25 May 2018 02:18:53 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 40sF1n2mbSzF1T6 for ; Fri, 25 May 2018 02:18:53 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=c-s.fr (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@c-s.fr; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40sF0300PpzF1Px for ; Fri, 25 May 2018 02:17:22 +1000 (AEST) Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 40sDzx5qWSz9ty8Z; Thu, 24 May 2018 18:17:17 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id 4J4_inNGp_y4; Thu, 24 May 2018 18:17:17 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 40sDzx5KHTz9ty8T; Thu, 24 May 2018 18:17:17 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id F01468BB4C; Thu, 24 May 2018 18:17:17 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id HxFkzR6mmOMU; Thu, 24 May 2018 18:17:17 +0200 (CEST) Received: from PO15451.localdomain (po15451.idsi0.si.c-s.fr [172.25.231.2]) by messagerie.si.c-s.fr (Postfix) with ESMTP id BDC378BB39; Thu, 24 May 2018 18:17:17 +0200 (CEST) Received: by localhost.localdomain (Postfix, from userid 0) id 97E336CCC3; Thu, 24 May 2018 16:17:17 +0000 (UTC) Message-Id: <54d0574dbb33251ba241620d038b716d90fe0632.1527178313.git.christophe.leroy@c-s.fr> From: Christophe Leroy Subject: [PATCH] powerpc/32: implement strlen() in assembly To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , segher@kernel.crashing.org Date: Thu, 24 May 2018 16:17:17 +0000 (UTC) X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The generic implementation of strlen() reads strings byte per byte. This patch implements strlen() in assembly for PPC32 based on a read of entire words, in the same spirit as what some other arches and glibc do. For long strings, the time spent in strlen is reduced by 50-60% Signed-off-by: Christophe Leroy --- Applies after the patch 'powerpc/lib: move PPC32 specific functions out of string.S' arch/powerpc/include/asm/string.h | 3 +++ arch/powerpc/lib/string_32.S | 40 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) diff --git a/arch/powerpc/include/asm/string.h b/arch/powerpc/include/asm/string.h index 0f41686b6243..23ee2a0f2b21 100644 --- a/arch/powerpc/include/asm/string.h +++ b/arch/powerpc/include/asm/string.h @@ -15,6 +15,9 @@ #define __HAVE_ARCH_MEMCHR #define __HAVE_ARCH_MEMSET16 #define __HAVE_ARCH_MEMCPY_FLUSHCACHE +#ifdef CONFIG_PPC32 +#define __HAVE_ARCH_STRLEN +#endif extern char * strcpy(char *,const char *); extern __kernel_size_t strlen(const char *); diff --git a/arch/powerpc/lib/string_32.S b/arch/powerpc/lib/string_32.S index c4e70123d245..31575a698c97 100644 --- a/arch/powerpc/lib/string_32.S +++ b/arch/powerpc/lib/string_32.S @@ -62,6 +62,46 @@ _GLOBAL(memcmp) blr EXPORT_SYMBOL(memcmp) +_GLOBAL(strlen) + andi. r9, r3, 3 + addi r10, r3, -4 + beq+ 2f +1: lbz r9, 4(r10) + addi r10, r10, 1 + cmpwi cr0, r9, 0 + beq 19f + andi. r9, r10, 3 + bne 1b +2: lis r6, 0x8080 + ori r6, r6, 0x8080 + rlwinm r7, r6, 1, 0xffffffff +3: lwzu r9, 4(r10) + subf r8, r7, r9 + andc r11, r6, r9 + and. r8, r8, r11 + beq+ 3b + rlwinm. r8, r9, 0, 0xff000000 + beq 20f + rlwinm. r8, r9, 0, 0x00ff0000 + beq 21f + rlwinm. r8, r9, 0, 0x0000ff00 + beq 22f + rlwinm. r8, r9, 0, 0x000000ff + bne 3b +23: subf r3, r3, r10 + addi r3, r3, 3 + blr +22: subf r3, r3, r10 + addi r3, r3, 2 + blr +21: subf r3, r3, r10 + addi r3, r3, 1 + blr +19: addi r10, r10, 3 +20: subf r3, r3, r10 + blr +EXPORT_SYMBOL(strlen) + CACHELINE_BYTES = L1_CACHE_BYTES LG_CACHELINE_BYTES = L1_CACHE_SHIFT CACHELINE_MASK = (L1_CACHE_BYTES-1)