From patchwork Tue Apr 17 07:38:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 899117 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40QHMM6mXxz9s19 for ; Tue, 17 Apr 2018 17:44:31 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 40QHMM5VS5zF239 for ; Tue, 17 Apr 2018 17:44:31 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=c-s.fr (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@c-s.fr; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40QHDg2QzCzDrJw for ; Tue, 17 Apr 2018 17:38:43 +1000 (AEST) Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 40QHDR3wBlz9ty7q; Tue, 17 Apr 2018 09:38:31 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id rswcelwTrOBi; Tue, 17 Apr 2018 09:38:31 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 40QHDR2r0Gz9ty7C; Tue, 17 Apr 2018 09:38:31 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 8D9778B887; Tue, 17 Apr 2018 09:38:39 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id RWaCh2PMGJLU; Tue, 17 Apr 2018 09:38:39 +0200 (CEST) Received: from po15720vm.idsi0.si.c-s.fr (po15451.idsi0.si.c-s.fr [172.25.231.2]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 746648B884; Tue, 17 Apr 2018 09:38:39 +0200 (CEST) Received: by po15720vm.idsi0.si.c-s.fr (Postfix, from userid 0) id 6A1076C030; Tue, 17 Apr 2018 09:38:39 +0200 (CEST) Message-Id: <9a28569116a3ddc89f2c14ebed81dcca74dac7bc.1523950415.git.christophe.leroy@c-s.fr> In-Reply-To: References: From: Christophe Leroy Subject: [PATCH 3/7] powerpc/lib: optimise PPC32 memcmp To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman Date: Tue, 17 Apr 2018 09:38:39 +0200 (CEST) X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" At the time being, memcmp() compares two chunks of memory byte per byte. This patch optimised the comparison by comparing word by word. A small benchmark performed on an 8xx based on the comparison of two chuncks of 512 bytes performed 100000 times gives: Before : 5852274 TB ticks After: 1488638 TB ticks This is almost 4 times faster Signed-off-by: Christophe Leroy --- arch/powerpc/lib/string_32.S | 42 +++++++++++++++++++++++++++++++++++------- 1 file changed, 35 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/lib/string_32.S b/arch/powerpc/lib/string_32.S index 94e9c9bc31c3..5b2a73fb07be 100644 --- a/arch/powerpc/lib/string_32.S +++ b/arch/powerpc/lib/string_32.S @@ -19,13 +19,41 @@ _GLOBAL(memcmp) PPC_LCMPI 0,r5,0 beq- 2f #endif - mtctr r5 - addi r6,r3,-1 - addi r4,r4,-1 -1: lbzu r3,1(r6) - lbzu r0,1(r4) - subf. r3,r0,r3 - bdnzt 2,1b + srawi. r7, r5, 2 /* Divide len by 4 */ + mr r6, r3 + beq- 3f + mtctr r7 + li r7, 0 +1: +#ifdef __LITTLE_ENDIAN__ + lwbrx r3, r6, r7 + lwbrx r0, r4, r7 +#else + lwzx r3, r6, r7 + lwzx r0, r4, r7 +#endif + addi r7, r7, 4 + subf. r3, r0, r3 + bdnzt eq, 1b + bnelr + andi. r5, r5, 3 + beqlr +3: cmplwi cr1, r5, 2 + blt- cr1, 4f +#ifdef __LITTLE_ENDIAN__ + lhbrx r3, r6, r7 + lhbrx r0, r4, r7 +#else + lhzx r3, r6, r7 + lhzx r0, r4, r7 +#endif + addi r7, r7, 2 + subf. r3, r0, r3 + beqlr cr1 + bnelr +4: lbzx r3, r6, r7 + lbzx r0, r4, r7 + subf. r3, r0, r3 blr #ifdef CONFIG_FORTIFY_SOURCE 2: li r3,0