From patchwork Wed Aug 23 14:54:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 805043 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xcrKc4Nz5z9s9Y for ; Thu, 24 Aug 2017 01:03:48 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xcrKc28bnzDqlN for ; Thu, 24 Aug 2017 01:03:48 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xcr740RMszDqZ1 for ; Thu, 24 Aug 2017 00:54:39 +1000 (AEST) Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 3xcr6n6dL8z9ttFQ; Wed, 23 Aug 2017 16:54:25 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id aF7pWbqCQLSP; Wed, 23 Aug 2017 16:54:25 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 3xcr6n62C9z9ttBy; Wed, 23 Aug 2017 16:54:25 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id CB93C8B80C; Wed, 23 Aug 2017 16:54:36 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id 7XIolaqBH41I; Wed, 23 Aug 2017 16:54:36 +0200 (CEST) Received: from po15668-vm-win7.idsi0.si.c-s.fr (po15451.idsi0.si.c-s.fr [172.25.231.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id A82568B802; Wed, 23 Aug 2017 16:54:36 +0200 (CEST) Received: by po15668-vm-win7.idsi0.si.c-s.fr (Postfix, from userid 0) id 9A01B679CE; Wed, 23 Aug 2017 16:54:36 +0200 (CEST) Message-Id: <7c2934cbc0d787eda6f740327449d7e6e3becd9a.1503277387.git.christophe.leroy@c-s.fr> In-Reply-To: References: From: Christophe Leroy Subject: [PATCH 3/4] powerpc/32: optimise memset() To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Scott Wood Date: Wed, 23 Aug 2017 16:54:36 +0200 (CEST) X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" There is no need to extend the set value to an int when the length is lower than 4 as in that case we only do byte stores. We can therefore immediately branch to the part handling it. By separating it from the normal case, we are able to eliminate a few actions on the destination pointer. Signed-off-by: Christophe Leroy --- arch/powerpc/lib/copy_32.S | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/lib/copy_32.S b/arch/powerpc/lib/copy_32.S index a3ffeac69eca..05aaee20590f 100644 --- a/arch/powerpc/lib/copy_32.S +++ b/arch/powerpc/lib/copy_32.S @@ -91,17 +91,17 @@ EXPORT_SYMBOL(memset16) * replaced by a nop once cache is active. This is done in machine_init() */ _GLOBAL(memset) + cmplwi 0,r5,4 + blt 7f + rlwimi r4,r4,8,16,23 rlwimi r4,r4,16,0,15 - addi r6,r3,-4 - cmplwi 0,r5,4 - blt 7f - stwu r4,4(r6) + stw r4,0(r3) beqlr - andi. r0,r6,3 + andi. r0,r3,3 add r5,r0,r5 - subf r6,r0,r6 + subf r6,r0,r3 cmplwi 0,r4,0 bne 2f /* Use normal procedure if r4 is not zero */ _GLOBAL(memset_nocache_branch) @@ -132,13 +132,20 @@ _GLOBAL(memset_nocache_branch) 1: stwu r4,4(r6) bdnz 1b 6: andi. r5,r5,3 -7: cmpwi 0,r5,0 beqlr mtctr r5 addi r6,r6,3 8: stbu r4,1(r6) bdnz 8b blr + +7: cmpwi 0,r5,0 + beqlr + mtctr r5 + addi r6,r3,-1 +9: stbu r4,1(r6) + bdnz 9b + blr EXPORT_SYMBOL(memset) /*