From patchwork Thu May 24 11:22:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 919791 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40s6Th1CMwz9s1w for ; Thu, 24 May 2018 21:24:08 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 40s6Tg6XCczF1RP for ; Thu, 24 May 2018 21:24:07 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=c-s.fr (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@c-s.fr; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40s6Rr2Z8mzF0pl for ; Thu, 24 May 2018 21:22:32 +1000 (AEST) Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 40s6Rl74S0z9ttr5; Thu, 24 May 2018 13:22:27 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id V2D8z4pqq7q7; Thu, 24 May 2018 13:22:27 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 40s6Rl6ZS6z9ttqg; Thu, 24 May 2018 13:22:27 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 2FE248B8EC; Thu, 24 May 2018 13:22:28 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id Jk0odTdZI7eg; Thu, 24 May 2018 13:22:28 +0200 (CEST) Received: from PO15451.localdomain (po15451.idsi0.si.c-s.fr [172.25.231.2]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 135F58B8B8; Thu, 24 May 2018 13:22:28 +0200 (CEST) Received: by localhost.localdomain (Postfix, from userid 0) id CE2C06C991; Thu, 24 May 2018 11:22:27 +0000 (UTC) Message-Id: <484bcfaccc1ec3d91b74aeaaa26a0ae66fe0955a.1527160868.git.christophe.leroy@c-s.fr> From: Christophe Leroy Subject: [PATCH] powerpc/32: Optimise __csum_partial() To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , segher@kernel.crashing.org Date: Thu, 24 May 2018 11:22:27 +0000 (UTC) X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Improve __csum_partial by interleaving loads and adds. On a 8xx, it brings neither improvement nor degradation. On a 83xx, it brings a 25% improvement. Signed-off-by: Christophe Leroy Reviewed-by: Segher Boessenkool --- arch/powerpc/lib/checksum_32.S | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/lib/checksum_32.S b/arch/powerpc/lib/checksum_32.S index d2238ea82209..aa224069f93a 100644 --- a/arch/powerpc/lib/checksum_32.S +++ b/arch/powerpc/lib/checksum_32.S @@ -47,16 +47,25 @@ _GLOBAL(__csum_partial) bdnz 2b 21: srwi. r6,r4,4 /* # blocks of 4 words to do */ beq 3f + lwz r0,4(r3) mtctr r6 -22: lwz r0,4(r3) lwz r6,8(r3) + adde r5,r5,r0 lwz r7,12(r3) + adde r5,r5,r6 lwzu r8,16(r3) + adde r5,r5,r7 + bdz 23f +22: lwz r0,4(r3) + adde r5,r5,r8 + lwz r6,8(r3) adde r5,r5,r0 + lwz r7,12(r3) adde r5,r5,r6 + lwzu r8,16(r3) adde r5,r5,r7 - adde r5,r5,r8 bdnz 22b +23: adde r5,r5,r8 3: andi. r0,r4,2 beq+ 4f lhz r0,4(r3)