From patchwork Tue Apr 10 06:34:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 896502 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40KyB03pxSz9rxx for ; Tue, 10 Apr 2018 16:36:24 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 40KyB01pFlzF1RV for ; Tue, 10 Apr 2018 16:36:24 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=c-s.fr (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@c-s.fr; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40Ky8015bbzDqRk for ; Tue, 10 Apr 2018 16:34:39 +1000 (AEST) Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 40Ky7t4Y51z9ttfv; Tue, 10 Apr 2018 08:34:34 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id EuhCsruoRGFN; Tue, 10 Apr 2018 08:34:34 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 40Ky7t4345z9ttfs; Tue, 10 Apr 2018 08:34:34 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 987258B791; Tue, 10 Apr 2018 08:34:35 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id j4IE3zHSTqMY; Tue, 10 Apr 2018 08:34:35 +0200 (CEST) Received: from po15720vm.idsi0.si.c-s.fr (unknown [192.168.232.3]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 669E78B750; Tue, 10 Apr 2018 08:34:35 +0200 (CEST) Received: by po15720vm.idsi0.si.c-s.fr (Postfix, from userid 0) id 272F8653BC; Tue, 10 Apr 2018 08:34:35 +0200 (CEST) From: Christophe Leroy Subject: [PATCH] powerpc/64: optimises from64to32() To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Scott Wood Message-Id: <20180410063435.272F8653BC@po15720vm.idsi0.si.c-s.fr> Date: Tue, 10 Apr 2018 08:34:35 +0200 (CEST) X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The current implementation of from64to32() gives a poor result: 0000000000000270 <.from64to32>: 270: 38 00 ff ff li r0,-1 274: 78 69 00 22 rldicl r9,r3,32,32 278: 78 00 00 20 clrldi r0,r0,32 27c: 7c 60 00 38 and r0,r3,r0 280: 7c 09 02 14 add r0,r9,r0 284: 78 09 00 22 rldicl r9,r0,32,32 288: 7c 00 4a 14 add r0,r0,r9 28c: 78 03 00 20 clrldi r3,r0,32 290: 4e 80 00 20 blr This patch modifies from64to32() to operate in the same spirit as csum_fold() It swaps the two 32-bit halves of sum then it adds it with the unswapped sum. If there is a carry from adding the two 32-bit halves, it will carry from the lower half into the upper half, giving us the correct sum in the upper half. The resulting code is: 0000000000000260 <.from64to32>: 260: 78 60 00 02 rotldi r0,r3,32 264: 7c 60 1a 14 add r3,r0,r3 268: 78 63 00 22 rldicl r3,r3,32,32 26c: 4e 80 00 20 blr Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/checksum.h | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h index 4e63787dc3be..54065caa40b3 100644 --- a/arch/powerpc/include/asm/checksum.h +++ b/arch/powerpc/include/asm/checksum.h @@ -12,6 +12,7 @@ #ifdef CONFIG_GENERIC_CSUM #include #else +#include /* * Computes the checksum of a memory block at src, length len, * and adds in "sum" (32-bit), while copying the block to dst. @@ -55,11 +56,7 @@ static inline __sum16 csum_fold(__wsum sum) static inline u32 from64to32(u64 x) { - /* add up 32-bit and 32-bit for 32+c bit */ - x = (x & 0xffffffff) + (x >> 32); - /* add up carry.. */ - x = (x & 0xffffffff) + (x >> 32); - return (u32)x; + return (x + ror64(x, 32)) >> 32; } static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr, __u32 len,