From patchwork Wed Dec 19 23:45:50 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Miller X-Patchwork-Id: 207570 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id AE4DD2C009B for ; Thu, 20 Dec 2012 10:45:51 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751489Ab2LSXpv (ORCPT ); Wed, 19 Dec 2012 18:45:51 -0500 Received: from shards.monkeyblade.net ([149.20.54.216]:50865 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751384Ab2LSXpu (ORCPT ); Wed, 19 Dec 2012 18:45:50 -0500 Received: from localhost (74-93-104-98-Washington.hfc.comcastbusiness.net [74.93.104.98]) (Authenticated sender: davem-davemloft) by shards.monkeyblade.net (Postfix) with ESMTPSA id D6697584149 for ; Wed, 19 Dec 2012 15:45:53 -0800 (PST) Date: Wed, 19 Dec 2012 15:45:50 -0800 (PST) Message-Id: <20121219.154550.765668728083453355.davem@davemloft.net> To: sparclinux@vger.kernel.org Subject: [PATCH 1/4] sparc64: Fix unrolled AES 256-bit key loops. From: David Miller X-Mailer: Mew version 6.5 on Emacs 24.1 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Sender: sparclinux-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: sparclinux@vger.kernel.org The basic scheme of the block mode assembler is that we start by enabling the FPU, loading the key into the floating point registers, then iterate calling the encrypt/decrypt routine for each block. For the 256-bit key cases, we run short on registers in the unrolled loops. So the {ENCRYPT,DECRYPT}_256_2() macros reload the key registers that get clobbered. The unrolled macros, {ENCRYPT,DECRYPT}_256(), are not mindful of this. So if we have a mix of multi-block and single-block calls, the single-block unrolled 256-bit encrypt/decrypt can run with some of the key registers clobbered. Handle this by always explicitly loading those registers before using the non-unrolled 256-bit macro. This was discovered thanks to all of the new test cases added by Jussi Kivilinna. Signed-off-by: David S. Miller --- arch/sparc/crypto/aes_asm.S | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/arch/sparc/crypto/aes_asm.S b/arch/sparc/crypto/aes_asm.S index 23f6cbb..1cda8aa 100644 --- a/arch/sparc/crypto/aes_asm.S +++ b/arch/sparc/crypto/aes_asm.S @@ -1024,7 +1024,11 @@ ENTRY(aes_sparc64_ecb_encrypt_256) add %o2, 0x20, %o2 brlz,pt %o3, 11f nop -10: ldx [%o1 + 0x00], %g3 +10: ldd [%o0 + 0xd0], %f56 + ldd [%o0 + 0xd8], %f58 + ldd [%o0 + 0xe0], %f60 + ldd [%o0 + 0xe8], %f62 + ldx [%o1 + 0x00], %g3 ldx [%o1 + 0x08], %g7 xor %g1, %g3, %g3 xor %g2, %g7, %g7 @@ -1128,9 +1132,9 @@ ENTRY(aes_sparc64_ecb_decrypt_256) /* %o0=&key[key_len], %o1=input, %o2=output, %o3=len */ ldx [%o0 - 0x10], %g1 subcc %o3, 0x10, %o3 + ldx [%o0 - 0x08], %g2 be 10f - ldx [%o0 - 0x08], %g2 - sub %o0, 0xf0, %o0 + sub %o0, 0xf0, %o0 1: ldx [%o1 + 0x00], %g3 ldx [%o1 + 0x08], %g7 ldx [%o1 + 0x10], %o4 @@ -1154,7 +1158,11 @@ ENTRY(aes_sparc64_ecb_decrypt_256) add %o2, 0x20, %o2 brlz,pt %o3, 11f nop -10: ldx [%o1 + 0x00], %g3 +10: ldd [%o0 + 0x18], %f56 + ldd [%o0 + 0x10], %f58 + ldd [%o0 + 0x08], %f60 + ldd [%o0 + 0x00], %f62 + ldx [%o1 + 0x00], %g3 ldx [%o1 + 0x08], %g7 xor %g1, %g3, %g3 xor %g2, %g7, %g7 @@ -1511,11 +1519,11 @@ ENTRY(aes_sparc64_ctr_crypt_256) add %o2, 0x20, %o2 brlz,pt %o3, 11f nop - ldd [%o0 + 0xd0], %f56 +10: ldd [%o0 + 0xd0], %f56 ldd [%o0 + 0xd8], %f58 ldd [%o0 + 0xe0], %f60 ldd [%o0 + 0xe8], %f62 -10: xor %g1, %g3, %o5 + xor %g1, %g3, %o5 MOVXTOD_O5_F0 xor %g2, %g7, %o5 MOVXTOD_O5_F2