From patchwork Tue Aug 22 23:47:59 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Mackerras X-Patchwork-Id: 804688 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xcS8w0wwyz9s9Y for ; Wed, 23 Aug 2017 09:54:56 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; secure) header.d=ozlabs.org header.i=@ozlabs.org header.b="sMq/uXPa"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xcS8v6gWwzDqZq for ; Wed, 23 Aug 2017 09:54:55 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=ozlabs.org header.i=@ozlabs.org header.b="sMq/uXPa"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (bilbo.ozlabs.org [103.22.144.67]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xcS1D5rfxzDqgQ for ; Wed, 23 Aug 2017 09:48:16 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=ozlabs.org header.i=@ozlabs.org header.b="sMq/uXPa"; dkim-atps=neutral Received: by ozlabs.org (Postfix) id 3xcS1D5LB0z9sRq; Wed, 23 Aug 2017 09:48:16 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPSA id 3xcS1D47GFz9sR9 for ; Wed, 23 Aug 2017 09:48:16 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ozlabs.org; s=201707; t=1503445696; bh=F6aHG6/Hkiz33kCU/OT9HCvH0em/Z17xq7PqfgqoiPs=; h=From:To:Subject:Date:In-Reply-To:References:From; b=sMq/uXPaSttW7878W/Xs18jx+/ybtVl682RO7Cqw8jeqXcs4TNZkcnMXLijyB101k ILViV8ssv/6urPl1I2h7phvQfeWT1nRaKndHPeYxMRLFv1f5w6Gevu6NEcgZrbNxqF J9ASPA9XKqy3/5kyPUMXp3aHINPfEUlB4JN2WXxVfC8G5x3R5HlXyauIIGeIZyGEj9 /ve1C168ZS5VZM9vZCw6lpPIQwATAYicCqoHojzsbiv54AAV/bqJCTcjMm185LS1PB z83aj2Imh8QbvwBBliOE0AC7FMEPu+wDJD7Jkz1f+plkI50CSqGrRo+bDieY4QY1Cf tsl30eKqlRtNA== From: Paul Mackerras To: linuxppc-dev@ozlabs.org Subject: [PATCH RFC 3/7] powerpc: Make load/store emulation use larger memory accesses Date: Wed, 23 Aug 2017 09:47:59 +1000 Message-Id: <1503445683-12011-4-git-send-email-paulus@ozlabs.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1503445683-12011-1-git-send-email-paulus@ozlabs.org> References: <1503445683-12011-1-git-send-email-paulus@ozlabs.org> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" At the moment, emulation of loads and stores of up to 8 bytes to unaligned addresses on a little-endian system uses a sequence of single-byte loads or stores to memory. This is rather inefficient, and the code is hard to follow because it has many ifdefs. In addition, the Power ISA has requirements on how unaligned accesses are performed, which are not met by doing all accesses as sequences of single-byte accesses. Emulation of VSX loads and stores uses __copy_{to,from}_user, which means the emulation code has no control on the size of accesses. To simplify this, we add new copy_mem_in() and copy_mem_out() functions for accessing memory. These use a sequence of the largest possible aligned accesses, up to 8 bytes (or 4 on 32-bit systems), to copy memory between a local buffer and user memory. We then rewrite {read,write}_mem_unaligned and the VSX load/store emulation using these new functions. These new function also simplify the code in do_fp_load() and do_fp_store() for the unaligned cases. Signed-off-by: Paul Mackerras --- arch/powerpc/lib/sstep.c | 237 +++++++++++++++++++++-------------------------- 1 file changed, 106 insertions(+), 131 deletions(-) diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c index becb486..e280ed1 100644 --- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -194,7 +194,6 @@ static nokprobe_inline unsigned long max_align(unsigned long x) return x & -x; /* isolates rightmost bit */ } - static nokprobe_inline unsigned long byterev_2(unsigned long x) { return ((x >> 8) & 0xff) | ((x & 0xff) << 8); @@ -240,56 +239,68 @@ static nokprobe_inline int read_mem_aligned(unsigned long *dest, return err; } -static nokprobe_inline int read_mem_unaligned(unsigned long *dest, - unsigned long ea, int nb, struct pt_regs *regs) +/* + * Copy from userspace to a buffer, using the largest possible + * aligned accesses, up to sizeof(long). + */ +static int nokprobe_inline copy_mem_in(u8 *dest, unsigned long ea, int nb) { - int err; - unsigned long x, b, c; -#ifdef __LITTLE_ENDIAN__ - int len = nb; /* save a copy of the length for byte reversal */ -#endif + int err = 0; + int c; - /* unaligned, do this in pieces */ - x = 0; for (; nb > 0; nb -= c) { -#ifdef __LITTLE_ENDIAN__ - c = 1; -#endif -#ifdef __BIG_ENDIAN__ c = max_align(ea); -#endif if (c > nb) c = max_align(nb); - err = read_mem_aligned(&b, ea, c); + switch (c) { + case 1: + err = __get_user(*dest, (unsigned char __user *) ea); + break; + case 2: + err = __get_user(*(u16 *)dest, + (unsigned short __user *) ea); + break; + case 4: + err = __get_user(*(u32 *)dest, + (unsigned int __user *) ea); + break; +#ifdef __powerpc64__ + case 8: + err = __get_user(*(unsigned long *)dest, + (unsigned long __user *) ea); + break; +#endif + } if (err) return err; - x = (x << (8 * c)) + b; + dest += c; ea += c; } -#ifdef __LITTLE_ENDIAN__ - switch (len) { - case 2: - *dest = byterev_2(x); - break; - case 4: - *dest = byterev_4(x); - break; -#ifdef __powerpc64__ - case 8: - *dest = byterev_8(x); - break; -#endif - } -#endif -#ifdef __BIG_ENDIAN__ - *dest = x; -#endif return 0; } +static nokprobe_inline int read_mem_unaligned(unsigned long *dest, + unsigned long ea, int nb) +{ + union { + unsigned long ul; + u8 b[sizeof(unsigned long)]; + } u; + int i; + int err; + + u.ul = 0; + i = IS_BE ? sizeof(unsigned long) - nb : 0; + err = copy_mem_in(&u.b[i], ea, nb); + if (!err) + *dest = u.ul; + return err; +} + /* * Read memory at address ea for nb bytes, return 0 for success - * or -EFAULT if an error occurred. + * or -EFAULT if an error occurred. N.B. nb must be 1, 2, 4 or 8. + * If nb < sizeof(long), the result is right-justified on BE systems. */ static int read_mem(unsigned long *dest, unsigned long ea, int nb, struct pt_regs *regs) @@ -298,7 +309,7 @@ static int read_mem(unsigned long *dest, unsigned long ea, int nb, return -EFAULT; if ((ea & (nb - 1)) == 0) return read_mem_aligned(dest, ea, nb); - return read_mem_unaligned(dest, ea, nb, regs); + return read_mem_unaligned(dest, ea, nb); } NOKPROBE_SYMBOL(read_mem); @@ -326,48 +337,63 @@ static nokprobe_inline int write_mem_aligned(unsigned long val, return err; } -static nokprobe_inline int write_mem_unaligned(unsigned long val, - unsigned long ea, int nb, struct pt_regs *regs) +/* + * Copy from a buffer to userspace, using the largest possible + * aligned accesses, up to sizeof(long). + */ +static int nokprobe_inline copy_mem_out(u8 *dest, unsigned long ea, int nb) { - int err; - unsigned long c; + int err = 0; + int c; -#ifdef __LITTLE_ENDIAN__ - switch (nb) { - case 2: - val = byterev_2(val); - break; - case 4: - val = byterev_4(val); - break; -#ifdef __powerpc64__ - case 8: - val = byterev_8(val); - break; -#endif - } -#endif - /* unaligned or little-endian, do this in pieces */ for (; nb > 0; nb -= c) { -#ifdef __LITTLE_ENDIAN__ - c = 1; -#endif -#ifdef __BIG_ENDIAN__ c = max_align(ea); -#endif if (c > nb) c = max_align(nb); - err = write_mem_aligned(val >> (nb - c) * 8, ea, c); + switch (c) { + case 1: + err = __put_user(*dest, (unsigned char __user *) ea); + break; + case 2: + err = __put_user(*(u16 *)dest, + (unsigned short __user *) ea); + break; + case 4: + err = __put_user(*(u32 *)dest, + (unsigned int __user *) ea); + break; +#ifdef __powerpc64__ + case 8: + err = __put_user(*(unsigned long *)dest, + (unsigned long __user *) ea); + break; +#endif + } if (err) return err; + dest += c; ea += c; } return 0; } +static nokprobe_inline int write_mem_unaligned(unsigned long val, + unsigned long ea, int nb) +{ + union { + unsigned long ul; + u8 b[sizeof(unsigned long)]; + } u; + int i; + + u.ul = val; + i = IS_BE ? sizeof(unsigned long) - nb : 0; + return copy_mem_out(&u.b[i], ea, nb); +} + /* * Write memory at address ea for nb bytes, return 0 for success - * or -EFAULT if an error occurred. + * or -EFAULT if an error occurred. N.B. nb must be 1, 2, 4 or 8. */ static int write_mem(unsigned long val, unsigned long ea, int nb, struct pt_regs *regs) @@ -376,7 +402,7 @@ static int write_mem(unsigned long val, unsigned long ea, int nb, return -EFAULT; if ((ea & (nb - 1)) == 0) return write_mem_aligned(val, ea, nb); - return write_mem_unaligned(val, ea, nb, regs); + return write_mem_unaligned(val, ea, nb); } NOKPROBE_SYMBOL(write_mem); @@ -390,40 +416,17 @@ static int do_fp_load(int rn, int (*func)(int, unsigned long), struct pt_regs *regs) { int err; - union { - double dbl; - unsigned long ul[2]; - struct { -#ifdef __BIG_ENDIAN__ - unsigned _pad_; - unsigned word; -#endif -#ifdef __LITTLE_ENDIAN__ - unsigned word; - unsigned _pad_; -#endif - } single; - } data; - unsigned long ptr; + u8 buf[sizeof(double)] __attribute__((aligned(sizeof(double)))); if (!address_ok(regs, ea, nb)) return -EFAULT; - if ((ea & 3) == 0) - return (*func)(rn, ea); - ptr = (unsigned long) &data.ul; - if (sizeof(unsigned long) == 8 || nb == 4) { - err = read_mem_unaligned(&data.ul[0], ea, nb, regs); - if (nb == 4) - ptr = (unsigned long)&(data.single.word); - } else { - /* reading a double on 32-bit */ - err = read_mem_unaligned(&data.ul[0], ea, 4, regs); - if (!err) - err = read_mem_unaligned(&data.ul[1], ea + 4, 4, regs); + if (ea & 3) { + err = copy_mem_in(buf, ea, nb); + if (err) + return err; + ea = (unsigned long) buf; } - if (err) - return err; - return (*func)(rn, ptr); + return (*func)(rn, ea); } NOKPROBE_SYMBOL(do_fp_load); @@ -432,43 +435,15 @@ static int do_fp_store(int rn, int (*func)(int, unsigned long), struct pt_regs *regs) { int err; - union { - double dbl; - unsigned long ul[2]; - struct { -#ifdef __BIG_ENDIAN__ - unsigned _pad_; - unsigned word; -#endif -#ifdef __LITTLE_ENDIAN__ - unsigned word; - unsigned _pad_; -#endif - } single; - } data; - unsigned long ptr; + u8 buf[sizeof(double)] __attribute__((aligned(sizeof(double)))); if (!address_ok(regs, ea, nb)) return -EFAULT; if ((ea & 3) == 0) return (*func)(rn, ea); - ptr = (unsigned long) &data.ul[0]; - if (sizeof(unsigned long) == 8 || nb == 4) { - if (nb == 4) - ptr = (unsigned long)&(data.single.word); - err = (*func)(rn, ptr); - if (err) - return err; - err = write_mem_unaligned(data.ul[0], ea, nb, regs); - } else { - /* writing a double on 32-bit */ - err = (*func)(rn, ptr); - if (err) - return err; - err = write_mem_unaligned(data.ul[0], ea, 4, regs); - if (!err) - err = write_mem_unaligned(data.ul[1], ea + 4, 4, regs); - } + err = (*func)(rn, (unsigned long) buf); + if (!err) + err = copy_mem_out(buf, ea, nb); return err; } NOKPROBE_SYMBOL(do_fp_store); @@ -2570,11 +2545,11 @@ int emulate_step(struct pt_regs *regs, unsigned int instr) #endif #ifdef CONFIG_VSX case LOAD_VSX: { - char mem[16]; + u8 mem[16]; union vsx_reg buf; if (!address_ok(regs, op.ea, size) || - __copy_from_user(mem, (void __user *)op.ea, size)) + copy_mem_in(mem, op.ea, size)) return 0; emulate_vsx_load(&op, &buf, mem); @@ -2632,7 +2607,7 @@ int emulate_step(struct pt_regs *regs, unsigned int instr) #endif #ifdef CONFIG_VSX case STORE_VSX: { - char mem[16]; + u8 mem[16]; union vsx_reg buf; if (!address_ok(regs, op.ea, size)) @@ -2640,7 +2615,7 @@ int emulate_step(struct pt_regs *regs, unsigned int instr) store_vsrn(op.reg, &buf); emulate_vsx_store(&op, &buf, mem); - if (__copy_to_user((void __user *)op.ea, mem, size)) + if (copy_mem_out(mem, op.ea, size)) return 0; goto ldst_done; }