Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/1.2/patches/805772/?format=api
{ "id": 805772, "url": "http://patchwork.ozlabs.org/api/1.2/patches/805772/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linuxppc-dev/patch/1503639722-19121-10-git-send-email-paulus@ozlabs.org/", "project": { "id": 2, "url": "http://patchwork.ozlabs.org/api/1.2/projects/2/?format=api", "name": "Linux PPC development", "link_name": "linuxppc-dev", "list_id": "linuxppc-dev.lists.ozlabs.org", "list_email": "linuxppc-dev@lists.ozlabs.org", "web_url": "https://github.com/linuxppc/wiki/wiki", "scm_url": "https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git", "webscm_url": "https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/", "list_archive_url": "https://lore.kernel.org/linuxppc-dev/", "list_archive_url_format": "https://lore.kernel.org/linuxppc-dev/{}/", "commit_url_format": "https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id={}" }, "msgid": "<1503639722-19121-10-git-send-email-paulus@ozlabs.org>", "list_archive_url": "https://lore.kernel.org/linuxppc-dev/1503639722-19121-10-git-send-email-paulus@ozlabs.org/", "date": "2017-08-25T05:42:01", "name": "[v2,09/10] powerpc: Handle opposite-endian processes in emulation code", "commit_ref": null, "pull_url": null, "state": "superseded", "archived": true, "hash": "40418212167723006be431b7c4bab6ae27e1a75e", "submitter": { "id": 67079, "url": "http://patchwork.ozlabs.org/api/1.2/people/67079/?format=api", "name": "Paul Mackerras", "email": "paulus@ozlabs.org" }, "delegate": null, "mbox": "http://patchwork.ozlabs.org/project/linuxppc-dev/patch/1503639722-19121-10-git-send-email-paulus@ozlabs.org/mbox/", "series": [], "comments": "http://patchwork.ozlabs.org/api/patches/805772/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/805772/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>", "X-Original-To": [ "patchwork-incoming@ozlabs.org", "linuxppc-dev@lists.ozlabs.org" ], "Delivered-To": [ "patchwork-incoming@ozlabs.org", "linuxppc-dev@lists.ozlabs.org", "linuxppc-dev@ozlabs.org" ], "Received": [ "from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xdrhH5d6yz9sxR\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 25 Aug 2017 16:23:27 +1000 (AEST)", "from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3xdrhH4HlNzDqY9\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 25 Aug 2017 16:23:27 +1000 (AEST)", "from ozlabs.org (bilbo.ozlabs.org [103.22.144.67])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3xdrDW6LtHzDrS2\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tFri, 25 Aug 2017 16:02:51 +1000 (AEST)", "by ozlabs.org (Postfix)\n\tid 3xdrDW5T2nz9sR9; Fri, 25 Aug 2017 16:02:51 +1000 (AEST)", "from authenticated.ozlabs.org (localhost [127.0.0.1])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPSA id 3xdrDW4MKRz9t0M\n\tfor <linuxppc-dev@ozlabs.org>; Fri, 25 Aug 2017 16:02:51 +1000 (AEST)" ], "Authentication-Results": [ "ozlabs.org; dkim=pass (2048-bit key;\n\tsecure) header.d=ozlabs.org header.i=@ozlabs.org header.b=\"f4Zy/v5W\";\n\tdkim-atps=neutral", "lists.ozlabs.org; dkim=pass (2048-bit key;\n\tsecure) header.d=ozlabs.org header.i=@ozlabs.org header.b=\"f4Zy/v5W\";\n\tdkim-atps=neutral", "lists.ozlabs.org; dkim=pass (2048-bit key;\n\tsecure) header.d=ozlabs.org header.i=@ozlabs.org header.b=\"f4Zy/v5W\"; \n\tdkim-atps=neutral" ], "DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/simple; d=ozlabs.org; s=201707; \n\tt=1503640971; bh=PAFmNwi3qkLZ+eJwE2nK31T9+oRh9naBoh8q88Sp4z0=;\n\th=From:To:Subject:Date:In-Reply-To:References:From;\n\tb=f4Zy/v5WBm2QpTIo485v1VayPYfVVgd4NJwKO5v9177ME1UFALuFI3XJ+uJ8p4m8M\n\tFeDcRbP+Bt0gAmELksgzLwtlzlq0eEESKFoibY+SRtRyR6zdLn61L9yf8/Yp3jVwZB\n\tZcqIrbAfufWVwZbcxyloFfAiBjSeFLMsa3eFESNFICrYSqAY/hE4zQ+AACEX/bZsOi\n\tvNKnQugLVxQiey/1hGgWFHUOH8hKYm5Uqhjf7RsmwrMnkZ2ElU3ubp+ahDhAK5wKme\n\ttpp7E02B1MvznYrmNsECofbqkR2z92whHhbVHSuljZZSzMpimVlCctgzwbSC0oR1t9\n\tLI2bu+dJe2JPg==", "From": "Paul Mackerras <paulus@ozlabs.org>", "To": "linuxppc-dev@ozlabs.org", "Subject": "[PATCH v2 09/10] powerpc: Handle opposite-endian processes in\n\temulation code", "Date": "Fri, 25 Aug 2017 15:42:01 +1000", "Message-Id": "<1503639722-19121-10-git-send-email-paulus@ozlabs.org>", "X-Mailer": "git-send-email 2.7.4", "In-Reply-To": "<1503639722-19121-1-git-send-email-paulus@ozlabs.org>", "References": "<1503639722-19121-1-git-send-email-paulus@ozlabs.org>", "X-BeenThere": "linuxppc-dev@lists.ozlabs.org", "X-Mailman-Version": "2.1.23", "Precedence": "list", "List-Id": "Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>", "List-Unsubscribe": "<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>", "List-Archive": "<http://lists.ozlabs.org/pipermail/linuxppc-dev/>", "List-Post": "<mailto:linuxppc-dev@lists.ozlabs.org>", "List-Help": "<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>", "List-Subscribe": "<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>", "Errors-To": "linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org", "Sender": "\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>" }, "content": "This adds code to the load and store emulation code to byte-swap\nthe data appropriately when the process being emulated is set to\nthe opposite endianness to that of the kernel.\n\nThis also enables the emulation for the multiple-register loads\nand stores (lmw, stmw, lswi, stswi, lswx, stswx) to work for\nlittle-endian. In little-endian mode, the partial word at the\nend of a transfer for lsw*/stsw* (when the byte count is not a\nmultiple of 4) is loaded/stored at the least-significant end of\nthe register. Additionally, this fixes a bug in the previous\ncode in that it could call read_mem/write_mem with a byte count\nthat was not 1, 2, 4 or 8.\n\nSigned-off-by: Paul Mackerras <paulus@ozlabs.org>\n---\n arch/powerpc/include/asm/sstep.h | 4 +-\n arch/powerpc/lib/sstep.c | 202 ++++++++++++++++++++++++++-------------\n 2 files changed, 135 insertions(+), 71 deletions(-)", "diff": "diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h\nindex 0e5dd23..5a3d3d4 100644\n--- a/arch/powerpc/include/asm/sstep.h\n+++ b/arch/powerpc/include/asm/sstep.h\n@@ -149,6 +149,6 @@ void emulate_update_regs(struct pt_regs *reg, struct instruction_op *op);\n extern int emulate_step(struct pt_regs *regs, unsigned int instr);\n \n extern void emulate_vsx_load(struct instruction_op *op, union vsx_reg *reg,\n-\t\t\t const void *mem);\n+\t\t\t const void *mem, bool cross_endian);\n extern void emulate_vsx_store(struct instruction_op *op, const union vsx_reg *reg,\n-\t\t\t void *mem);\n+\t\t\t void *mem, bool cross_endian);\ndiff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c\nindex 4773055..7afb8ef 100644\n--- a/arch/powerpc/lib/sstep.c\n+++ b/arch/powerpc/lib/sstep.c\n@@ -210,6 +210,33 @@ static nokprobe_inline unsigned long byterev_8(unsigned long x)\n }\n #endif\n \n+static nokprobe_inline void do_byte_reverse(void *ptr, int nb)\n+{\n+\tswitch (nb) {\n+\tcase 2:\n+\t\t*(u16 *)ptr = byterev_2(*(u16 *)ptr);\n+\t\tbreak;\n+\tcase 4:\n+\t\t*(u32 *)ptr = byterev_4(*(u32 *)ptr);\n+\t\tbreak;\n+#ifdef __powerpc64__\n+\tcase 8:\n+\t\t*(unsigned long *)ptr = byterev_8(*(unsigned long *)ptr);\n+\t\tbreak;\n+\tcase 16: {\n+\t\tunsigned long *up = (unsigned long *)ptr;\n+\t\tunsigned long tmp;\n+\t\ttmp = byterev_8(up[0]);\n+\t\tup[0] = byterev_8(up[1]);\n+\t\tup[1] = tmp;\n+\t\tbreak;\n+\t}\n+#endif\n+\tdefault:\n+\t\tWARN_ON_ONCE(1);\n+\t}\n+}\n+\n static nokprobe_inline int read_mem_aligned(unsigned long *dest,\n \t\t\t\t\tunsigned long ea, int nb)\n {\n@@ -409,7 +436,8 @@ NOKPROBE_SYMBOL(write_mem);\n * These access either the real FP register or the image in the\n * thread_struct, depending on regs->msr & MSR_FP.\n */\n-static int do_fp_load(int rn, unsigned long ea, int nb, struct pt_regs *regs)\n+static int do_fp_load(int rn, unsigned long ea, int nb, struct pt_regs *regs,\n+\t\t bool cross_endian)\n {\n \tint err;\n \tunion {\n@@ -424,6 +452,11 @@ static int do_fp_load(int rn, unsigned long ea, int nb, struct pt_regs *regs)\n \terr = copy_mem_in(u.b, ea, nb);\n \tif (err)\n \t\treturn err;\n+\tif (unlikely(cross_endian)) {\n+\t\tdo_byte_reverse(u.b, min(nb, 8));\n+\t\tif (nb == 16)\n+\t\t\tdo_byte_reverse(&u.b[8], 8);\n+\t}\n \tpreempt_disable();\n \tif (nb == 4)\n \t\tconv_sp_to_dp(&u.f, &u.d[0]);\n@@ -444,7 +477,8 @@ static int do_fp_load(int rn, unsigned long ea, int nb, struct pt_regs *regs)\n }\n NOKPROBE_SYMBOL(do_fp_load);\n \n-static int do_fp_store(int rn, unsigned long ea, int nb, struct pt_regs *regs)\n+static int do_fp_store(int rn, unsigned long ea, int nb, struct pt_regs *regs,\n+\t\t bool cross_endian)\n {\n \tunion {\n \t\tfloat f;\n@@ -470,6 +504,11 @@ static int do_fp_store(int rn, unsigned long ea, int nb, struct pt_regs *regs)\n \t\t\tu.l[1] = current->thread.TS_FPR(rn);\n \t}\n \tpreempt_enable();\n+\tif (unlikely(cross_endian)) {\n+\t\tdo_byte_reverse(u.b, min(nb, 8));\n+\t\tif (nb == 16)\n+\t\t\tdo_byte_reverse(&u.b[8], 8);\n+\t}\n \treturn copy_mem_out(u.b, ea, nb);\n }\n NOKPROBE_SYMBOL(do_fp_store);\n@@ -478,7 +517,8 @@ NOKPROBE_SYMBOL(do_fp_store);\n #ifdef CONFIG_ALTIVEC\n /* For Altivec/VMX, no need to worry about alignment */\n static nokprobe_inline int do_vec_load(int rn, unsigned long ea,\n-\t\t\t\t int size, struct pt_regs *regs)\n+\t\t\t\t int size, struct pt_regs *regs,\n+\t\t\t\t bool cross_endian)\n {\n \tint err;\n \tunion {\n@@ -493,7 +533,8 @@ static nokprobe_inline int do_vec_load(int rn, unsigned long ea,\n \terr = copy_mem_in(&u.b[ea & 0xf], ea, size);\n \tif (err)\n \t\treturn err;\n-\n+\tif (unlikely(cross_endian))\n+\t\tdo_byte_reverse(&u.b[ea & 0xf], size);\n \tpreempt_disable();\n \tif (regs->msr & MSR_VEC)\n \t\tput_vr(rn, &u.v);\n@@ -504,7 +545,8 @@ static nokprobe_inline int do_vec_load(int rn, unsigned long ea,\n }\n \n static nokprobe_inline int do_vec_store(int rn, unsigned long ea,\n-\t\t\t\t\tint size, struct pt_regs *regs)\n+\t\t\t\t\tint size, struct pt_regs *regs,\n+\t\t\t\t\tbool cross_endian)\n {\n \tunion {\n \t\t__vector128 v;\n@@ -522,94 +564,105 @@ static nokprobe_inline int do_vec_store(int rn, unsigned long ea,\n \telse\n \t\tu.v = current->thread.vr_state.vr[rn];\n \tpreempt_enable();\n+\tif (unlikely(cross_endian))\n+\t\tdo_byte_reverse(&u.b[ea & 0xf], size);\n \treturn copy_mem_out(&u.b[ea & 0xf], ea, size);\n }\n #endif /* CONFIG_ALTIVEC */\n \n #ifdef __powerpc64__\n static nokprobe_inline int emulate_lq(struct pt_regs *regs, unsigned long ea,\n-\t\t\t\t int reg)\n+\t\t\t\t int reg, bool cross_endian)\n {\n \tint err;\n \n \tif (!address_ok(regs, ea, 16))\n \t\treturn -EFAULT;\n \t/* if aligned, should be atomic */\n-\tif ((ea & 0xf) == 0)\n-\t\treturn do_lq(ea, ®s->gpr[reg]);\n-\n-\terr = read_mem(®s->gpr[reg + IS_LE], ea, 8, regs);\n-\tif (!err)\n-\t\terr = read_mem(®s->gpr[reg + IS_BE], ea + 8, 8, regs);\n+\tif ((ea & 0xf) == 0) {\n+\t\terr = do_lq(ea, ®s->gpr[reg]);\n+\t} else {\n+\t\terr = read_mem(®s->gpr[reg + IS_LE], ea, 8, regs);\n+\t\tif (!err)\n+\t\t\terr = read_mem(®s->gpr[reg + IS_BE], ea + 8, 8, regs);\n+\t}\n+\tif (!err && unlikely(cross_endian))\n+\t\tdo_byte_reverse(®s->gpr[reg], 16);\n \treturn err;\n }\n \n static nokprobe_inline int emulate_stq(struct pt_regs *regs, unsigned long ea,\n-\t\t\t\t int reg)\n+\t\t\t\t int reg, bool cross_endian)\n {\n \tint err;\n+\tunsigned long vals[2];\n \n \tif (!address_ok(regs, ea, 16))\n \t\treturn -EFAULT;\n+\tvals[0] = regs->gpr[reg];\n+\tvals[1] = regs->gpr[reg + 1];\n+\tif (unlikely(cross_endian))\n+\t\tdo_byte_reverse(vals, 16);\n+\n \t/* if aligned, should be atomic */\n \tif ((ea & 0xf) == 0)\n-\t\treturn do_stq(ea, regs->gpr[reg], regs->gpr[reg + 1]);\n+\t\treturn do_stq(ea, vals[0], vals[1]);\n \n-\terr = write_mem(regs->gpr[reg + IS_LE], ea, 8, regs);\n+\terr = write_mem(vals[IS_LE], ea, 8, regs);\n \tif (!err)\n-\t\terr = write_mem(regs->gpr[reg + IS_BE], ea + 8, 8, regs);\n+\t\terr = write_mem(vals[IS_BE], ea + 8, 8, regs);\n \treturn err;\n }\n #endif /* __powerpc64 */\n \n #ifdef CONFIG_VSX\n void emulate_vsx_load(struct instruction_op *op, union vsx_reg *reg,\n-\t\t const void *mem)\n+\t\t const void *mem, bool cross_endian)\n {\n \tint size, read_size;\n \tint i, j;\n-\tunion vsx_reg buf;\n+\tbool rev = cross_endian;\n \tconst unsigned int *wp;\n \tconst unsigned short *hp;\n \tconst unsigned char *bp;\n \n \tsize = GETSIZE(op->type);\n-\tbuf.d[0] = buf.d[1] = 0;\n+\treg->d[0] = reg->d[1] = 0;\n \n \tswitch (op->element_size) {\n \tcase 16:\n \t\t/* whole vector; lxv[x] or lxvl[l] */\n \t\tif (size == 0)\n \t\t\tbreak;\n-\t\tmemcpy(&buf, mem, size);\n-\t\tif (IS_LE && (op->vsx_flags & VSX_LDLEFT)) {\n-\t\t\t/* reverse 16 bytes */\n-\t\t\tunsigned long tmp;\n-\t\t\ttmp = byterev_8(buf.d[0]);\n-\t\t\tbuf.d[0] = byterev_8(buf.d[1]);\n-\t\t\tbuf.d[1] = tmp;\n-\t\t}\n+\t\tmemcpy(reg, mem, size);\n+\t\tif (IS_LE && (op->vsx_flags & VSX_LDLEFT))\n+\t\t\trev = !rev;\n+\t\tif (rev)\n+\t\t\tdo_byte_reverse(reg, 16);\n \t\tbreak;\n \tcase 8:\n \t\t/* scalar loads, lxvd2x, lxvdsx */\n \t\tread_size = (size >= 8) ? 8 : size;\n \t\ti = IS_LE ? 8 : 8 - read_size;\n-\t\tmemcpy(&buf.b[i], mem, read_size);\n+\t\tmemcpy(®->b[i], mem, read_size);\n+\t\tif (rev)\n+\t\t\tdo_byte_reverse(®->b[i], 8);\n \t\tif (size < 8) {\n \t\t\tif (op->type & SIGNEXT) {\n \t\t\t\t/* size == 4 is the only case here */\n-\t\t\t\tbuf.d[IS_LE] = (signed int) buf.d[IS_LE];\n+\t\t\t\treg->d[IS_LE] = (signed int) reg->d[IS_LE];\n \t\t\t} else if (op->vsx_flags & VSX_FPCONV) {\n \t\t\t\tpreempt_disable();\n-\t\t\t\tconv_sp_to_dp(&buf.fp[1 + IS_LE],\n-\t\t\t\t\t &buf.dp[IS_LE]);\n+\t\t\t\tconv_sp_to_dp(®->fp[1 + IS_LE],\n+\t\t\t\t\t ®->dp[IS_LE]);\n \t\t\t\tpreempt_enable();\n \t\t\t}\n \t\t} else {\n-\t\t\tif (size == 16)\n-\t\t\t\tbuf.d[IS_BE] = *(unsigned long *)(mem + 8);\n-\t\t\telse if (op->vsx_flags & VSX_SPLAT)\n-\t\t\t\tbuf.d[IS_BE] = buf.d[IS_LE];\n+\t\t\tif (size == 16) {\n+\t\t\t\tunsigned long v = *(unsigned long *)(mem + 8);\n+\t\t\t\treg->d[IS_BE] = !rev ? v : byterev_8(v);\n+\t\t\t} else if (op->vsx_flags & VSX_SPLAT)\n+\t\t\t\treg->d[IS_BE] = reg->d[IS_LE];\n \t\t}\n \t\tbreak;\n \tcase 4:\n@@ -617,13 +670,13 @@ void emulate_vsx_load(struct instruction_op *op, union vsx_reg *reg,\n \t\twp = mem;\n \t\tfor (j = 0; j < size / 4; ++j) {\n \t\t\ti = IS_LE ? 3 - j : j;\n-\t\t\tbuf.w[i] = *wp++;\n+\t\t\treg->w[i] = !rev ? *wp++ : byterev_4(*wp++);\n \t\t}\n \t\tif (op->vsx_flags & VSX_SPLAT) {\n-\t\t\tu32 val = buf.w[IS_LE ? 3 : 0];\n+\t\t\tu32 val = reg->w[IS_LE ? 3 : 0];\n \t\t\tfor (; j < 4; ++j) {\n \t\t\t\ti = IS_LE ? 3 - j : j;\n-\t\t\t\tbuf.w[i] = val;\n+\t\t\t\treg->w[i] = val;\n \t\t\t}\n \t\t}\n \t\tbreak;\n@@ -632,7 +685,7 @@ void emulate_vsx_load(struct instruction_op *op, union vsx_reg *reg,\n \t\thp = mem;\n \t\tfor (j = 0; j < size / 2; ++j) {\n \t\t\ti = IS_LE ? 7 - j : j;\n-\t\t\tbuf.h[i] = *hp++;\n+\t\t\treg->h[i] = !rev ? *hp++ : byterev_2(*hp++);\n \t\t}\n \t\tbreak;\n \tcase 1:\n@@ -640,20 +693,20 @@ void emulate_vsx_load(struct instruction_op *op, union vsx_reg *reg,\n \t\tbp = mem;\n \t\tfor (j = 0; j < size; ++j) {\n \t\t\ti = IS_LE ? 15 - j : j;\n-\t\t\tbuf.b[i] = *bp++;\n+\t\t\treg->b[i] = *bp++;\n \t\t}\n \t\tbreak;\n \t}\n-\t*reg = buf;\n }\n EXPORT_SYMBOL_GPL(emulate_vsx_load);\n NOKPROBE_SYMBOL(emulate_vsx_load);\n \n void emulate_vsx_store(struct instruction_op *op, const union vsx_reg *reg,\n-\t\t void *mem)\n+\t\t void *mem, bool cross_endian)\n {\n \tint size, write_size;\n \tint i, j;\n+\tbool rev = cross_endian;\n \tunion vsx_reg buf;\n \tunsigned int *wp;\n \tunsigned short *hp;\n@@ -666,7 +719,9 @@ void emulate_vsx_store(struct instruction_op *op, const union vsx_reg *reg,\n \t\t/* stxv, stxvx, stxvl, stxvll */\n \t\tif (size == 0)\n \t\t\tbreak;\n-\t\tif (IS_LE && (op->vsx_flags & VSX_LDLEFT)) {\n+\t\tif (IS_LE && (op->vsx_flags & VSX_LDLEFT))\n+\t\t\trev = !rev;\n+\t\tif (rev) {\n \t\t\t/* reverse 16 bytes */\n \t\t\tbuf.d[0] = byterev_8(reg->d[1]);\n \t\t\tbuf.d[1] = byterev_8(reg->d[0]);\n@@ -688,13 +743,18 @@ void emulate_vsx_store(struct instruction_op *op, const union vsx_reg *reg,\n \t\tmemcpy(mem, ®->b[i], write_size);\n \t\tif (size == 16)\n \t\t\tmemcpy(mem + 8, ®->d[IS_BE], 8);\n+\t\tif (unlikely(rev)) {\n+\t\t\tdo_byte_reverse(mem, write_size);\n+\t\t\tif (size == 16)\n+\t\t\t\tdo_byte_reverse(mem + 8, 8);\n+\t\t}\n \t\tbreak;\n \tcase 4:\n \t\t/* stxvw4x */\n \t\twp = mem;\n \t\tfor (j = 0; j < size / 4; ++j) {\n \t\t\ti = IS_LE ? 3 - j : j;\n-\t\t\t*wp++ = reg->w[i];\n+\t\t\t*wp++ = !rev ? reg->w[i] : byterev_4(reg->w[i]);\n \t\t}\n \t\tbreak;\n \tcase 2:\n@@ -702,7 +762,7 @@ void emulate_vsx_store(struct instruction_op *op, const union vsx_reg *reg,\n \t\thp = mem;\n \t\tfor (j = 0; j < size / 2; ++j) {\n \t\t\ti = IS_LE ? 7 - j : j;\n-\t\t\t*hp++ = reg->h[i];\n+\t\t\t*hp++ = !rev ? reg->h[i] : byterev_2(reg->h[i]);\n \t\t}\n \t\tbreak;\n \tcase 1:\n@@ -719,7 +779,7 @@ EXPORT_SYMBOL_GPL(emulate_vsx_store);\n NOKPROBE_SYMBOL(emulate_vsx_store);\n \n static nokprobe_inline int do_vsx_load(struct instruction_op *op,\n-\t\t\t\t struct pt_regs *regs)\n+\t\t\t\t struct pt_regs *regs, bool cross_endian)\n {\n \tint reg = op->reg;\n \tu8 mem[16];\n@@ -729,7 +789,7 @@ static nokprobe_inline int do_vsx_load(struct instruction_op *op,\n \tif (!address_ok(regs, op->ea, size) || copy_mem_in(mem, op->ea, size))\n \t\treturn -EFAULT;\n \n-\temulate_vsx_load(op, &buf, mem);\n+\temulate_vsx_load(op, &buf, mem, cross_endian);\n \tpreempt_disable();\n \tif (reg < 32) {\n \t\t/* FP regs + extensions */\n@@ -750,7 +810,7 @@ static nokprobe_inline int do_vsx_load(struct instruction_op *op,\n }\n \n static nokprobe_inline int do_vsx_store(struct instruction_op *op,\n-\t\t\t\t\tstruct pt_regs *regs)\n+\t\t\t\t\tstruct pt_regs *regs, bool cross_endian)\n {\n \tint reg = op->reg;\n \tu8 mem[16];\n@@ -776,7 +836,7 @@ static nokprobe_inline int do_vsx_store(struct instruction_op *op,\n \t\t\tbuf.v = current->thread.vr_state.vr[reg - 32];\n \t}\n \tpreempt_enable();\n-\temulate_vsx_store(op, &buf, mem);\n+\temulate_vsx_store(op, &buf, mem, cross_endian);\n \treturn copy_mem_out(mem, op->ea, size);\n }\n #endif /* CONFIG_VSX */\n@@ -2731,6 +2791,7 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)\n \tunsigned long val;\n \tunsigned int cr;\n \tint i, rd, nb;\n+\tbool cross_endian;\n \n \tr = analyse_instr(&op, regs, instr);\n \tif (r < 0)\n@@ -2742,6 +2803,7 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)\n \n \terr = 0;\n \tsize = GETSIZE(op.type);\n+\tcross_endian = (regs->msr & MSR_LE) != (MSR_KERNEL & MSR_LE);\n \tswitch (op.type & INSTR_TYPE_MASK) {\n \tcase CACHEOP:\n \t\tif (!address_ok(regs, op.ea, 8))\n@@ -2841,7 +2903,7 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)\n \tcase LOAD:\n #ifdef __powerpc64__\n \t\tif (size == 16) {\n-\t\t\terr = emulate_lq(regs, op.ea, op.reg);\n+\t\t\terr = emulate_lq(regs, op.ea, op.reg, cross_endian);\n \t\t\tgoto ldst_done;\n \t\t}\n #endif\n@@ -2849,39 +2911,40 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)\n \t\tif (!err) {\n \t\t\tif (op.type & SIGNEXT)\n \t\t\t\tdo_signext(®s->gpr[op.reg], size);\n-\t\t\tif (op.type & BYTEREV)\n+\t\t\tif ((op.type & BYTEREV) == (cross_endian ? 0 : BYTEREV))\n \t\t\t\tdo_byterev(®s->gpr[op.reg], size);\n \t\t}\n \t\tgoto ldst_done;\n \n #ifdef CONFIG_PPC_FPU\n \tcase LOAD_FP:\n-\t\terr = do_fp_load(op.reg, op.ea, size, regs);\n+\t\terr = do_fp_load(op.reg, op.ea, size, regs, cross_endian);\n \t\tgoto ldst_done;\n #endif\n #ifdef CONFIG_ALTIVEC\n \tcase LOAD_VMX:\n-\t\terr = do_vec_load(op.reg, op.ea, size, regs);\n+\t\terr = do_vec_load(op.reg, op.ea, size, regs, cross_endian);\n \t\tgoto ldst_done;\n #endif\n #ifdef CONFIG_VSX\n \tcase LOAD_VSX:\n-\t\terr = do_vsx_load(&op, regs);\n+\t\terr = do_vsx_load(&op, regs, cross_endian);\n \t\tgoto ldst_done;\n #endif\n \tcase LOAD_MULTI:\n-\t\tif (regs->msr & MSR_LE)\n-\t\t\treturn 0;\n \t\trd = op.reg;\n \t\tfor (i = 0; i < size; i += 4) {\n+\t\t\tunsigned int v32 = 0;\n+\n \t\t\tnb = size - i;\n \t\t\tif (nb > 4)\n \t\t\t\tnb = 4;\n-\t\t\terr = read_mem(®s->gpr[rd], op.ea, nb, regs);\n+\t\t\terr = copy_mem_in((u8 *) &v32, op.ea, nb);\n \t\t\tif (err)\n \t\t\t\treturn 0;\n-\t\t\tif (nb < 4)\t/* left-justify last bytes */\n-\t\t\t\tregs->gpr[rd] <<= 32 - 8 * nb;\n+\t\t\tif (unlikely(cross_endian))\n+\t\t\t\tv32 = byterev_4(v32);\n+\t\t\tregs->gpr[rd] = v32;\n \t\t\top.ea += 4;\n \t\t\t++rd;\n \t\t}\n@@ -2890,7 +2953,7 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)\n \tcase STORE:\n #ifdef __powerpc64__\n \t\tif (size == 16) {\n-\t\t\terr = emulate_stq(regs, op.ea, op.reg);\n+\t\t\terr = emulate_stq(regs, op.ea, op.reg, cross_endian);\n \t\t\tgoto ldst_done;\n \t\t}\n #endif\n@@ -2901,36 +2964,37 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)\n \t\t\terr = handle_stack_update(op.ea, regs);\n \t\t\tgoto ldst_done;\n \t\t}\n+\t\tif (unlikely(cross_endian))\n+\t\t\tdo_byterev(&op.val, size);\n \t\terr = write_mem(op.val, op.ea, size, regs);\n \t\tgoto ldst_done;\n \n #ifdef CONFIG_PPC_FPU\n \tcase STORE_FP:\n-\t\terr = do_fp_store(op.reg, op.ea, size, regs);\n+\t\terr = do_fp_store(op.reg, op.ea, size, regs, cross_endian);\n \t\tgoto ldst_done;\n #endif\n #ifdef CONFIG_ALTIVEC\n \tcase STORE_VMX:\n-\t\terr = do_vec_store(op.reg, op.ea, size, regs);\n+\t\terr = do_vec_store(op.reg, op.ea, size, regs, cross_endian);\n \t\tgoto ldst_done;\n #endif\n #ifdef CONFIG_VSX\n \tcase STORE_VSX:\n-\t\terr = do_vsx_store(&op, regs);\n+\t\terr = do_vsx_store(&op, regs, cross_endian);\n \t\tgoto ldst_done;\n #endif\n \tcase STORE_MULTI:\n-\t\tif (regs->msr & MSR_LE)\n-\t\t\treturn 0;\n \t\trd = op.reg;\n \t\tfor (i = 0; i < size; i += 4) {\n-\t\t\tval = regs->gpr[rd];\n+\t\t\tunsigned int v32 = regs->gpr[rd];\n+\n \t\t\tnb = size - i;\n \t\t\tif (nb > 4)\n \t\t\t\tnb = 4;\n-\t\t\telse\n-\t\t\t\tval >>= 32 - 8 * nb;\n-\t\t\terr = write_mem(val, op.ea, nb, regs);\n+\t\t\tif (unlikely(cross_endian))\n+\t\t\t\tv32 = byterev_4(v32);\n+\t\t\terr = copy_mem_out((u8 *) &v32, op.ea, nb);\n \t\t\tif (err)\n \t\t\t\treturn 0;\n \t\t\top.ea += 4;\n", "prefixes": [ "v2", "09/10" ] }