From patchwork Fri Apr 1 14:30:41 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 89284 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 88D65B6F89 for ; Sat, 2 Apr 2011 02:11:10 +1100 (EST) Received: from localhost ([127.0.0.1]:54447 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Q5g0B-00056B-Rf for incoming@patchwork.ozlabs.org; Fri, 01 Apr 2011 11:11:07 -0400 Received: from [140.186.70.92] (port=58113 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Q5fwb-0000V2-AK for qemu-devel@nongnu.org; Fri, 01 Apr 2011 11:07:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Q5fwZ-0006la-V7 for qemu-devel@nongnu.org; Fri, 01 Apr 2011 11:07:25 -0400 Received: from mnementh.archaic.org.uk ([81.2.115.146]:35411) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Q5fwZ-0006kw-IB for qemu-devel@nongnu.org; Fri, 01 Apr 2011 11:07:23 -0400 Received: from pm215 by mnementh.archaic.org.uk with local (Exim 4.72) (envelope-from ) id 1Q5fN5-0007mj-AY; Fri, 01 Apr 2011 15:30:43 +0100 From: Peter Maydell To: Anthony Liguori , qemu-devel@nongnu.org Date: Fri, 1 Apr 2011 15:30:41 +0100 Message-Id: <1301668243-29886-9-git-send-email-peter.maydell@linaro.org> X-Mailer: git-send-email 1.7.2.5 In-Reply-To: <1301668243-29886-1-git-send-email-peter.maydell@linaro.org> References: <1301668243-29886-1-git-send-email-peter.maydell@linaro.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 81.2.115.146 Cc: Aurelien Jarno Subject: [Qemu-devel] [PATCH 08/10] target-arm: Fix VLD of single element to all lanes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Fix several bugs in VLD of single element to all lanes: The "single element to all lanes" form of VLD1 differs from those for VLD2, VLD3 and VLD4 in that bit 5 indicates whether the loaded element should be written to one or two Dregs (rather than being a register stride). Handle this by special-casing VLD1 rather than trying to have one loop which deals with both VLD1 and 2/3/4. Handle VLD4.32 with 16 byte alignment specified, rather than UNDEFfing. UNDEF for the invalid size and alignment combinations. Signed-off-by: Peter Maydell --- target-arm/translate.c | 84 +++++++++++++++++++++++++++++++++-------------- 1 files changed, 59 insertions(+), 25 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 6ce8b7a..e79ea03 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2648,6 +2648,28 @@ static void gen_neon_dup_high16(TCGv var) tcg_temp_free_i32(tmp); } +static TCGv gen_load_and_replicate(DisasContext *s, TCGv addr, int size) +{ + /* Load a single Neon element and replicate into a 32 bit TCG reg */ + TCGv tmp; + switch (size) { + case 0: + tmp = gen_ld8u(addr, IS_USER(s)); + gen_neon_dup_u8(tmp, 0); + break; + case 1: + tmp = gen_ld16u(addr, IS_USER(s)); + gen_neon_dup_low16(tmp); + break; + case 2: + tmp = gen_ld32(addr, IS_USER(s)); + break; + default: /* Avoid compiler warnings. */ + abort(); + } + return tmp; +} + /* Disassemble a VFP instruction. Returns nonzero if an error occured (ie. an undefined instruction). */ static int disas_vfp_insn(CPUState * env, DisasContext *s, uint32_t insn) @@ -3890,36 +3912,48 @@ static int disas_neon_ls_insn(CPUState * env, DisasContext *s, uint32_t insn) size = (insn >> 10) & 3; if (size == 3) { /* Load single element to all lanes. */ - if (!load) + int a = (insn >> 4) & 1; + if (!load) { return 1; + } size = (insn >> 6) & 3; nregs = ((insn >> 8) & 3) + 1; - stride = (insn & (1 << 5)) ? 2 : 1; - load_reg_var(s, addr, rn); - for (reg = 0; reg < nregs; reg++) { - switch (size) { - case 0: - tmp = gen_ld8u(addr, IS_USER(s)); - gen_neon_dup_u8(tmp, 0); - break; - case 1: - tmp = gen_ld16u(addr, IS_USER(s)); - gen_neon_dup_low16(tmp); - break; - case 2: - tmp = gen_ld32(addr, IS_USER(s)); - break; - case 3: + + if (size == 3) { + if (nregs != 4 || a == 0) { return 1; - default: /* Avoid compiler warnings. */ - abort(); } - tcg_gen_addi_i32(addr, addr, 1 << size); - tmp2 = tcg_temp_new_i32(); - tcg_gen_mov_i32(tmp2, tmp); - neon_store_reg(rd, 0, tmp2); - neon_store_reg(rd, 1, tmp); - rd += stride; + /* For VLD4 size==3 a == 1 means 32 bits at 16 byte alignment */ + size = 2; + } + if (nregs == 1 && a == 1 && size == 0) { + return 1; + } + if (nregs == 3 && a == 1) { + return 1; + } + load_reg_var(s, addr, rn); + if (nregs == 1) { + /* VLD1 to all lanes: bit 5 indicates how many Dregs to write */ + tmp = gen_load_and_replicate(s, addr, size); + tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0)); + tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1)); + if (insn & (1 << 5)) { + tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 0)); + tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 1)); + } + tcg_temp_free_i32(tmp); + } else { + /* VLD2/3/4 to all lanes: bit 5 indicates register stride */ + stride = (insn & (1 << 5)) ? 2 : 1; + for (reg = 0; reg < nregs; reg++) { + tmp = gen_load_and_replicate(s, addr, size); + tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0)); + tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1)); + tcg_temp_free_i32(tmp); + tcg_gen_addi_i32(addr, addr, 1 << size); + rd += stride; + } } stride = (1 << size) * nregs; } else {