From patchwork Tue Jun 8 22:34:33 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Schmidt X-Patchwork-Id: 55039 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 88CC0B7D20 for ; Wed, 9 Jun 2010 08:34:46 +1000 (EST) Received: (qmail 2470 invoked by alias); 8 Jun 2010 22:34:44 -0000 Received: (qmail 2442 invoked by uid 22791); 8 Jun 2010 22:34:40 -0000 X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL, BAYES_00, TW_DQ, TW_DR, TW_OV, TW_QE, TW_VH, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 08 Jun 2010 22:34:31 +0000 Received: (qmail 18339 invoked from network); 8 Jun 2010 22:34:29 -0000 Received: from unknown (HELO ?84.152.236.246?) (bernds@127.0.0.2) by mail.codesourcery.com with ESMTPA; 8 Jun 2010 22:34:29 -0000 Message-ID: <4C0EC579.5010603@codesourcery.com> Date: Wed, 09 Jun 2010 00:34:33 +0200 From: Bernd Schmidt User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100425 Thunderbird/3.0.4 MIME-Version: 1.0 To: GCC Patches Subject: ARM patch (PR42172): Rework the {sign,zero}-extend patterns Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org PR42172 is a case where we generate terrible code for bitfield assignments on Thumb1. Part of the problem is that we fail to optimize strb r3, [r0] ldrb r3, [r0] (insn 11 (set (mem:QI (reg:SI 133) (subreg:QI (reg:SI 136) 0)) (insn 12 (set (reg:SI 140) (zero_extend:SI (mem:QI (reg/v/f:SI 133))))) cse will of course try to replace the second MEM with the previous value in a register, but the ARM backend (if !arch_v6) does not accept zero_extend from a reg, only from a mem. This makes some sense at first glance as there isn't a specific instruction, so the expander creates a sequence of shifts when given a REG input. However, to make CSE do its job, I believe it's best to make the ARM extension patterns accept registers. We should avoid creating such patterns when possible, both at expand time (generating the sequence of shifts as before), and in the combiner (by setting appropriate rtx_costs), since the two-shift sequence can often be optimized further. The following patch does that, and it also fixes a few other problems: * The expanders are a little ugly: [(set (match_dup 2) (ashift:SI (match_operand:HI 1 "nonimmediate_operand" "") (const_int 16))) This is invalid RTL; it works because we modify operand[1], but I've changed the patterns to avoid this (and to make them a lot smaller, and, IMO, clearer) * Slightly broken Thumb1 patterns: if (which_alternative == 0) return \"sxtb\\t%0, %1\"; [....] if (which_alternative == 0) return \"ldrsb\\t%0, %1\"; * Removed massive duplication of Thumb1 patterns for v6 and !v6 * Removed apparently useless code to handle LABEL_REFs in addresses in Thumb1 extend patterns. Here are some examples of the code generation changes: strb r3, [r4, #4] ldrb r2, [r8, #4] @ zero_extendqisi2 - ldrb r3, [r4, #4] @ zero_extendqisi2 + and r3, r3, #255 ==== - ldrb r3, [sp, #36] @ zero_extendqisi2 - bic r3, r3, #176 - orr r3, r3, #64 - strb r3, [sp, #36] - ldrb r3, [sp, #36] @ zero_extendqisi2 - bic r3, r3, #10 - orr r3, r3, #5 + mov r3, #69 ==== - strb r2, [ip, #4] - ldrb r2, [ip, #4] @ zero_extendqisi2 bic r2, r2, #2 strb r2, [ip, #4] ==== - ldrh r3, [r5, #2] - lsl r3, r3, #16 - asr r3, r3, #16 - add r2, r3, #1 + mov r0, #2 + ldrsh r3, [r5, r0] + add r1, r3, #1 On the whole things looked very good in the testcases I looked at. Unfortunately PR42172 isn't completely solved by this, but it gets a lot better. Regression tested on: Target is arm-none-linux-gnueabi Host is i686-pc-linux-gnu Schedule of variations: qemu-system-armv7/arch=armv7-a/thumb qemu-system-armv7/thumb qemu-system-armv7 Ok? Bernd * config/arm/arm.c (thumb1_rtx_costs): Improve support for SIGN_EXTEND and ZERO_EXTEND. (arm_rtx_costs_1): Likewise. (arm_size_rtx_costs): Use arm_rtx_costs_1 for these codes. * config/arm/arm.md (is_arch6): New attribute. (zero_extendhisi2, zero_extendqisi2, extendhisi2, extendqisi2): Tighten the code somewhat, avoiding invalid RTL to occur in the expander patterns. (thumb1_zero_extendhisi2): Merge with thumb1_zero_extendhisi2_v6. (thumb1_zero_extendhisi2_v6): Delete. (thumb1_extendhisi2): Merge with thumb1_extendhisi2_v6. (thumb1_extendhisi2_v6): Delete. (thumb1_extendqisi2): Merge with thumb1_extendhisi2_v6. (thumb1_extendqisi2_v6): Delete. (zero_extendhisi2 for register input splitter): New. (zero_extendqisi2 for register input splitter): New. (thumb1_extendhisi2 for register input splitter): New. (extendhisi2 for register input splitter): New. (extendqisi2 for register input splitter): New. (TARGET_THUMB1 extendqisi2 for memory input splitter): New. (arm_zero_extendhisi2): Allow nonimmediate_operand for operand 1, and add support for a register alternative requiring a split. (thumb1_zero_extendqisi2): Likewise. (arm_zero_extendqisi2): Likewise. (arm_extendhisi2): Likewise. (arm_extendqisi2): Likewise. * gcc.target/arm/pr42172-1.c: New test. Index: config/arm/arm.c =================================================================== --- config/arm/arm.c (revision 160260) +++ config/arm/arm.c (working copy) @@ -6193,6 +6193,7 @@ static inline int thumb1_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer) { enum machine_mode mode = GET_MODE (x); + int total; switch (code) { @@ -6291,24 +6292,20 @@ thumb1_rtx_costs (rtx x, enum rtx_code c return 14; return 2; + case SIGN_EXTEND: case ZERO_EXTEND: - /* XXX still guessing. */ - switch (GET_MODE (XEXP (x, 0))) - { - case QImode: - return (1 + (mode == DImode ? 4 : 0) - + (GET_CODE (XEXP (x, 0)) == MEM ? 10 : 0)); + total = mode == DImode ? COSTS_N_INSNS (1) : 0; + total += thumb1_rtx_costs (XEXP (x, 0), GET_CODE (XEXP (x, 0)), code); - case HImode: - return (4 + (mode == DImode ? 4 : 0) - + (GET_CODE (XEXP (x, 0)) == MEM ? 10 : 0)); + if (mode == SImode) + return total; - case SImode: - return (1 + (GET_CODE (XEXP (x, 0)) == MEM ? 10 : 0)); + if (arm_arch6) + return total + COSTS_N_INSNS (1); - default: - return 99; - } + /* Assume a two-shift sequence. Increase the cost slightly so + we prefer actual shifts over an extend operation. */ + return total + 1 + COSTS_N_INSNS (2); default: return 99; @@ -6794,44 +6791,39 @@ arm_rtx_costs_1 (rtx x, enum rtx_code ou return false; case SIGN_EXTEND: - if (GET_MODE_CLASS (mode) == MODE_INT) - { - *total = 0; - if (mode == DImode) - *total += COSTS_N_INSNS (1); - - if (GET_MODE (XEXP (x, 0)) != SImode) - { - if (arm_arch6) - { - if (GET_CODE (XEXP (x, 0)) != MEM) - *total += COSTS_N_INSNS (1); - } - else if (!arm_arch4 || GET_CODE (XEXP (x, 0)) != MEM) - *total += COSTS_N_INSNS (2); - } - - return false; - } - - /* Fall through */ case ZERO_EXTEND: *total = 0; if (GET_MODE_CLASS (mode) == MODE_INT) { + rtx op = XEXP (x, 0); + enum machine_mode opmode = GET_MODE (op); + if (mode == DImode) *total += COSTS_N_INSNS (1); - if (GET_MODE (XEXP (x, 0)) != SImode) + if (opmode != SImode) { - if (arm_arch6) + if (MEM_P (op)) { - if (GET_CODE (XEXP (x, 0)) != MEM) - *total += COSTS_N_INSNS (1); + /* If !arm_arch4, we use one of the extendhisi2_mem + or movhi_bytes patterns for HImode. For a QImode + sign extension, we first zero-extend from memory + and then perform a shift sequence. */ + if (!arm_arch4 && (opmode != QImode || code == SIGN_EXTEND)) + *total += COSTS_N_INSNS (2); } - else if (!arm_arch4 || GET_CODE (XEXP (x, 0)) != MEM) - *total += COSTS_N_INSNS (GET_MODE (XEXP (x, 0)) == QImode ? - 1 : 2); + else if (arm_arch6) + *total += COSTS_N_INSNS (1); + + /* We don't have the necessary insn, so we need to perform some + other operation. */ + else if (TARGET_ARM && code == ZERO_EXTEND && mode == QImode) + /* An and with constant 255. */ + *total += COSTS_N_INSNS (1); + else + /* A shift sequence. Increase costs slightly to avoid + combining two shifts into an extend operation. */ + *total += COSTS_N_INSNS (2) + 1; } return false; @@ -7187,41 +7179,8 @@ arm_size_rtx_costs (rtx x, enum rtx_code return false; case SIGN_EXTEND: - *total = 0; - if (GET_MODE_SIZE (GET_MODE (XEXP (x, 0))) < 4) - { - if (!(arm_arch4 && MEM_P (XEXP (x, 0)))) - *total += COSTS_N_INSNS (arm_arch6 ? 1 : 2); - } - if (mode == DImode) - *total += COSTS_N_INSNS (1); - return false; - case ZERO_EXTEND: - *total = 0; - if (!(arm_arch4 && MEM_P (XEXP (x, 0)))) - { - switch (GET_MODE (XEXP (x, 0))) - { - case QImode: - *total += COSTS_N_INSNS (1); - break; - - case HImode: - *total += COSTS_N_INSNS (arm_arch6 ? 1 : 2); - - case SImode: - break; - - default: - *total += COSTS_N_INSNS (2); - } - } - - if (mode == DImode) - *total += COSTS_N_INSNS (1); - - return false; + return arm_rtx_costs_1 (x, outer_code, total, 0); case CONST_INT: if (const_ok_for_arm (INTVAL (x))) Index: config/arm/ldmstm.md =================================================================== Index: config/arm/arm-ldmstm.ml =================================================================== Index: config/arm/arm.md =================================================================== --- config/arm/arm.md (revision 160260) +++ config/arm/arm.md (working copy) @@ -148,6 +148,9 @@ (define_constants ; patterns that share the same RTL in both ARM and Thumb code. (define_attr "is_thumb" "no,yes" (const (symbol_ref "thumb_code"))) +; IS_ARCH6 is set to 'yes' when we are generating code form ARMv6. +(define_attr "is_arch6" "no,yes" (const (symbol_ref "arm_arch6"))) + ;; Operand number of an input operand that is shifted. Zero if the ;; given instruction does not shift one of its input operands. (define_attr "shift" "" (const_int 0)) @@ -3958,93 +3961,46 @@ (define_insn "*arm_extendsidi2" ) (define_expand "zero_extendhisi2" - [(set (match_dup 2) - (ashift:SI (match_operand:HI 1 "nonimmediate_operand" "") - (const_int 16))) - (set (match_operand:SI 0 "s_register_operand" "") - (lshiftrt:SI (match_dup 2) (const_int 16)))] + [(set (match_operand:SI 0 "s_register_operand" "") + (zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" "")))] "TARGET_EITHER" - " - { - if ((TARGET_THUMB1 || arm_arch4) && GET_CODE (operands[1]) == MEM) - { - emit_insn (gen_rtx_SET (VOIDmode, operands[0], - gen_rtx_ZERO_EXTEND (SImode, operands[1]))); - DONE; - } - - if (TARGET_ARM && GET_CODE (operands[1]) == MEM) - { - emit_insn (gen_movhi_bytes (operands[0], operands[1])); - DONE; - } - - if (!s_register_operand (operands[1], HImode)) - operands[1] = copy_to_mode_reg (HImode, operands[1]); - - if (arm_arch6) - { - emit_insn (gen_rtx_SET (VOIDmode, operands[0], - gen_rtx_ZERO_EXTEND (SImode, operands[1]))); - DONE; - } - - operands[1] = gen_lowpart (SImode, operands[1]); - operands[2] = gen_reg_rtx (SImode); - }" -) - -(define_insn "*thumb1_zero_extendhisi2" - [(set (match_operand:SI 0 "register_operand" "=l") - (zero_extend:SI (match_operand:HI 1 "memory_operand" "m")))] - "TARGET_THUMB1 && !arm_arch6" - "* - rtx mem = XEXP (operands[1], 0); - - if (GET_CODE (mem) == CONST) - mem = XEXP (mem, 0); - - if (GET_CODE (mem) == LABEL_REF) - return \"ldr\\t%0, %1\"; - - if (GET_CODE (mem) == PLUS) +{ + if (TARGET_ARM && !arm_arch4 && MEM_P (operands[1])) { - rtx a = XEXP (mem, 0); - rtx b = XEXP (mem, 1); - - /* This can happen due to bugs in reload. */ - if (GET_CODE (a) == REG && REGNO (a) == SP_REGNUM) - { - rtx ops[2]; - ops[0] = operands[0]; - ops[1] = a; - - output_asm_insn (\"mov %0, %1\", ops); - - XEXP (mem, 0) = operands[0]; - } - - else if ( GET_CODE (a) == LABEL_REF - && GET_CODE (b) == CONST_INT) - return \"ldr\\t%0, %1\"; + emit_insn (gen_movhi_bytes (operands[0], operands[1])); + DONE; } - - return \"ldrh\\t%0, %1\"; - " - [(set_attr "length" "4") - (set_attr "type" "load_byte") - (set_attr "pool_range" "60")] -) + if (!arm_arch6 && !MEM_P (operands[1])) + { + rtx t = gen_lowpart (SImode, operands[1]); + rtx tmp = gen_reg_rtx (SImode); + emit_insn (gen_ashlsi3 (tmp, t, GEN_INT (16))); + emit_insn (gen_lshrsi3 (operands[0], tmp, GEN_INT (16))); + DONE; + } +}) -(define_insn "*thumb1_zero_extendhisi2_v6" +(define_split + [(set (match_operand:SI 0 "register_operand" "") + (zero_extend:SI (match_operand:HI 1 "register_operand" "l,m")))] + "!TARGET_THUMB2 && !arm_arch6" + [(set (match_dup 0) (ashift:SI (match_dup 2) (const_int 16))) + (set (match_dup 0) (lshiftrt:SI (match_dup 0) (const_int 16)))] +{ + operands[2] = gen_lowpart (SImode, operands[1]); +}) + +(define_insn "*thumb1_zero_extendhisi2" [(set (match_operand:SI 0 "register_operand" "=l,l") (zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" "l,m")))] - "TARGET_THUMB1 && arm_arch6" + "TARGET_THUMB1" "* rtx mem; - if (which_alternative == 0) + if (which_alternative == 0 && arm_arch6) return \"uxth\\t%0, %1\"; + if (which_alternative == 0) + return \"#\"; mem = XEXP (operands[1], 0); @@ -4078,20 +4034,25 @@ (define_insn "*thumb1_zero_extendhisi2_v return \"ldrh\\t%0, %1\"; " - [(set_attr "length" "2,4") + [(set_attr_alternative "length" + [(if_then_else (eq_attr "is_arch6" "yes") + (const_int 2) (const_int 4)) + (const_int 4)]) (set_attr "type" "alu_shift,load_byte") (set_attr "pool_range" "*,60")] ) (define_insn "*arm_zero_extendhisi2" - [(set (match_operand:SI 0 "s_register_operand" "=r") - (zero_extend:SI (match_operand:HI 1 "memory_operand" "m")))] + [(set (match_operand:SI 0 "s_register_operand" "=r,r") + (zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" "r,m")))] "TARGET_ARM && arm_arch4 && !arm_arch6" - "ldr%(h%)\\t%0, %1" - [(set_attr "type" "load_byte") + "@ + # + ldr%(h%)\\t%0, %1" + [(set_attr "type" "alu_shift,load_byte") (set_attr "predicable" "yes") - (set_attr "pool_range" "256") - (set_attr "neg_pool_range" "244")] + (set_attr "pool_range" "*,256") + (set_attr "neg_pool_range" "*,244")] ) (define_insn "*arm_zero_extendhisi2_v6" @@ -4121,50 +4082,49 @@ (define_expand "zero_extendqisi2" [(set (match_operand:SI 0 "s_register_operand" "") (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "")))] "TARGET_EITHER" - " - if (!arm_arch6 && GET_CODE (operands[1]) != MEM) +{ + if (TARGET_ARM && !arm_arch6 && GET_CODE (operands[1]) != MEM) { - if (TARGET_ARM) - { - emit_insn (gen_andsi3 (operands[0], - gen_lowpart (SImode, operands[1]), - GEN_INT (255))); - } - else /* TARGET_THUMB */ - { - rtx temp = gen_reg_rtx (SImode); - rtx ops[3]; - - operands[1] = copy_to_mode_reg (QImode, operands[1]); - operands[1] = gen_lowpart (SImode, operands[1]); - - ops[0] = temp; - ops[1] = operands[1]; - ops[2] = GEN_INT (24); - - emit_insn (gen_rtx_SET (VOIDmode, ops[0], - gen_rtx_ASHIFT (SImode, ops[1], ops[2]))); - - ops[0] = operands[0]; - ops[1] = temp; - ops[2] = GEN_INT (24); + emit_insn (gen_andsi3 (operands[0], + gen_lowpart (SImode, operands[1]), + GEN_INT (255))); + DONE; + } + if (!arm_arch6 && !MEM_P (operands[1])) + { + rtx t = gen_lowpart (SImode, operands[1]); + rtx tmp = gen_reg_rtx (SImode); + emit_insn (gen_ashlsi3 (tmp, t, GEN_INT (24))); + emit_insn (gen_lshrsi3 (operands[0], tmp, GEN_INT (24))); + DONE; + } +}) - emit_insn (gen_rtx_SET (VOIDmode, ops[0], - gen_rtx_LSHIFTRT (SImode, ops[1], ops[2]))); - } +(define_split + [(set (match_operand:SI 0 "register_operand" "") + (zero_extend:SI (match_operand:QI 1 "register_operand" "")))] + "!arm_arch6" + [(set (match_dup 0) (ashift:SI (match_dup 2) (const_int 24))) + (set (match_dup 0) (lshiftrt:SI (match_dup 0) (const_int 24)))] +{ + operands[2] = simplify_gen_subreg (SImode, operands[1], QImode, 0); + if (TARGET_ARM) + { + emit_insn (gen_andsi3 (operands[0], operands[2], GEN_INT (255))); DONE; } - " -) +}) (define_insn "*thumb1_zero_extendqisi2" - [(set (match_operand:SI 0 "register_operand" "=l") - (zero_extend:SI (match_operand:QI 1 "memory_operand" "m")))] + [(set (match_operand:SI 0 "register_operand" "=l,l") + (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "l,m")))] "TARGET_THUMB1 && !arm_arch6" - "ldrb\\t%0, %1" - [(set_attr "length" "2") - (set_attr "type" "load_byte") - (set_attr "pool_range" "32")] + "@ + # + ldrb\\t%0, %1" + [(set_attr "length" "4,2") + (set_attr "type" "alu_shift,load_byte") + (set_attr "pool_range" "*,32")] ) (define_insn "*thumb1_zero_extendqisi2_v6" @@ -4180,14 +4140,17 @@ (define_insn "*thumb1_zero_extendqisi2_v ) (define_insn "*arm_zero_extendqisi2" - [(set (match_operand:SI 0 "s_register_operand" "=r") - (zero_extend:SI (match_operand:QI 1 "memory_operand" "m")))] + [(set (match_operand:SI 0 "s_register_operand" "=r,r") + (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] "TARGET_ARM && !arm_arch6" - "ldr%(b%)\\t%0, %1\\t%@ zero_extendqisi2" - [(set_attr "type" "load_byte") + "@ + # + ldr%(b%)\\t%0, %1\\t%@ zero_extendqisi2" + [(set_attr "length" "8,4") + (set_attr "type" "alu_shift,load_byte") (set_attr "predicable" "yes") - (set_attr "pool_range" "4096") - (set_attr "neg_pool_range" "4084")] + (set_attr "pool_range" "*,4096") + (set_attr "neg_pool_range" "*,4084")] ) (define_insn "*arm_zero_extendqisi2_v6" @@ -4266,108 +4229,42 @@ (define_insn "*compareqi_eq0" ) (define_expand "extendhisi2" - [(set (match_dup 2) - (ashift:SI (match_operand:HI 1 "nonimmediate_operand" "") - (const_int 16))) - (set (match_operand:SI 0 "s_register_operand" "") - (ashiftrt:SI (match_dup 2) - (const_int 16)))] + [(set (match_operand:SI 0 "s_register_operand" "") + (sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" "")))] "TARGET_EITHER" - " - { - if (GET_CODE (operands[1]) == MEM) - { - if (TARGET_THUMB1) - { - emit_insn (gen_thumb1_extendhisi2 (operands[0], operands[1])); - DONE; - } - else if (arm_arch4) - { - emit_insn (gen_rtx_SET (VOIDmode, operands[0], - gen_rtx_SIGN_EXTEND (SImode, operands[1]))); - DONE; - } - } - - if (TARGET_ARM && GET_CODE (operands[1]) == MEM) - { - emit_insn (gen_extendhisi2_mem (operands[0], operands[1])); - DONE; - } - - if (!s_register_operand (operands[1], HImode)) - operands[1] = copy_to_mode_reg (HImode, operands[1]); - - if (arm_arch6) - { - if (TARGET_THUMB1) - emit_insn (gen_thumb1_extendhisi2 (operands[0], operands[1])); - else - emit_insn (gen_rtx_SET (VOIDmode, operands[0], - gen_rtx_SIGN_EXTEND (SImode, operands[1]))); - - DONE; - } - - operands[1] = gen_lowpart (SImode, operands[1]); - operands[2] = gen_reg_rtx (SImode); - }" -) - -(define_insn "thumb1_extendhisi2" - [(set (match_operand:SI 0 "register_operand" "=l") - (sign_extend:SI (match_operand:HI 1 "memory_operand" "m"))) - (clobber (match_scratch:SI 2 "=&l"))] - "TARGET_THUMB1 && !arm_arch6" - "* - { - rtx ops[4]; - rtx mem = XEXP (operands[1], 0); - - /* This code used to try to use 'V', and fix the address only if it was - offsettable, but this fails for e.g. REG+48 because 48 is outside the - range of QImode offsets, and offsettable_address_p does a QImode - address check. */ - - if (GET_CODE (mem) == CONST) - mem = XEXP (mem, 0); - - if (GET_CODE (mem) == LABEL_REF) - return \"ldr\\t%0, %1\"; - - if (GET_CODE (mem) == PLUS) - { - rtx a = XEXP (mem, 0); - rtx b = XEXP (mem, 1); - - if (GET_CODE (a) == LABEL_REF - && GET_CODE (b) == CONST_INT) - return \"ldr\\t%0, %1\"; - - if (GET_CODE (b) == REG) - return \"ldrsh\\t%0, %1\"; - - ops[1] = a; - ops[2] = b; - } - else - { - ops[1] = mem; - ops[2] = const0_rtx; - } +{ + if (TARGET_THUMB1) + { + emit_insn (gen_thumb1_extendhisi2 (operands[0], operands[1])); + DONE; + } + if (MEM_P (operands[1]) && TARGET_ARM && !arm_arch4) + { + emit_insn (gen_extendhisi2_mem (operands[0], operands[1])); + DONE; + } - gcc_assert (GET_CODE (ops[1]) == REG); + if (!arm_arch6 && !MEM_P (operands[1])) + { + rtx t = gen_lowpart (SImode, operands[1]); + rtx tmp = gen_reg_rtx (SImode); + emit_insn (gen_ashlsi3 (tmp, t, GEN_INT (16))); + emit_insn (gen_ashrsi3 (operands[0], tmp, GEN_INT (16))); + DONE; + } +}) - ops[0] = operands[0]; - ops[3] = operands[2]; - output_asm_insn (\"mov\\t%3, %2\;ldrsh\\t%0, [%1, %3]\", ops); - return \"\"; - }" - [(set_attr "length" "4") - (set_attr "type" "load_byte") - (set_attr "pool_range" "1020")] -) +(define_split + [(parallel + [(set (match_operand:SI 0 "register_operand" "") + (sign_extend:SI (match_operand:HI 1 "register_operand" ""))) + (clobber (match_scratch:SI 2 ""))])] + "!arm_arch6" + [(set (match_dup 0) (ashift:SI (match_dup 2) (const_int 16))) + (set (match_dup 0) (ashiftrt:SI (match_dup 0) (const_int 16)))] +{ + operands[2] = simplify_gen_subreg (SImode, operands[1], HImode, 0); +}) ;; We used to have an early-clobber on the scratch register here. ;; However, there's a bug somewhere in reload which means that this @@ -4376,16 +4273,18 @@ (define_insn "thumb1_extendhisi2" ;; we try to verify the operands. Fortunately, we don't really need ;; the early-clobber: we can always use operand 0 if operand 2 ;; overlaps the address. -(define_insn "*thumb1_extendhisi2_insn_v6" +(define_insn "thumb1_extendhisi2" [(set (match_operand:SI 0 "register_operand" "=l,l") (sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" "l,m"))) (clobber (match_scratch:SI 2 "=X,l"))] - "TARGET_THUMB1 && arm_arch6" + "TARGET_THUMB1" "* { rtx ops[4]; rtx mem; + if (which_alternative == 0 && !arm_arch6) + return \"#\"; if (which_alternative == 0) return \"sxth\\t%0, %1\"; @@ -4433,7 +4332,10 @@ (define_insn "*thumb1_extendhisi2_insn_v output_asm_insn (\"mov\\t%3, %2\;ldrsh\\t%0, [%1, %3]\", ops); return \"\"; }" - [(set_attr "length" "2,4") + [(set_attr_alternative "length" + [(if_then_else (eq_attr "is_arch6" "yes") + (const_int 2) (const_int 4)) + (const_int 4)]) (set_attr "type" "alu_shift,load_byte") (set_attr "pool_range" "*,1020")] ) @@ -4474,15 +4376,28 @@ (define_expand "extendhisi2_mem" }" ) +(define_split + [(set (match_operand:SI 0 "register_operand" "") + (sign_extend:SI (match_operand:HI 1 "register_operand" "")))] + "!arm_arch6" + [(set (match_dup 0) (ashift:SI (match_dup 2) (const_int 16))) + (set (match_dup 0) (ashiftrt:SI (match_dup 0) (const_int 16)))] +{ + operands[2] = simplify_gen_subreg (SImode, operands[1], HImode, 0); +}) + (define_insn "*arm_extendhisi2" - [(set (match_operand:SI 0 "s_register_operand" "=r") - (sign_extend:SI (match_operand:HI 1 "memory_operand" "m")))] + [(set (match_operand:SI 0 "s_register_operand" "=r,r") + (sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" "r,m")))] "TARGET_ARM && arm_arch4 && !arm_arch6" - "ldr%(sh%)\\t%0, %1" - [(set_attr "type" "load_byte") + "@ + # + ldr%(sh%)\\t%0, %1" + [(set_attr "length" "8,4") + (set_attr "type" "alu_shift,load_byte") (set_attr "predicable" "yes") - (set_attr "pool_range" "256") - (set_attr "neg_pool_range" "244")] + (set_attr "pool_range" "*,256") + (set_attr "neg_pool_range" "*,244")] ) ;; ??? Check Thumb-2 pool range @@ -4544,46 +4459,45 @@ (define_insn "*arm_extendqihi_insn" ) (define_expand "extendqisi2" - [(set (match_dup 2) - (ashift:SI (match_operand:QI 1 "arm_reg_or_extendqisi_mem_op" "") - (const_int 24))) - (set (match_operand:SI 0 "s_register_operand" "") - (ashiftrt:SI (match_dup 2) - (const_int 24)))] + [(set (match_operand:SI 0 "s_register_operand" "") + (sign_extend:SI (match_operand:QI 1 "arm_reg_or_extendqisi_mem_op" "")))] "TARGET_EITHER" - " - { - if ((TARGET_THUMB || arm_arch4) && GET_CODE (operands[1]) == MEM) - { - emit_insn (gen_rtx_SET (VOIDmode, operands[0], - gen_rtx_SIGN_EXTEND (SImode, operands[1]))); - DONE; - } - - if (!s_register_operand (operands[1], QImode)) - operands[1] = copy_to_mode_reg (QImode, operands[1]); +{ + if (!arm_arch4 && MEM_P (operands[1])) + operands[1] = copy_to_mode_reg (QImode, operands[1]); - if (arm_arch6) - { - emit_insn (gen_rtx_SET (VOIDmode, operands[0], - gen_rtx_SIGN_EXTEND (SImode, operands[1]))); - DONE; - } + if (!arm_arch6 && !MEM_P (operands[1])) + { + rtx t = gen_lowpart (SImode, operands[1]); + rtx tmp = gen_reg_rtx (SImode); + emit_insn (gen_ashlsi3 (tmp, t, GEN_INT (24))); + emit_insn (gen_ashrsi3 (operands[0], tmp, GEN_INT (24))); + DONE; + } +}) - operands[1] = gen_lowpart (SImode, operands[1]); - operands[2] = gen_reg_rtx (SImode); - }" -) +(define_split + [(set (match_operand:SI 0 "register_operand" "") + (sign_extend:SI (match_operand:QI 1 "register_operand" "")))] + "!arm_arch6" + [(set (match_dup 0) (ashift:SI (match_dup 2) (const_int 24))) + (set (match_dup 0) (ashiftrt:SI (match_dup 0) (const_int 24)))] +{ + operands[2] = simplify_gen_subreg (SImode, operands[1], QImode, 0); +}) (define_insn "*arm_extendqisi" - [(set (match_operand:SI 0 "s_register_operand" "=r") - (sign_extend:SI (match_operand:QI 1 "arm_extendqisi_mem_op" "Uq")))] + [(set (match_operand:SI 0 "s_register_operand" "=r,r") + (sign_extend:SI (match_operand:QI 1 "arm_reg_or_extendqisi_mem_op" "r,Uq")))] "TARGET_ARM && arm_arch4 && !arm_arch6" - "ldr%(sb%)\\t%0, %1" - [(set_attr "type" "load_byte") + "@ + # + ldr%(sb%)\\t%0, %1" + [(set_attr "length" "8,4") + (set_attr "type" "alu_shift,load_byte") (set_attr "predicable" "yes") - (set_attr "pool_range" "256") - (set_attr "neg_pool_range" "244")] + (set_attr "pool_range" "*,256") + (set_attr "neg_pool_range" "*,244")] ) (define_insn "*arm_extendqisi_v6" @@ -4611,162 +4525,78 @@ (define_insn "*arm_extendqisi2addsi" (set_attr "predicable" "yes")] ) -(define_insn "*thumb1_extendqisi2" - [(set (match_operand:SI 0 "register_operand" "=l,l") - (sign_extend:SI (match_operand:QI 1 "memory_operand" "V,m")))] - "TARGET_THUMB1 && !arm_arch6" - "* - { - rtx ops[3]; - rtx mem = XEXP (operands[1], 0); - - if (GET_CODE (mem) == CONST) - mem = XEXP (mem, 0); - - if (GET_CODE (mem) == LABEL_REF) - return \"ldr\\t%0, %1\"; +(define_split + [(set (match_operand:SI 0 "register_operand" "") + (sign_extend:SI (match_operand:QI 1 "memory_operand" "")))] + "TARGET_THUMB1 && reload_completed" + [(set (match_dup 0) (match_dup 2)) + (set (match_dup 0) (sign_extend:SI (match_dup 3)))] +{ + rtx addr = XEXP (operands[1], 0); - if (GET_CODE (mem) == PLUS - && GET_CODE (XEXP (mem, 0)) == LABEL_REF) - return \"ldr\\t%0, %1\"; - - if (which_alternative == 0) - return \"ldrsb\\t%0, %1\"; - - ops[0] = operands[0]; - - if (GET_CODE (mem) == PLUS) - { - rtx a = XEXP (mem, 0); - rtx b = XEXP (mem, 1); - - ops[1] = a; - ops[2] = b; + if (GET_CODE (addr) == CONST) + addr = XEXP (addr, 0); + if (GET_CODE (addr) == PLUS + && REG_P (XEXP (addr, 0)) && REG_P (XEXP (addr, 1))) + /* No split necessary. */ + FAIL; + if (GET_CODE (addr) == PLUS + && !REG_P (XEXP (addr, 0)) && !REG_P (XEXP (addr, 1))) + FAIL; - if (GET_CODE (a) == REG) - { - if (GET_CODE (b) == REG) - output_asm_insn (\"ldrsb\\t%0, [%1, %2]\", ops); - else if (REGNO (a) == REGNO (ops[0])) - { - output_asm_insn (\"ldrb\\t%0, [%1, %2]\", ops); - output_asm_insn (\"lsl\\t%0, %0, #24\", ops); - output_asm_insn (\"asr\\t%0, %0, #24\", ops); - } - else - output_asm_insn (\"mov\\t%0, %2\;ldrsb\\t%0, [%1, %0]\", ops); - } - else - { - gcc_assert (GET_CODE (b) == REG); - if (REGNO (b) == REGNO (ops[0])) - { - output_asm_insn (\"ldrb\\t%0, [%2, %1]\", ops); - output_asm_insn (\"lsl\\t%0, %0, #24\", ops); - output_asm_insn (\"asr\\t%0, %0, #24\", ops); - } - else - output_asm_insn (\"mov\\t%0, %2\;ldrsb\\t%0, [%1, %0]\", ops); - } - } - else if (GET_CODE (mem) == REG && REGNO (ops[0]) == REGNO (mem)) - { - output_asm_insn (\"ldrb\\t%0, [%0, #0]\", ops); - output_asm_insn (\"lsl\\t%0, %0, #24\", ops); - output_asm_insn (\"asr\\t%0, %0, #24\", ops); - } - else - { - ops[1] = mem; - ops[2] = const0_rtx; - - output_asm_insn (\"mov\\t%0, %2\;ldrsb\\t%0, [%1, %0]\", ops); - } - return \"\"; - }" - [(set_attr "length" "2,6") - (set_attr "type" "load_byte,load_byte") - (set_attr "pool_range" "32,32")] -) + if (reg_overlap_mentioned_p (operands[0], addr)) + { + rtx t = gen_lowpart (QImode, operands[0]); + emit_move_insn (t, operands[1]); + emit_insn (gen_thumb1_extendqisi2 (operands[0], t)); + DONE; + } + if (REG_P (addr)) + { + addr = gen_rtx_PLUS (Pmode, addr, operands[0]); + operands[2] = const0_rtx; + } + else if (GET_CODE (addr) != PLUS) + FAIL; + else if (REG_P (XEXP (addr, 0))) + { + operands[2] = XEXP (addr, 1); + addr = gen_rtx_PLUS (Pmode, XEXP (addr, 0), operands[0]); + } + else + { + operands[2] = XEXP (addr, 0); + addr = gen_rtx_PLUS (Pmode, XEXP (addr, 1), operands[0]); + } + operands[3] = change_address (operands[1], QImode, addr); +}) -(define_insn "*thumb1_extendqisi2_v6" +(define_insn "thumb1_extendqisi2" [(set (match_operand:SI 0 "register_operand" "=l,l,l") (sign_extend:SI (match_operand:QI 1 "nonimmediate_operand" "l,V,m")))] - "TARGET_THUMB1 && arm_arch6" - "* - { - rtx ops[3]; - rtx mem; - - if (which_alternative == 0) - return \"sxtb\\t%0, %1\"; + "TARGET_THUMB1" +{ + rtx addr; - mem = XEXP (operands[1], 0); - - if (GET_CODE (mem) == CONST) - mem = XEXP (mem, 0); - - if (GET_CODE (mem) == LABEL_REF) - return \"ldr\\t%0, %1\"; + if (which_alternative == 0 && arm_arch6) + return "sxtb\\t%0, %1"; + if (which_alternative == 0) + return "#"; - if (GET_CODE (mem) == PLUS - && GET_CODE (XEXP (mem, 0)) == LABEL_REF) - return \"ldr\\t%0, %1\"; - - if (which_alternative == 0) - return \"ldrsb\\t%0, %1\"; + addr = XEXP (operands[1], 0); + if (GET_CODE (addr) == PLUS + && REG_P (XEXP (addr, 0)) && REG_P (XEXP (addr, 1))) + return "ldrsb\\t%0, %1"; - ops[0] = operands[0]; - - if (GET_CODE (mem) == PLUS) - { - rtx a = XEXP (mem, 0); - rtx b = XEXP (mem, 1); - - ops[1] = a; - ops[2] = b; - - if (GET_CODE (a) == REG) - { - if (GET_CODE (b) == REG) - output_asm_insn (\"ldrsb\\t%0, [%1, %2]\", ops); - else if (REGNO (a) == REGNO (ops[0])) - { - output_asm_insn (\"ldrb\\t%0, [%1, %2]\", ops); - output_asm_insn (\"sxtb\\t%0, %0\", ops); - } - else - output_asm_insn (\"mov\\t%0, %2\;ldrsb\\t%0, [%1, %0]\", ops); - } - else - { - gcc_assert (GET_CODE (b) == REG); - if (REGNO (b) == REGNO (ops[0])) - { - output_asm_insn (\"ldrb\\t%0, [%2, %1]\", ops); - output_asm_insn (\"sxtb\\t%0, %0\", ops); - } - else - output_asm_insn (\"mov\\t%0, %2\;ldrsb\\t%0, [%1, %0]\", ops); - } - } - else if (GET_CODE (mem) == REG && REGNO (ops[0]) == REGNO (mem)) - { - output_asm_insn (\"ldrb\\t%0, [%0, #0]\", ops); - output_asm_insn (\"sxtb\\t%0, %0\", ops); - } - else - { - ops[1] = mem; - ops[2] = const0_rtx; - - output_asm_insn (\"mov\\t%0, %2\;ldrsb\\t%0, [%1, %0]\", ops); - } - return \"\"; - }" - [(set_attr "length" "2,2,4") - (set_attr "type" "alu_shift,load_byte,load_byte") - (set_attr "pool_range" "*,32,32")] + return "#"; +} + [(set_attr_alternative "length" + [(if_then_else (eq_attr "is_arch6" "yes") + (const_int 2) (const_int 4)) + (const_int 2) + (if_then_else (eq_attr "is_arch6" "yes") + (const_int 4) (const_int 6))]) + (set_attr "type" "alu_shift,load_byte,load_byte")] ) (define_expand "extendsfdf2" Index: testsuite/gcc.target/arm/pr42172-1.c =================================================================== --- testsuite/gcc.target/arm/pr42172-1.c (revision 0) +++ testsuite/gcc.target/arm/pr42172-1.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-options "-O2" } */ + +struct A { + unsigned int f1 : 3; + unsigned int f2 : 3; + unsigned int f3 : 1; + unsigned int f4 : 1; + +}; + +void init_A (struct A *this) +{ + this->f1 = 0; + this->f2 = 1; + this->f3 = 0; + this->f4 = 0; +} + +/* { dg-final { scan-assembler-times "ldr" 1 } } */