From patchwork Fri May 6 09:07:06 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Carrot Wei X-Patchwork-Id: 94347 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 4609F1007D1 for ; Fri, 6 May 2011 19:07:32 +1000 (EST) Received: (qmail 19751 invoked by alias); 6 May 2011 09:07:29 -0000 Received: (qmail 19686 invoked by uid 22791); 6 May 2011 09:07:27 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, SPF_HELO_PASS, TW_CB, TW_QE, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.44.51) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 06 May 2011 09:07:08 +0000 Received: from wpaz33.hot.corp.google.com (wpaz33.hot.corp.google.com [172.24.198.97]) by smtp-out.google.com with ESMTP id p46977c6009710 for ; Fri, 6 May 2011 02:07:07 -0700 Received: from qwi2 (qwi2.prod.google.com [10.241.195.2]) by wpaz33.hot.corp.google.com with ESMTP id p46976eR010242 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Fri, 6 May 2011 02:07:06 -0700 Received: by qwi2 with SMTP id 2so2974248qwi.22 for ; Fri, 06 May 2011 02:07:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.78.155 with SMTP id l27mr2268513qck.47.1304672826082; Fri, 06 May 2011 02:07:06 -0700 (PDT) Received: by 10.229.77.143 with HTTP; Fri, 6 May 2011 02:07:06 -0700 (PDT) In-Reply-To: <1304588525.19537.67.camel@e102346-lin.cambridge.arm.com> References: <20110505065105.BBA6820A24@guozhiwei.sha.corp.google.com> <1304588525.19537.67.camel@e102346-lin.cambridge.arm.com> Date: Fri, 6 May 2011 17:07:06 +0800 Message-ID: Subject: Re: [PATCH, ARM] PR47855 Compute attr length for thumb2 insns, 3/3 (issue4475042) From: Carrot Wei To: Richard Earnshaw Cc: reply@codereview.appspotmail.com, gcc-patches@gcc.gnu.org X-System-Of-Record: true X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Thu, May 5, 2011 at 5:42 PM, Richard Earnshaw wrote: > > On Thu, 2011-05-05 at 14:51 +0800, Guozhi Wei wrote: > > Hi > > > > This is the third part of the fixing for > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855 > > > > This patch contains the length computation/refinement for insn patterns > > "*thumb2_movsi_insn", "*thumb2_cbz" and "*thumb2_cbnz". > > > > At the same time this patch revealed two bugs. The first is the maximum offset > > of cbz/cbnz, it should be 126, but it is 128 in patterns "*thumb2_cbz" and > > "*thumb2_cbnz". The second is that only 2-register form of shift instructions > > can be 16 bit, but 3-register form is allowed in "*thumb2_shiftsi3_short" and > > related peephole2. The fix is also contained in this patch. > > > > The patch has been tested on arm qemu. > > > > thanks > > Carrot > > > > > > 2011-05-05  Guozhi Wei   > > > >       PR target/47855 > >       * config/arm/thumb2.md (thumb2_movsi_insn): Add length addtribute. > >       (thumb2_shiftsi3_short and peephole2): Remove 3-register case. > >       (thumb2_cbz): Refine length computation. > >       (thumb2_cbnz): Likewise. > > > > Hmm, although these changes are all related to length calculations, they > are really three patches that are unrelated to each other.  It would be > easier to review this if they were kept separate. > > 1) thumb2_shiftsi3_short > This appears to be a straight bug.  We are putting out a 32-bit > instruction when we are claiming it to be only 16 bits.  This is OK. > > 2) thumb2_movsi_insn > There are two things here. > a) Thumb2 has a 16-bit move instruction for all core > register-to-register transfers, so the separation of alternatives 1 and > 2 is unnecessary -- just code these as "rk". done. > > b) The ldm form does not support unaligned memory accesses.  I'm aware > that work is being done to add unaligned support to GCC for ARM, so I > need to find out whether this patch will interfere with those changes. > I'll try to find out what the situation is here and get back to you. > > 3) thumb2_cbz and thumb2_cbnz > The range calculations look wrong here.  Remember that the 'pc' as far > as GCC is concerned is the address of the start of the insn.  So for a > backwards branch you need to account for all the bytes in the insn > pattern that occur before the branch instruction itself, and secondly > you also have to remember that the 'pc' that the CPU uses is the address > of the branch instruction plus 4.  All these conspire to reduce the > backwards range of a short branch to several bytes less than the 256 > that you currently have coded. The usage of 'pc' is more complex than I thought. I understood it after reading the comment in file arm.md. And the description at http://gcc.gnu.org/onlinedocs/gccint/Insn-Lengths.html#Insn-Lengths is not right for forward branch cases. Now the ranges are modified accordingly. It has been tested on arm qemu in thumb2 mode. thanks Carrot 2011-05-06 Guozhi Wei PR target/47855 * config/arm/thumb2.md (thumb2_movsi_insn): Add length addtribute. (thumb2_shiftsi3_short and peephole2): Remove 3-register case. (thumb2_cbz): Refine length computation. (thumb2_cbnz): Likewise. Index: config/arm/thumb2.md =================================================================== --- config/arm/thumb2.md (revision 173350) +++ config/arm/thumb2.md (working copy) @@ -165,23 +165,46 @@ ;; regs. The high register alternatives are not taken into account when ;; choosing register preferences in order to reflect their expense. (define_insn "*thumb2_movsi_insn" - [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,r,l ,*hk,m,*m") - (match_operand:SI 1 "general_operand" "rk ,I,K,j,mi,*mi,l,*hk"))] + [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,r,l ,*rk,Uu,*m") + (match_operand:SI 1 "general_operand" "rk ,I,K,j,Uu,*mi,l ,*rk"))] "TARGET_THUMB2 && ! TARGET_IWMMXT && !(TARGET_HARD_FLOAT && TARGET_VFP) && ( register_operand (operands[0], SImode) || register_operand (operands[1], SImode))" - "@ - mov%?\\t%0, %1 - mov%?\\t%0, %1 - mvn%?\\t%0, #%B1 - movw%?\\t%0, %1 - ldr%?\\t%0, %1 - ldr%?\\t%0, %1 - str%?\\t%1, %0 - str%?\\t%1, %0" + "* + switch (which_alternative) + { + case 0: return \"mov%?\\t%0, %1\"; + case 1: return \"mov%?\\t%0, %1\"; + case 2: return \"mvn%?\\t%0, #%B1\"; + case 3: return \"movw%?\\t%0, %1\"; + + case 4: + if (GET_CODE (XEXP (operands[1], 0)) == POST_INC) + { + operands[1] = XEXP (XEXP (operands[1], 0), 0); + return \"ldm%(ia%)\t%1!, {%0}\"; + } + else + return \"ldr%?\\t%0, %1\"; + + case 5: return \"ldr%?\\t%0, %1\"; + + case 6: + if (GET_CODE (XEXP (operands[0], 0)) == POST_INC) + { + operands[0] = XEXP (XEXP (operands[0], 0), 0); + return \"stm%(ia%)\t%0!, {%1}\"; + } + else + return \"str%?\\t%1, %0\"; + + case 7: return \"str%?\\t%1, %0\"; + default: gcc_unreachable (); + }" [(set_attr "type" "*,*,*,*,load1,load1,store1,store1") (set_attr "predicable" "yes") + (set_attr "length" "2,4,4,4,2,4,2,4") (set_attr "pool_range" "*,*,*,*,1020,4096,*,*") (set_attr "neg_pool_range" "*,*,*,*,0,0,*,*")] ) @@ -685,7 +708,8 @@ "TARGET_THUMB2 && peep2_regno_dead_p(0, CC_REGNUM) && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT) - || REG_P(operands[2]))" + || REG_P(operands[2])) + && (CONSTANT_P (operands[2]) || (operands[0] == operands[1]))" [(parallel [(set (match_dup 0) (match_op_dup 3 @@ -696,10 +720,10 @@ ) (define_insn "*thumb2_shiftsi3_short" - [(set (match_operand:SI 0 "low_register_operand" "=l") + [(set (match_operand:SI 0 "low_register_operand" "=l,l") (match_operator:SI 3 "shift_operator" - [(match_operand:SI 1 "low_register_operand" "l") - (match_operand:SI 2 "low_reg_or_int_operand" "lM")])) + [(match_operand:SI 1 "low_register_operand" "0,l") + (match_operand:SI 2 "low_reg_or_int_operand" "l,M")])) (clobber (reg:CC CC_REGNUM))] "TARGET_THUMB2 && reload_completed && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT) @@ -707,7 +731,7 @@ "* return arm_output_shift(operands, 2);" [(set_attr "predicable" "yes") (set_attr "shift" "1") - (set_attr "length" "2") + (set_attr "length" "2,2") (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "") (const_string "alu_shift") (const_string "alu_shift_reg")))] @@ -965,13 +989,23 @@ else return \"cmp\\t%0, #0\;beq\\t%l1\"; " - [(set (attr "length") - (if_then_else - (and (ge (minus (match_dup 1) (pc)) (const_int 2)) - (le (minus (match_dup 1) (pc)) (const_int 128)) - (eq (symbol_ref ("which_alternative")) (const_int 0))) - (const_int 2) - (const_int 8)))] + [(set (attr "length") + (if_then_else + (eq (symbol_ref ("which_alternative")) (const_int 0)) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int 2)) + (le (minus (match_dup 1) (pc)) (const_int 128))) + (const_int 2) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int -250)) + (le (minus (match_dup 1) (pc)) (const_int 256))) + (const_int 4) + (const_int 6))) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int -248)) + (le (minus (match_dup 1) (pc)) (const_int 256))) + (const_int 6) + (const_int 8))))] ) (define_insn "*thumb2_cbnz" @@ -988,13 +1022,23 @@ else return \"cmp\\t%0, #0\;bne\\t%l1\"; " - [(set (attr "length") - (if_then_else - (and (ge (minus (match_dup 1) (pc)) (const_int 2)) - (le (minus (match_dup 1) (pc)) (const_int 128)) - (eq (symbol_ref ("which_alternative")) (const_int 0))) - (const_int 2) - (const_int 8)))] + [(set (attr "length") + (if_then_else + (eq (symbol_ref ("which_alternative")) (const_int 0)) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int 2)) + (le (minus (match_dup 1) (pc)) (const_int 128))) + (const_int 2) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int -250)) + (le (minus (match_dup 1) (pc)) (const_int 256))) + (const_int 4) + (const_int 6))) + (if_then_else + (and (ge (minus (match_dup 1) (pc)) (const_int -248)) + (le (minus (match_dup 1) (pc)) (const_int 256))) + (const_int 6) + (const_int 8))))] ) ;; 16-bit complement