From patchwork Wed Aug 18 22:31:01 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Botcazou X-Patchwork-Id: 62093 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 54553B70D2 for ; Thu, 19 Aug 2010 08:31:15 +1000 (EST) Received: (qmail 10289 invoked by alias); 18 Aug 2010 22:31:12 -0000 Received: (qmail 10273 invoked by uid 22791); 18 Aug 2010 22:31:10 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=BAYES_00 X-Spam-Check-By: sourceware.org Received: from mel.act-europe.fr (HELO mel.act-europe.fr) (212.99.106.210) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 18 Aug 2010 22:31:03 +0000 Received: from localhost (localhost [127.0.0.1]) by filtered-smtp.eu.adacore.com (Postfix) with ESMTP id 628C5CB0241 for ; Thu, 19 Aug 2010 00:31:01 +0200 (CEST) Received: from mel.act-europe.fr ([127.0.0.1]) by localhost (smtp.eu.adacore.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0TEfPfFKapVg for ; Thu, 19 Aug 2010 00:31:01 +0200 (CEST) Received: from new-host.home (ADijon-552-1-28-130.w92-138.abo.wanadoo.fr [92.138.163.130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mel.act-europe.fr (Postfix) with ESMTP id 32647CB0202 for ; Thu, 19 Aug 2010 00:31:01 +0200 (CEST) From: Eric Botcazou To: gcc-patches@gcc.gnu.org Subject: [IA-64] Fix latent problem in FP div code Date: Thu, 19 Aug 2010 00:31:01 +0200 User-Agent: KMail/1.9.9 MIME-Version: 1.0 Message-Id: <201008190031.01724.ebotcazou@adacore.com> Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, a couple of years ago I submitted a fix for a problem we ran into in the FP div code on IA-64: http://gcc.gnu.org/ml/gcc-patches/2008-06/msg00904.html It didn't get much attention and the problem is still visible as of today on the 4.3 branch with the testcase I provided. The issue is mitigated by IRA these days but I think that it is still latent. Moreover the new FP sqrt code suffers from the same problem because frsqrta, like frcpa, only sets one predicate register. Hence the attached patch, tested on ia64-suse-linux, OK for mainline? 2010-08-18 Eric Botcazou * config/ia64/ia64.h (HARD_REGNO_NREGS): Return 1 for CCImode in general purpose registers. (HARD_REGNO_MODE_OK): Accept CCImode in general purpose registers. * config/ia64/ia64.md (*movcci): Change to named pattern. Deal with general purpose registers and memory operands. Add associated CCImode post-reload splitter. * config/ia64/div.md: Change BImode to CCImode throughout. Index: config/ia64/div.md =================================================================== --- config/ia64/div.md (revision 163220) +++ config/ia64/div.md (working copy) @@ -37,7 +37,7 @@ (define_insn "addrf3_cond" [(set (match_operand:RF 0 "fr_register_operand" "=f,f") - (if_then_else:RF (ne:RF (match_operand:BI 1 "register_operand" "c,c") + (if_then_else:RF (ne:RF (match_operand:CCI 1 "register_operand" "c,c") (const_int 0)) (plus:RF (match_operand:RF 2 "fr_reg_or_fp01_operand" "fG,fG") @@ -52,7 +52,7 @@ (define_insn "addrf3_cond" (define_insn "subrf3_cond" [(set (match_operand:RF 0 "fr_register_operand" "=f,f") - (if_then_else:RF (ne:RF (match_operand:BI 1 "register_operand" "c,c") + (if_then_else:RF (ne:RF (match_operand:CCI 1 "register_operand" "c,c") (const_int 0)) (minus:RF (match_operand:RF 2 "fr_reg_or_fp01_operand" "fG,fG") @@ -67,7 +67,7 @@ (define_insn "subrf3_cond" (define_insn "mulrf3_cond" [(set (match_operand:RF 0 "fr_register_operand" "=f,f") - (if_then_else:RF (ne:RF (match_operand:BI 1 "register_operand" "c,c") + (if_then_else:RF (ne:RF (match_operand:CCI 1 "register_operand" "c,c") (const_int 0)) (mult:RF (match_operand:RF 2 "fr_reg_or_fp01_operand" "fG,fG") @@ -84,7 +84,7 @@ (define_insn "mulrf3_cond" (define_insn "nmulrf3_cond" [(set (match_operand:RF 0 "fr_register_operand" "=f,f") - (if_then_else:RF (ne:RF (match_operand:BI 1 "register_operand" "c,c") + (if_then_else:RF (ne:RF (match_operand:CCI 1 "register_operand" "c,c") (const_int 0)) (neg:RF (mult:RF (match_operand:RF 2 "fr_reg_or_fp01_operand" "fG,fG") @@ -101,7 +101,7 @@ (define_insn "nmulrf3_cond" (define_insn "m1addrf4_cond" [(set (match_operand:RF 0 "fr_register_operand" "=f,f") - (if_then_else:RF (ne:RF (match_operand:BI 1 "register_operand" "c,c") + (if_then_else:RF (ne:RF (match_operand:CCI 1 "register_operand" "c,c") (const_int 0)) (plus:RF (mult:RF @@ -118,7 +118,7 @@ (define_insn "m1addrf4_cond" (define_insn "m1subrf4_cond" [(set (match_operand:RF 0 "fr_register_operand" "=f,f") - (if_then_else:RF (ne:RF (match_operand:BI 1 "register_operand" "c,c") + (if_then_else:RF (ne:RF (match_operand:CCI 1 "register_operand" "c,c") (const_int 0)) (minus:RF (mult:RF @@ -137,7 +137,7 @@ (define_insn "m1subrf4_cond" (define_insn "m2addrf4_cond" [(set (match_operand:RF 0 "fr_register_operand" "=f,f") - (if_then_else:RF (ne:RF (match_operand:BI 1 "register_operand" "c,c") + (if_then_else:RF (ne:RF (match_operand:CCI 1 "register_operand" "c,c") (const_int 0)) (plus:RF (match_operand:RF 2 "fr_reg_or_fp01_operand" "fG,fG") @@ -154,7 +154,7 @@ (define_insn "m2addrf4_cond" (define_insn "m2subrf4_cond" [(set (match_operand:RF 0 "fr_register_operand" "=f,f") - (if_then_else:RF (ne:RF (match_operand:BI 1 "register_operand" "c,c") + (if_then_else:RF (ne:RF (match_operand:CCI 1 "register_operand" "c,c") (const_int 0)) (minus:RF (match_operand:RF 2 "fr_reg_or_fp01_operand" "fG,fG") @@ -255,8 +255,8 @@ (define_insn "recip_approx_rf" (unspec:RF [(match_operand:RF 1 "fr_reg_or_fp01_operand" "fG") (match_operand:RF 2 "fr_reg_or_fp01_operand" "fG")] UNSPEC_FR_RECIP_APPROX_RES)) - (set (match_operand:BI 3 "register_operand" "=c") - (unspec:BI [(match_dup 1) (match_dup 2)] UNSPEC_FR_RECIP_APPROX)) + (set (match_operand:CCI 3 "register_operand" "=c") + (unspec:CCI [(match_dup 1) (match_dup 2)] UNSPEC_FR_RECIP_APPROX)) (use (match_operand:SI 4 "const_int_operand" ""))] "" "frcpa.s%4 %0, %3 = %F1, %F2" @@ -297,7 +297,7 @@ (define_expand "divsf3_internal_thr" rtx q = gen_reg_rtx (RFmode); rtx r = gen_reg_rtx (RFmode); rtx q_res = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx one = CONST1_RTX (RFmode); rtx status0 = CONST0_RTX (SImode); @@ -345,7 +345,7 @@ (define_expand "divsf3_internal_lat" rtx q1 = gen_reg_rtx (RFmode); rtx r = gen_reg_rtx (RFmode); rtx q_res = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx one = CONST1_RTX (RFmode); rtx status0 = CONST0_RTX (SImode); @@ -414,7 +414,7 @@ (define_expand "divdf3_internal_thr" rtx y3 = gen_reg_rtx (RFmode); rtx q = gen_reg_rtx (RFmode); rtx r = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx one = CONST1_RTX (RFmode); rtx status0 = CONST0_RTX (SImode); @@ -471,7 +471,7 @@ (define_expand "divdf3_internal_lat" rtx e3 = gen_reg_rtx (RFmode); rtx q = gen_reg_rtx (RFmode); rtx r1 = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx one = CONST1_RTX (RFmode); rtx status0 = CONST0_RTX (SImode); @@ -535,7 +535,7 @@ (define_expand "divxf3" rtx q = gen_reg_rtx (RFmode); rtx r = gen_reg_rtx (RFmode); rtx r1 = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx one = CONST1_RTX (RFmode); rtx status0 = CONST0_RTX (SImode); @@ -702,7 +702,7 @@ (define_expand "divsi3_internal" rtx e1 = gen_reg_rtx (RFmode); rtx q = gen_reg_rtx (RFmode); rtx q1 = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx one = CONST1_RTX (RFmode); rtx status1 = CONST1_RTX (SImode); @@ -844,7 +844,7 @@ (define_expand "divdi3_internal_lat" rtx q1 = gen_reg_rtx (RFmode); rtx q2 = gen_reg_rtx (RFmode); rtx r = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx one = CONST1_RTX (RFmode); rtx status1 = CONST1_RTX (SImode); @@ -888,7 +888,7 @@ (define_expand "divdi3_internal_thr" rtx e1 = gen_reg_rtx (RFmode); rtx q2 = gen_reg_rtx (RFmode); rtx r = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx one = CONST1_RTX (RFmode); rtx status1 = CONST1_RTX (SImode); @@ -920,8 +920,8 @@ (define_insn "sqrt_approx_rf" [(set (match_operand:RF 0 "fr_register_operand" "=f") (unspec:RF [(match_operand:RF 1 "fr_reg_or_fp01_operand" "fG")] UNSPEC_FR_SQRT_RECIP_APPROX_RES)) - (set (match_operand:BI 2 "register_operand" "=c") - (unspec:BI [(match_dup 1)] UNSPEC_FR_SQRT_RECIP_APPROX)) + (set (match_operand:CCI 2 "register_operand" "=c") + (unspec:CCI [(match_dup 1)] UNSPEC_FR_SQRT_RECIP_APPROX)) (use (match_operand:SI 3 "const_int_operand" ""))] "" "frsqrta.s%3 %0, %2 = %F1" @@ -958,7 +958,7 @@ (define_expand "sqrtsf2_internal_thr" rtx h = gen_reg_rtx (RFmode); rtx d = gen_reg_rtx (RFmode); rtx g2 = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx one = CONST1_RTX (RFmode); rtx c1 = ia64_dconst_0_5(); @@ -1021,7 +1021,7 @@ (define_expand "sqrtsf2_internal_lat" rtx h = gen_reg_rtx (RFmode); rtx h1 = gen_reg_rtx (RFmode); rtx d = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx one = CONST1_RTX (RFmode); rtx c1 = ia64_dconst_0_5(); @@ -1104,7 +1104,7 @@ (define_expand "sqrtdf2_internal_thr" rtx h2 = gen_reg_rtx (RFmode); rtx d = gen_reg_rtx (RFmode); rtx d1 = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx c1 = ia64_dconst_0_5(); rtx reg_df_c1 = gen_reg_rtx (DFmode); @@ -1171,7 +1171,7 @@ (define_expand "sqrtxf2" rtx h3 = gen_reg_rtx (RFmode); rtx d = gen_reg_rtx (RFmode); rtx d1 = gen_reg_rtx (RFmode); - rtx cond = gen_reg_rtx (BImode); + rtx cond = gen_reg_rtx (CCImode); rtx zero = CONST0_RTX (RFmode); rtx c1 = ia64_dconst_0_5(); rtx reg_df_c1 = gen_reg_rtx (DFmode); Index: config/ia64/ia64.h =================================================================== --- config/ia64/ia64.h (revision 163220) +++ config/ia64/ia64.h (working copy) @@ -646,7 +646,7 @@ while (0) #define HARD_REGNO_NREGS(REGNO, MODE) \ ((REGNO) == PR_REG (0) && (MODE) == DImode ? 64 \ : PR_REGNO_P (REGNO) && (MODE) == BImode ? 2 \ - : PR_REGNO_P (REGNO) && (MODE) == CCImode ? 1 \ + : (PR_REGNO_P (REGNO) || GR_REGNO_P (REGNO)) && (MODE) == CCImode ? 1\ : FR_REGNO_P (REGNO) && (MODE) == XFmode ? 1 \ : FR_REGNO_P (REGNO) && (MODE) == RFmode ? 1 \ : FR_REGNO_P (REGNO) && (MODE) == XCmode ? 2 \ @@ -664,7 +664,7 @@ while (0) : PR_REGNO_P (REGNO) ? \ (MODE) == BImode || GET_MODE_CLASS (MODE) == MODE_CC \ : GR_REGNO_P (REGNO) ? \ - (MODE) != CCImode && (MODE) != XFmode && (MODE) != XCmode && (MODE) != RFmode \ + (MODE) != XFmode && (MODE) != XCmode && (MODE) != RFmode \ : AR_REGNO_P (REGNO) ? (MODE) == DImode \ : BR_REGNO_P (REGNO) ? (MODE) == DImode \ : 0) Index: config/ia64/ia64.md =================================================================== --- config/ia64/ia64.md (revision 163220) +++ config/ia64/ia64.md (working copy) @@ -217,17 +217,34 @@ (define_attr "speculable2" "no,yes" (con ;; Set of a single predicate register. This is only used to implement ;; pr-to-pr move and complement. -(define_insn "*movcci" - [(set (match_operand:CCI 0 "register_operand" "=c,c,c") - (match_operand:CCI 1 "nonmemory_operand" "O,n,c"))] +(define_insn "movcci" + [(set (match_operand:CCI 0 "destination_operand" "=c,c,?c,?*r, c,*r,*m,*r") + (match_operand:CCI 1 "move_operand" " O,n, c, c,*r,*m,*r,*r"))] "" "@ cmp.ne %0, p0 = r0, r0 cmp.eq %0, p0 = r0, r0 - (%1) cmp.eq.unc %0, p0 = r0, r0" - [(set_attr "itanium_class" "icmp") + (%1) cmp.eq.unc %0, p0 = r0, r0 + # + tbit.nz %0, p0 = %1, 0 + ld1%O1 %0 = %1%P1 + st1%Q0 %0 = %1%P0 + mov %0 = %1" + [(set_attr "itanium_class" "icmp,icmp,icmp,unknown,tbit,ld,st,ialu") (set_attr "predicable" "no")]) +(define_split + [(set (match_operand:CCI 0 "register_operand" "") + (match_operand:CCI 1 "register_operand" ""))] + "reload_completed + && GET_CODE (operands[0]) == REG && GR_REGNO_P (REGNO (operands[0])) + && GET_CODE (operands[1]) == REG && PR_REGNO_P (REGNO (operands[1]))" + [(set (match_dup 2) (const_int 0)) + (cond_exec (ne (match_dup 3) (const_int 0)) + (set (match_dup 2) (const_int 1)))] + "operands[2] = gen_rtx_REG (BImode, REGNO (operands[0])); + operands[3] = gen_rtx_REG (BImode, REGNO (operands[1]));") + (define_insn "movbi" [(set (match_operand:BI 0 "destination_operand" "=c,c,?c,?*r, c,*r,*r,*m,*r") (match_operand:BI 1 "move_operand" " O,n, c, c,*r, n,*m,*r,*r"))]