Patchwork [i386] : FIX PR50875, -O3 and -mavx lead to internal compiler error in find_reloads

login
register
mail settings
Submitter Uros Bizjak
Date Oct. 27, 2011, 4:06 p.m.
Message ID <CAFULd4aBApo7e7QpvqhA8GX0vWpVqeVBF10PAGB+sw0hDUBN=A@mail.gmail.com>
Download mbox | patch
Permalink /patch/122175/
State New
Headers show

Comments

Uros Bizjak - Oct. 27, 2011, 4:06 p.m.
Hello!

Apparently, reload does not like matching register alternatives in
*avx_unpcklpd256. Attached patch removes this problematic alternative
and generates vunpcklpd insn instead, since the later instruction has
the same length, latency, uops, etc  as vmovddup while giving much
more freedom to reload.

I would kindly ask someone fluent in fortran to construct a
compile-time testcase from the PR [1], that will correctly exercise
the fix on x86 with "-O3 -mavx" options and immortalize the testcase
in gfortran testsuite.

2011-10-27  Uros Bizjak  <ubizjak@gmail.com>

	PR target/50875
	* config/i386/sse.md (*avx2_unpcklpd256): Remove extra insn
	constraints.  Change alternative 1 to "x,m,1".

Patch was bootstrapped and regresion tested on
x86_64-pc-linux-gnu{,-m32} with --with-fpmath=avx configure option.
Also, the fortran testcase from the PR was checked manually that it
compiles and works OK.

[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50875

Thanks,
Uros.
Steve Kargl - Oct. 27, 2011, 4:45 p.m.
On Thu, Oct 27, 2011 at 06:06:09PM +0200, Uros Bizjak wrote:
> 
> I would kindly ask someone fluent in fortran to construct a
> compile-time testcase from the PR [1], that will correctly exercise
> the fix on x86 with "-O3 -mavx" options and immortalize the testcase
> in gfortran testsuite.
> 

How's the attached?  I don't know if the target part of
the dg-options is required.
Uros Bizjak - Oct. 27, 2011, 4:56 p.m.
On Thu, Oct 27, 2011 at 6:45 PM, Steve Kargl
<sgk@troutmask.apl.washington.edu> wrote:
> On Thu, Oct 27, 2011 at 06:06:09PM +0200, Uros Bizjak wrote:
>>
>> I would kindly ask someone fluent in fortran to construct a
>> compile-time testcase from the PR [1], that will correctly exercise
>> the fix on x86 with "-O3 -mavx" options and immortalize the testcase
>> in gfortran testsuite.
>>
>
> How's the attached?  I don't know if the target part of
> the dg-options is required.

Great!

Regarding the dg-options, I believe that tests that fail only on
certain target (i.e. pr48757.f) define dg directives as:

! { dg-do compile { target { i?86-*-* x86_64-*-* } }
! { dg-options "-O3 -mavx" }

This way, we won't annoy other innocent targets ;)

I will change your test accordingly and commit everything to SVN.

Thanks,
Uros.

Patch

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 180528)
+++ ChangeLog	(working copy)
@@ -1,5 +1,9 @@ 
 2011-10-26  Eric Botcazou  <ebotcazou@adacore.com>
 
+<<<<<<< .mine
+	* input.c (expand_location): Rewrite using linemap_resolve_location
+	and linemap_expand_location.  Add a comment.
+=======
 	* reload.c (reload_inner_reg_of_subreg): Change type of return value
 	and type of OUTPUT parameter to bool and adjust.  Document MODE and
 	OUTPUT parameters.  Use HARD_REGISTER_P.  Reorder final condition
@@ -197,6 +201,7 @@ 
 	* input.c (expand_location): Rewrite using
 	linemap_resolve_location and linemap_expand_location.  Add a
 	comment.
+>>>>>>> .r180528
 
 2011-10-25  Jakub Jelinek  <jakub@redhat.com>
 
@@ -315,7 +320,7 @@ 
 2011-10-24  Julian Brown  <julian@codesourcery.com>
 
 	* config/m68k/m68k.c (notice_update_cc): Tighten condition for
-	setting CC_REVERSED for FP comparisons.  
+	setting CC_REVERSED for FP comparisons.
 
 2011-10-24  Richard Guenther  <rguenther@suse.de>
 
@@ -368,14 +373,12 @@ 
 	float and integer regs.
 	(sparc_register_move_cost): Adjust to account for VIS3 moves.
 	(sparc_preferred_reload_class): On 32-bit with VIS3 when moving an
-	integer reg to a class containing EXTRA_FP_REGS, constrain to
-	FP_REGS.
+	integer reg to a class containing EXTRA_FP_REGS, constrain to FP_REGS.
 	(sparc_secondary_reload): On 32-bit with VIS3 when moving between
 	float and integer regs we sometimes need a FP_REGS class
 	intermediate move to satisfy the reload.  When this happens
 	specify an extra cost of 2.
-	(*movsi_insn): Rename to have "_novis3" suffix and add !VIS3
-	guard.
+	(*movsi_insn): Rename to have "_novis3" suffix and add !VIS3 guard.
 	(*movdi_insn_sp32_v9): Likewise.
 	(*movdi_insn_sp64): Likewise.
 	(*movsf_insn): Likewise.
@@ -399,10 +402,9 @@ 
 	(*mov<VM32:mode>_insn_vis3): New insn.
 	(*mov<VM64:mode>_insn_sp64_vis3): New insn.
 	(*mov<VM64:mode>_insn_sp32_vis3): New insn.
-	(VM64 reg<-->reg split): New spliiter for 32-bit.
+	(VM64 reg<-->reg split): New splitter for 32-bit.
 
-	* config/sparc/sparc.c (sparc_split_regreg_legitimate): New
-	function.
+	* config/sparc/sparc.c (sparc_split_regreg_legitimate): New function.
 	* config/sparc/sparc-protos.h (sparc_split_regreg_legitimate):
 	Declare it.
 	* config/sparc/sparc.md (DImode reg/reg split): Use it.
@@ -478,22 +480,23 @@ 
 2011-10-23  Tom de Vries  <tom@codesourcery.com>
 
 	PR tree-optimization/50763
-	* tree-ssa-tail-merge.c (same_succ_flush_bb): New function, factored out
-	of ...
+	* tree-ssa-tail-merge.c (same_succ_flush_bb): New function, factored
+	out of ...
 	(same_succ_flush_bbs): Use same_succ_flush_bb.
 	(purge_bbs): Remove argument.  Remove calls to same_succ_flush_bbs,
 	release_last_vdef and delete_basic_block.
 	(unlink_virtual_phi): New function.
 	(update_vuses): Add and use vuse1_phi_args argument.  Set var to
-	SSA_NAME_VAR of vuse1 or vuse2, and use var.  Handle case that def_stmt2
-	is NULL.  Use phi result as phi arg in case vuse1 or vuse2 is NULL_TREE.
-	Replace uses of vuse1 if vuse2 is NULL_TREE.  Fix code to limit
-	replacement of uses.  Propagate phi argument for phis with a single
-	argument.
+	SSA_NAME_VAR of vuse1 or vuse2, and use var.  Handle case that
+	def_stmt2 is NULL.  Use phi result as phi arg in case vuse1 or vuse2
+	is NULL_TREE.  Replace uses of vuse1 if vuse2 is NULL_TREE.  Fix code
+	to limit replacement of uses.  Propagate phi argument for phis with a
+	single argument.
 	(replace_block_by): Update vops if phi_vuse1 or phi_vuse2 is NULL_TREE.
-	Set vuse1_phi_args if vuse1 is a phi defined in bb1.  Add vuse1_phi_args
-	as argument to call to update_vuses.  Call release_last_vdef,
-	same_succ_flush_bb, delete_basic_block.  Update CDI_DOMINATORS info.
+	Set vuse1_phi_args if vuse1 is a phi defined in bb1.  Add
+	vuse1_phi_args as argument to call to update_vuses.  Call
+	release_last_vdef, same_succ_flush_bb, delete_basic_block.  Update
+	CDI_DOMINATORS info.
 	(tail_merge_optimize): Remove argument in call to purge_bbs.  Remove
 	call to free_dominance_info.  Only call calculate_dominance_info once.
 
Index: config/i386/sse.md
===================================================================
--- config/i386/sse.md	(revision 180546)
+++ config/i386/sse.md	(working copy)
@@ -4231,15 +4231,14 @@ 
   [(set (match_operand:V4DF 0 "register_operand"         "=x,x")
 	(vec_select:V4DF
 	  (vec_concat:V8DF
-	    (match_operand:V4DF 1 "nonimmediate_operand" "xm,x")
-	    (match_operand:V4DF 2 "nonimmediate_operand" " 1,xm"))
+	    (match_operand:V4DF 1 "nonimmediate_operand" " x,m")
+	    (match_operand:V4DF 2 "nonimmediate_operand" "xm,1"))
 	  (parallel [(const_int 0) (const_int 4)
 		     (const_int 2) (const_int 6)])))]
-  "TARGET_AVX
-   && (!MEM_P (operands[1]) || rtx_equal_p (operands[1], operands[2]))"
+  "TARGET_AVX"
   "@
-   vmovddup\t{%1, %0|%0, %1}
-   vunpcklpd\t{%2, %1, %0|%0, %1, %2}"
+   vunpcklpd\t{%2, %1, %0|%0, %1, %2}
+   vmovddup\t{%1, %0|%0, %1}"
   [(set_attr "type" "sselog")
    (set_attr "prefix" "vex")
    (set_attr "mode" "V4DF")])