diff mbox

[x32] PATCH: PR middle-end/47725: [x32] error: unable to find a register to spill in class DIREG

Message ID 20110214185738.GA13004@intel.com
State New
Headers show

Commit Message

H.J. Lu Feb. 14, 2011, 6:57 p.m. UTC
Hi,

cant_combine_insn_p won't check ZERO_EXTEND and SIGN_EXTEND.  This
patch helps cant_combine_insn_p by copying the arg in the incoming mode
and extending the copy instead of the arg.  OK for trunk when stage 1
is open?

Thanks.


H.J.
---

Comments

Jeff Law Feb. 14, 2011, 7:05 p.m. UTC | #1
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/14/11 11:57, H.J. Lu wrote:
> Hi,
> 
> cant_combine_insn_p won't check ZERO_EXTEND and SIGN_EXTEND.  This
> patch helps cant_combine_insn_p by copying the arg in the incoming mode
> and extending the copy instead of the arg.  OK for trunk when stage 1
> is open?
Why not fix the problem in combine?  IIRC the whole point behind the
code you're twiddling in function.c is to avoid useless copies due to
extending arguments.

jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNWX0DAAoJEBRtltQi2kC7VhsH/0H+h943mRzeY4xmCOf5RSu0
NI2f0zTAewlfcpqR8eaXSy8A1AYipUnS0NcS2oN7NO5v5A65YdSkGK+SWoqdKls1
eSnvr98AIFTy+ih4CKNxh3xkWNiCdG0LAfZed8zVb4cLPMBDbN5U/kojMHvw2865
pbr/+7QkJ6XNCc2R8oTuVsvxrFn998HhSgFFA2fH2Cs+PCyyg4U3V9agmr4LSsKf
dDoUvg8cGmtsyU/+kDldJ/dA3nzkBRGBrPUWLwOL8B9FFK6e3kcwfi6CegXrZRC4
qaR3Z+bIMeUjM1OPn5vwsp7ks/whtHMY33flwmLYijklaNEdt9FoXYa4BrMRnUc=
=KOsD
-----END PGP SIGNATURE-----
H.J. Lu Feb. 14, 2011, 7:07 p.m. UTC | #2
On Mon, Feb 14, 2011 at 11:05 AM, Jeff Law <law@redhat.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 02/14/11 11:57, H.J. Lu wrote:
>> Hi,
>>
>> cant_combine_insn_p won't check ZERO_EXTEND and SIGN_EXTEND.  This
>> patch helps cant_combine_insn_p by copying the arg in the incoming mode
>> and extending the copy instead of the arg.  OK for trunk when stage 1
>> is open?
> Why not fix the problem in combine?  IIRC the whole point behind the
> code you're twiddling in function.c is to avoid useless copies due to
> extending arguments.

See:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47725#c4
Jeff Law Feb. 14, 2011, 7:11 p.m. UTC | #3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/14/11 12:07, H.J. Lu wrote:
> On Mon, Feb 14, 2011 at 11:05 AM, Jeff Law <law@redhat.com> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 02/14/11 11:57, H.J. Lu wrote:
>>> Hi,
>>>
>>> cant_combine_insn_p won't check ZERO_EXTEND and SIGN_EXTEND.  This
>>> patch helps cant_combine_insn_p by copying the arg in the incoming mode
>>> and extending the copy instead of the arg.  OK for trunk when stage 1
>>> is open?
>> Why not fix the problem in combine?  IIRC the whole point behind the
>> code you're twiddling in function.c is to avoid useless copies due to
>> extending arguments.
> 
> See:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47725#c4
So all you're doing is trading one performance issue for another.

I still think you're better off tackling this in combine -- IMHO the
impact will be less there.

jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNWX5yAAoJEBRtltQi2kC7+nkH/0LmUtL2A/ziaKzCLy70QnU8
GS9uTj3wPE/h9tCzDrM1c3JZQSv1po1+wgTmoxgfmWYLSNUHi4b6OLWHJtTquca2
qKuSdak9r/MgJ2k79lQO8UQ3EKyZiVOd+aLumag4pmLTNHRwxMZ+BeL7X8AX2BRt
wcPv8EqLAaiFyW19cWatm9/C3JA2kAf9ayBvMCpLS4Wyqp9qdRt/IaOmO9GHmw3n
Uj0uZ+8apTu4TJ/4ZWT0+SZsLQkXHnnV40VUK08RObuU5QQBTLJ2dznQCjyJpvdV
olmgmOQxox+grc5pQKYs9Tvq99brniO7/czHEi7v96iRmh2LHCkgXyQp3NmShqM=
=1+XC
-----END PGP SIGNATURE-----
Eric Botcazou Feb. 14, 2011, 7:21 p.m. UTC | #4
> So all you're doing is trading one performance issue for another.

If adding a copy to a new pseudo was really a performance issue, we would have 
many issues thoughout the compiler.  AFAIK it's a classical trick.
Bernd Schmidt Feb. 14, 2011, 7:44 p.m. UTC | #5
On 02/14/2011 08:21 PM, Eric Botcazou wrote:
>> So all you're doing is trading one performance issue for another.
> 
> If adding a copy to a new pseudo was really a performance issue, we would have 
> many issues thoughout the compiler.  AFAIK it's a classical trick.

The insns we're dealing with here can potentially get REG_EQUIV notes
attached to them, so they're a bit special. At least it needs to be
verified that the optimization in IRA which moves such insns before
their only use still triggers (PR42235).

I agree with Jeff that combine would be the correct place to fix this.
At least it takes class_likely_spilled_p into account, so it will
restrict only those machines where extending the lifetime of hard regs
is dangerous.


Bernd
Eric Botcazou Feb. 14, 2011, 7:46 p.m. UTC | #6
> I agree with Jeff that combine would be the correct place to fix this.
> At least it takes class_likely_spilled_p into account, so it will
> restrict only those machines where extending the lifetime of hard regs
> is dangerous.

OK, but I don't see how copying to a new pseudo would interfere with that.
H.J. Lu Feb. 14, 2011, 7:50 p.m. UTC | #7
On Mon, Feb 14, 2011 at 11:44 AM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 02/14/2011 08:21 PM, Eric Botcazou wrote:
>>> So all you're doing is trading one performance issue for another.
>>
>> If adding a copy to a new pseudo was really a performance issue, we would have
>> many issues thoughout the compiler.  AFAIK it's a classical trick.
>
> The insns we're dealing with here can potentially get REG_EQUIV notes
> attached to them, so they're a bit special. At least it needs to be
> verified that the optimization in IRA which moves such insns before
> their only use still triggers (PR42235).
>
> I agree with Jeff that combine would be the correct place to fix this.
> At least it takes class_likely_spilled_p into account, so it will
> restrict only those machines where extending the lifetime of hard regs
> is dangerous.
>

Hi Jeff, Bernd,

Does my patch at

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47725#c3

do what you suggested?
Bernd Schmidt Feb. 15, 2011, 3:50 p.m. UTC | #8
On 02/14/2011 08:50 PM, H.J. Lu wrote:
> 
> Does my patch at
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47725#c3
> 
> do what you suggested?

Yes. That's how I'd fix it. Concerns about optimization seem to be
misplaced as well; I've run this through my collection of input files
and did not find a case where code generation changed. So, OK to install.


Bernd
H.J. Lu Feb. 15, 2011, 4:56 p.m. UTC | #9
On Tue, Feb 15, 2011 at 7:50 AM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 02/14/2011 08:50 PM, H.J. Lu wrote:
>>
>> Does my patch at
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47725#c3
>>
>> do what you suggested?
>
> Yes. That's how I'd fix it. Concerns about optimization seem to be
> misplaced as well; I've run this through my collection of input files
> and did not find a case where code generation changed. So, OK to install.
>

I checked it into trunk and reverted the function.c change on x32 branch.

Thanks.
Eric Botcazou Feb. 15, 2011, 5:33 p.m. UTC | #10
> Yes. That's how I'd fix it. Concerns about optimization seem to be
> misplaced as well; I've run this through my collection of input files
> and did not find a case where code generation changed. So, OK to install.

I disagree, this isn't a regression so this isn't suitable for stage 4 as per
  http://gcc.gnu.org/ml/gcc/2011-02/msg00250.html

Did you look at the code generated on the x32 branch?
H.J. Lu Feb. 15, 2011, 6:08 p.m. UTC | #11
On Tue, Feb 15, 2011 at 9:33 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Yes. That's how I'd fix it. Concerns about optimization seem to be
>> misplaced as well; I've run this through my collection of input files
>> and did not find a case where code generation changed. So, OK to install.
>
> I disagree, this isn't a regression so this isn't suitable for stage 4 as per
>  http://gcc.gnu.org/ml/gcc/2011-02/msg00250.html

I can revert the change and recheck it in in stage 1.

> Did you look at the code generated on the x32 branch?
>

I didn't check the detail. But I think combine.c change is better.
Eric Botcazou Feb. 15, 2011, 6:55 p.m. UTC | #12
> I can revert the change and recheck it in in stage 1.

Yes, thanks in advance.

> I didn't check the detail. But I think combine.c change is better.

I still disagree, but let's discuss this again when stage 1 reopens.
Jeff Law Feb. 15, 2011, 9:08 p.m. UTC | #13
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/15/11 11:08, H.J. Lu wrote:
> On Tue, Feb 15, 2011 at 9:33 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>>> Yes. That's how I'd fix it. Concerns about optimization seem to be
>>> misplaced as well; I've run this through my collection of input files
>>> and did not find a case where code generation changed. So, OK to install.
>>
>> I disagree, this isn't a regression so this isn't suitable for stage 4 as per
>>  http://gcc.gnu.org/ml/gcc/2011-02/msg00250.html
> 
> I can revert the change and recheck it in in stage 1.
Probably wise since we're in stage4 right now.

jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNWutXAAoJEBRtltQi2kC7u14H/10dBjwL3lReIbfH6f9pELBQ
2l/uK9W7djHYBAUUxhzA5UJ/Guy7JHUgJl/CCwVx5+XVGqGY8SfeZO3g4Rmz7Cd9
qjAIul4Vkr/lOiw1xGSq2UHTrVoPxllO8oNJY3m1cwDmz0VF8svLcsglAjbdqGwa
9mQl17gyAeQrlziIDqB7nA7MZvxesXwUEkCQKZJgT0uOQ/SrkpndyemHlYjqeiRd
aTqCg/b2Ihr/1BjljzQOQbwuiSMGg/Q17I9Ji0gEshTomjXs9eFTTgYlcCai2pc+
FWLS8yIUT8PVMDz2Lhj9k9pKo8nUpGBipORPEG3wEeWToMsVHkhghQsOpmFPrXM=
=QxMB
-----END PGP SIGNATURE-----
Bernd Schmidt Feb. 15, 2011, 9:30 p.m. UTC | #14
On 02/15/2011 06:33 PM, Eric Botcazou wrote:
>> Yes. That's how I'd fix it. Concerns about optimization seem to be
>> misplaced as well; I've run this through my collection of input files
>> and did not find a case where code generation changed. So, OK to install.
> 
> I disagree, this isn't a regression so this isn't suitable for stage 4 as per
>   http://gcc.gnu.org/ml/gcc/2011-02/msg00250.html

This type of extension from a hard register is new in 4.6.  There may
well be other ports that fail in the same way that HJ has discovered.


Bernd
Bernd Schmidt Feb. 15, 2011, 10:55 p.m. UTC | #15
On 02/14/2011 08:46 PM, Eric Botcazou wrote:
>> I agree with Jeff that combine would be the correct place to fix this.
>> At least it takes class_likely_spilled_p into account, so it will
>> restrict only those machines where extending the lifetime of hard regs
>> is dangerous.
> 
> OK, but I don't see how copying to a new pseudo would interfere with that.

For one thing, the set no longer matches the REG_EQUIV note we make
here, and that does seem to interfere with the optimization. I've tested
both patches on ARM, -march=armv7-a. The combiner patch produced no code
changes. The function.c patch produced regressions (increased register
pressure). Both results are as expected.

To put it another way: the combiner change is conservatively correct,
and necessary if we're going to have extends of hard registers in the
RTL. The function.c change is demonstrably incorrect as shown by the ARM
codegen regressions.


Bernd
H.J. Lu March 18, 2011, 12:30 a.m. UTC | #16
On Tue, Feb 15, 2011 at 2:55 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 02/14/2011 08:46 PM, Eric Botcazou wrote:
>>> I agree with Jeff that combine would be the correct place to fix this.
>>> At least it takes class_likely_spilled_p into account, so it will
>>> restrict only those machines where extending the lifetime of hard regs
>>> is dangerous.
>>
>> OK, but I don't see how copying to a new pseudo would interfere with that.
>
> For one thing, the set no longer matches the REG_EQUIV note we make
> here, and that does seem to interfere with the optimization. I've tested
> both patches on ARM, -march=armv7-a. The combiner patch produced no code
> changes. The function.c patch produced regressions (increased register
> pressure). Both results are as expected.
>
> To put it another way: the combiner change is conservatively correct,
> and necessary if we're going to have extends of hard registers in the
> RTL. The function.c change is demonstrably incorrect as shown by the ARM
> codegen regressions.
>

I checked in my patch into trunk.

Thanks.
H.J. Lu March 18, 2011, 4 a.m. UTC | #17
On Thu, Mar 17, 2011 at 5:30 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Feb 15, 2011 at 2:55 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>> On 02/14/2011 08:46 PM, Eric Botcazou wrote:
>>>> I agree with Jeff that combine would be the correct place to fix this.
>>>> At least it takes class_likely_spilled_p into account, so it will
>>>> restrict only those machines where extending the lifetime of hard regs
>>>> is dangerous.
>>>
>>> OK, but I don't see how copying to a new pseudo would interfere with that.
>>
>> For one thing, the set no longer matches the REG_EQUIV note we make
>> here, and that does seem to interfere with the optimization. I've tested
>> both patches on ARM, -march=armv7-a. The combiner patch produced no code
>> changes. The function.c patch produced regressions (increased register
>> pressure). Both results are as expected.
>>
>> To put it another way: the combiner change is conservatively correct,
>> and necessary if we're going to have extends of hard registers in the
>> RTL. The function.c change is demonstrably incorrect as shown by the ARM
>> codegen regressions.
>>
>
> I checked in my patch into trunk.
>

I noticed that for x32, all pointers passed in registers are zero-extended
by caller. If we can fix

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48085

by avoiding zero-extension in callee, this issue won't happen for x32.  I will
revert the combine change for now and try to implement this approach.

Thanks.
H.J. Lu March 21, 2011, 4:13 a.m. UTC | #18
On Thu, Mar 17, 2011 at 9:00 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Mar 17, 2011 at 5:30 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Feb 15, 2011 at 2:55 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>>> On 02/14/2011 08:46 PM, Eric Botcazou wrote:
>>>>> I agree with Jeff that combine would be the correct place to fix this.
>>>>> At least it takes class_likely_spilled_p into account, so it will
>>>>> restrict only those machines where extending the lifetime of hard regs
>>>>> is dangerous.
>>>>
>>>> OK, but I don't see how copying to a new pseudo would interfere with that.
>>>
>>> For one thing, the set no longer matches the REG_EQUIV note we make
>>> here, and that does seem to interfere with the optimization. I've tested
>>> both patches on ARM, -march=armv7-a. The combiner patch produced no code
>>> changes. The function.c patch produced regressions (increased register
>>> pressure). Both results are as expected.
>>>
>>> To put it another way: the combiner change is conservatively correct,
>>> and necessary if we're going to have extends of hard registers in the
>>> RTL. The function.c change is demonstrably incorrect as shown by the ARM
>>> codegen regressions.
>>>
>>
>> I checked in my patch into trunk.
>>
>
> I noticed that for x32, all pointers passed in registers are zero-extended
> by caller. If we can fix
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48085
>
> by avoiding zero-extension in callee, this issue won't happen for x32.  I will
> revert the combine change for now and try to implement this approach.
>

The issues are

1. When passing a 32bit integer parameter in a register, hardware
always zero-extends
it to 64bit.  I can update x32 psABI to document it, which costs
nothing to implement
in GCC.
2. assign_parm_setup_reg zero-extends 32bit pointer in register to
64bit, which is
redundant.

It leads 2 problems:

1. Redundant zero-extension at function entry.
2. combine doesn't check zero-extension on hard register and leads to
internal compiler error.

Is there a way to avoid redundant zero-extension at function entry to
solve both problems?

Thanks.
Eric Botcazou March 22, 2011, 8:05 p.m. UTC | #19
> It leads 2 problems:
>
> 1. Redundant zero-extension at function entry.
> 2. combine doesn't check zero-extension on hard register and leads to
> internal compiler error.
>
> Is there a way to avoid redundant zero-extension at function entry to
> solve both problems?

Eliminating the redundant extension in the callee seems indeed to be appealing.
You need to find out who decides that the incoming parameter needs to be zero- 
extended.  Is that the call to promote_function_mode in assign_parm_setup_reg?
H.J. Lu March 23, 2011, 3:29 a.m. UTC | #20
On Tue, Mar 22, 2011 at 1:05 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> It leads 2 problems:
>>
>> 1. Redundant zero-extension at function entry.
>> 2. combine doesn't check zero-extension on hard register and leads to
>> internal compiler error.
>>
>> Is there a way to avoid redundant zero-extension at function entry to
>> solve both problems?
>
> Eliminating the redundant extension in the callee seems indeed to be appealing.
> You need to find out who decides that the incoming parameter needs to be zero-
> extended.  Is that the call to promote_function_mode in assign_parm_setup_reg?
>

It is:

      op0 = parmreg;
      op1 = validated_mem;
      if (icode != CODE_FOR_nothing
          && insn_data[icode].operand[0].predicate (op0, promoted_nominal_mode)
          && insn_data[icode].operand[1].predicate (op1, data->passed_mode))
        {
          enum rtx_code code = unsignedp ? ZERO_EXTEND : SIGN_EXTEND;
          rtx insn, insns;
          HARD_REG_SET hardregs;

          start_sequence ();
          insn = gen_extend_insn (op0, op1, promoted_nominal_mode,
                                  data->passed_mode, unsignedp);
          emit_insn (insn);
          insns = get_insns ();

in assign_parm_setup_reg.
Eric Botcazou March 23, 2011, 8:22 a.m. UTC | #21
> It is:
>
>       op0 = parmreg;
>       op1 = validated_mem;
>       if (icode != CODE_FOR_nothing
>           && insn_data[icode].operand[0].predicate (op0,
> promoted_nominal_mode) && insn_data[icode].operand[1].predicate (op1,
> data->passed_mode)) {
>           enum rtx_code code = unsignedp ? ZERO_EXTEND : SIGN_EXTEND;
>           rtx insn, insns;
>           HARD_REG_SET hardregs;
>
>           start_sequence ();
>           insn = gen_extend_insn (op0, op1, promoted_nominal_mode,
>                                   data->passed_mode, unsignedp);
>           emit_insn (insn);
>           insns = get_insns ();

Sure, but why is need_conversion set to true?
H.J. Lu March 23, 2011, 10:57 a.m. UTC | #22
On Wed, Mar 23, 2011 at 1:22 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> It is:
>>
>>       op0 = parmreg;
>>       op1 = validated_mem;
>>       if (icode != CODE_FOR_nothing
>>           && insn_data[icode].operand[0].predicate (op0,
>> promoted_nominal_mode) && insn_data[icode].operand[1].predicate (op1,
>> data->passed_mode)) {
>>           enum rtx_code code = unsignedp ? ZERO_EXTEND : SIGN_EXTEND;
>>           rtx insn, insns;
>>           HARD_REG_SET hardregs;
>>
>>           start_sequence ();
>>           insn = gen_extend_insn (op0, op1, promoted_nominal_mode,
>>                                   data->passed_mode, unsignedp);
>>           emit_insn (insn);
>>           insns = get_insns ();
>
> Sure, but why is need_conversion set to true?
>

Pointer is promoted to Pmode from ptr_mode.
Eric Botcazou March 24, 2011, 3:51 p.m. UTC | #23
> Pointer is promoted to Pmode from ptr_mode.

Indeed.  However the problem is the 2 in assign_parm_setup_reg:

  /* Store the parm in a pseudoregister during the function, but we may
     need to do it in a wider mode.  Using 2 here makes the result
     consistent with promote_decl_mode and thus expand_expr_real_1.  */
 promoted_nominal_mode
   = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
			     TREE_TYPE (current_function_decl), 2);

which is supposed to match the 2 in promote_decl_mode:

  if (TREE_CODE (decl) == RESULT_DECL
      || TREE_CODE (decl) == PARM_DECL)
    pmode = promote_function_mode (type, mode, &unsignedp,
                                   TREE_TYPE (current_function_decl), 2);
  else
    pmode = promote_mode (type, mode, &unsignedp);

but doesn't match the 0 in assign_parm_find_data_types:

  promoted_mode = promote_function_mode (passed_type, passed_mode, &unsignedp,
				         TREE_TYPE (current_function_decl), 0);

so you get the redundant extension in the callee.  The solution is to define 
the promote_function_mode hook for x86 to something like:

static enum machine_mode
ix86_promote_function_mode (const_tree type,
                            enum machine_mode mode,
                            int *punsignedp,
                            const_tree fntype ATTRIBUTE_UNUSED,
                            int for_return ATTRIBUTE_UNUSED)
{
  if (POINTER_TYPE_P (type))
    {
      *punsignedp = POINTERS_EXTEND_UNSIGNED;
      return Pmode;
    }

  return mode;
}
diff mbox

Patch

Index: gcc/testsuite/ChangeLog.x32
===================================================================
--- gcc/testsuite/ChangeLog.x32	(revision 170144)
+++ gcc/testsuite/ChangeLog.x32	(working copy)
@@ -1,5 +1,10 @@ 
 2011-02-13  H.J. Lu  <hongjiu.lu@intel.com>
 
+	PR middle-end/47725
+	* gcc.dg/torture/pr47725.c: New.
+
+2011-02-13  H.J. Lu  <hongjiu.lu@intel.com>
+
 	PR target/47715
 	* gcc.target/i386/pr47715-3.c: New.
 
Index: gcc/testsuite/gcc.dg/torture/pr47725.c
===================================================================
--- gcc/testsuite/gcc.dg/torture/pr47725.c	(revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr47725.c	(revision 0)
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+
+struct _Unwind_Context
+{
+  void *reg[17];
+  void *ra;
+};
+extern void bar (struct _Unwind_Context *);
+void
+__frame_state_for (void *pc_target)
+{
+  struct _Unwind_Context context;
+  __builtin_memset (&context, 0, sizeof (struct _Unwind_Context));
+  context.ra = pc_target;
+  bar (&context);
+}
Index: gcc/function.c
===================================================================
--- gcc/function.c	(revision 170144)
+++ gcc/function.c	(working copy)
@@ -3004,6 +3004,14 @@  assign_parm_setup_reg (struct assign_par
 	  HARD_REG_SET hardregs;
 
 	  start_sequence ();
+	  if (REG_P (op1) && REGNO (op1) < FIRST_PSEUDO_REGISTER)
+	    {
+	      /* We must copy the hard register first before extending
+		 it.  Otherwise, combine won't see the hard register.  */
+	      rtx copy = gen_reg_rtx (data->passed_mode);
+	      emit_move_insn (copy, op1);
+	      op1 = copy;
+	    }
 	  insn = gen_extend_insn (op0, op1, promoted_nominal_mode,
 				  data->passed_mode, unsignedp);
 	  emit_insn (insn);
Index: gcc/ChangeLog.x32
===================================================================
--- gcc/ChangeLog.x32	(revision 170146)
+++ gcc/ChangeLog.x32	(working copy)
@@ -1,3 +1,9 @@ 
+2011-02-14  H.J. Lu  <hongjiu.lu@intel.com>
+
+	PR middle-end/47725
+	* function.c (assign_parm_setup_reg): Copy the hard register
+	first before extending it.
+
 2011-02-13  H.J. Lu  <hongjiu.lu@intel.com>
 
 	PR target/47715