diff mbox

Fix wrong code with VCE to bit-field type at -O

Message ID 1407954.7L4M0YkQRE@polaris
State New
Headers show

Commit Message

Eric Botcazou Feb. 11, 2014, 10:40 a.m. UTC
Hi,

this is an interesting regression from GCC 4.5 present on all active branches 
and triggered by recent SRA improvements.  For the attached testcase, we have 
an unchecked conversion of a 3-byte slice of an array of 4 bytes to a record 
type containing a 3-byte bit-field; an unchecked conversion in this case 
directly translates into a VIEW_CONVERT_EXPR.  Then SRA scalarizes the bit-
field and turns the original VCE into a VCE of the 3-byte slice to the bit-
field type, which is a 32-bit integer with precision 24.

But the expansion of VCE isn't prepared for this and generates a full 32-bit 
read, which thus reads 1 additional byte and doesn't mask it afterwards, thus 
resulting in a wrong value for the scalarized bit-field.

Proposed fix attached, tested on x86-64/Linux, OK for the mainline?


2014-02-11  Eric Botcazou  <ebotcazou@adacore.com>

	* expr.c (REDUCE_BIT_FIELD): Extend the scope of the macro.
	(expand_expr_real_1) <case VIEW_CONVERT_EXPR>: Deal with bit-field
	destination types.


2014-02-11  Eric Botcazou  <ebotcazou@adacore.com>

	* gnat.dg/opt31.adb: New test.

Comments

Richard Biener Feb. 11, 2014, 1:17 p.m. UTC | #1
On Tue, Feb 11, 2014 at 11:40 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> Hi,
>
> this is an interesting regression from GCC 4.5 present on all active branches
> and triggered by recent SRA improvements.  For the attached testcase, we have
> an unchecked conversion of a 3-byte slice of an array of 4 bytes to a record
> type containing a 3-byte bit-field; an unchecked conversion in this case
> directly translates into a VIEW_CONVERT_EXPR.  Then SRA scalarizes the bit-
> field and turns the original VCE into a VCE of the 3-byte slice to the bit-
> field type, which is a 32-bit integer with precision 24.
>
> But the expansion of VCE isn't prepared for this and generates a full 32-bit
> read, which thus reads 1 additional byte and doesn't mask it afterwards, thus
> resulting in a wrong value for the scalarized bit-field.
>
> Proposed fix attached, tested on x86-64/Linux, OK for the mainline?

Hmm.  The intent was of course to only allow truly no-op converts via
VIEW_CONVERT_EXPR - that is, the size of the operand type and the
result type should be the same.  So, isn't SRA doing it wrong when
creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?

The verification we do in tree-cfg.c:verify_types_in_gimple_reference
hints at that we _do_ have even grosser mismatches - so reality may
trump desired design here.

Richard.

> 2014-02-11  Eric Botcazou  <ebotcazou@adacore.com>
>
>         * expr.c (REDUCE_BIT_FIELD): Extend the scope of the macro.
>         (expand_expr_real_1) <case VIEW_CONVERT_EXPR>: Deal with bit-field
>         destination types.
>
>
> 2014-02-11  Eric Botcazou  <ebotcazou@adacore.com>
>
>         * gnat.dg/opt31.adb: New test.
>
>
> --
> Eric Botcazou
Jakub Jelinek Feb. 11, 2014, 1:22 p.m. UTC | #2
On Tue, Feb 11, 2014 at 02:17:04PM +0100, Richard Biener wrote:
> > this is an interesting regression from GCC 4.5 present on all active branches
> > and triggered by recent SRA improvements.  For the attached testcase, we have
> > an unchecked conversion of a 3-byte slice of an array of 4 bytes to a record
> > type containing a 3-byte bit-field; an unchecked conversion in this case
> > directly translates into a VIEW_CONVERT_EXPR.  Then SRA scalarizes the bit-
> > field and turns the original VCE into a VCE of the 3-byte slice to the bit-
> > field type, which is a 32-bit integer with precision 24.
> >
> > But the expansion of VCE isn't prepared for this and generates a full 32-bit
> > read, which thus reads 1 additional byte and doesn't mask it afterwards, thus
> > resulting in a wrong value for the scalarized bit-field.
> >
> > Proposed fix attached, tested on x86-64/Linux, OK for the mainline?
> 
> Hmm.  The intent was of course to only allow truly no-op converts via
> VIEW_CONVERT_EXPR - that is, the size of the operand type and the
> result type should be the same.  So, isn't SRA doing it wrong when
> creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?
> 
> The verification we do in tree-cfg.c:verify_types_in_gimple_reference
> hints at that we _do_ have even grosser mismatches - so reality may
> trump desired design here.

I thought we only allow VCE if the bitsize of both types is the same.
If you have different bitsizes, then supposedly VCE to same bitsize integer
followed by zero/sign extension or truncation followed by another VCE should
be used.

	Jakub
Richard Biener Feb. 11, 2014, 1:39 p.m. UTC | #3
On Tue, Feb 11, 2014 at 2:22 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Feb 11, 2014 at 02:17:04PM +0100, Richard Biener wrote:
>> > this is an interesting regression from GCC 4.5 present on all active branches
>> > and triggered by recent SRA improvements.  For the attached testcase, we have
>> > an unchecked conversion of a 3-byte slice of an array of 4 bytes to a record
>> > type containing a 3-byte bit-field; an unchecked conversion in this case
>> > directly translates into a VIEW_CONVERT_EXPR.  Then SRA scalarizes the bit-
>> > field and turns the original VCE into a VCE of the 3-byte slice to the bit-
>> > field type, which is a 32-bit integer with precision 24.
>> >
>> > But the expansion of VCE isn't prepared for this and generates a full 32-bit
>> > read, which thus reads 1 additional byte and doesn't mask it afterwards, thus
>> > resulting in a wrong value for the scalarized bit-field.
>> >
>> > Proposed fix attached, tested on x86-64/Linux, OK for the mainline?
>>
>> Hmm.  The intent was of course to only allow truly no-op converts via
>> VIEW_CONVERT_EXPR - that is, the size of the operand type and the
>> result type should be the same.  So, isn't SRA doing it wrong when
>> creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?
>>
>> The verification we do in tree-cfg.c:verify_types_in_gimple_reference
>> hints at that we _do_ have even grosser mismatches - so reality may
>> trump desired design here.
>
> I thought we only allow VCE if the bitsize of both types is the same.
> If you have different bitsizes, then supposedly VCE to same bitsize integer
> followed by zero/sign extension or truncation followed by another VCE should
> be used.

Yeah, but the verification code is conveniently "crippled":

      if (TREE_CODE (expr) == VIEW_CONVERT_EXPR)
        {
          /* For VIEW_CONVERT_EXPRs which are allowed here too, we only check
             that their operand is not an SSA name or an invariant when
             requiring an lvalue (this usually means there is a SRA or IPA-SRA
             bug).  Otherwise there is nothing to verify, gross mismatches at
             most invoke undefined behavior.  */
          if (require_lvalue
              && (TREE_CODE (op) == SSA_NAME
                  || is_gimple_min_invariant (op)))
            {
              error ("conversion of an SSA_NAME on the left hand side");
              debug_generic_stmt (expr);
              return true;
            }
          else if (TREE_CODE (op) == SSA_NAME
                   && TYPE_SIZE (TREE_TYPE (expr)) != TYPE_SIZE
(TREE_TYPE (op)))
            {
              error ("conversion of register to a different size");
              debug_generic_stmt (expr);
              return true;
            }

thus that only applies to register VIEW_CONVERT_EXPRs.  But maybe
we two are missing sth here - it's an Ada testcase after all ;)

Richard.

>         Jakub
Eric Botcazou Feb. 11, 2014, 5:15 p.m. UTC | #4
> Hmm.  The intent was of course to only allow truly no-op converts via
> VIEW_CONVERT_EXPR - that is, the size of the operand type and the
> result type should be the same.  So, isn't SRA doing it wrong when
> creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?

That's debatable IMO if the 4-byte type has 3-byte precision, but I don't have 
a strong opinion so I can try to fix it in SRA, although it will be weird to 
do low-level fiddling because of precision and size at the Tree level.
Richard Biener Feb. 12, 2014, 10:54 a.m. UTC | #5
On Tue, Feb 11, 2014 at 6:15 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Hmm.  The intent was of course to only allow truly no-op converts via
>> VIEW_CONVERT_EXPR - that is, the size of the operand type and the
>> result type should be the same.  So, isn't SRA doing it wrong when
>> creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?
>
> That's debatable IMO if the 4-byte type has 3-byte precision, but I don't have
> a strong opinion so I can try to fix it in SRA, although it will be weird to
> do low-level fiddling because of precision and size at the Tree level.

That's true.  What I wonder is why the stmt checker doesn't trip.  Probably
because while SRA scalarizes the thing the result isn't rewritten into
SSA form?  So there may be a testcase where that happens and we'd even
ICE?

Richard.

> --
> Eric Botcazou
Eric Botcazou Feb. 12, 2014, 11:57 a.m. UTC | #6
> That's true.  What I wonder is why the stmt checker doesn't trip.  Probably
> because while SRA scalarizes the thing the result isn't rewritten into
> SSA form?

We have a slice of an array on the RHS so there is no SSA form, it's the VCE 
of an ARRAY_RANGE_REF to a 24-bit precision integer type.

> So there may be a testcase where that happens and we'd even ICE?

Probably not in Ada, we only generate a VCE when there is an aggregate type.
Martin Jambor Feb. 12, 2014, 5:10 p.m. UTC | #7
Hi,

On Tue, Feb 11, 2014 at 02:17:04PM +0100, Richard Biener wrote:
> Hmm.  The intent was of course to only allow truly no-op converts via
> VIEW_CONVERT_EXPR - that is, the size of the operand type and the
> result type should be the same.  So, isn't SRA doing it wrong when
> creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?
> 

Even though I cannot recall the details, I remember I had to make SRA
able to accept such size mismatches in between V_C_E operands and
results in its input, to be able to work on Ada FE generated IL.  I
believe that is still the case and so we would have these mismatches
even if we changed SRA not to create them.

The reason why SRA creates such a thing is that
get_ref_base_and_extent returns size 24 when run on the following, an
expression of type with type size 32:

 <component_ref 0x7ffc45212240
    type <integer_type 0x7ffc451fe930 opt31__time_t___XDLU_0__11059199
        type <integer_type 0x7ffc451dc5e8 opt31__Tunsigned_24B sizes-gimplified public unsigned SI
            size <integer_cst 0x7ffc450573a0 constant 32>
            unit size <integer_cst 0x7ffc450573c0 constant 4>
            align 32 symtab 0 alias set -1 canonical type 0x7ffc451dc5e8 precision 32 min <integer_cst 0x7ffc451f85a0 0> max <integer_cst 0x7ffc450a1cc0 4294967295> context <function_decl 0x7ffc451fb400 opt31> RM size <integer_cst 0x7ffc450573a0 32>
            chain <type_decl 0x7ffc45084170 opt31__Tunsigned_24B>>
        sizes-gimplified public unsigned SI
        size <integer_cst 0x7ffc450573a0 constant 32>
        unit size <integer_cst 0x7ffc450573c0 constant 4>
        align 32 symtab 0 alias set -1 canonical type 0x7ffc451fe930 precision 24 min <integer_cst 0x7ffc451f8ec0 0> max <integer_cst 0x7ffc451f8e60 16777215> RM size <integer_cst 0x7ffc45057800 24> RM min <integer_cst 0x7ffc451f85a0 0> RM max <integer_cst 0x7ffc451f8800 11059199>>
   
    arg 0 <component_ref 0x7ffc45212270
        type <record_type 0x7ffc451fe888 opt31__rec1 sizes-gimplified packed BLK
            size <integer_cst 0x7ffc45057800 constant 24>
            unit size <integer_cst 0x7ffc45083100 constant 3>
            align 8 symtab 0 alias set -1 canonical type 0x7ffc451fe888 fields <field_decl 0x7ffc4507de40 f> context <function_decl 0x7ffc451fb400 opt31>
            Ada size <integer_cst 0x7ffc45057800 constant 24>
            chain <type_decl 0x7ffc45202000 opt31__rec1>>
       
        arg 0 <var_decl 0x7ffc45204260 my_rec2 type <record_type 0x7ffc451fed20 opt31__rec2>
            BLK file opt31.adb line 31 col 5
            size <integer_cst 0x7ffc45203180 constant 160>
            unit size <integer_cst 0x7ffc452031a0 constant 20>
            align 32 context <function_decl 0x7ffc451fb900 opt31__decode>>
        arg 1 <field_decl 0x7ffc45204098 r1 type <record_type 0x7ffc451fe888 opt31__rec1>
            BLK file opt31.adb line 25 col 5
            size <integer_cst 0x7ffc45057800 constant 24>
            unit size <integer_cst 0x7ffc45083100 constant 3>
            align 8 offset_align 128
            offset <integer_cst 0x7ffc450570c0 constant 16>
            bit offset <integer_cst 0x7ffc450570e0 constant 0> context <record_type 0x7ffc451fed20 opt31__rec2>>>
    arg 1 <field_decl 0x7ffc4507de40 f
        type <integer_type 0x7ffc451fe930 opt31__time_t___XDLU_0__11059199 type <integer_type 0x7ffc451dc5e8 opt31__Tunsigned_24B>
            sizes-gimplified public unsigned SI
            size <integer_cst 0x7ffc450573a0 constant 32>
            unit size <integer_cst 0x7ffc450573c0 constant 4>
            align 32 symtab 0 alias set -1 canonical type 0x7ffc451fe930 precision 24 min <integer_cst 0x7ffc451f8ec0 0> max <integer_cst 0x7ffc451f8e60 16777215> RM size <integer_cst 0x7ffc45057800 24> RM min <integer_cst 0x7ffc451f85a0 0> RM max <integer_cst 0x7ffc451f8800 11059199>>
        unsigned external packed bit-field nonaddressable SI file opt31.adb line 16 col 5
        size <integer_cst 0x7ffc45057800 constant 24>
        unit size <integer_cst 0x7ffc45083100 constant 3>
        align 1 offset_align 128
        offset <integer_cst 0x7ffc45057060 constant 0>
        bit offset <integer_cst 0x7ffc450570e0 constant 0>
        bit_field_type <integer_type 0x7ffc451fe930 opt31__time_t___XDLU_0__11059199 type <integer_type 0x7ffc451dc5e8 opt31__Tunsigned_24B>
            sizes-gimplified public unsigned SI
            size <integer_cst 0x7ffc450573a0 constant 32>
            unit size <integer_cst 0x7ffc450573c0 constant 4>
            align 32 symtab 0 alias set -1 canonical type 0x7ffc451fe930 precision 24 min <integer_cst 0x7ffc451f8ec0 0> max <integer_cst 0x7ffc451f8e60 16777215> RM size <integer_cst 0x7ffc45057800 24> RM min <integer_cst 0x7ffc451f85a0 0> RM max <integer_cst 0x7ffc451f8800 11059199>> context <record_type 0x7ffc451fe888 opt31__rec1>>>

This type wins over an access to the same memory with only 24-bit
large type from another statement and when that statement is modified,
ensuing type mismatch is "fixed" by creating a VIEW_CONVERT_EXPR with
the mismatch.

I am not sure how to deal with this, given that we have mismatched
V_C_Es anyway, I'm inclined not to care and let the expander deal with
it.  But at the same I understand that it is ugly and will certainly
cause somebody more headache in the future.  I suppose that not
scalarizing here might hurt performance and would be frowned upon at
the very least.  If the fields bigger than the record approach is the
standard way of doing this, perhaps SRA can detect such cases and
produce these strange COMPONENT_REFs instead, but is it so?

Martin
Eric Botcazou Feb. 12, 2014, 5:51 p.m. UTC | #8
> I am not sure how to deal with this, given that we have mismatched
> V_C_Es anyway, I'm inclined not to care and let the expander deal with
> it.  But at the same I understand that it is ugly and will certainly
> cause somebody more headache in the future.  I suppose that not
> scalarizing here might hurt performance and would be frowned upon at
> the very least.  If the fields bigger than the record approach is the
> standard way of doing this, perhaps SRA can detect such cases and
> produce these strange COMPONENT_REFs instead, but is it so?

You may remember that we went that way before (building a COMPONENT_REF for 
bit-fields instead of fully lowering the access) so doing it again would be a 
step backwards.  Likewise if we refuses to scalarize.  So IMO it's either low-
level fiddling in SRA or in the expander (my preference too).
Richard Biener Feb. 13, 2014, 12:04 p.m. UTC | #9
On Wed, Feb 12, 2014 at 6:51 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> I am not sure how to deal with this, given that we have mismatched
>> V_C_Es anyway, I'm inclined not to care and let the expander deal with
>> it.  But at the same I understand that it is ugly and will certainly
>> cause somebody more headache in the future.  I suppose that not
>> scalarizing here might hurt performance and would be frowned upon at
>> the very least.  If the fields bigger than the record approach is the
>> standard way of doing this, perhaps SRA can detect such cases and
>> produce these strange COMPONENT_REFs instead, but is it so?
>
> You may remember that we went that way before (building a COMPONENT_REF for
> bit-fields instead of fully lowering the access) so doing it again would be a
> step backwards.  Likewise if we refuses to scalarize.  So IMO it's either low-
> level fiddling in SRA or in the expander (my preference too).

Ok, I've looked at the testcase and I suppose the following change is
what triggers the bug:

   <bb 11>:
   _56 = m.P_ARRAY;
-  my_rec2.r1 = VIEW_CONVERT_EXPR<struct opt31__rec1>(*_56[1 ...]{lb:
_3 sz: 1});
-  _58 = my_rec2.r1.f;
+  _51 = VIEW_CONVERT_EXPR<opt31__time_t___XDLU_0__11059199>(*_56[1
...]{lb: _3 sz: 1});
+  my_rec2$r1$f_43 = _51;
+  _58 = my_rec2$r1$f_43;
   if (_58 > 11059199)

I observe that SRA modifies an existing but not replaced memory reference
(something I always thought is asking for trouble).  It changes
VIEW_CONVERT_EXPR<struct opt31__rec1>(*_56[1 ...]{lb: _3 sz: 1});
to VIEW_CONVERT_EXPR<opt31__time_t___XDLU_0__11059199>(*_56[1 ...]{lb:
_3 sz: 1});.

Created a replacement for my_rec2 offset: 128, size: 24: my_rec2$r1$f

Access trees for my_rec2 (UID: 2659):
access { base = (2659)'my_rec2', offset = 128, size = 24, expr =
my_rec2.r1.f, type = opt31__time_t___XDLU_0__11059199, grp_read = 1,
grp_write = 1, grp_assignment_read = 1, grp_assignment_write = 1,
grp_scalar_read = 1, grp_scalar_write = 0, grp_total_scalarization =
0, grp_hint = 0, grp_covered = 1, grp_unscalarizable_region = 0,
grp_unscalarized_data = 0, grp_partial_lhs = 0, grp_to_be_replaced =
1, grp_to_be_debug_replaced = 0, grp_maybe_modified = 0,
grp_not_necessarilly_dereferenced = 0

but obviously 'type' doesn't agree with 'size' here.

In other places we disqualify exprs using VIEW_CONVERT_EXPRs but
appearantly only for the candidate itself, not for stuff assigned to it.
(though I never understood why disqualifying was necessary at all
for VIEW_CONVERT_EXPRs).

We are using the type of a bitfield field for the replacement which
we IMHO should avoid because the FIELD_DECLs size is 24
but the fields type TYPE_SIZE is 32 (it's precision is 24).  That's
all not an issue until you start to VIEW_CONVERT to such type
(VIEW_CONVERT being a reference op just cares for size not
precision).  Other ops are treated correctly by expansion.

Now - using a non-mode precision integer type as scalar replacement
isn't going to produce great code and, as we can see, has "issues"
when using VIEW_CONVERT_EXPRs.

SRA should either avoid this transform or fixup by VIEW_CONVERTing
memory reads only to mode-precision integer types and then inserting
a fixup cast.  The direct VIEW_CONVERsion it creates, from

  my_rec2.r1 = VIEW_CONVERT_EXPR<struct opt31__rec1>(*_56[1 ...]{lb: _3 sz: 1});
  _58 = my_rec2.r1.f;


to basically

  _58 = VIEW_CONVERT_EXPR<opt31__time_t___XDLU_0__11059199>(*_56[1
...]{lb: _3 sz: 1});

is simply wrong.

If you "fix" expansion then consider a nested VIEW_CONVERT_EXPR
that views back to the aggregate type - is that now supposed to
clear the upper 8 bits because of the VIEW_CONVERT_EXPR in the
middle?  Not so.  So fixing VIEW_CONVERT_EXPR sounds conceptually
wrong to me.

Not scalarizing a field to a DECL_BIT_FIELD FIELD_DECLs type looks like
the best fix to me.

Richard.
Eric Botcazou Feb. 13, 2014, 4 p.m. UTC | #10
> We are using the type of a bitfield field for the replacement which
> we IMHO should avoid because the FIELD_DECLs size is 24
> but the fields type TYPE_SIZE is 32 (it's precision is 24).  That's
> all not an issue until you start to VIEW_CONVERT to such type
> (VIEW_CONVERT being a reference op just cares for size not
> precision).  Other ops are treated correctly by expansion.
> 
> Now - using a non-mode precision integer type as scalar replacement
> isn't going to produce great code and, as we can see, has "issues"
> when using VIEW_CONVERT_EXPRs.
> 
> SRA should either avoid this transform or fixup by VIEW_CONVERTing
> memory reads only to mode-precision integer types and then inserting
> a fixup cast.  The direct VIEW_CONVERsion it creates, from
> 
>   my_rec2.r1 = VIEW_CONVERT_EXPR<struct opt31__rec1>(*_56[1 ...]{lb: _3 sz:
> 1}); _58 = my_rec2.r1.f;
> 
> 
> to basically
> 
>   _58 = VIEW_CONVERT_EXPR<opt31__time_t___XDLU_0__11059199>(*_56[1
> ...]{lb: _3 sz: 1});
> 
> is simply wrong.

There is nothing obvious I think, i.e. that's debatable.  I agree that a VCE 
from a 32-bit object to a 32-bit integer with 24-bit precision should not 
clear the upper 8 bits (so the REDUCE_BIT_FIELD part of my patch is wrong).
But here we have a VCE from a 24-bit object to a 32-bit integer with 24-bit 
precision which reads *more bits* than the size of the source type; that I 
think is plain wrong and is fixed by the bit-field extraction in the patch.

> If you "fix" expansion then consider a nested VIEW_CONVERT_EXPR
> that views back to the aggregate type - is that now supposed to
> clear the upper 8 bits because of the VIEW_CONVERT_EXPR in the
> middle?  Not so.  So fixing VIEW_CONVERT_EXPR sounds conceptually
> wrong to me.

I agree that we need not clear, but we need to prevent the expansion from 
reading more bits than what is contained in the source type.  And this is 
sufficient to fix the regression.

> Not scalarizing a field to a DECL_BIT_FIELD FIELD_DECLs type looks like
> the best fix to me.

That seems like a big hammer though.
diff mbox

Patch

Index: expr.c
===================================================================
--- expr.c	(revision 207685)
+++ expr.c	(working copy)
@@ -9221,7 +9221,6 @@  expand_expr_real_2 (sepops ops, rtx targ
     return temp;
   return REDUCE_BIT_FIELD (temp);
 }
-#undef REDUCE_BIT_FIELD
 
 
 /* Return TRUE if expression STMT is suitable for replacement.  
@@ -10444,15 +10443,21 @@  expand_expr_real_1 (tree exp, rtx target
 	  op0 = target;
 	}
 
-      /* At this point, OP0 is in the correct mode.  If the output type is
+      /* If it's a MEM with a different mode and we need to reduce it to a
+	 bit-field type, do an extraction.  Otherwise, we can read all bits
+	 of MODE but need to deal with the alignment.  If the output type is
 	 such that the operand is known to be aligned, indicate that it is.
-	 Otherwise, we need only be concerned about alignment for non-BLKmode
-	 results.  */
+	 Otherwise, we actually need only be concerned about alignment for
+	 non-BLKmode results.  */
       if (MEM_P (op0))
 	{
 	  enum insn_code icode;
 
-	  if (TYPE_ALIGN_OK (type))
+	  if (reduce_bit_field && GET_MODE (op0) != mode)
+	    return extract_bit_field (op0, TYPE_PRECISION (type), 0,
+				      TYPE_UNSIGNED (type), NULL_RTX,
+				      mode, mode);
+	  else if (TYPE_ALIGN_OK (type))
 	    {
 	      /* ??? Copying the MEM without substantially changing it might
 		 run afoul of the code handling volatile memory references in
@@ -10483,7 +10488,7 @@  expand_expr_real_1 (tree exp, rtx target
 		  /* Nor can the insn generator.  */
 		  insn = GEN_FCN (icode) (reg, op0);
 		  emit_insn (insn);
-		  return reg;
+		  return REDUCE_BIT_FIELD (reg);
 		}
 	      else if (STRICT_ALIGNMENT)
 		{
@@ -10513,7 +10518,7 @@  expand_expr_real_1 (tree exp, rtx target
 	  op0 = adjust_address (op0, mode, 0);
 	}
 
-      return op0;
+      return REDUCE_BIT_FIELD (op0);
 
     case MODIFY_EXPR:
       {
@@ -10613,6 +10618,7 @@  expand_expr_real_1 (tree exp, rtx target
       return expand_expr_real_2 (&ops, target, tmode, modifier);
     }
 }
+#undef REDUCE_BIT_FIELD
 
 /* Subroutine of above: reduce EXP to the precision of TYPE (in the
    signedness of TYPE), possibly returning the result in TARGET.  */