Patchwork attribute to reverse bitfield allocations

login
register
mail settings
Submitter DJ Delorie
Date July 21, 2011, 10:23 p.m.
Message ID <201107212223.p6LMNE07001515@greed.delorie.com>
Download mbox | patch
Permalink /patch/106180/
State New
Headers show

Comments

DJ Delorie - July 21, 2011, 10:23 p.m.
Seeing little opposition, I plod further...  now with documentation
and a test case.  OK yet?
DJ Delorie - July 28, 2011, 6:18 p.m.
> Seeing little opposition, I plod further...  now with documentation
> and a test case.  OK yet?

Ping?

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html
DJ Delorie - Aug. 22, 2011, 5:36 p.m.
Ping 2 ?

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02555.html
DJ Delorie - Sept. 26, 2011, 2:55 p.m.
Ping 3 ?  I'd like to get this in before stage1 ends...  There's
RX-specific code that goes with it that can't go in until the core
functionality is approved.

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02555.html
http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01800.html
DJ Delorie - Jan. 11, 2012, 5:56 a.m.
Ping 3?  It's been months with no feedback...

> Ping 2 ?
> 
> http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html
> http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02555.html
DJ Delorie - Jan. 24, 2012, 7:39 p.m.
Ping 4...

> Ping 3?  It's been months with no feedback...
> 
> > Ping 2 ?
> > 
> > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html
> > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02555.html
http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00529.html
DJ Delorie - Feb. 13, 2012, 7:43 p.m.
Ping 5...

> Ping 4...
> 
> > Ping 3?  It's been months with no feedback...
> > 
> > > Ping 2 ?
> > > 
> > > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html
> > > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02555.html
> http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00529.html
http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01246.html
DJ Delorie - April 2, 2012, 7:41 p.m.
Ping 6...

It's now been over EIGHT MONTHS since I posted the patch, back in
stage 1 for 4.7.  Can someone please review and/or approve this before
gcc 4.8's stage 1 is closed?  This is needed as a first step for ABI
compatibility for rx-elf.

> Ping 5...
> 
> > Ping 4...
> > 
> > > Ping 3?  It's been months with no feedback...
> > > 
> > > > Ping 2 ?
> > > > 
> > > > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html
> > > > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02555.html
> > http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00529.html
> http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01246.html
http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00636.html
Mike Stump - April 2, 2012, 10:30 p.m.
On Apr 2, 2012, at 12:41 PM, DJ Delorie wrote:
> Ping 6...
> 
> It's now been over EIGHT MONTHS since I posted the patch, back in
> stage 1 for 4.7.  Can someone please review and/or approve this before
> gcc 4.8's stage 1 is closed?  This is needed as a first step for ABI
> compatibility for rx-elf.

Set up a cron job to ping once a day.  :-)  Did you ever dig up the Apple test cases from the APPLE LOCAL work I pointed you at earlier?  They will be more comprehensive that any testing you've done, and, if you get them to all pass, the work should be closer to being complete.  The feature needed a ton of testcases, a few didn't cut it.
Richard Guenther - April 3, 2012, 8:40 a.m.
On Mon, Apr 2, 2012 at 9:41 PM, DJ Delorie <dj@redhat.com> wrote:
>
> Ping 6...
>
> It's now been over EIGHT MONTHS since I posted the patch, back in
> stage 1 for 4.7.  Can someone please review and/or approve this before
> gcc 4.8's stage 1 is closed?  This is needed as a first step for ABI
> compatibility for rx-elf.

If it's required for ABI compatibility why is this an attribute and not
a target hook?

>> Ping 5...
>>
>> > Ping 4...
>> >
>> > > Ping 3?  It's been months with no feedback...
>> > >
>> > > > Ping 2 ?
>> > > >
>> > > > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html
>> > > > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02555.html
>> > http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00529.html
>> http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01246.html
> http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00636.html
DJ Delorie - April 3, 2012, 7:04 p.m.
> If it's required for ABI compatibility why is this an attribute and not
> a target hook?

The ABI uses a #pragma; after this is in I'll do a target-specific
pragma handler to set the attribute.  Plus, when I originally proposed
the idea, I was told it was generic so make it an attribute ;-)
DJ Delorie - April 3, 2012, 7:57 p.m.
> Did you ever dig up the Apple test cases from the APPLE LOCAL work I
> pointed you at earlier?

Sorry, I read that as "the internal tree at Apple" not "the apple
branch at fsf".  I'll look at it, thanks!

> They will be more comprehensive that any testing you've done, and,
> if you get them to all pass, the work should be closer to being
> complete.  The feature needed a ton of testcases, a few didn't cut
> it.

Worse, the test cases need to be target-endian-specific, and
sizeof(int) agnostic.
Mike Stump - April 3, 2012, 9:13 p.m.
On Apr 3, 2012, at 12:57 PM, DJ Delorie wrote:
>> Did you ever dig up the Apple test cases from the APPLE LOCAL work I
>> pointed you at earlier?
> 
> Sorry, I read that as "the internal tree at Apple" not "the apple
> branch at fsf".  I'll look at it, thanks!

Oh, I just checked the llvm-gcc-4.2 tree, which should have all the work in it, no testcases...  Arg, never mind...  sorry.
DJ Delorie - April 30, 2012, 11:34 p.m.
> Set up a cron job to ping once a day.  :-) Did you ever dig up the
> Apple test cases from the APPLE LOCAL work I pointed you at earlier?
> They will be more comprehensive that any testing you've done, and,
> if you get them to all pass, the work should be closer to being
> complete.  The feature needed a ton of testcases, a few didn't cut
> it.

In going through the Apple test cases, I discovered one HUGE
disadvantage to using __attribute__ to tag structures for bit reversal
- it doesn't propogate.  I.e.:

	typedef __attribute__((reverse)) struct {
	  struct {
	    int x:4;
	    int y:4;
	  } a;
	} b;

The attribute seems to apply only to struct b, not to struct a.  For
PACKED, we handle the flag specially, propogating it at every step in
the layout.  The original Apple patch used a #pragma.

Suggestions on how to make a structure attribute apply to the whole
structure?

Or should I *also* add a #pragma to specify the default bit ordering?
This is what the Renesas ABIs want anyway, and what Apple did, but I
was told to use an attribute and have a target pragma set the
attribute, but I don't see how...
Mike Stump - May 1, 2012, 12:39 a.m.
I've been reading your patches...

On Apr 30, 2012, at 4:34 PM, DJ Delorie wrote:
> In going through the Apple test cases, I discovered one HUGE
> disadvantage to using __attribute__ to tag structures for bit reversal
> - it doesn't propogate.  I.e.:

This is why pragma exists.  :-)  Certainly, once you have the underlying support for an attribute, adding a pragma is easier.

Patch

Index: doc/extend.texi
===================================================================
--- doc/extend.texi	(revision 176586)
+++ doc/extend.texi	(working copy)
@@ -5089,12 +5089,74 @@  Note that the type visibility is applied
 associated with the class (vtable, typeinfo node, etc.).  In
 particular, if a class is thrown as an exception in one shared object
 and caught in another, the class must have default visibility.
 Otherwise the two shared objects will be unable to use the same
 typeinfo node and exception handling will break.
 
+@item bit_order
+Normally, GCC allocates bitfields from either the least significant or
+most significant bit in the underlying type, such that bitfields
+happen to be allocated from lowest address to highest address.
+Specifically, big-endian targets allocate the MSB first, where
+little-endian targets allocate the LSB first.  The @code{bit_order}
+attribute overrides this default, allowing you to force allocation to
+be MSB-first, LSB-first, or the opposite of whatever gcc defaults to.  The
+@code{bit_order} attribute takes an optional argument:
+
+@table @code
+
+@item native
+This is the default, and also the mode when no argument is given.  GCC
+allocates LSB-first on little endian targets, and MSB-first on big
+endian targets.
+
+@item swapped
+Bitfield allocation is the opposite of @code{native}.
+
+@item lsb
+Bits are allocated LSB-first.
+
+@item msb
+Bits are allocated MSB-first.
+
+@end table
+
+A short example demonstrates bitfield allocation:
+
+@example
+struct __attribute__((bit_order(msb))) @{
+  char a:3;
+  char b:3;
+@} foo = @{ 3, 5 @};
+@end example
+
+With LSB-first allocation, @code{foo.a} will be in the 3 least
+significant bits (mask 0x07) and @code{foo.b} will be in the next 3
+bits (mask 0x38).  With MSB-first allocation, @code{foo.a} will be in
+the 3 most significant bits (mask 0xE0) and @code{foo.b} will be in
+the next 3 bits (mask 0x1C).
+
+Note that it is entirely up to the programmer to define bitfields that
+make sense when swapped.  Consider:
+
+@example
+struct __attribute__((bit_order(msb))) @{
+  short a:7;
+  char b:6;
+@} foo = @{ 3, 5 @};
+@end example
+
+On some targets, or if the struct is @code{packed}, GCC may only use
+one byte of storage for A despite it being a @code{short} type.
+Swapping the bit order of A would cause it to overlap B.  Worse, the
+bitfield for B may span bytes, so ``swapping'' would no longer be
+defined as there is no ``char'' to swap within.  To avoid such
+problems, the programmer should either fully-define each underlying
+type, or ensure that their target's ABI allocates enough space for
+each underlying type regardless of how much of it is used.
+
 @end table
 
 @subsection ARM Type Attributes
 
 On those ARM targets that support @code{dllimport} (such as Symbian
 OS), you can use the @code{notshared} attribute to indicate that the
Index: c-family/c-common.c
===================================================================
--- c-family/c-common.c	(revision 176586)
+++ c-family/c-common.c	(working copy)
@@ -312,12 +312,13 @@  struct visibility_flags visibility_optio
 
 static tree c_fully_fold_internal (tree expr, bool, bool *, bool *);
 static tree check_case_value (tree);
 static bool check_case_bounds (tree, tree, tree *, tree *);
 
 static tree handle_packed_attribute (tree *, tree, tree, int, bool *);
+static tree handle_bitorder_attribute (tree *, tree, tree, int, bool *);
 static tree handle_nocommon_attribute (tree *, tree, tree, int, bool *);
 static tree handle_common_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
 static tree handle_hot_attribute (tree *, tree, tree, int, bool *);
 static tree handle_cold_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
@@ -589,12 +590,14 @@  const unsigned int num_c_common_reswords
 const struct attribute_spec c_common_attribute_table[] =
 {
   /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler,
        affects_type_identity } */
   { "packed",                 0, 0, false, false, false,
 			      handle_packed_attribute , false},
+  { "bit_order",              0, 1, false, true, false,
+			      handle_bitorder_attribute , false},
   { "nocommon",               0, 0, true,  false, false,
 			      handle_nocommon_attribute, false},
   { "common",                 0, 0, true,  false, false,
 			      handle_common_attribute, false },
   /* FIXME: logically, noreturn attributes should be listed as
      "false, true, true" and apply to function types.  But implementing this
@@ -5764,12 +5767,42 @@  handle_packed_attribute (tree *node, tre
       *no_add_attrs = true;
     }
 
   return NULL_TREE;
 }
 
+/* Handle a "bit_order" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_bitorder_attribute (tree *ARG_UNUSED (node), tree ARG_UNUSED (name),
+			   tree ARG_UNUSED (args),
+			   int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  tree bmode;
+  const char *bname;
+
+  /* Allow no arguments to mean "native".  */
+  if (args == NULL_TREE)
+    return NULL_TREE;
+
+  bmode = TREE_VALUE (args);
+
+  bname = IDENTIFIER_POINTER (bmode);
+  if (strcmp (bname, "msb")
+      && strcmp (bname, "lsb")
+      && strcmp (bname, "swapped")
+      && strcmp (bname, "native"))
+    {
+      error ("%qE is not a valid bit_order - use lsb, msb, native, or swapped", bmode);
+      *no_add_attrs = true;
+    }
+
+  return NULL_TREE;
+}
+
 /* Handle a "nocommon" attribute; arguments as in
    struct attribute_spec.handler.  */
 
 static tree
 handle_nocommon_attribute (tree *node, tree name,
 			   tree ARG_UNUSED (args),
Index: testsuite/gcc.dg/attr-bitorder-1.c
===================================================================
--- testsuite/gcc.dg/attr-bitorder-1.c	(revision 0)
+++ testsuite/gcc.dg/attr-bitorder-1.c	(revision 0)
@@ -0,0 +1,139 @@ 
+/* { dg-options "-fstrict-volatile-bitfields" } */
+#include <stdio.h>
+#include <string.h>
+
+typedef unsigned char  QI __attribute__((mode(QI)));
+typedef unsigned short HI __attribute__((mode(HI)));
+typedef unsigned int   SI __attribute__((mode(SI)));
+
+#define U(AT) volatile struct __attribute((bit_order(AT))) { SI a:3; SI b:3; } t1_##AT = { 3, 5 };
+U(native)
+U(lsb)
+U(msb)
+U(swapped)
+QI t1_bytes_lsb [] = { 0x2b, 0x00, 0x00, 0x00 };
+QI t1_bytes_msb [] = { 0x00, 0x00, 0x00, 0x74 };
+
+#undef U
+#define U(AT) volatile struct __attribute((bit_order(AT))) { QI a:3; QI b:3; } t2_##AT = { 3, 5 };
+U(native)
+U(lsb)
+U(msb)
+U(swapped)
+QI t2_bytes_lsb [] = { 0x2b };
+QI t2_bytes_msb [] = { 0x74 };
+
+#undef U
+#define U(AT) volatile struct __attribute((bit_order(AT))) { QI a:5; QI b:5; } t3_##AT = { 31, 17 };
+U(native)
+U(lsb)
+U(msb)
+U(swapped)
+QI t3_bytes_lsb [] = { 0x1f, 0x11 };
+QI t3_bytes_msb [] = { 0xf8, 0x88 };
+
+#undef U
+#define U(AT) volatile struct __attribute((bit_order(AT))) { SI a:8; SI b:8; } t4_##AT = { 0x12, 0x78 };
+U(native)
+U(lsb)
+U(msb)
+U(swapped)
+QI t4_bytes_lsb [] = { 0x12, 0x78, 0x00, 0x00 };
+QI t4_bytes_msb [] = { 0x00, 0x00, 0x78, 0x12 };
+
+#undef U
+#define U(AT) volatile struct __attribute((bit_order(AT))) { SI a:16; SI b:8; } t5_##AT = { 0x1212, 0x78 };
+U(native)
+U(lsb)
+U(msb)
+U(swapped)
+QI t5_bytes_lsb [] = { 0x12, 0x12, 0x78, 0x00 };
+QI t5_bytes_msb [] = { 0x00, 0x78, 0x12, 0x12 };
+
+#undef U
+#define U(AT) volatile struct __attribute((bit_order(AT))) { HI a:4; QI b:4; } t6_##AT = { 5, 5 };
+U(native)
+U(lsb)
+U(msb)
+U(swapped)
+QI t6_bytes_lsb [] = { 0x05, 0x00, 0x05 };
+QI t6_bytes_msb [] = { 0x00, 0x50, 0x50 };
+
+static int be;
+static int printed_be = 0;
+
+static void
+dump_bytes (QI *b, int s, char *prefix)
+{
+  int bit, i;
+  printf("%s:", prefix);
+  for (i=0; i<s; i++)
+    {
+      QI byte = b[i];
+      putchar(' ');
+      for (bit=7; bit>=0; bit--)
+	putchar((byte & (1<<bit)) ? '1' : '0');
+    }
+  putchar('\n');
+}
+
+static int
+verify_bytes (int line, QI *st, QI *by, int sz, char *name, char *attr)
+{
+  int i, b;
+
+  if (memcmp (st, by, sz) == 0)
+    return 0;
+
+  if (!printed_be)
+    {
+      if (be)
+	printf("target is big endian (msb first)\n");
+      else
+	printf("target is little endian (lsb first)\n");
+      printed_be = 1;
+    }
+  printf("byte mismatch at line %d (%s %s):\n", line, name, attr);
+  dump_bytes (st, sz, "struct");
+  dump_bytes (by, sz, "expect");
+  return 1;
+}
+
+#define V1(N,AT,B) if (verify_bytes(__LINE__, (QI *)&N##_##AT, N##_bytes_##B, sizeof(N##_##AT), #N, #AT)) {  \
+  typeof(N##_##AT) tmp;							\
+  memset ((void *)&tmp, 0, sizeof(tmp)); tmp.a = ~0; dump_bytes((QI *)&tmp, sizeof(tmp), "tmp.a "); \
+  memset ((void *)&tmp, 0, sizeof(tmp)); tmp.b = ~0; dump_bytes((QI *)&tmp, sizeof(tmp), "tmp.b "); \
+  rv ++; }
+
+
+main()
+{
+  int rv = 0;
+  union {
+    long l;
+    char c;
+  } lc;
+
+  lc.l = 1;
+  if (lc.c == 1)
+    be = 0;
+  else
+    be = 1;
+
+#define V(n)						  \
+  if (be) { V1(n, native, msb) }			  \
+  else    { V1(n, native, lsb) }			  \
+  if (be) { V1(n, swapped, lsb) }			  \
+  else    { V1(n, swapped, msb) }			  \
+  V1(n, lsb, lsb)					  \
+  V1(n, msb, msb)
+
+  V(t1);
+  V(t2);
+  V(t3);
+  V(t4);
+  V(t5);
+  V(t6);
+
+  return rv;
+}
Index: stor-layout.c
===================================================================
--- stor-layout.c	(revision 176586)
+++ stor-layout.c	(working copy)
@@ -1716,24 +1716,107 @@  finalize_type_size (tree type)
 	  TYPE_ALIGN (variant) = align;
 	  TYPE_USER_ALIGN (variant) = user_align;
 	  SET_TYPE_MODE (variant, mode);
 	}
     }
 }
+  
+static void
+reverse_bitfield_layout (record_layout_info rli)
+{
+  tree field, oldtype, oldbtype;
+
+  for (field = TYPE_FIELDS (rli->t); field; field = TREE_CHAIN (field))
+    {
+      tree type = TREE_TYPE (field);
+      tree bit, byte, bmod, byte_offset;
+
+      if (TREE_CODE (field) != FIELD_DECL)
+	continue;
+      if (TREE_CODE (field) == ERROR_MARK || TREE_CODE (type) == ERROR_MARK)
+	return;
+
+      oldtype = TREE_TYPE (DECL_FIELD_BIT_OFFSET (field));
+      oldbtype = TREE_TYPE (DECL_FIELD_OFFSET (field));
+
+      bit = DECL_FIELD_BIT_OFFSET (field);
+      byte = DECL_FIELD_OFFSET (field);
+
+      /* Sometimes, the next field might be in the next type-size
+	 container.  We have to calculate which *container* it's in,
+	 and swap within that container.  Example: { char a:5; char
+	 b:5; } will put B in the next char, but the byte/bit numbers
+	 might show that as "bit 8 of byte 0".  */
+      bmod = size_binop (FLOOR_DIV_EXPR, bit, TYPE_SIZE (type));
+      bmod = size_binop (MULT_EXPR, bmod, TYPE_SIZE (type));
+      bit = size_binop (MINUS_EXPR, bit, bmod);
+
+      byte_offset = size_binop (FLOOR_DIV_EXPR, bmod, bitsize_int (BITS_PER_UNIT));
+      byte_offset = fold_convert (sizetype, byte_offset);
+      byte = size_binop (PLUS_EXPR, byte, byte_offset);
+
+      DECL_FIELD_BIT_OFFSET (field)
+	= size_binop (MINUS_EXPR,
+		      size_binop (MINUS_EXPR, TYPE_SIZE (type),
+				  DECL_SIZE (field)),
+		      bit);
+      DECL_FIELD_OFFSET (field) = byte;
+
+      TREE_TYPE (DECL_FIELD_BIT_OFFSET (field)) = oldtype;
+      TREE_TYPE (DECL_FIELD_OFFSET (field)) = oldbtype;
+    }
+}
+
+static int
+reverse_bitfields_p (record_layout_info rli)
+{
+  tree st, arg;
+  const char *mode;
+
+  st = rli->t;
+
+  arg = lookup_attribute ("bit_order", TYPE_ATTRIBUTES (st));
+
+  if (!arg)
+    return 0;
+  arg = TREE_VALUE (TREE_VALUE (arg));
+  if (!arg)
+    return 0;
+
+  mode = IDENTIFIER_POINTER (arg);
+
+  if (strcmp (mode, "swapped") == 0)
+    return 1;
+  if (BYTES_BIG_ENDIAN)
+    {
+      if (strcmp (mode, "lsb") == 0)
+	return 1;
+    }
+  else
+    {
+      if (strcmp (mode, "msb") == 0)
+	return 1;
+    }
+
+  return 0;
+}
 
 /* Do all of the work required to layout the type indicated by RLI,
    once the fields have been laid out.  This function will call `free'
    for RLI, unless FREE_P is false.  Passing a value other than false
    for FREE_P is bad practice; this option only exists to support the
    G++ 3.2 ABI.  */
 
 void
 finish_record_layout (record_layout_info rli, int free_p)
 {
   tree variant;
 
+  if (reverse_bitfields_p (rli))
+    reverse_bitfield_layout (rli);
+
   /* Compute the final size.  */
   finalize_record_size (rli);
 
   /* Compute the TYPE_MODE for the record.  */
   compute_record_mode (rli->t);
 
Index: varasm.c
===================================================================
--- varasm.c	(revision 176586)
+++ varasm.c	(working copy)
@@ -4720,37 +4720,46 @@  output_constructor_array_range (oc_local
 
       /* Count its size.  */
       local->total_bytes += fieldsize;
     }
 }
 
-/* Helper for output_constructor.  From the current LOCAL state, output a
-   field element that is not true bitfield or part of an outer one.  */
-
-static void
-output_constructor_regular_field (oc_local_state *local)
+/* Helper for output_constructor_regular_field */
+static HOST_WIDE_INT
+constructor_regular_field_bytepos (oc_local_state *local)
 {
-  /* Field size and position.  Since this structure is static, we know the
-     positions are constant.  */
-  unsigned HOST_WIDE_INT fieldsize;
   HOST_WIDE_INT fieldpos;
-
-  unsigned int align2;
-
   if (local->index != NULL_TREE)
     {
       double_int idx = double_int_sub (tree_to_double_int (local->index),
 				       tree_to_double_int (local->min_index));
       gcc_assert (double_int_fits_in_shwi_p (idx));
       fieldpos = (tree_low_cst (TYPE_SIZE_UNIT (TREE_TYPE (local->val)), 1)
 		  * idx.low);
     }
   else if (local->field != NULL_TREE)
     fieldpos = int_byte_position (local->field);
   else
     fieldpos = 0;
+  return fieldpos;
+}
+
+/* Helper for output_constructor.  From the current LOCAL state, output a
+   field element that is not true bitfield or part of an outer one.  */
+
+static void
+output_constructor_regular_field (oc_local_state *local)
+{
+  /* Field size and position.  Since this structure is static, we know the
+     positions are constant.  */
+  unsigned HOST_WIDE_INT fieldsize;
+  HOST_WIDE_INT fieldpos;
+
+  unsigned int align2;
+
+  fieldpos = constructor_regular_field_bytepos (local);
 
   /* Output any buffered-up bit-fields preceding this element.  */
   if (local->byte_buffer_in_use)
     {
       assemble_integer (GEN_INT (local->byte), 1, BITS_PER_UNIT, 1);
       local->total_bytes++;
@@ -5001,18 +5010,49 @@  output_constructor_bitfield (oc_local_st
 }
 
 /* Subroutine of output_constant, used for CONSTRUCTORs (aggregate constants).
    Generate at least SIZE bytes, padding if necessary.  OUTER designates the
    caller output state of relevance in recursive invocations.  */
 
+typedef struct {
+  unsigned HOST_WIDE_INT cnt;
+  tree val;
+  tree index;
+  tree field;
+  int what_to_do;
+} constructor_field_list;
+
+#define WHAT_ARRAY    1
+#define WHAT_REGULAR  2
+#define WHAT_BITFIELD 3
+
+static int
+constructor_field_sort (const void *va, const void *vb)
+{
+  const constructor_field_list *a = (const constructor_field_list *)va;
+  const constructor_field_list *b = (const constructor_field_list *)vb;
+  /* A field that's exactly a whole number of bytes might show up as a
+     "regular" type instead of a "field" byte.  We can tell the
+     difference here, because those will have FIELD set.  Just
+     preserve the original order for non-field components.  */
+  if (! a->field || ! b->field)
+    return a->cnt - b->cnt;
+  /* For two fields, compare byte offset first, then bit offset.  */
+  if (int_byte_position (a->field) != int_byte_position (b->field))
+    return int_byte_position (a->field) - int_byte_position (b->field);
+  return int_bit_position (a->field) - int_bit_position (b->field);
+}
+
 static unsigned HOST_WIDE_INT
 output_constructor (tree exp, unsigned HOST_WIDE_INT size,
 		    unsigned int align, oc_outer_state * outer)
 {
   unsigned HOST_WIDE_INT cnt;
   constructor_elt *ce;
+  constructor_field_list *constructor_fields;
+  unsigned HOST_WIDE_INT constructor_field_count;
 
   oc_local_state local;
 
   /* Setup our local state to communicate with helpers.  */
   local.exp = exp;
   local.size = size;
@@ -5043,12 +5083,15 @@  output_constructor (tree exp, unsigned H
      more one).  */
 
   local.field = NULL_TREE;
   if (TREE_CODE (local.type) == RECORD_TYPE)
     local.field = TYPE_FIELDS (local.type);
 
+  constructor_field_count = VEC_length (constructor_elt, CONSTRUCTOR_ELTS (exp));
+  constructor_fields = XNEWVEC (constructor_field_list, constructor_field_count);
+
   for (cnt = 0;
        VEC_iterate (constructor_elt, CONSTRUCTOR_ELTS (exp), cnt, ce);
        cnt++, local.field = local.field ? DECL_CHAIN (local.field) : 0)
     {
       local.val = ce->value;
       local.index = NULL_TREE;
@@ -5072,32 +5115,64 @@  output_constructor (tree exp, unsigned H
 		 : "<anonymous>");
 
       /* Eliminate the marker that makes a cast not be an lvalue.  */
       if (local.val != NULL_TREE)
 	STRIP_NOPS (local.val);
 
-      /* Output the current element, using the appropriate helper ...  */
+      constructor_fields[cnt].cnt = cnt;
+      constructor_fields[cnt].val = local.val;
+      constructor_fields[cnt].index = local.index;
+      constructor_fields[cnt].field = local.field;
 
       /* For an array slice not part of an outer bitfield.  */
       if (!outer
 	  && local.index != NULL_TREE
 	  && TREE_CODE (local.index) == RANGE_EXPR)
-	output_constructor_array_range (&local);
+	constructor_fields[cnt].what_to_do = WHAT_ARRAY;
 
       /* For a field that is neither a true bitfield nor part of an outer one,
 	 known to be at least byte aligned and multiple-of-bytes long.  */
       else if (!outer
 	       && (local.field == NULL_TREE
 		   || !CONSTRUCTOR_BITFIELD_P (local.field)))
-	output_constructor_regular_field (&local);
+	constructor_fields[cnt].what_to_do = WHAT_REGULAR;
 
       /* For a true bitfield or part of an outer one.  */
       else
-	output_constructor_bitfield (&local, outer);
+	constructor_fields[cnt].what_to_do = WHAT_BITFIELD;
+
     }
 
+  qsort (constructor_fields, constructor_field_count,
+	 sizeof(constructor_field_list), constructor_field_sort);
+
+  for (cnt = 0;
+       cnt < constructor_field_count;
+       cnt++)
+    {
+      /* Output the current element, using the appropriate helper ...  */
+      local.val = constructor_fields[cnt].val;
+      local.index = constructor_fields[cnt].index;
+      local.field = constructor_fields[cnt].field;
+
+      switch (constructor_fields[cnt].what_to_do)
+	{
+	case WHAT_ARRAY:
+	  output_constructor_array_range (&local);
+	  break;
+	case WHAT_REGULAR:
+	  output_constructor_regular_field (&local);
+	  break;
+	case WHAT_BITFIELD:
+	  output_constructor_bitfield (&local, outer);
+	  break;
+	}
+    }
+
+  XDELETEVEC (constructor_fields);
+
   /* If we are not at toplevel, save the pending data for our caller.
      Otherwise output the pending data and padding zeros as needed. */
   if (outer)
     outer->byte = local.byte;
   else
     {