Patchwork Add a new option "-ftree-bitfield-merge" (patch / doc inside)

login
register
mail settings
Submitter Zoran Jovanovic
Date Aug. 23, 2013, 2:05 p.m.
Message ID <386B40EC5E8DBF459FD11A754D868AD922E333D4@BADAG02.ba.imgtec.org>
Download mbox | patch
Permalink /patch/269465/
State New
Headers show

Comments

Zoran Jovanovic - Aug. 23, 2013, 2:05 p.m.
Hello,
This is new patch version. 
Optimization does not use BIT_FIELD_REF any more, instead it generates new COMPONENT_REF and FIELD_DECL.
Existing Bit field representative is associated with newly created field declaration.
During analysis phase optimization uses bit field representatives when deciding which bit-fields accesses can be merged.
Instead of having separate pass optimization is moved to tree-sra.c and executed with sra early.
New test case involving unions is added.
Also, some other comments received on first patch are applied in new implementation.


Example:

Original code:
  <unnamed-unsigned:3> D.1351;
  <unnamed-unsigned:9> D.1350;
  <unnamed-unsigned:7> D.1349;
  D.1349_2 = p1_1(D)->f1;
  p2_3(D)->f1 = D.1349_2;
  D.1350_4 = p1_1(D)->f2;
  p2_3(D)->f2 = D.1350_4;
  D.1351_5 = p1_1(D)->f3;
  p2_3(D)->f3 = D.1351_5;

Optimized code:
  <unnamed-unsigned:19> D.1358;
  _16 = pr1_2(D)->_field0;
  pr2_4(D)->_field0 = _16;
  
Algorithm works on basic block level and consists of following 3 major steps:
1. Go trough basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure.
2. Identify records that represent adjacent bit field accesses and mark them as merged.
3. Modify trees accordingly.

New command line option "-ftree-bitfield-merge" is introduced.

Tested - passed gcc regression tests.

Changelog -

gcc/ChangeLog:
2013-08-22 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
  * Makefile.in : Added tree-sra.c to GTFILES.
  * common.opt (ftree-bitfield-merge): New option.
  * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
  * tree-sra.c (ssa_bitfield_merge): New function.
  Entry for (-ftree-bitfield-merge).
  (bitfield_stmt_access_pair_htab_hash): New function.
  (bitfield_stmt_access_pair_htab_eq): New function.
  (cmp_access): New function.
  (create_and_insert_access): New function.
  (get_bit_offset): New function.
  (get_merged_bit_field_size): New function.
  (add_stmt_access_pair): New function.
  * dwarf2out.c (simple_type_size_in_bits): moved to tree.c.
  (field_byte_offset): declaration moved to tree.h, static removed.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
  * tree-ssa-sccvn.c (expressions_equal_p): moved to tree.c.
  * tree-ssa-sccvn.h (expressions_equal_p): declaration moved to tree.h.
  * tree.c (expressions_equal_p): moved from tree-ssa-sccvn.c.
  (simple_type_size_in_bits): moved from dwarf2out.c.
  * tree.h (expressions_equal_p): declaration added.
  (field_byte_offset): declaration added.

Patch -



Regards,
Zoran Jovanovic
Joseph S. Myers - Aug. 23, 2013, 10:14 p.m.
On Fri, 23 Aug 2013, Zoran Jovanovic wrote:

> New test case involving unions is added.

Tests for unions should include *execution* tests that the relevant bits 
have the right values after the sequence of assignments.
aldot - Aug. 24, 2013, 9:41 p.m.
On 23 August 2013 16:05:32 Zoran Jovanovic <Zoran.Jovanovic@imgtec.com> wrote:
> Hello,
> This is new patch version. Optimization does not use BIT_FIELD_REF any 
> more, instead it generates new COMPONENT_REF and FIELD_DECL.
> Existing Bit field representative is associated with newly created field 
> declaration.
> During analysis phase optimization uses bit field representatives when 
> deciding which bit-fields accesses can be merged.
> Instead of having separate pass optimization is moved to tree-sra.c and 
> executed with sra early.
> New test case involving unions is added.
> Also, some other comments received on first patch are applied in new 
> implementation.
>
>
> Example:
>
> Original code:
>   <unnamed-unsigned:3> D.1351;
>   <unnamed-unsigned:9> D.1350;
>   <unnamed-unsigned:7> D.1349;
>   D.1349_2 = p1_1(D)->f1;
>   p2_3(D)->f1 = D.1349_2;
>   D.1350_4 = p1_1(D)->f2;
>   p2_3(D)->f2 = D.1350_4;
>   D.1351_5 = p1_1(D)->f3;
>   p2_3(D)->f3 = D.1351_5;
>
> Optimized code:
>   <unnamed-unsigned:19> D.1358;
>   _16 = pr1_2(D)->_field0;
>   pr2_4(D)->_field0 = _16;
>
> Algorithm works on basic block level and consists of following 3 major steps:
> 1. Go trough basic block statements list. If there are statement pairs that 
> implement copy of bit field content from one memory location to another 
> record statements pointers and other necessary data in corresponding data 
> structure.
> 2. Identify records that represent adjacent bit field accesses and mark 
> them as merged.
> 3. Modify trees accordingly.
>
> New command line option "-ftree-bitfield-merge" is introduced.
>
> Tested - passed gcc regression tests.
>
> Changelog -
>
> gcc/ChangeLog:
> 2013-08-22 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
>   * Makefile.in : Added tree-sra.c to GTFILES.
>   * common.opt (ftree-bitfield-merge): New option.
>   * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
>   * tree-sra.c (ssa_bitfield_merge): New function.
>   Entry for (-ftree-bitfield-merge).
>   (bitfield_stmt_access_pair_htab_hash): New function.
>   (bitfield_stmt_access_pair_htab_eq): New function.
>   (cmp_access): New function.
>   (create_and_insert_access): New function.
>   (get_bit_offset): New function.
>   (get_merged_bit_field_size): New function.
>   (add_stmt_access_pair): New function.
>   * dwarf2out.c (simple_type_size_in_bits): moved to tree.c.
>   (field_byte_offset): declaration moved to tree.h, static removed.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
>   * tree-ssa-sccvn.c (expressions_equal_p): moved to tree.c.
>   * tree-ssa-sccvn.h (expressions_equal_p): declaration moved to tree.h.
>   * tree.c (expressions_equal_p): moved from tree-ssa-sccvn.c.
>   (simple_type_size_in_bits): moved from dwarf2out.c.
>   * tree.h (expressions_equal_p): declaration added.
>   (field_byte_offset): declaration added.
>
> Patch -
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 6034046..dad9337 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -3831,6 +3831,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h 
> $(srcdir)/coretypes.h \
>    $(srcdir)/vtable-verify.c \
>    $(srcdir)/asan.c \
>    $(srcdir)/tsan.c $(srcdir)/ipa-devirt.c \
> +  $(srcdir)/tree-sra.c \
>    @all_gtfiles@
>
>  # Compute the list of GT header files from the corresponding C sources,
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 9082280..fe0ecd9 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2160,6 +2160,10 @@ ftree-sra
>  Common Report Var(flag_tree_sra) Optimization
>  Perform scalar replacement of aggregates
>
> +ftree-bitfield-merge
> +Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
> +Enable bit-field merge on trees
> +
>  ftree-ter
>  Common Report Var(flag_tree_ter) Optimization
>  Replace temporary expressions in the SSA->normal pass
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index dae7605..7abe538 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -412,7 +412,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
>  -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> --ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> +-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
>  -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
>  -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
>  -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> @@ -7646,6 +7646,11 @@ pointer alignment information.
>  This pass only operates on local scalar variables and is enabled by default
>  at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
>
> +@item -ftree-bitfield-merge
> +@opindex ftree-bitfield-merge
> +Combines several adjacent bit-field accesses that copy values
> +from one memory location to another into single bit-field access.

into one single
Would be easier to understand, IMHO. Same for the other occurrences in this 
patch.

> +
>  @item -ftree-ccp
>  @opindex ftree-ccp
>  Perform sparse conditional constant propagation (CCP) on trees.  This
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index fc1c3f2..f3530ef 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -3104,8 +3104,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned 
> int);
>  static tree field_type (const_tree);
>  static unsigned int simple_type_align_in_bits (const_tree);
>  static unsigned int simple_decl_align_in_bits (const_tree);
> -static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
> -static HOST_WIDE_INT field_byte_offset (const_tree);
>  static void add_AT_location_description	(dw_die_ref, enum dwarf_attribute,
>  					 dw_loc_list_ref);
>  static void add_data_member_location_attribute (dw_die_ref, tree);
> @@ -10145,25 +10143,6 @@ is_base_type (tree type)
>    return 0;
>  }
>
> -/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
> -   node, return the size in bits for the type if it is a constant, or else
> -   return the alignment for the type if the type's size is not constant, or
> -   else return BITS_PER_WORD if the type actually turns out to be an
> -   ERROR_MARK node.  */
> -
> -static inline unsigned HOST_WIDE_INT
> -simple_type_size_in_bits (const_tree type)
> -{
> -  if (TREE_CODE (type) == ERROR_MARK)
> -    return BITS_PER_WORD;
> -  else if (TYPE_SIZE (type) == NULL_TREE)
> -    return 0;
> -  else if (host_integerp (TYPE_SIZE (type), 1))
> -    return tree_low_cst (TYPE_SIZE (type), 1);
> -  else
> -    return TYPE_ALIGN (type);
> -}
> -
>  /* Similarly, but return a double_int instead of UHWI.  */
>
>  static inline double_int
> @@ -14516,7 +14495,7 @@ round_up_to_align (double_int t, unsigned int align)
>     because the offset is actually variable.  (We can't handle the latter case
>     just yet).  */
>
> -static HOST_WIDE_INT
> +HOST_WIDE_INT
>  field_byte_offset (const_tree decl)
>  {
>    double_int object_offset_in_bits;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> new file mode 100644
> index 0000000..e9e96b7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
> +
> +struct S
> +{
> +  unsigned f1:7;
> +  unsigned f2:9;
> +  unsigned f3:3;
> +  unsigned f4:5;
> +  unsigned f5:1;
> +  unsigned f6:2;
> +};
> +
> +unsigned
> +foo (struct S *p1, struct S *p2, int *ptr)
> +{
> +  p2->f1 = p1->f1;
> +  p2->f2 = p1->f2;
> +  p2->f3 = p1->f3;
> +  *ptr = 7;
> +  p2->f4 = p1->f4;
> +  p2->f5 = p1->f5;
> +  p2->f6 = p1->f6;
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "19" "esra" } } */
> +/* { dg-final { scan-tree-dump "8" "esra"} } */
> +/* { dg-final { cleanup-tree-dump "esra" } } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> new file mode 100644
> index 0000000..c056a30
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> @@ -0,0 +1,42 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
> +
> +struct proba1
> +{
> +  unsigned f1:7;
> +  unsigned f2:5;
> +  unsigned f3:1;
> +  unsigned f4:7;
> +};
> +
> +
> +struct proba2
> +{
> +  unsigned f0:1;
> +  unsigned f1:7;
> +  unsigned f2:5;
> +  unsigned f3:1;
> +  unsigned f4:7;
> +};
> +
> +union proba_un
> +{
> +  struct proba1 a;
> +  struct proba2 b;
> +};
> +
> +
> +int func (union proba_un *pr1, union proba_un *pr2)
> +{
> +  pr2->a.f1 = pr1->a.f1;
> +  pr2->a.f2 = pr1->a.f2;
> +  pr2->a.f3 = pr1->a.f3;
> +  pr2->b.f2 = pr1->b.f2;
> +  pr2->a.f4 = pr1->a.f4;
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "13" "esra" } } */
> +/* { dg-final { scan-tree-dump "7" "esra"} } */
> +/* { dg-final { cleanup-tree-dump "esra" } } */
> +
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> old mode 100644
> new mode 100755
> index 8e3bb81..cf31463
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -91,6 +91,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-inline.h"
>  #include "gimple-pretty-print.h"
>  #include "ipa-inline.h"
> +#include "ggc.h"
>
>  /* Enumeration of all aggregate reductions we can do.  */
>  enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
> @@ -3424,12 +3425,469 @@ perform_intra_sra (void)
>    return ret;
>  }
>
> +/* This optimization combines several adjacent bit-field accesses that copy
> +   values from one memory location to another into single bit-field 
> access.  */

Ditto.
> +
> +/* Data for single bit-field read/write sequence.  */
> +struct GTY (()) bitfield_access_d {
> +  gimple load_stmt;		  /* Bit-field load statement.  */
> +  gimple store_stmt;		  /* Bit-field store statement.  */
> +  unsigned src_offset_words;	  /* Bit-field offset at src in words.  */
> +  unsigned src_bit_offset;	  /* Bit-field offset inside source word.  */
> +  unsigned src_bit_size;	  /* Size of bit-field in source word.  */
> +  unsigned dst_offset_words;	  /* Bit-field offset at dst in words.  */
> +  unsigned dst_bit_offset;	  /* Bit-field offset inside destination
> +				     word.  */
> +  unsigned dst_bit_size;	  /* Size of bit-field in destination word.  */
> +  tree src_addr;		  /* Address of source memory access.  */
> +  tree dst_addr;		  /* Address of destination memory access.  */
> +  bool merged;			  /* True if access is merged with another
> +				     one.  */
> +  bool modified;		  /* True if bit-field size is modified.  */
> +  bool is_barrier;		  /* True if access is barrier (call or mem
> +				     access).  */
> +  struct bitfield_access_d *next; /* Access with which this one is merged.  */
> +  tree bitfield_type;		  /* Field type.  */
> +  tree bitfield_representative;	  /* Bit field representative of original
> +				     declaration.  */
> +  tree field_decl_context;	  /* Context of original bit-field
> +				     declaration.  */
> +};
> +
> +typedef struct bitfield_access_d bitfield_access_o;
> +typedef struct bitfield_access_d *bitfield_access;
> +
> +/* Connecting register with bit-field access sequence that defines value in
> +   that register.  */
> +struct GTY (()) bitfield_stmt_access_pair_d
> +{
> +  gimple stmt;
> +  bitfield_access access;
> +};
> +
> +typedef struct bitfield_stmt_access_pair_d bitfield_stmt_access_pair_o;
> +typedef struct bitfield_stmt_access_pair_d *bitfield_stmt_access_pair;
> +
> +static GTY ((param_is (struct bitfield_stmt_access_pair_d)))
> +  htab_t bitfield_stmt_access_htab;
> +
> +/* Hash table callbacks for bitfield_stmt_access_htab.  */
> +
> +static hashval_t
> +bitfield_stmt_access_pair_htab_hash (const void *p)
> +{
> +  const struct bitfield_stmt_access_pair_d *entry =
> +    (const struct bitfield_stmt_access_pair_d *)p;

Perhaps shorter to say const bitfield_stmt_access_pair here?

> +  return (hashval_t) (uintptr_t) entry->stmt;
> +}
> +
> +static int
> +bitfield_stmt_access_pair_htab_eq (const void *p1, const void *p2)
> +{
> +  const struct bitfield_stmt_access_pair_d *entry1 =
> +    (const struct bitfield_stmt_access_pair_d *)p1;
> +  const struct bitfield_stmt_access_pair_d *entry2 =
> +    (const struct bitfield_stmt_access_pair_d *)p2;

Likewise const bitfield_stmt_access_pair ?
> +  return entry1->stmt == entry2->stmt;
> +}
> +
> +/* Counter used for generating unique names for new fields.  */
> +static unsigned new_field_no;
> +
> +/* Compare two bit-field access records.  */
> +
> +static int
> +cmp_access (const void *p1, const void *p2)
> +{
> +  const bitfield_access a1 = (*(const bitfield_access*)p1);
> +  const bitfield_access a2 = (*(const bitfield_access*)p2);
> +
> +  if (a1->bitfield_representative - a2->bitfield_representative)
> +    return a1->bitfield_representative - a2->bitfield_representative;
> +
> +  if (!expressions_equal_p (a1->src_addr, a1->src_addr))

The second parm has to be a2->src_addr to make sense to me.

> +    return a1 - a2;
> +
> +  if (!expressions_equal_p (a1->dst_addr, a1->dst_addr))

Ditto.

> +    return a1 - a2;
> +
> +  if (a1->src_offset_words - a2->src_offset_words)
> +    return a1->src_offset_words - a2->src_offset_words;
> +
> +  return a1->src_bit_offset - a2->src_bit_offset;
> +}
> +
> +/* Create new bit-field access structure and add it to given bitfield_accesses
> +   htab.  */
> +
> +static bitfield_access
> +create_and_insert_access (vec<bitfield_access>
> +		       *bitfield_accesses)
> +{
> +  bitfield_access access = ggc_alloc_bitfield_access_d ();
> +  memset (access, 0, sizeof (struct bitfield_access_d));
> +  bitfield_accesses->safe_push (access);
> +  return access;
> +}
> +
> +/* Slightly modified add_bit_offset_attribute from dwarf2out.c.  */
> +
> +static inline HOST_WIDE_INT
> +get_bit_offset (tree decl)
> +{
> +  HOST_WIDE_INT object_offset_in_bytes = field_byte_offset (decl);
> +  tree type = DECL_BIT_FIELD_TYPE (decl);
> +  HOST_WIDE_INT bitpos_int;
> +  HOST_WIDE_INT highest_order_object_bit_offset;
> +  HOST_WIDE_INT highest_order_field_bit_offset;
> +  HOST_WIDE_INT bit_offset;
> +
> +  /* Must be a field and a bit-field.  */
> +  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
> +  if (! host_integerp (bit_position (decl), 0)
> +      || ! host_integerp (DECL_SIZE (decl), 1))

Didn't we have integer_zero and integer_onep for these, nowadays?

> +    return -1;
> +
> +  bitpos_int = int_bit_position (decl);
> +
> +  /* Note that the bit offset is always the distance (in bits) from the
> +     highest-order bit of the "containing object" to the highest-order bit of
> +     the bit-field itself.  Since the "high-order end" of any object or field
> +     is different on big-endian and little-endian machines, the computation
> +     below must take account of these differences.  */
> +  highest_order_object_bit_offset = object_offset_in_bytes * BITS_PER_UNIT;
> +  highest_order_field_bit_offset = bitpos_int;
> +
> +  if (! BYTES_BIG_ENDIAN)
> +    {
> +      highest_order_field_bit_offset += tree_low_cst (DECL_SIZE (decl), 0);
> +      highest_order_object_bit_offset += simple_type_size_in_bits (type);
> +    }
> +
> +  bit_offset
> +    = (! BYTES_BIG_ENDIAN
> +       ? highest_order_object_bit_offset - highest_order_field_bit_offset
> +       : highest_order_field_bit_offset - highest_order_object_bit_offset);

I'd manually move the LE store to the if above and do BE in an else to 
improve readability.

> +
> +  return bit_offset;
> +}
> +
> +/* Returns size of combined bitfields.  */
> +
> +static int
> +get_merged_bit_field_size (bitfield_access access)
> +{
> +  bitfield_access tmp_access = access;
> +  int size = 0;

I take it this won't overflow for insanely large bit-fields..

> +
> +  while (tmp_access)
> +  {
> +    size += tmp_access->src_bit_size;
> +    tmp_access = tmp_access->next;
> +  }
> +  return size;
> +}
> +
> +/* Adds new pair consisting of statement and bit-field access structure that
> +   contains it.  */

Params in upper case.

> +
> +static bool add_stmt_access_pair (bitfield_access access, gimple stmt)
> +{
> +  bitfield_stmt_access_pair new_pair;
> +  void **slot;
> +  new_pair = ggc_alloc_bitfield_stmt_access_pair_o ();
> +  new_pair->stmt = stmt;
> +  new_pair->access = access;
> +  slot = htab_find_slot (bitfield_stmt_access_htab, new_pair, INSERT);
> +  if (*slot == HTAB_EMPTY_ENTRY)
> +    {
> +      *slot = new_pair;
> +      return true;
> +    }
> +  return false;
> +}
> +
> +/* Main entry point for the bit-field merge optimization.  */
> +
> +static unsigned int
> +ssa_bitfield_merge (void)
> +{
> +  basic_block bb;
> +  unsigned int todoflags = 0;
> +  vec<bitfield_access> bitfield_accesses;
> +  int ix, iy;
> +  bitfield_access access;
> +  bool cfg_changed = false;
> +
> +  /* In the strict volatile bitfields case, doing code changes here may 
> prevent
> +     other optimizations, in particular in a SLOW_BYTE_ACCESS setting.  */
> +  if (flag_strict_volatile_bitfields> 0)

My mailer does not render a space before the ">", is it missing?

> +    return 0;
> +
> +  FOR_EACH_BB (bb)
> +    {
> +      gimple_stmt_iterator gsi;
> +      vec<bitfield_access> bitfield_accesses_merge = vNULL;
> +      tree prev_representative = NULL_TREE;
> +      bitfield_accesses.create (0);
> +
> +      bitfield_stmt_access_htab
> +	= htab_create_ggc (128, bitfield_stmt_access_pair_htab_hash,
> +			   bitfield_stmt_access_pair_htab_eq, NULL);

This sounds like it allocates htab even for BBs without a bit-field, no? 
Isn't that a bit wasteful?

> +
> +      /* Identify all bitfield copy sequences in the basic-block.  */
> +      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
> +	{
> +	  gimple stmt = gsi_stmt (gsi);
> +	  tree lhs, rhs;
> +	  void **slot;
> +	  struct bitfield_stmt_access_pair_d asdata;
> +
> +	  if (!is_gimple_assign (stmt))
> +	    {
> +	      gsi_next (&gsi);
> +	      continue;
> +	    }
> +
> +	  lhs = gimple_assign_lhs (stmt);
> +	  rhs = gimple_assign_rhs1 (stmt);
> +
> +	  if (TREE_CODE (rhs) == COMPONENT_REF)
> +	    {
> +	      use_operand_p use;
> +	      gimple use_stmt;
> +	      tree op0 = TREE_OPERAND (rhs, 0);
> +	      tree op1 = TREE_OPERAND (rhs, 1);
> +
> +	      if (TREE_CODE (op1) == FIELD_DECL && DECL_BIT_FIELD_TYPE (op1)
> +		  && !TREE_THIS_VOLATILE (op1))
> +		{
> +		  if (single_imm_use (lhs, &use, &use_stmt)
> +		       && is_gimple_assign (use_stmt))
> +		    {
> +		      tree use_lhs = gimple_assign_lhs (use_stmt);
> +		      if (TREE_CODE (use_lhs) == COMPONENT_REF)
> +			{
> +			  tree use_op0 = TREE_OPERAND (use_lhs, 0);
> +			  tree use_op1 = TREE_OPERAND (use_lhs, 1);
> +			  tree tmp_repr = DECL_BIT_FIELD_REPRESENTATIVE (op1);
> +			  if (TREE_CODE (use_op1) == FIELD_DECL
> +			      && DECL_BIT_FIELD_TYPE (use_op1)
> +			      && !TREE_THIS_VOLATILE (use_op1))
> +			    {
> +			      if (prev_representative
> +				  && (prev_representative != tmp_repr))
> +				{
> +				  /* If previous access has different
> +				     representative then barrier is needed
> +				     between it and new access.  */
> +				  access = create_and_insert_access
> +					     (&bitfield_accesses);
> +				  access->is_barrier = true;
> +				}
> +			      prev_representative = tmp_repr;
> +			      /* Create new bit-field access structure.  */
> +			      access = create_and_insert_access
> +					 (&bitfield_accesses);
> +			      /* Collect access data - load instruction.  */
> +			      access->src_bit_size = tree_low_cst
> +						      (DECL_SIZE (op1), 1);
> +			      access->src_bit_offset = get_bit_offset (op1);
> +			      access->src_offset_words =
> +				field_byte_offset (op1) / UNITS_PER_WORD;
> +			      access->src_addr = op0;
> +			      access->load_stmt = gsi_stmt (gsi);
> +			      /* Collect access data - store instruction.  */
> +			      access->dst_bit_size =
> +				tree_low_cst (DECL_SIZE (use_op1), 1);
> +			      access->dst_bit_offset =
> +				get_bit_offset (use_op1);
> +			      access->dst_offset_words =
> +				field_byte_offset (use_op1) / UNITS_PER_WORD;
> +			      access->dst_addr = use_op0;
> +			      access->store_stmt = use_stmt;
> +			      add_stmt_access_pair (access, stmt);
> +			      add_stmt_access_pair (access, use_stmt);
> +			      access->bitfield_type
> +				= DECL_BIT_FIELD_TYPE (use_op1);
> +			      access->bitfield_representative = tmp_repr;
> +			      access->field_decl_context =
> +				DECL_FIELD_CONTEXT (op1);
> +			    }
> +			}
> +		    }
> +		}
> +	    }
> +
> +	  /* Insert barrier for merging if statement is function call or memory
> +	     access.  */
> +	  asdata.stmt = stmt;
> +	  slot
> +	    = htab_find_slot (bitfield_stmt_access_htab, &asdata, NO_INSERT);
> +	  if (!slot
> +	      && ((gimple_code (stmt) == GIMPLE_CALL)
> +		  || (gimple_has_mem_ops (stmt))))
> +	    {
> +	      /* Create new bit-field access structure.  */
> +	      access = create_and_insert_access (&bitfield_accesses);
> +	      /* Mark it as barrier.  */
> +	      access->is_barrier = true;
> +	    }
> +
> +	  gsi_next (&gsi);
> +	}
> +
> +      /* If there are no at least two accesses go to the next basic block.  */
> +      if (bitfield_accesses.length () <= 1)
> +	{
> +	  bitfield_accesses.release ();
> +	  continue;
> +	}
> +
> +      /* Find bit-field accesses that can be merged.  */
> +      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
> +	{
> +	  bitfield_access head_access;
> +	  bitfield_access mrg_access;
> +	  bitfield_access prev_access;
> +	  if (!bitfield_accesses_merge.exists ())
> +	    bitfield_accesses_merge.create (0);
> +
> +	  bitfield_accesses_merge.safe_push (access);
> +
> +	  if (!access->is_barrier
> +	      && !(access == bitfield_accesses.last ()
> +	      && !bitfield_accesses_merge.is_empty ()))
> +	    continue;
> +
> +	  bitfield_accesses_merge.qsort (cmp_access);
> +
> +	  head_access = NULL;
> +	  for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access); iy++)
> +	    {
> +	      if (head_access
> +		  && expressions_equal_p (head_access->src_addr,
> +					  mrg_access->src_addr)
> +		  && expressions_equal_p (head_access->dst_addr,
> +					  mrg_access->dst_addr)
> +		  && prev_access->src_offset_words
> +		     == prev_access->src_offset_words

Really? How come?

> +		  && prev_access->dst_offset_words
> +		     == prev_access->dst_offset_words

Ditto. Did you mean == mrg_access?

> +		  && prev_access->src_bit_offset + prev_access->src_bit_size
> +		     == mrg_access->src_bit_offset
> +		  && prev_access->dst_bit_offset + prev_access->dst_bit_size
> +		     == mrg_access->dst_bit_offset
> +		  && prev_access->bitfield_representative
> +		     == mrg_access->bitfield_representative)
> +		{
> +		  /* Merge conditions are satisfied - merge accesses.  */
> +		  mrg_access->merged = true;
> +		  prev_access->next = mrg_access;
> +		  head_access->modified = true;
> +		  prev_access = mrg_access;
> +		}
> +	      else
> +		head_access = prev_access = mrg_access;
> +	    }
> +	  bitfield_accesses_merge.release ();
> +	  bitfield_accesses_merge = vNULL;
> +	}
> +
> +      /* Modify generated code.  */
> +      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
> +	{
> +	  if (access->merged)
> +	    {
> +	      /* Access merged - remove instructions.  */
> +	      gimple_stmt_iterator tmp_gsi;
> +	      tmp_gsi = gsi_for_stmt (access->load_stmt);
> +	      gsi_remove (&tmp_gsi, true);
> +	      tmp_gsi = gsi_for_stmt (access->store_stmt);
> +	      gsi_remove (&tmp_gsi, true);
> +	    }
> +	  else if (access->modified)
> +	    {
> +	      /* Access modified - modify generated code.  */
> +	      gimple_stmt_iterator tmp_gsi;
> +	      tree tmp_ssa;
> +	      tree itype = make_node (INTEGER_TYPE);
> +	      tree new_rhs;
> +	      tree new_lhs;
> +	      gimple new_stmt;
> +	      char new_field_name [15];
> +	      int decl_size;
> +
> +	      /* Bit-field size changed - modify load statement.  */
> +	      access->src_bit_size = get_merged_bit_field_size (access);
> +
> +	      TYPE_PRECISION (itype) = access->src_bit_size;
> +	      fixup_unsigned_type (itype);
> +
> +	      /* Create new declaration.  */
> +	      tree new_field = make_node (FIELD_DECL);
> +	      sprintf (new_field_name, "_field%d", new_field_no++);

Safe with huge fields?

> +	      DECL_NAME (new_field) = get_identifier (new_field_name);
> +	      TREE_TYPE (new_field) = itype;
> +	      DECL_BIT_FIELD (new_field) = 1;
> +	      DECL_BIT_FIELD_TYPE (new_field) = access->bitfield_type;
> +	      DECL_BIT_FIELD_REPRESENTATIVE (new_field) =
> +		access->bitfield_representative;
> +	      DECL_FIELD_CONTEXT (new_field) = access->field_decl_context;
> +	      DECL_NONADDRESSABLE_P (new_field) = 1;
> +	      DECL_FIELD_OFFSET (new_field) =
> +		build_int_cst (unsigned_type_node, access->src_offset_words);
> +	      DECL_FIELD_BIT_OFFSET (new_field) =
> +		build_int_cst (unsigned_type_node, access->src_bit_offset);
> +	      DECL_SIZE (new_field) = build_int_cst (unsigned_type_node,
> +						     access->src_bit_size);
> +	      decl_size = access->src_bit_size / BITS_PER_UNIT
> +		+ (access->src_bit_size % BITS_PER_UNIT ? 1 : 0);
> +	      DECL_SIZE_UNIT (new_field) =
> +		build_int_cst (unsigned_type_node, decl_size);
> +
> +	      tmp_ssa = make_ssa_name (create_tmp_var (itype, NULL), NULL);

Did Richi add a shorthand for that not too long ago or is this the current 
way already?

Thanks,
> +
> +	      /* Create new component ref.  */
> +	      new_rhs = build3 (COMPONENT_REF, itype, access->src_addr,
> +				new_field, NULL);
> +	      tmp_gsi = gsi_for_stmt (access->load_stmt);
> +	      new_stmt = gimple_build_assign (tmp_ssa, new_rhs);
> +	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
> +	      SSA_NAME_DEF_STMT (tmp_ssa) = new_stmt;
> +	      gsi_remove (&tmp_gsi, true);
> +
> +	      /* Bit-field size changed - modify store statement.  */
> +	      /* Create new component ref.  */
> +	      new_lhs = build3 (COMPONENT_REF, itype, access->dst_addr,
> +				new_field, NULL);
> +	      new_stmt = gimple_build_assign (new_lhs, tmp_ssa);
> +	      tmp_gsi = gsi_for_stmt (access->store_stmt);
> +	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
> +	      gsi_remove (&tmp_gsi, true);
> +	      cfg_changed = true;
> +	    }
> +	}
> +      /* Empty or delete data structures used for basic block.  */
> +      htab_empty (bitfield_stmt_access_htab);
> +      bitfield_accesses.release ();
> +    }
> +
> +  if (cfg_changed)
> +    todoflags |= TODO_cleanup_cfg;
> +
> +  return todoflags;
> +}
> +
>  /* Perform early intraprocedural SRA.  */
>  static unsigned int
>  early_intra_sra (void)
>  {
> +  unsigned int todoflags = 0;
>    sra_mode = SRA_MODE_EARLY_INTRA;
> -  return perform_intra_sra ();
> +  if (flag_tree_bitfield_merge)
> +    todoflags = ssa_bitfield_merge ();
> +  return todoflags | perform_intra_sra ();
>  }
>
>  /* Perform "late" intraprocedural SRA.  */
> @@ -5095,3 +5553,5 @@ make_pass_early_ipa_sra (gcc::context *ctxt)
>  {
>    return new pass_early_ipa_sra (ctxt);
>  }
> +
> +#include "gt-tree-sra.h"
> diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
> index 6886efb..afc73a6 100644
> --- a/gcc/tree-ssa-sccvn.c
> +++ b/gcc/tree-ssa-sccvn.c
> @@ -4176,29 +4176,6 @@ get_next_value_id (void)
>    return next_value_id++;
>  }
>
> -
> -/* Compare two expressions E1 and E2 and return true if they are equal.  */
> -
> -bool
> -expressions_equal_p (tree e1, tree e2)
> -{
> -  /* The obvious case.  */
> -  if (e1 == e2)
> -    return true;
> -
> -  /* If only one of them is null, they cannot be equal.  */
> -  if (!e1 || !e2)
> -    return false;
> -
> -  /* Now perform the actual comparison.  */
> -  if (TREE_CODE (e1) == TREE_CODE (e2)
> -      && operand_equal_p (e1, e2, OEP_PURE_SAME))
> -    return true;
> -
> -  return false;
> -}
> -
> -
>  /* Return true if the nary operation NARY may trap.  This is a copy
>     of stmt_could_throw_1_p adjusted to the SCCVN IL.  */
>
> diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
> index 94e3603..707b18c 100644
> --- a/gcc/tree-ssa-sccvn.h
> +++ b/gcc/tree-ssa-sccvn.h
> @@ -21,10 +21,6 @@
>  #ifndef TREE_SSA_SCCVN_H
>  #define TREE_SSA_SCCVN_H
>
> -/* In tree-ssa-sccvn.c  */
> -bool expressions_equal_p (tree, tree);
> -
> -
>  /* TOP of the VN lattice.  */
>  extern tree VN_TOP;
>
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 1947105..7e8d4a8 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -12108,4 +12108,44 @@ contains_bitfld_component_ref_p (const_tree ref)
>    return false;
>  }
>
> +/* Compare two expressions E1 and E2 and return true if they are equal.  */
> +
> +bool
> +expressions_equal_p (tree e1, tree e2)
> +{
> +  /* The obvious case.  */
> +  if (e1 == e2)
> +    return true;
> +
> +  /* If only one of them is null, they cannot be equal.  */
> +  if (!e1 || !e2)
> +    return false;
> +
> +  /* Now perform the actual comparison.  */
> +  if (TREE_CODE (e1) == TREE_CODE (e2)
> +      && operand_equal_p (e1, e2, OEP_PURE_SAME))
> +    return true;
> +
> +  return false;
> +}
> +
> +/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
> +   node, return the size in bits for the type if it is a constant, or else
> +   return the alignment for the type if the type's size is not constant, or
> +   else return BITS_PER_WORD if the type actually turns out to be an
> +   ERROR_MARK node.  */
> +
> +unsigned HOST_WIDE_INT
> +simple_type_size_in_bits (const_tree type)
> +{
> +  if (TREE_CODE (type) == ERROR_MARK)
> +    return BITS_PER_WORD;
> +  else if (TYPE_SIZE (type) == NULL_TREE)
> +    return 0;
> +  else if (host_integerp (TYPE_SIZE (type), 1))
> +    return tree_low_cst (TYPE_SIZE (type), 1);
> +  else
> +    return TYPE_ALIGN (type);
> +}
> +
>  #include "gt-tree.h"
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 84bd699..0ea2203 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -5981,6 +5981,7 @@ extern tree get_ref_base_and_extent (tree, 
> HOST_WIDE_INT *,
>  				     HOST_WIDE_INT *, HOST_WIDE_INT *);
>  extern bool contains_bitfld_component_ref_p (const_tree);
>  extern bool type_in_anonymous_namespace_p (tree);
> +extern bool expressions_equal_p (tree e1, tree e2);
>
>  /* In tree-nested.c */
>  extern tree build_addr (tree, tree);
> @@ -6502,6 +6503,10 @@ extern bool block_may_fallthru (const_tree);
>  /* In vtable-verify.c.  */
>  extern void save_vtable_map_decl (tree);
>
> +/* In dwarf2out.c.  */
> +HOST_WIDE_INT
> +field_byte_offset (const_tree decl);
> +
>  ?
>  /* Functional interface to the builtin functions.  */
>
> @@ -6613,5 +6618,6 @@ builtin_decl_implicit_p (enum built_in_function fncode)
>  #endif	/* NO_DOLLAR_IN_LABEL */
>  #endif	/* NO_DOT_IN_LABEL */
>
> +extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type);
>
>  #endif  /* GCC_TREE_H  */
>
>
> Regards,
> Zoran Jovanovic


Sent with AquaMail for Android
http://www.aqua-mail.com

Patch

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 6034046..dad9337 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3831,6 +3831,7 @@  GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/vtable-verify.c \
   $(srcdir)/asan.c \
   $(srcdir)/tsan.c $(srcdir)/ipa-devirt.c \
+  $(srcdir)/tree-sra.c \
   @all_gtfiles@
 
 # Compute the list of GT header files from the corresponding C sources,
diff --git a/gcc/common.opt b/gcc/common.opt
index 9082280..fe0ecd9 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2160,6 +2160,10 @@  ftree-sra
 Common Report Var(flag_tree_sra) Optimization
 Perform scalar replacement of aggregates
 
+ftree-bitfield-merge
+Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
+Enable bit-field merge on trees
+
 ftree-ter
 Common Report Var(flag_tree_ter) Optimization
 Replace temporary expressions in the SSA->normal pass
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index dae7605..7abe538 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,7 +412,7 @@  Objective-C and Objective-C++ Dialects}.
 -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
 -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
--ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
+-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
 -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
 -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
 -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
@@ -7646,6 +7646,11 @@  pointer alignment information.
 This pass only operates on local scalar variables and is enabled by default
 at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
 
+@item -ftree-bitfield-merge
+@opindex ftree-bitfield-merge
+Combines several adjacent bit-field accesses that copy values
+from one memory location to another into single bit-field access.
+
 @item -ftree-ccp
 @opindex ftree-ccp
 Perform sparse conditional constant propagation (CCP) on trees.  This
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index fc1c3f2..f3530ef 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3104,8 +3104,6 @@  static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned int);
 static tree field_type (const_tree);
 static unsigned int simple_type_align_in_bits (const_tree);
 static unsigned int simple_decl_align_in_bits (const_tree);
-static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
-static HOST_WIDE_INT field_byte_offset (const_tree);
 static void add_AT_location_description	(dw_die_ref, enum dwarf_attribute,
 					 dw_loc_list_ref);
 static void add_data_member_location_attribute (dw_die_ref, tree);
@@ -10145,25 +10143,6 @@  is_base_type (tree type)
   return 0;
 }
 
-/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
-   node, return the size in bits for the type if it is a constant, or else
-   return the alignment for the type if the type's size is not constant, or
-   else return BITS_PER_WORD if the type actually turns out to be an
-   ERROR_MARK node.  */
-
-static inline unsigned HOST_WIDE_INT
-simple_type_size_in_bits (const_tree type)
-{
-  if (TREE_CODE (type) == ERROR_MARK)
-    return BITS_PER_WORD;
-  else if (TYPE_SIZE (type) == NULL_TREE)
-    return 0;
-  else if (host_integerp (TYPE_SIZE (type), 1))
-    return tree_low_cst (TYPE_SIZE (type), 1);
-  else
-    return TYPE_ALIGN (type);
-}
-
 /* Similarly, but return a double_int instead of UHWI.  */
 
 static inline double_int
@@ -14516,7 +14495,7 @@  round_up_to_align (double_int t, unsigned int align)
    because the offset is actually variable.  (We can't handle the latter case
    just yet).  */
 
-static HOST_WIDE_INT
+HOST_WIDE_INT
 field_byte_offset (const_tree decl)
 {
   double_int object_offset_in_bits;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
new file mode 100644
index 0000000..e9e96b7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
@@ -0,0 +1,30 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
+
+struct S
+{
+  unsigned f1:7;
+  unsigned f2:9;
+  unsigned f3:3;
+  unsigned f4:5;
+  unsigned f5:1;
+  unsigned f6:2;
+};
+
+unsigned
+foo (struct S *p1, struct S *p2, int *ptr)
+{
+  p2->f1 = p1->f1;
+  p2->f2 = p1->f2;
+  p2->f3 = p1->f3;
+  *ptr = 7;
+  p2->f4 = p1->f4;
+  p2->f5 = p1->f5;
+  p2->f6 = p1->f6;
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "19" "esra" } } */
+/* { dg-final { scan-tree-dump "8" "esra"} } */
+/* { dg-final { cleanup-tree-dump "esra" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
new file mode 100644
index 0000000..c056a30
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
@@ -0,0 +1,42 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
+
+struct proba1
+{
+  unsigned f1:7;
+  unsigned f2:5;
+  unsigned f3:1;
+  unsigned f4:7;
+};
+
+
+struct proba2
+{
+  unsigned f0:1;
+  unsigned f1:7;
+  unsigned f2:5;
+  unsigned f3:1;
+  unsigned f4:7;
+};
+
+union proba_un
+{
+  struct proba1 a;
+  struct proba2 b;
+};
+
+
+int func (union proba_un *pr1, union proba_un *pr2)
+{
+  pr2->a.f1 = pr1->a.f1;
+  pr2->a.f2 = pr1->a.f2;
+  pr2->a.f3 = pr1->a.f3;
+  pr2->b.f2 = pr1->b.f2;
+  pr2->a.f4 = pr1->a.f4;
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "13" "esra" } } */
+/* { dg-final { scan-tree-dump "7" "esra"} } */
+/* { dg-final { cleanup-tree-dump "esra" } } */
+
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
old mode 100644
new mode 100755
index 8e3bb81..cf31463
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -91,6 +91,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-inline.h"
 #include "gimple-pretty-print.h"
 #include "ipa-inline.h"
+#include "ggc.h"
 
 /* Enumeration of all aggregate reductions we can do.  */
 enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
@@ -3424,12 +3425,469 @@  perform_intra_sra (void)
   return ret;
 }
 
+/* This optimization combines several adjacent bit-field accesses that copy
+   values from one memory location to another into single bit-field access.  */
+
+/* Data for single bit-field read/write sequence.  */
+struct GTY (()) bitfield_access_d {
+  gimple load_stmt;		  /* Bit-field load statement.  */
+  gimple store_stmt;		  /* Bit-field store statement.  */
+  unsigned src_offset_words;	  /* Bit-field offset at src in words.  */
+  unsigned src_bit_offset;	  /* Bit-field offset inside source word.  */
+  unsigned src_bit_size;	  /* Size of bit-field in source word.  */
+  unsigned dst_offset_words;	  /* Bit-field offset at dst in words.  */
+  unsigned dst_bit_offset;	  /* Bit-field offset inside destination
+				     word.  */
+  unsigned dst_bit_size;	  /* Size of bit-field in destination word.  */
+  tree src_addr;		  /* Address of source memory access.  */
+  tree dst_addr;		  /* Address of destination memory access.  */
+  bool merged;			  /* True if access is merged with another
+				     one.  */
+  bool modified;		  /* True if bit-field size is modified.  */
+  bool is_barrier;		  /* True if access is barrier (call or mem
+				     access).  */
+  struct bitfield_access_d *next; /* Access with which this one is merged.  */
+  tree bitfield_type;		  /* Field type.  */
+  tree bitfield_representative;	  /* Bit field representative of original
+				     declaration.  */
+  tree field_decl_context;	  /* Context of original bit-field
+				     declaration.  */
+};
+
+typedef struct bitfield_access_d bitfield_access_o;
+typedef struct bitfield_access_d *bitfield_access;
+
+/* Connecting register with bit-field access sequence that defines value in
+   that register.  */
+struct GTY (()) bitfield_stmt_access_pair_d
+{
+  gimple stmt;
+  bitfield_access access;
+};
+
+typedef struct bitfield_stmt_access_pair_d bitfield_stmt_access_pair_o;
+typedef struct bitfield_stmt_access_pair_d *bitfield_stmt_access_pair;
+
+static GTY ((param_is (struct bitfield_stmt_access_pair_d)))
+  htab_t bitfield_stmt_access_htab;
+
+/* Hash table callbacks for bitfield_stmt_access_htab.  */
+
+static hashval_t
+bitfield_stmt_access_pair_htab_hash (const void *p)
+{
+  const struct bitfield_stmt_access_pair_d *entry =
+    (const struct bitfield_stmt_access_pair_d *)p;
+  return (hashval_t) (uintptr_t) entry->stmt;
+}
+
+static int
+bitfield_stmt_access_pair_htab_eq (const void *p1, const void *p2)
+{
+  const struct bitfield_stmt_access_pair_d *entry1 =
+    (const struct bitfield_stmt_access_pair_d *)p1;
+  const struct bitfield_stmt_access_pair_d *entry2 =
+    (const struct bitfield_stmt_access_pair_d *)p2;
+  return entry1->stmt == entry2->stmt;
+}
+
+/* Counter used for generating unique names for new fields.  */
+static unsigned new_field_no;
+
+/* Compare two bit-field access records.  */
+
+static int
+cmp_access (const void *p1, const void *p2)
+{
+  const bitfield_access a1 = (*(const bitfield_access*)p1);
+  const bitfield_access a2 = (*(const bitfield_access*)p2);
+
+  if (a1->bitfield_representative - a2->bitfield_representative)
+    return a1->bitfield_representative - a2->bitfield_representative;
+
+  if (!expressions_equal_p (a1->src_addr, a1->src_addr))
+    return a1 - a2;
+
+  if (!expressions_equal_p (a1->dst_addr, a1->dst_addr))
+    return a1 - a2;
+
+  if (a1->src_offset_words - a2->src_offset_words)
+    return a1->src_offset_words - a2->src_offset_words;
+
+  return a1->src_bit_offset - a2->src_bit_offset;
+}
+
+/* Create new bit-field access structure and add it to given bitfield_accesses
+   htab.  */
+
+static bitfield_access
+create_and_insert_access (vec<bitfield_access>
+		       *bitfield_accesses)
+{
+  bitfield_access access = ggc_alloc_bitfield_access_d ();
+  memset (access, 0, sizeof (struct bitfield_access_d));
+  bitfield_accesses->safe_push (access);
+  return access;
+}
+
+/* Slightly modified add_bit_offset_attribute from dwarf2out.c.  */
+
+static inline HOST_WIDE_INT
+get_bit_offset (tree decl)
+{
+  HOST_WIDE_INT object_offset_in_bytes = field_byte_offset (decl);
+  tree type = DECL_BIT_FIELD_TYPE (decl);
+  HOST_WIDE_INT bitpos_int;
+  HOST_WIDE_INT highest_order_object_bit_offset;
+  HOST_WIDE_INT highest_order_field_bit_offset;
+  HOST_WIDE_INT bit_offset;
+
+  /* Must be a field and a bit-field.  */
+  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
+  if (! host_integerp (bit_position (decl), 0)
+      || ! host_integerp (DECL_SIZE (decl), 1))
+    return -1;
+
+  bitpos_int = int_bit_position (decl);
+
+  /* Note that the bit offset is always the distance (in bits) from the
+     highest-order bit of the "containing object" to the highest-order bit of
+     the bit-field itself.  Since the "high-order end" of any object or field
+     is different on big-endian and little-endian machines, the computation
+     below must take account of these differences.  */
+  highest_order_object_bit_offset = object_offset_in_bytes * BITS_PER_UNIT;
+  highest_order_field_bit_offset = bitpos_int;
+
+  if (! BYTES_BIG_ENDIAN)
+    {
+      highest_order_field_bit_offset += tree_low_cst (DECL_SIZE (decl), 0);
+      highest_order_object_bit_offset += simple_type_size_in_bits (type);
+    }
+
+  bit_offset
+    = (! BYTES_BIG_ENDIAN
+       ? highest_order_object_bit_offset - highest_order_field_bit_offset
+       : highest_order_field_bit_offset - highest_order_object_bit_offset);
+
+  return bit_offset;
+}
+
+/* Returns size of combined bitfields.  */
+
+static int
+get_merged_bit_field_size (bitfield_access access)
+{
+  bitfield_access tmp_access = access;
+  int size = 0;
+
+  while (tmp_access)
+  {
+    size += tmp_access->src_bit_size;
+    tmp_access = tmp_access->next;
+  }
+  return size;
+}
+
+/* Adds new pair consisting of statement and bit-field access structure that
+   contains it.  */
+
+static bool add_stmt_access_pair (bitfield_access access, gimple stmt)
+{
+  bitfield_stmt_access_pair new_pair;
+  void **slot;
+  new_pair = ggc_alloc_bitfield_stmt_access_pair_o ();
+  new_pair->stmt = stmt;
+  new_pair->access = access;
+  slot = htab_find_slot (bitfield_stmt_access_htab, new_pair, INSERT);
+  if (*slot == HTAB_EMPTY_ENTRY)
+    {
+      *slot = new_pair;
+      return true;
+    }
+  return false;
+}
+
+/* Main entry point for the bit-field merge optimization.  */
+
+static unsigned int
+ssa_bitfield_merge (void)
+{
+  basic_block bb;
+  unsigned int todoflags = 0;
+  vec<bitfield_access> bitfield_accesses;
+  int ix, iy;
+  bitfield_access access;
+  bool cfg_changed = false;
+
+  /* In the strict volatile bitfields case, doing code changes here may prevent
+     other optimizations, in particular in a SLOW_BYTE_ACCESS setting.  */
+  if (flag_strict_volatile_bitfields> 0)
+    return 0;
+
+  FOR_EACH_BB (bb)
+    {
+      gimple_stmt_iterator gsi;
+      vec<bitfield_access> bitfield_accesses_merge = vNULL;
+      tree prev_representative = NULL_TREE;
+      bitfield_accesses.create (0);
+
+      bitfield_stmt_access_htab
+	= htab_create_ggc (128, bitfield_stmt_access_pair_htab_hash,
+			   bitfield_stmt_access_pair_htab_eq, NULL);
+
+      /* Identify all bitfield copy sequences in the basic-block.  */
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+	{
+	  gimple stmt = gsi_stmt (gsi);
+	  tree lhs, rhs;
+	  void **slot;
+	  struct bitfield_stmt_access_pair_d asdata;
+
+	  if (!is_gimple_assign (stmt))
+	    {
+	      gsi_next (&gsi);
+	      continue;
+	    }
+
+	  lhs = gimple_assign_lhs (stmt);
+	  rhs = gimple_assign_rhs1 (stmt);
+
+	  if (TREE_CODE (rhs) == COMPONENT_REF)
+	    {
+	      use_operand_p use;
+	      gimple use_stmt;
+	      tree op0 = TREE_OPERAND (rhs, 0);
+	      tree op1 = TREE_OPERAND (rhs, 1);
+
+	      if (TREE_CODE (op1) == FIELD_DECL && DECL_BIT_FIELD_TYPE (op1)
+		  && !TREE_THIS_VOLATILE (op1))
+		{
+		  if (single_imm_use (lhs, &use, &use_stmt)
+		       && is_gimple_assign (use_stmt))
+		    {
+		      tree use_lhs = gimple_assign_lhs (use_stmt);
+		      if (TREE_CODE (use_lhs) == COMPONENT_REF)
+			{
+			  tree use_op0 = TREE_OPERAND (use_lhs, 0);
+			  tree use_op1 = TREE_OPERAND (use_lhs, 1);
+			  tree tmp_repr = DECL_BIT_FIELD_REPRESENTATIVE (op1);
+			  if (TREE_CODE (use_op1) == FIELD_DECL
+			      && DECL_BIT_FIELD_TYPE (use_op1)
+			      && !TREE_THIS_VOLATILE (use_op1))
+			    {
+			      if (prev_representative
+				  && (prev_representative != tmp_repr))
+				{
+				  /* If previous access has different
+				     representative then barrier is needed
+				     between it and new access.  */
+				  access = create_and_insert_access
+					     (&bitfield_accesses);
+				  access->is_barrier = true;
+				}
+			      prev_representative = tmp_repr;
+			      /* Create new bit-field access structure.  */
+			      access = create_and_insert_access
+					 (&bitfield_accesses);
+			      /* Collect access data - load instruction.  */
+			      access->src_bit_size = tree_low_cst
+						      (DECL_SIZE (op1), 1);
+			      access->src_bit_offset = get_bit_offset (op1);
+			      access->src_offset_words =
+				field_byte_offset (op1) / UNITS_PER_WORD;
+			      access->src_addr = op0;
+			      access->load_stmt = gsi_stmt (gsi);
+			      /* Collect access data - store instruction.  */
+			      access->dst_bit_size =
+				tree_low_cst (DECL_SIZE (use_op1), 1);
+			      access->dst_bit_offset =
+				get_bit_offset (use_op1);
+			      access->dst_offset_words =
+				field_byte_offset (use_op1) / UNITS_PER_WORD;
+			      access->dst_addr = use_op0;
+			      access->store_stmt = use_stmt;
+			      add_stmt_access_pair (access, stmt);
+			      add_stmt_access_pair (access, use_stmt);
+			      access->bitfield_type
+				= DECL_BIT_FIELD_TYPE (use_op1);
+			      access->bitfield_representative = tmp_repr;
+			      access->field_decl_context =
+				DECL_FIELD_CONTEXT (op1);
+			    }
+			}
+		    }
+		}
+	    }
+
+	  /* Insert barrier for merging if statement is function call or memory
+	     access.  */
+	  asdata.stmt = stmt;
+	  slot
+	    = htab_find_slot (bitfield_stmt_access_htab, &asdata, NO_INSERT);
+	  if (!slot
+	      && ((gimple_code (stmt) == GIMPLE_CALL)
+		  || (gimple_has_mem_ops (stmt))))
+	    {
+	      /* Create new bit-field access structure.  */
+	      access = create_and_insert_access (&bitfield_accesses);
+	      /* Mark it as barrier.  */
+	      access->is_barrier = true;
+	    }
+
+	  gsi_next (&gsi);
+	}
+
+      /* If there are no at least two accesses go to the next basic block.  */
+      if (bitfield_accesses.length () <= 1)
+	{
+	  bitfield_accesses.release ();
+	  continue;
+	}
+
+      /* Find bit-field accesses that can be merged.  */
+      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
+	{
+	  bitfield_access head_access;
+	  bitfield_access mrg_access;
+	  bitfield_access prev_access;
+	  if (!bitfield_accesses_merge.exists ())
+	    bitfield_accesses_merge.create (0);
+
+	  bitfield_accesses_merge.safe_push (access);
+
+	  if (!access->is_barrier
+	      && !(access == bitfield_accesses.last ()
+	      && !bitfield_accesses_merge.is_empty ()))
+	    continue;
+
+	  bitfield_accesses_merge.qsort (cmp_access);
+
+	  head_access = NULL;
+	  for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access); iy++)
+	    {
+	      if (head_access
+		  && expressions_equal_p (head_access->src_addr,
+					  mrg_access->src_addr)
+		  && expressions_equal_p (head_access->dst_addr,
+					  mrg_access->dst_addr)
+		  && prev_access->src_offset_words
+		     == prev_access->src_offset_words
+		  && prev_access->dst_offset_words
+		     == prev_access->dst_offset_words
+		  && prev_access->src_bit_offset + prev_access->src_bit_size
+		     == mrg_access->src_bit_offset
+		  && prev_access->dst_bit_offset + prev_access->dst_bit_size
+		     == mrg_access->dst_bit_offset
+		  && prev_access->bitfield_representative
+		     == mrg_access->bitfield_representative)
+		{
+		  /* Merge conditions are satisfied - merge accesses.  */
+		  mrg_access->merged = true;
+		  prev_access->next = mrg_access;
+		  head_access->modified = true;
+		  prev_access = mrg_access;
+		}
+	      else
+		head_access = prev_access = mrg_access;
+	    }
+	  bitfield_accesses_merge.release ();
+	  bitfield_accesses_merge = vNULL;
+	}
+
+      /* Modify generated code.  */
+      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
+	{
+	  if (access->merged)
+	    {
+	      /* Access merged - remove instructions.  */
+	      gimple_stmt_iterator tmp_gsi;
+	      tmp_gsi = gsi_for_stmt (access->load_stmt);
+	      gsi_remove (&tmp_gsi, true);
+	      tmp_gsi = gsi_for_stmt (access->store_stmt);
+	      gsi_remove (&tmp_gsi, true);
+	    }
+	  else if (access->modified)
+	    {
+	      /* Access modified - modify generated code.  */
+	      gimple_stmt_iterator tmp_gsi;
+	      tree tmp_ssa;
+	      tree itype = make_node (INTEGER_TYPE);
+	      tree new_rhs;
+	      tree new_lhs;
+	      gimple new_stmt;
+	      char new_field_name [15];
+	      int decl_size;
+
+	      /* Bit-field size changed - modify load statement.  */
+	      access->src_bit_size = get_merged_bit_field_size (access);
+
+	      TYPE_PRECISION (itype) = access->src_bit_size;
+	      fixup_unsigned_type (itype);
+
+	      /* Create new declaration.  */
+	      tree new_field = make_node (FIELD_DECL);
+	      sprintf (new_field_name, "_field%d", new_field_no++);
+	      DECL_NAME (new_field) = get_identifier (new_field_name);
+	      TREE_TYPE (new_field) = itype;
+	      DECL_BIT_FIELD (new_field) = 1;
+	      DECL_BIT_FIELD_TYPE (new_field) = access->bitfield_type;
+	      DECL_BIT_FIELD_REPRESENTATIVE (new_field) =
+		access->bitfield_representative;
+	      DECL_FIELD_CONTEXT (new_field) = access->field_decl_context;
+	      DECL_NONADDRESSABLE_P (new_field) = 1;
+	      DECL_FIELD_OFFSET (new_field) =
+		build_int_cst (unsigned_type_node, access->src_offset_words);
+	      DECL_FIELD_BIT_OFFSET (new_field) =
+		build_int_cst (unsigned_type_node, access->src_bit_offset);
+	      DECL_SIZE (new_field) = build_int_cst (unsigned_type_node,
+						     access->src_bit_size);
+	      decl_size = access->src_bit_size / BITS_PER_UNIT
+		+ (access->src_bit_size % BITS_PER_UNIT ? 1 : 0);
+	      DECL_SIZE_UNIT (new_field) =
+		build_int_cst (unsigned_type_node, decl_size);
+
+	      tmp_ssa = make_ssa_name (create_tmp_var (itype, NULL), NULL);
+
+	      /* Create new component ref.  */
+	      new_rhs = build3 (COMPONENT_REF, itype, access->src_addr,
+				new_field, NULL);
+	      tmp_gsi = gsi_for_stmt (access->load_stmt);
+	      new_stmt = gimple_build_assign (tmp_ssa, new_rhs);
+	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
+	      SSA_NAME_DEF_STMT (tmp_ssa) = new_stmt;
+	      gsi_remove (&tmp_gsi, true);
+
+	      /* Bit-field size changed - modify store statement.  */
+	      /* Create new component ref.  */
+	      new_lhs = build3 (COMPONENT_REF, itype, access->dst_addr,
+				new_field, NULL);
+	      new_stmt = gimple_build_assign (new_lhs, tmp_ssa);
+	      tmp_gsi = gsi_for_stmt (access->store_stmt);
+	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
+	      gsi_remove (&tmp_gsi, true);
+	      cfg_changed = true;
+	    }
+	}
+      /* Empty or delete data structures used for basic block.  */
+      htab_empty (bitfield_stmt_access_htab);
+      bitfield_accesses.release ();
+    }
+
+  if (cfg_changed)
+    todoflags |= TODO_cleanup_cfg;
+
+  return todoflags;
+}
+
 /* Perform early intraprocedural SRA.  */
 static unsigned int
 early_intra_sra (void)
 {
+  unsigned int todoflags = 0;
   sra_mode = SRA_MODE_EARLY_INTRA;
-  return perform_intra_sra ();
+  if (flag_tree_bitfield_merge)
+    todoflags = ssa_bitfield_merge ();
+  return todoflags | perform_intra_sra ();
 }
 
 /* Perform "late" intraprocedural SRA.  */
@@ -5095,3 +5553,5 @@  make_pass_early_ipa_sra (gcc::context *ctxt)
 {
   return new pass_early_ipa_sra (ctxt);
 }
+
+#include "gt-tree-sra.h"
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 6886efb..afc73a6 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -4176,29 +4176,6 @@  get_next_value_id (void)
   return next_value_id++;
 }
 
-
-/* Compare two expressions E1 and E2 and return true if they are equal.  */
-
-bool
-expressions_equal_p (tree e1, tree e2)
-{
-  /* The obvious case.  */
-  if (e1 == e2)
-    return true;
-
-  /* If only one of them is null, they cannot be equal.  */
-  if (!e1 || !e2)
-    return false;
-
-  /* Now perform the actual comparison.  */
-  if (TREE_CODE (e1) == TREE_CODE (e2)
-      && operand_equal_p (e1, e2, OEP_PURE_SAME))
-    return true;
-
-  return false;
-}
-
-
 /* Return true if the nary operation NARY may trap.  This is a copy
    of stmt_could_throw_1_p adjusted to the SCCVN IL.  */
 
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index 94e3603..707b18c 100644
--- a/gcc/tree-ssa-sccvn.h
+++ b/gcc/tree-ssa-sccvn.h
@@ -21,10 +21,6 @@ 
 #ifndef TREE_SSA_SCCVN_H
 #define TREE_SSA_SCCVN_H
 
-/* In tree-ssa-sccvn.c  */
-bool expressions_equal_p (tree, tree);
-
-
 /* TOP of the VN lattice.  */
 extern tree VN_TOP;
 
diff --git a/gcc/tree.c b/gcc/tree.c
index 1947105..7e8d4a8 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -12108,4 +12108,44 @@  contains_bitfld_component_ref_p (const_tree ref)
   return false;
 }
 
+/* Compare two expressions E1 and E2 and return true if they are equal.  */
+
+bool
+expressions_equal_p (tree e1, tree e2)
+{
+  /* The obvious case.  */
+  if (e1 == e2)
+    return true;
+
+  /* If only one of them is null, they cannot be equal.  */
+  if (!e1 || !e2)
+    return false;
+
+  /* Now perform the actual comparison.  */
+  if (TREE_CODE (e1) == TREE_CODE (e2)
+      && operand_equal_p (e1, e2, OEP_PURE_SAME))
+    return true;
+
+  return false;
+}
+
+/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
+   node, return the size in bits for the type if it is a constant, or else
+   return the alignment for the type if the type's size is not constant, or
+   else return BITS_PER_WORD if the type actually turns out to be an
+   ERROR_MARK node.  */
+
+unsigned HOST_WIDE_INT
+simple_type_size_in_bits (const_tree type)
+{
+  if (TREE_CODE (type) == ERROR_MARK)
+    return BITS_PER_WORD;
+  else if (TYPE_SIZE (type) == NULL_TREE)
+    return 0;
+  else if (host_integerp (TYPE_SIZE (type), 1))
+    return tree_low_cst (TYPE_SIZE (type), 1);
+  else
+    return TYPE_ALIGN (type);
+}
+
 #include "gt-tree.h"
diff --git a/gcc/tree.h b/gcc/tree.h
index 84bd699..0ea2203 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5981,6 +5981,7 @@  extern tree get_ref_base_and_extent (tree, HOST_WIDE_INT *,
 				     HOST_WIDE_INT *, HOST_WIDE_INT *);
 extern bool contains_bitfld_component_ref_p (const_tree);
 extern bool type_in_anonymous_namespace_p (tree);
+extern bool expressions_equal_p (tree e1, tree e2);
 
 /* In tree-nested.c */
 extern tree build_addr (tree, tree);
@@ -6502,6 +6503,10 @@  extern bool block_may_fallthru (const_tree);
 /* In vtable-verify.c.  */
 extern void save_vtable_map_decl (tree);
 
+/* In dwarf2out.c.  */
+HOST_WIDE_INT
+field_byte_offset (const_tree decl);
+
 ?
 /* Functional interface to the builtin functions.  */
 
@@ -6613,5 +6618,6 @@  builtin_decl_implicit_p (enum built_in_function fncode)
 #endif	/* NO_DOLLAR_IN_LABEL */
 #endif	/* NO_DOT_IN_LABEL */
 
+extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type);
 
 #endif  /* GCC_TREE_H  */