From patchwork Mon Jul 25 17:07:06 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aldy Hernandez X-Patchwork-Id: 106720 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id B15C1B70B2 for ; Tue, 26 Jul 2011 03:07:36 +1000 (EST) Received: (qmail 26002 invoked by alias); 25 Jul 2011 17:07:34 -0000 Received: (qmail 25981 invoked by uid 22791); 25 Jul 2011 17:07:29 -0000 X-SWARE-Spam-Status: No, hits=-7.1 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, SPF_HELO_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 25 Jul 2011 17:07:08 +0000 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p6PH78VG013472 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 25 Jul 2011 13:07:08 -0400 Received: from austin.quesejoda.com (vpn-227-100.phx2.redhat.com [10.3.227.100]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p6PH77jW024570; Mon, 25 Jul 2011 13:07:07 -0400 Message-ID: <4E2DA2BA.1010003@redhat.com> Date: Mon, 25 Jul 2011 12:07:06 -0500 From: Aldy Hernandez User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc14 Lightning/1.0b3pre Thunderbird/3.1.10 MIME-Version: 1.0 To: Jason Merrill CC: Jeff Law , gcc-patches , Jakub Jelinek Subject: Re: [C++0x] contiguous bitfields race implementation References: <4DC8155A.3040401@redhat.com> <4DC82136.6030802@redhat.com> <4DC83549.8010709@redhat.com> <4DC83F49.5020101@redhat.com> <4DCD8412.9020005@redhat.com> <4DD440EA.2080500@redhat.com> <4DD5AA47.4000902@redhat.com> <4DDE8CE6.40203@redhat.com> <4DDE9040.8000807@redhat.com> <4DDE99D2.4020502@redhat.com> <4DDE9DED.6040801@redhat.com> <4DDFF90E.7070408@redhat.com> <4E2420E6.8090809@redhat.com> <4E29C502.8020100@redhat.com> In-Reply-To: <4E29C502.8020100@redhat.com> Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 07/22/11 13:44, Jason Merrill wrote: > On 07/18/2011 08:02 AM, Aldy Hernandez wrote: >> + /* If other threads can't see this value, no need to restrict >> stores. */ >> + if (ALLOW_STORE_DATA_RACES >> + || !DECL_THREAD_VISIBLE_P (innerdecl)) >> + { >> + *bitstart = *bitend = 0; >> + return; >> + } > > What if get_inner_reference returns something that isn't a DECL, such as > an INDIRECT_REF? I had changed this already to take into account aliasing, so if we get an INDIRECT_REF, ptr_deref_may_alias_global_p() returns true, and we proceed with the restriction: + /* If other threads can't see this value, no need to restrict stores. */ + if (ALLOW_STORE_DATA_RACES + || (!ptr_deref_may_alias_global_p (innerdecl) + && (DECL_THREAD_LOCAL_P (innerdecl) + || !TREE_STATIC (innerdecl)))) >> + if (fld) >> + { >> + /* We found the end of the bit field sequence. Include the >> + padding up to the next field and be done. */ >> + *bitend = bitpos - 1; >> + } > > bitpos is the position of "field", and it seems to me we want the > position of "fld" here. Notice that bitpos gets recalculated at each iteration by get_inner_reference, so bitpos is actually the position of fld. >> + /* If unset, no restriction. */ >> + if (!bitregion_end) >> + maxbits = 0; >> + else >> + maxbits = (bitregion_end - bitregion_start) % align; > > Maybe use MAX_FIXED_MODE_SIZE so you don't have to test it against 0? Fixed everywhere. >> + if (!bitregion_end) >> + maxbits = 0; >> + else if (1||bitpos + offset * BITS_PER_UNIT < bitregion_start) >> + maxbits = bitregion_end - bitregion_start; >> + else >> + maxbits = bitregion_end - (bitpos + offset * BITS_PER_UNIT) + 1; > > I assume the 1|| was there for debugging? Fixed, plus I adjusted the calculation of maxbits everywhere because I found an off-by-one error. I have also overhauled store_bit_field() to adjust the address of the address to point to the beginning of the bit region. This fixed a myraid of corner cases pointed out by a test Hans Boehm was kind enough to provide. I have added more tests. How does this look? (Pending tests.) * params.h (ALLOW_STORE_DATA_RACES): New. * params.def (PARAM_ALLOW_STORE_DATA_RACES): New. * Makefile.in (expr.o): Depend on PARAMS_H. * machmode.h (get_best_mode): Add argument. * fold-const.c (optimize_bit_field_compare): Add argument to get_best_mode. (fold_truthop): Same. * ifcvt.c (noce_emit_move_insn): Add argument to store_bit_field. * expr.c (emit_group_store): Same. (copy_blkmode_from_reg): Same. (write_complex_part): Same. (optimize_bitfield_assignment_op): Add argument. Add argument to get_best_mode. (get_bit_range): New. (expand_assignment): Calculate maxbits and pass it down accordingly. (store_field): New argument. (expand_expr_real_2): New argument to store_field. Include params.h. * expr.h (store_bit_field): New argument. * stor-layout.c (get_best_mode): Restrict mode expansion by taking into account maxbits. * calls.c (store_unaligned_arguments_into_pseudos): New argument to store_bit_field. * expmed.c (store_bit_field_1): New argument. Use it. (store_bit_field): Same. (store_fixed_bit_field): Same. (store_split_bit_field): Same. (extract_bit_field_1): Pass new argument to get_best_mode. (extract_bit_field): Same. * stmt.c (store_bit_field): Pass new argument to store_bit_field. * doc/invoke.texi: Document parameter allow-store-data-races. Index: doc/invoke.texi =================================================================== --- doc/invoke.texi (revision 176280) +++ doc/invoke.texi (working copy) @@ -9027,6 +9027,11 @@ The maximum number of conditional stores if either vectorization (@option{-ftree-vectorize}) or if-conversion (@option{-ftree-loop-if-convert}) is disabled. The default is 2. +@item allow-store-data-races +Allow optimizers to introduce new data races on stores. +Set to 1 to allow, otherwise to 0. This option is enabled by default +unless implicitly set by the @option{-fmemory-model=} option. + @item case-values-threshold The smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches. If the value is Index: machmode.h =================================================================== --- machmode.h (revision 176280) +++ machmode.h (working copy) @@ -248,7 +248,10 @@ extern enum machine_mode mode_for_vector /* Find the best mode to use to access a bit field. */ -extern enum machine_mode get_best_mode (int, int, unsigned int, +extern enum machine_mode get_best_mode (int, int, + unsigned HOST_WIDE_INT, + unsigned HOST_WIDE_INT, + unsigned int, enum machine_mode, int); /* Determine alignment, 1<=result<=BIGGEST_ALIGNMENT. */ Index: fold-const.c =================================================================== --- fold-const.c (revision 176280) +++ fold-const.c (working copy) @@ -3394,7 +3394,7 @@ optimize_bit_field_compare (location_t l && flag_strict_volatile_bitfields > 0) nmode = lmode; else - nmode = get_best_mode (lbitsize, lbitpos, + nmode = get_best_mode (lbitsize, lbitpos, 0, 0, const_p ? TYPE_ALIGN (TREE_TYPE (linner)) : MIN (TYPE_ALIGN (TREE_TYPE (linner)), TYPE_ALIGN (TREE_TYPE (rinner))), @@ -5222,7 +5222,7 @@ fold_truthop (location_t loc, enum tree_ to be relative to a field of that size. */ first_bit = MIN (ll_bitpos, rl_bitpos); end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize); - lnmode = get_best_mode (end_bit - first_bit, first_bit, + lnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0, TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode, volatilep); if (lnmode == VOIDmode) @@ -5287,7 +5287,7 @@ fold_truthop (location_t loc, enum tree_ first_bit = MIN (lr_bitpos, rr_bitpos); end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize); - rnmode = get_best_mode (end_bit - first_bit, first_bit, + rnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0, TYPE_ALIGN (TREE_TYPE (lr_inner)), word_mode, volatilep); if (rnmode == VOIDmode) Index: params.h =================================================================== --- params.h (revision 176280) +++ params.h (working copy) @@ -211,4 +211,6 @@ extern void init_param_values (int *para PARAM_VALUE (PARAM_MIN_NONDEBUG_INSN_UID) #define MAX_STORES_TO_SINK \ PARAM_VALUE (PARAM_MAX_STORES_TO_SINK) +#define ALLOW_STORE_DATA_RACES \ + PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES) #endif /* ! GCC_PARAMS_H */ Index: testsuite/gcc.dg/20110509-4.c =================================================================== --- testsuite/gcc.dg/20110509-4.c (revision 0) +++ testsuite/gcc.dg/20110509-4.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O2 --param allow-store-data-races=0" } */ + +struct bits +{ + char a; + int b:7; + int c:9; + unsigned char d; +} x; + +/* Store into should not clobber . */ +void update_c(struct bits *p, int val) +{ + p -> c = val; +} + +/* { dg-final { scan-assembler-not "movl" } } */ Index: testsuite/gcc.dg/20110509.c =================================================================== --- testsuite/gcc.dg/20110509.c (revision 0) +++ testsuite/gcc.dg/20110509.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O2 --param allow-store-data-races=0" } */ + +/* Test that we don't store past VAR.A. */ + +struct S +{ + volatile unsigned int a : 4; + unsigned char b; + unsigned int c : 6; +} var; + +void set_a() +{ + var.a = 12; +} + +/* { dg-final { scan-assembler-not "movl.*, var" } } */ Index: testsuite/gcc.dg/20110509-2.c =================================================================== --- testsuite/gcc.dg/20110509-2.c (revision 0) +++ testsuite/gcc.dg/20110509-2.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O2 --param allow-store-data-races=0" } */ + +/* Test that we don't store past VAR.K. */ + +struct S +{ + volatile int i; + volatile int j: 32; + volatile int k: 15; + volatile char c[2]; +} var; + +void setit() +{ + var.k = 13; +} + +/* { dg-final { scan-assembler-not "movl.*, var" } } */ Index: testsuite/gcc.dg/20110509-3.c =================================================================== --- testsuite/gcc.dg/20110509-3.c (revision 0) +++ testsuite/gcc.dg/20110509-3.c (revision 0) @@ -0,0 +1,21 @@ +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O2 --param allow-store-data-races=0" } */ + +/* Make sure we don't narrow down to a QI or HI to store into VAR.J, + but instead use an SI. */ + +struct S +{ + volatile int i: 4; + volatile int j: 4; + volatile int k: 8; + volatile int l: 8; + volatile int m: 8; +} var; + +void setit() +{ + var.j = 5; +} + +/* { dg-final { scan-assembler "movl.*, var" } } */ Index: ifcvt.c =================================================================== --- ifcvt.c (revision 176280) +++ ifcvt.c (working copy) @@ -885,7 +885,7 @@ noce_emit_move_insn (rtx x, rtx y) } gcc_assert (start < (MEM_P (op) ? BITS_PER_UNIT : BITS_PER_WORD)); - store_bit_field (op, size, start, GET_MODE (x), y); + store_bit_field (op, size, start, 0, 0, GET_MODE (x), y); return; } @@ -939,7 +939,8 @@ noce_emit_move_insn (rtx x, rtx y) inner = XEXP (outer, 0); outmode = GET_MODE (outer); bitpos = SUBREG_BYTE (outer) * BITS_PER_UNIT; - store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, outmode, y); + store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, + 0, 0, outmode, y); } /* Return sequence of instructions generated by if conversion. This Index: expr.c =================================================================== --- expr.c (revision 176280) +++ expr.c (working copy) @@ -55,6 +55,7 @@ along with GCC; see the file COPYING3. #include "diagnostic.h" #include "ssaexpand.h" #include "target-globals.h" +#include "params.h" /* Decide whether a function's arguments should be processed from first to last or from last to first. @@ -143,7 +144,9 @@ static void store_constructor_field (rtx HOST_WIDE_INT, enum machine_mode, tree, tree, int, alias_set_type); static void store_constructor (tree, rtx, int, HOST_WIDE_INT); -static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT, enum machine_mode, +static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT, + unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, + enum machine_mode, tree, tree, alias_set_type, bool); static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree); @@ -2074,7 +2077,7 @@ emit_group_store (rtx orig_dst, rtx src, emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]); else store_bit_field (dest, bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT, - mode, tmps[i]); + 0, 0, mode, tmps[i]); } /* Copy from the pseudo into the (probable) hard reg. */ @@ -2168,7 +2171,7 @@ copy_blkmode_from_reg (rtx tgtblk, rtx s /* Use xbitpos for the source extraction (right justified) and bitpos for the destination store (left justified). */ - store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, copy_mode, + store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, 0, copy_mode, extract_bit_field (src, bitsize, xbitpos % BITS_PER_WORD, 1, false, NULL_RTX, copy_mode, copy_mode)); @@ -2805,7 +2808,7 @@ write_complex_part (rtx cplx, rtx val, b gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD); } - store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, imode, val); + store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, 0, imode, val); } /* Extract one of the components of the complex value CPLX. Extract the @@ -3940,6 +3943,8 @@ get_subtarget (rtx x) static bool optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize, unsigned HOST_WIDE_INT bitpos, + unsigned HOST_WIDE_INT bitregion_start, + unsigned HOST_WIDE_INT bitregion_end, enum machine_mode mode1, rtx str_rtx, tree to, tree src) { @@ -4001,6 +4006,7 @@ optimize_bitfield_assignment_op (unsigne if (str_bitsize == 0 || str_bitsize > BITS_PER_WORD) str_mode = word_mode; str_mode = get_best_mode (bitsize, bitpos, + bitregion_start, bitregion_end, MEM_ALIGN (str_rtx), str_mode, 0); if (str_mode == VOIDmode) return false; @@ -4109,6 +4115,113 @@ optimize_bitfield_assignment_op (unsigne return false; } +/* In the C++ memory model, consecutive bit fields in a structure are + considered one memory location. + + Given a COMPONENT_REF, this function returns the bit range of + consecutive bits in which this COMPONENT_REF belongs in. The + values are returned in *BITSTART and *BITEND. If either the C++ + memory model is not activated, or this memory access is not thread + visible, 0 is returned in *BITSTART and *BITEND. + + EXP is the COMPONENT_REF. + INNERDECL is the actual object being referenced. + BITPOS is the position in bits where the bit starts within the structure. + BITSIZE is size in bits of the field being referenced in EXP. + + For example, while storing into FOO.A here... + + struct { + BIT 0: + unsigned int a : 4; + unsigned int b : 1; + BIT 8: + unsigned char c; + unsigned int d : 6; + } foo; + + ...we are not allowed to store past , so for the layout above, a + range of 0..7 (because no one cares if we store into the + padding). */ + +static void +get_bit_range (unsigned HOST_WIDE_INT *bitstart, + unsigned HOST_WIDE_INT *bitend, + tree exp, tree innerdecl, + HOST_WIDE_INT bitpos, HOST_WIDE_INT bitsize) +{ + tree field, record_type, fld; + bool found_field = false; + bool prev_field_is_bitfield; + + gcc_assert (TREE_CODE (exp) == COMPONENT_REF); + + /* If other threads can't see this value, no need to restrict stores. */ + if (ALLOW_STORE_DATA_RACES + || (!ptr_deref_may_alias_global_p (innerdecl) + && (DECL_THREAD_LOCAL_P (innerdecl) + || !TREE_STATIC (innerdecl)))) + { + *bitstart = *bitend = 0; + return; + } + + /* Bit field we're storing into. */ + field = TREE_OPERAND (exp, 1); + record_type = DECL_FIELD_CONTEXT (field); + + /* Count the contiguous bitfields for the memory location that + contains FIELD. */ + *bitstart = 0; + prev_field_is_bitfield = true; + for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld)) + { + tree t, offset; + enum machine_mode mode; + int unsignedp, volatilep; + + if (TREE_CODE (fld) != FIELD_DECL) + continue; + + t = build3 (COMPONENT_REF, TREE_TYPE (exp), + unshare_expr (TREE_OPERAND (exp, 0)), + fld, NULL_TREE); + get_inner_reference (t, &bitsize, &bitpos, &offset, + &mode, &unsignedp, &volatilep, true); + + if (field == fld) + found_field = true; + + if (DECL_BIT_FIELD_TYPE (fld) && bitsize > 0) + { + if (prev_field_is_bitfield == false) + { + *bitstart = bitpos; + prev_field_is_bitfield = true; + } + } + else + { + prev_field_is_bitfield = false; + if (found_field) + break; + } + } + gcc_assert (found_field); + + if (fld) + { + /* We found the end of the bit field sequence. Include the + padding up to the next field and be done. */ + *bitend = bitpos - 1; + } + else + { + /* If this is the last element in the structure, include the padding + at the end of structure. */ + *bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - 1; + } +} /* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL is true, try generating a nontemporal store. */ @@ -4208,6 +4321,8 @@ expand_assignment (tree to, tree from, b { enum machine_mode mode1; HOST_WIDE_INT bitsize, bitpos; + unsigned HOST_WIDE_INT bitregion_start = 0; + unsigned HOST_WIDE_INT bitregion_end = 0; tree offset; int unsignedp; int volatilep = 0; @@ -4217,6 +4332,11 @@ expand_assignment (tree to, tree from, b tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1, &unsignedp, &volatilep, true); + if (TREE_CODE (to) == COMPONENT_REF + && DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1))) + get_bit_range (&bitregion_start, &bitregion_end, + to, tem, bitpos, bitsize); + /* If we are going to use store_bit_field and extract_bit_field, make sure to_rtx will be safe for multiple use. */ @@ -4298,11 +4418,14 @@ expand_assignment (tree to, tree from, b nontemporal); else if (bitpos + bitsize <= mode_bitsize / 2) result = store_field (XEXP (to_rtx, 0), bitsize, bitpos, + bitregion_start, bitregion_end, mode1, from, TREE_TYPE (tem), get_alias_set (to), nontemporal); else if (bitpos >= mode_bitsize / 2) result = store_field (XEXP (to_rtx, 1), bitsize, - bitpos - mode_bitsize / 2, mode1, from, + bitpos - mode_bitsize / 2, + bitregion_start, bitregion_end, + mode1, from, TREE_TYPE (tem), get_alias_set (to), nontemporal); else if (bitpos == 0 && bitsize == mode_bitsize) @@ -4323,7 +4446,9 @@ expand_assignment (tree to, tree from, b 0); write_complex_part (temp, XEXP (to_rtx, 0), false); write_complex_part (temp, XEXP (to_rtx, 1), true); - result = store_field (temp, bitsize, bitpos, mode1, from, + result = store_field (temp, bitsize, bitpos, + bitregion_start, bitregion_end, + mode1, from, TREE_TYPE (tem), get_alias_set (to), nontemporal); emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false)); @@ -4348,11 +4473,15 @@ expand_assignment (tree to, tree from, b MEM_KEEP_ALIAS_SET_P (to_rtx) = 1; } - if (optimize_bitfield_assignment_op (bitsize, bitpos, mode1, + if (optimize_bitfield_assignment_op (bitsize, bitpos, + bitregion_start, bitregion_end, + mode1, to_rtx, to, from)) result = NULL; else - result = store_field (to_rtx, bitsize, bitpos, mode1, from, + result = store_field (to_rtx, bitsize, bitpos, + bitregion_start, bitregion_end, + mode1, from, TREE_TYPE (tem), get_alias_set (to), nontemporal); } @@ -4745,7 +4874,7 @@ store_expr (tree exp, rtx target, int ca : BLOCK_OP_NORMAL)); else if (GET_MODE (target) == BLKmode) store_bit_field (target, INTVAL (expr_size (exp)) * BITS_PER_UNIT, - 0, GET_MODE (temp), temp); + 0, 0, 0, GET_MODE (temp), temp); else convert_move (target, temp, unsignedp); } @@ -5210,7 +5339,8 @@ store_constructor_field (rtx target, uns store_constructor (exp, target, cleared, bitsize / BITS_PER_UNIT); } else - store_field (target, bitsize, bitpos, mode, exp, type, alias_set, false); + store_field (target, bitsize, bitpos, 0, 0, mode, exp, type, alias_set, + false); } /* Store the value of constructor EXP into the rtx TARGET. @@ -5784,6 +5914,11 @@ store_constructor (tree exp, rtx target, BITSIZE bits, starting BITPOS bits from the start of TARGET. If MODE is VOIDmode, it means that we are storing into a bit-field. + BITREGION_START is bitpos of the first bitfield in this region. + BITREGION_END is the bitpos of the ending bitfield in this region. + These two fields are 0, if the C++ memory model does not apply, + or we are not interested in keeping track of bitfield regions. + Always return const0_rtx unless we have something particular to return. @@ -5797,6 +5932,8 @@ store_constructor (tree exp, rtx target, static rtx store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos, + unsigned HOST_WIDE_INT bitregion_start, + unsigned HOST_WIDE_INT bitregion_end, enum machine_mode mode, tree exp, tree type, alias_set_type alias_set, bool nontemporal) { @@ -5829,8 +5966,9 @@ store_field (rtx target, HOST_WIDE_INT b if (bitsize != (HOST_WIDE_INT) GET_MODE_BITSIZE (GET_MODE (target))) emit_move_insn (object, target); - store_field (blk_object, bitsize, bitpos, mode, exp, type, alias_set, - nontemporal); + store_field (blk_object, bitsize, bitpos, + bitregion_start, bitregion_end, + mode, exp, type, alias_set, nontemporal); emit_move_insn (target, object); @@ -5944,7 +6082,9 @@ store_field (rtx target, HOST_WIDE_INT b } /* Store the value in the bitfield. */ - store_bit_field (target, bitsize, bitpos, mode, temp); + store_bit_field (target, bitsize, bitpos, + bitregion_start, bitregion_end, + mode, temp); return const0_rtx; } @@ -7354,7 +7494,7 @@ expand_expr_real_2 (sepops ops, rtx targ (treeop0)) * BITS_PER_UNIT), (HOST_WIDE_INT) GET_MODE_BITSIZE (mode)), - 0, TYPE_MODE (valtype), treeop0, + 0, 0, 0, TYPE_MODE (valtype), treeop0, type, 0, false); } Index: expr.h =================================================================== --- expr.h (revision 176280) +++ expr.h (working copy) @@ -665,7 +665,10 @@ extern enum machine_mode mode_for_extraction (enum extraction_pattern, int); extern void store_bit_field (rtx, unsigned HOST_WIDE_INT, - unsigned HOST_WIDE_INT, enum machine_mode, rtx); + unsigned HOST_WIDE_INT, + unsigned HOST_WIDE_INT, + unsigned HOST_WIDE_INT, + enum machine_mode, rtx); extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, int, bool, rtx, enum machine_mode, enum machine_mode); Index: stor-layout.c =================================================================== --- stor-layout.c (revision 176280) +++ stor-layout.c (working copy) @@ -2361,6 +2361,13 @@ fixup_unsigned_type (tree type) /* Find the best machine mode to use when referencing a bit field of length BITSIZE bits starting at BITPOS. + BITREGION_START is the bit position of the first bit in this + sequence of bit fields. BITREGION_END is the last bit in this + sequence. If these two fields are non-zero, we should restrict the + memory access to a maximum sized chunk of + BITREGION_END - BITREGION_START + 1. Otherwise, we are allowed to touch + any adjacent non bit-fields. + The underlying object is known to be aligned to a boundary of ALIGN bits. If LARGEST_MODE is not VOIDmode, it means that we should not use a mode larger than LARGEST_MODE (usually SImode). @@ -2378,11 +2385,21 @@ fixup_unsigned_type (tree type) decide which of the above modes should be used. */ enum machine_mode -get_best_mode (int bitsize, int bitpos, unsigned int align, +get_best_mode (int bitsize, int bitpos, + unsigned HOST_WIDE_INT bitregion_start, + unsigned HOST_WIDE_INT bitregion_end, + unsigned int align, enum machine_mode largest_mode, int volatilep) { enum machine_mode mode; unsigned int unit = 0; + unsigned HOST_WIDE_INT maxbits; + + /* If unset, no restriction. */ + if (!bitregion_end) + maxbits = MAX_FIXED_MODE_SIZE; + else + maxbits = (bitregion_end - bitregion_start) % align + 1; /* Find the narrowest integer mode that contains the bit field. */ for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode; @@ -2419,6 +2436,7 @@ get_best_mode (int bitsize, int bitpos, && bitpos / unit == (bitpos + bitsize - 1) / unit && unit <= BITS_PER_WORD && unit <= MIN (align, BIGGEST_ALIGNMENT) + && unit <= maxbits && (largest_mode == VOIDmode || unit <= GET_MODE_BITSIZE (largest_mode))) wide_mode = tmode; Index: calls.c =================================================================== --- calls.c (revision 176280) +++ calls.c (working copy) @@ -924,8 +924,8 @@ store_unaligned_arguments_into_pseudos ( emit_move_insn (reg, const0_rtx); bytes -= bitsize / BITS_PER_UNIT; - store_bit_field (reg, bitsize, endian_correction, word_mode, - word); + store_bit_field (reg, bitsize, endian_correction, 0, 0, + word_mode, word); } } } Index: expmed.c =================================================================== --- expmed.c (revision 176280) +++ expmed.c (working copy) @@ -47,9 +47,15 @@ struct target_expmed *this_target_expmed static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, - unsigned HOST_WIDE_INT, rtx); + unsigned HOST_WIDE_INT, + unsigned HOST_WIDE_INT, + unsigned HOST_WIDE_INT, + rtx); static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT, - unsigned HOST_WIDE_INT, rtx); + unsigned HOST_WIDE_INT, + unsigned HOST_WIDE_INT, + unsigned HOST_WIDE_INT, + rtx); static rtx extract_fixed_bit_field (enum machine_mode, rtx, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, @@ -333,7 +339,10 @@ mode_for_extraction (enum extraction_pat static bool store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize, - unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode, + unsigned HOST_WIDE_INT bitnum, + unsigned HOST_WIDE_INT bitregion_start, + unsigned HOST_WIDE_INT bitregion_end, + enum machine_mode fieldmode, rtx value, bool fallback_p) { unsigned int unit @@ -455,6 +464,7 @@ store_bit_field_1 (rtx str_rtx, unsigned /* We may be accessing data outside the field, which means we can alias adjacent data. */ + /* ?? not always for C++0x memory model ?? */ if (MEM_P (op0)) { op0 = shallow_copy_rtx (op0); @@ -547,7 +557,9 @@ store_bit_field_1 (rtx str_rtx, unsigned if (!store_bit_field_1 (op0, MIN (BITS_PER_WORD, bitsize - i * BITS_PER_WORD), - bitnum + bit_offset, word_mode, + bitnum + bit_offset, + bitregion_start, bitregion_end, + word_mode, value_word, fallback_p)) { delete_insns_since (last); @@ -710,6 +722,10 @@ store_bit_field_1 (rtx str_rtx, unsigned if (HAVE_insv && MEM_P (op0)) { enum machine_mode bestmode; + unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE; + + if (bitregion_end) + maxbits = bitregion_end - bitregion_start + 1; /* Get the mode to use for inserting into this field. If OP0 is BLKmode, get the smallest mode consistent with the alignment. If @@ -717,9 +733,12 @@ store_bit_field_1 (rtx str_rtx, unsigned mode. Otherwise, use the smallest mode containing the field. */ if (GET_MODE (op0) == BLKmode + || GET_MODE_BITSIZE (GET_MODE (op0)) > maxbits || (op_mode != MAX_MACHINE_MODE && GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode))) - bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0), + bestmode = get_best_mode (bitsize, bitnum, + bitregion_start, bitregion_end, + MEM_ALIGN (op0), (op_mode == MAX_MACHINE_MODE ? VOIDmode : op_mode), MEM_VOLATILE_P (op0)); @@ -748,6 +767,7 @@ store_bit_field_1 (rtx str_rtx, unsigned the unit. */ tempreg = copy_to_reg (xop0); if (store_bit_field_1 (tempreg, bitsize, xbitpos, + bitregion_start, bitregion_end, fieldmode, orig_value, false)) { emit_move_insn (xop0, tempreg); @@ -760,21 +780,59 @@ store_bit_field_1 (rtx str_rtx, unsigned if (!fallback_p) return false; - store_fixed_bit_field (op0, offset, bitsize, bitpos, value); + store_fixed_bit_field (op0, offset, bitsize, bitpos, + bitregion_start, bitregion_end, value); return true; } /* Generate code to store value from rtx VALUE into a bit-field within structure STR_RTX containing BITSIZE bits starting at bit BITNUM. + + BITREGION_START is bitpos of the first bitfield in this region. + BITREGION_END is the bitpos of the ending bitfield in this region. + These two fields are 0, if the C++ memory model does not apply, + or we are not interested in keeping track of bitfield regions. + FIELDMODE is the machine-mode of the FIELD_DECL node for this field. */ void store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize, - unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode, + unsigned HOST_WIDE_INT bitnum, + unsigned HOST_WIDE_INT bitregion_start, + unsigned HOST_WIDE_INT bitregion_end, + enum machine_mode fieldmode, rtx value) { - if (!store_bit_field_1 (str_rtx, bitsize, bitnum, fieldmode, value, true)) + /* Under the C++0x memory model, we must not touch bits outside the + bit region. Adjust the address to start at the beginning of the + bit region. */ + if (MEM_P (str_rtx) + && bitregion_start > 0) + { + enum machine_mode bestmode; + enum machine_mode op_mode; + unsigned HOST_WIDE_INT offset; + + op_mode = mode_for_extraction (EP_insv, 3); + if (op_mode == MAX_MACHINE_MODE) + op_mode = VOIDmode; + + offset = bitregion_start / BITS_PER_UNIT; + bitnum -= bitregion_start; + bitregion_end -= bitregion_start; + bitregion_start = 0; + bestmode = get_best_mode (bitsize, bitnum, + bitregion_start, bitregion_end, + MEM_ALIGN (str_rtx), + op_mode, + MEM_VOLATILE_P (str_rtx)); + str_rtx = adjust_address (str_rtx, bestmode, offset); + } + + if (!store_bit_field_1 (str_rtx, bitsize, bitnum, + bitregion_start, bitregion_end, + fieldmode, value, true)) gcc_unreachable (); } @@ -790,7 +848,10 @@ store_bit_field (rtx str_rtx, unsigned H static void store_fixed_bit_field (rtx op0, unsigned HOST_WIDE_INT offset, unsigned HOST_WIDE_INT bitsize, - unsigned HOST_WIDE_INT bitpos, rtx value) + unsigned HOST_WIDE_INT bitpos, + unsigned HOST_WIDE_INT bitregion_start, + unsigned HOST_WIDE_INT bitregion_end, + rtx value) { enum machine_mode mode; unsigned int total_bits = BITS_PER_WORD; @@ -811,12 +872,19 @@ store_fixed_bit_field (rtx op0, unsigned /* Special treatment for a bit field split across two registers. */ if (bitsize + bitpos > BITS_PER_WORD) { - store_split_bit_field (op0, bitsize, bitpos, value); + store_split_bit_field (op0, bitsize, bitpos, + bitregion_start, bitregion_end, + value); return; } } else { + unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE; + + if (bitregion_end) + maxbits = bitregion_end - bitregion_start + 1; + /* Get the proper mode to use for this field. We want a mode that includes the entire field. If such a mode would be larger than a word, we won't be doing the extraction the normal way. @@ -829,10 +897,12 @@ store_fixed_bit_field (rtx op0, unsigned if (MEM_VOLATILE_P (op0) && GET_MODE_BITSIZE (GET_MODE (op0)) > 0 + && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits && flag_strict_volatile_bitfields > 0) mode = GET_MODE (op0); else mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, + bitregion_start, bitregion_end, MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0)); if (mode == VOIDmode) @@ -840,7 +910,7 @@ store_fixed_bit_field (rtx op0, unsigned /* The only way this should occur is if the field spans word boundaries. */ store_split_bit_field (op0, bitsize, bitpos + offset * BITS_PER_UNIT, - value); + bitregion_start, bitregion_end, value); return; } @@ -960,7 +1030,10 @@ store_fixed_bit_field (rtx op0, unsigned static void store_split_bit_field (rtx op0, unsigned HOST_WIDE_INT bitsize, - unsigned HOST_WIDE_INT bitpos, rtx value) + unsigned HOST_WIDE_INT bitpos, + unsigned HOST_WIDE_INT bitregion_start, + unsigned HOST_WIDE_INT bitregion_end, + rtx value) { unsigned int unit; unsigned int bitsdone = 0; @@ -1075,7 +1148,7 @@ store_split_bit_field (rtx op0, unsigned it is just an out-of-bounds access. Ignore it. */ if (word != const0_rtx) store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize, - thispos, part); + thispos, bitregion_start, bitregion_end, part); bitsdone += thissize; } } @@ -1515,7 +1588,7 @@ extract_bit_field_1 (rtx str_rtx, unsign if (GET_MODE (op0) == BLKmode || (ext_mode != MAX_MACHINE_MODE && GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (ext_mode))) - bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0), + bestmode = get_best_mode (bitsize, bitnum, 0, 0, MEM_ALIGN (op0), (ext_mode == MAX_MACHINE_MODE ? VOIDmode : ext_mode), MEM_VOLATILE_P (op0)); @@ -1641,7 +1714,7 @@ extract_fixed_bit_field (enum machine_mo mode = tmode; } else - mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, + mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, 0, 0, MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0)); if (mode == VOIDmode) Index: Makefile.in =================================================================== --- Makefile.in (revision 176280) +++ Makefile.in (working copy) @@ -2908,7 +2908,7 @@ expr.o : expr.c $(CONFIG_H) $(SYSTEM_H) reload.h langhooks.h intl.h $(TM_P_H) $(TARGET_H) \ tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \ $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H) \ - $(COMMON_TARGET_H) + $(PARAMS_H) $(COMMON_TARGET_H) dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \ $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \ langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) output.h Index: stmt.c =================================================================== --- stmt.c (revision 176280) +++ stmt.c (working copy) @@ -1759,7 +1759,8 @@ expand_return (tree retval) /* Use bitpos for the source extraction (left justified) and xbitpos for the destination store (right justified). */ - store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, word_mode, + store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, + 0, 0, word_mode, extract_bit_field (src, bitsize, bitpos % BITS_PER_WORD, 1, false, NULL_RTX, word_mode, word_mode)); Index: params.def =================================================================== --- params.def (revision 176280) +++ params.def (working copy) @@ -902,6 +902,12 @@ DEFPARAM (PARAM_CASE_VALUES_THRESHOLD, "if 0, use the default for the machine", 0, 0, 0) +/* Data race flags for C++0x memory model compliance. */ +DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES, + "allow-store-data-races", + "Allow new data races on stores to be introduced", + 1, 0, 1) + /* Local variables: