Message ID | 20210113191122.GO1034503@tucnak |
---|---|
State | New |
Headers | show |
Series | c-family: Improve MEM_REF printing for diagnostics [PR98597] | expand |
On Wed, 13 Jan 2021, Jakub Jelinek wrote: > Hi! > > The following patch doesn't actually fix the print_mem_ref bugs, I've kept > it for now as broken as it was, but at least improves the cases where > we can unambiguously map back MEM[&something + off] into some particular > reference (e.g. something.foo[1].bar etc.). > In the distant past I think we were folding such MEM_REFs back to > COMPONENT_REFs and ARRAY_REFs, but we've stopped doing that. Yeah, because it has different semantics - *(((int *)t + 3) accesses an int object while t.u.b accesses a 't' object from the TBAA perspective. > But for > diagnostics that is what the user actually want to see IMHO. > So on the attached testcase, instead of printing what is in left column > it prints what is in right column: > ((int*)t) + 3 t.u.b > ((int*)t) + 6 t.u.e.i > ((int*)t) + 8 t.v > s + 1 s[1] so while that's "nice" in general, for TBAA diagnostics it might actually be misleading. I wonder whether we absolutely need to print a C expression here. We could print, instead of *((int *)t + 3), "access to a memory object of type 'int' at offset 12 bytes from 't'", thus explain in plain english. That said, *((int *)t + 3) is exactly what the access is, semantically. There's the case of a mismatch of the access type and the TBAA type which we cannot write down in C terms but maybe we want to have a builtin for this? __builtin_access (ptr, lvalue-type, tbaa-type)? > Of course, print_mem_ref needs to be also fixed to avoid printing the > nonsense it is printing right now, t is a structure type, so it can't be > cast to int* in C and in C++ only using some user operator, and > the result of what it printed is a pointer, while the uninitialized reads > are int. > > I was hoping Martin would fix that, but given his comment in the PR I think > I'll fix it myself tomorrow. > > Anyway, this patch is useful on its own. Bootstrapped/regtested on > x86_64-linux and i686-linux, ok for trunk? In the light of Martins patch this is probably reasonable but still the general direction is wrong (which is why I didn't approve Martins original patch). I'm also somewhat disappointed we're breaking this so late in the cycle. c_fold_indirect_ref_for_warn doesn't look like it is especially careful about error recovery issues (error_mark_node in random places of the trees). Maybe that never happens. Thanks, Richard. > 2021-01-13 Jakub Jelinek <jakub@redhat.com> > > PR tree-optimization/98597 > * c-pretty-print.c (c_fold_indirect_ref_for_warn): New function. > (print_mem_ref): Call it. > > * gcc.dg/uninit-40.c: New test. > > --- gcc/c-family/c-pretty-print.c.jj 2021-01-13 08:02:09.425498954 +0100 > +++ gcc/c-family/c-pretty-print.c 2021-01-13 15:02:57.860329998 +0100 > @@ -1809,6 +1809,113 @@ pp_c_call_argument_list (c_pretty_printe > pp_c_right_paren (pp); > } > > +/* Try to fold *(type *)&op into op.fld.fld2[1] if possible. > + Only used for printing expressions. Should punt if ambiguous > + (e.g. in unions). */ > + > +static tree > +c_fold_indirect_ref_for_warn (location_t loc, tree type, tree op, > + offset_int &off) > +{ > + tree optype = TREE_TYPE (op); > + if (off == 0) > + { > + if (lang_hooks.types_compatible_p (optype, type)) > + return op; > + /* *(foo *)&complexfoo => __real__ complexfoo */ > + else if (TREE_CODE (optype) == COMPLEX_TYPE > + && lang_hooks.types_compatible_p (type, TREE_TYPE (optype))) > + return build1_loc (loc, REALPART_EXPR, type, op); > + } > + /* ((foo*)&complexfoo)[1] => __imag__ complexfoo */ > + else if (TREE_CODE (optype) == COMPLEX_TYPE > + && lang_hooks.types_compatible_p (type, TREE_TYPE (optype)) > + && tree_to_uhwi (TYPE_SIZE_UNIT (type)) == off) > + { > + off = 0; > + return build1_loc (loc, IMAGPART_EXPR, type, op); > + } > + /* ((foo *)&fooarray)[x] => fooarray[x] */ > + if (TREE_CODE (optype) == ARRAY_TYPE > + && TYPE_SIZE_UNIT (TREE_TYPE (optype)) > + && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (optype))) == INTEGER_CST > + && !integer_zerop (TYPE_SIZE_UNIT (TREE_TYPE (optype)))) > + { > + tree type_domain = TYPE_DOMAIN (optype); > + tree min_val = size_zero_node; > + if (type_domain && TYPE_MIN_VALUE (type_domain)) > + min_val = TYPE_MIN_VALUE (type_domain); > + offset_int el_sz = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (optype))); > + offset_int idx = off / el_sz; > + offset_int rem = off % el_sz; > + if (TREE_CODE (min_val) == INTEGER_CST) > + { > + tree index > + = wide_int_to_tree (sizetype, idx + wi::to_offset (min_val)); > + op = build4_loc (loc, ARRAY_REF, TREE_TYPE (optype), op, index, > + NULL_TREE, NULL_TREE); > + off = rem; > + if (tree ret = c_fold_indirect_ref_for_warn (loc, type, op, off)) > + return ret; > + return op; > + } > + } > + /* ((foo *)&struct_with_foo_field)[x] => COMPONENT_REF */ > + else if (TREE_CODE (optype) == RECORD_TYPE) > + { > + for (tree field = TYPE_FIELDS (optype); > + field; field = DECL_CHAIN (field)) > + if (TREE_CODE (field) == FIELD_DECL > + && TREE_TYPE (field) != error_mark_node > + && TYPE_SIZE_UNIT (TREE_TYPE (field)) > + && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (field))) == INTEGER_CST) > + { > + tree pos = byte_position (field); > + if (TREE_CODE (pos) != INTEGER_CST) > + continue; > + offset_int upos = wi::to_offset (pos); > + offset_int el_sz > + = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (field))); > + if (upos <= off && off < upos + el_sz) > + { > + tree cop = build3_loc (loc, COMPONENT_REF, TREE_TYPE (field), > + op, field, NULL_TREE); > + off = off - upos; > + if (tree ret = c_fold_indirect_ref_for_warn (loc, type, cop, > + off)) > + return ret; > + return cop; > + } > + } > + } > + /* Similarly for unions, but in this case try to be very conservative, > + only match if some field has type compatible with type and it is the > + only such field. */ > + else if (TREE_CODE (optype) == UNION_TYPE) > + { > + tree fld = NULL_TREE; > + for (tree field = TYPE_FIELDS (optype); > + field; field = DECL_CHAIN (field)) > + if (TREE_CODE (field) == FIELD_DECL > + && TREE_TYPE (field) != error_mark_node > + && lang_hooks.types_compatible_p (TREE_TYPE (field), type)) > + { > + if (fld) > + return NULL_TREE; > + else > + fld = field; > + } > + if (fld) > + { > + off = 0; > + return build3_loc (loc, COMPONENT_REF, TREE_TYPE (fld), op, fld, > + NULL_TREE); > + } > + } > + > + return NULL_TREE; > +} > + > /* Print the MEM_REF expression REF, including its type and offset. > Apply casts as necessary if the type of the access is different > from the type of the accessed object. Produce compact output > @@ -1836,6 +1943,17 @@ print_mem_ref (c_pretty_printer *pp, tre > const bool addr = TREE_CODE (arg) == ADDR_EXPR; > if (addr) > { > + tree op > + = c_fold_indirect_ref_for_warn (EXPR_LOCATION (e), TREE_TYPE (e), > + TREE_OPERAND (arg, 0), byte_off); > + if (op > + && byte_off == 0 > + && lang_hooks.types_compatible_p (TREE_TYPE (e), TREE_TYPE (op))) > + { > + pp->expression (op); > + return; > + } > + byte_off = wi::to_offset (TREE_OPERAND (e, 1)); > arg = TREE_OPERAND (arg, 0); > if (byte_off == 0) > { > --- gcc/testsuite/gcc.dg/uninit-40.c.jj 2021-01-13 15:23:47.366134905 +0100 > +++ gcc/testsuite/gcc.dg/uninit-40.c 2021-01-13 15:23:15.208500279 +0100 > @@ -0,0 +1,31 @@ > +/* PR tree-optimization/98597 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -Wuninitialized" } */ > + > +union U { double d; int i; float f; }; > +struct S { char a; int b; char c; unsigned d; union U e; }; > +struct T { char t; struct S u; int v; }; > +typedef short U[2][2]; > +void baz (U *); > + > +static inline int > +bar (char *p) > +{ > + return *(int *) p; > +} > + > +void > +foo (int *q) > +{ > + struct T t; > + t.t = 1; > + t.u.c = 2; > + char *pt = (char *) &t; > + q[0] = bar (pt + __builtin_offsetof (struct T, u.b)); /* { dg-warning "'t\\.u\\.b' is used uninitialized" } */ > + q[1] = bar (pt + __builtin_offsetof (struct T, u.e)); /* { dg-warning "'t\\.u\\.e\\.i' is used uninitialized" } */ > + q[2] = bar (pt + __builtin_offsetof (struct T, v)); /* { dg-warning "'t\\.v' is used uninitialized" } */ > + int s[3]; > + s[0] = 1; > + char *ps = (char *) s; > + q[3] = bar (ps + sizeof (int)); /* { dg-warning "'s\\\[1\\\]' is used uninitialized" } */ > +} > > Jakub > >
On Thu, Jan 14, 2021 at 08:43:54AM +0100, Richard Biener wrote: > > But for > > diagnostics that is what the user actually want to see IMHO. > > So on the attached testcase, instead of printing what is in left column > > it prints what is in right column: > > ((int*)t) + 3 t.u.b > > ((int*)t) + 6 t.u.e.i > > ((int*)t) + 8 t.v > > s + 1 s[1] > > so while that's "nice" in general, for TBAA diagnostics it might actually > be misleading. > > I wonder whether we absolutely need to print a C expression here. I'm afraid yes, because it is not a toplevel routine, but something called from the c-family pretty-printers, so it can be in the middle of arbitrary C/C++ expressions. And printing (3 * (access to a memory object of type 'int' at offset 12 bytes from 't') + 31) * 42 would be just weird. > We could print, instead of *((int *)t + 3), "access to a memory > object of type 'int' at offset 12 bytes from 't'", thus explain > in plain english. > > That said, *((int *)t + 3) is exactly what the access is, *((int *)&t + 3) actually, the code I haven't touched has multiple bugs. The user generally doesn't know the exact layout of the structures, and especially with C++ templates it is extremely hard to figure that out, so even when we could print verbose text it would be helpful to give a hint (in your text something like (which falls into 't.u.b')). I don't see how we can print both the MEM_REF type and TBAA type in a way that would be understandable to the user. Could we print t.u.b if the TBAA type is compatible with the type of the reference and perhaps *(int*)&t.u.b if it is incompatible? From the aliasing perspective that is still different, but we don't print the TBAA type anyway. > In the light of Martins patch this is probably reasonable but still > the general direction is wrong (which is why I didn't approve Martins > original patch). I'm also somewhat disappointed we're breaking this > so late in the cycle. I'm too. > c_fold_indirect_ref_for_warn doesn't look like it is especially > careful about error recovery issues (error_mark_node in random > places of the trees). Maybe that never happens. I've created it by copying and adjusting the C++ cxx_fold_indirect_ref_1 which had those error_mark_node checks in there (haven't verified if they are strictly necessary or not), but as the diagnostic code isn't used solely during middle-end, but also in the FEs and I remember several cases where the types had error marks within the types in there. The function seemed to be too short and after the changes too different from cxx_fold_indirect_ref_1, which contains some very C++ specific parts, handling of the active union member or empty bases (there is a pending PR for it) etc. Jakub
On Thu, Jan 14, 2021 at 09:28:31AM +0100, Jakub Jelinek via Gcc-patches wrote: > I'm afraid yes, because it is not a toplevel routine, but something called > from the c-family pretty-printers, so it can be in the middle of arbitrary > C/C++ expressions. And printing > (3 * (access to a memory object of type 'int' at offset 12 bytes from 't') + 31) * 42 > would be just weird. > > > We could print, instead of *((int *)t + 3), "access to a memory > > object of type 'int' at offset 12 bytes from 't'", thus explain > > in plain english. > > > > That said, *((int *)t + 3) is exactly what the access is, > > *((int *)&t + 3) actually, the code I haven't touched has multiple bugs. > > The user generally doesn't know the exact layout of the structures, > and especially with C++ templates it is extremely hard to figure that out, > so even when we could print verbose text it would be helpful to give a hint > (in your text something like (which falls into 't.u.b')). > I don't see how we can print both the MEM_REF type and TBAA type in a way > that would be understandable to the user. > > Could we print > t.u.b > if the TBAA type is compatible with the type of the reference and perhaps > *(int*)&t.u.b > if it is incompatible? > >From the aliasing perspective that is still different, but we don't print > the TBAA type anyway. There is another option I forgot about, but perhaps it is too verbose. Print *(int*)((char*)&t + offsetof (struct T, u.b)) so like *(int*)((char*)&t + 12) but print the offset in a more user-friendly way. Jakub
On Thu, 14 Jan 2021, Jakub Jelinek wrote: > On Thu, Jan 14, 2021 at 09:28:31AM +0100, Jakub Jelinek via Gcc-patches wrote: > > I'm afraid yes, because it is not a toplevel routine, but something called > > from the c-family pretty-printers, so it can be in the middle of arbitrary > > C/C++ expressions. And printing > > (3 * (access to a memory object of type 'int' at offset 12 bytes from 't') + 31) * 42 > > would be just weird. > > > > > We could print, instead of *((int *)t + 3), "access to a memory > > > object of type 'int' at offset 12 bytes from 't'", thus explain > > > in plain english. > > > > > > That said, *((int *)t + 3) is exactly what the access is, > > > > *((int *)&t + 3) actually, the code I haven't touched has multiple bugs. > > > > The user generally doesn't know the exact layout of the structures, > > and especially with C++ templates it is extremely hard to figure that out, > > so even when we could print verbose text it would be helpful to give a hint > > (in your text something like (which falls into 't.u.b')). > > I don't see how we can print both the MEM_REF type and TBAA type in a way > > that would be understandable to the user. > > > > Could we print > > t.u.b > > if the TBAA type is compatible with the type of the reference and perhaps > > *(int*)&t.u.b > > if it is incompatible? > > >From the aliasing perspective that is still different, but we don't print > > the TBAA type anyway. True. As said we could simply add a GCC extension to write a MEM_REF in source and print that syntax ... then it would be valid (GCC) C/C++. > There is another option I forgot about, but perhaps it is too verbose. > Print > *(int*)((char*)&t + offsetof (struct T, u.b)) or rather offsetof (struct T, u) to not single out a specific union member? Richard. > so like > *(int*)((char*)&t + 12) > but print the offset in a more user-friendly way.
On Thu, Jan 14, 2021 at 11:05:40AM +0100, Richard Biener wrote: > > > Could we print > > > t.u.b > > > if the TBAA type is compatible with the type of the reference and perhaps > > > *(int*)&t.u.b > > > if it is incompatible? > > > >From the aliasing perspective that is still different, but we don't print > > > the TBAA type anyway. > > True. As said we could simply add a GCC extension to write a MEM_REF > in source and print that syntax ... then it would be valid (GCC) C/C++. But even if we do that unless people are familiar with that extension they wouldn't know what it means (and they didn't write it in that way in their source). > > There is another option I forgot about, but perhaps it is too verbose. > > Print > > *(int*)((char*)&t + offsetof (struct T, u.b)) > > or rather offsetof (struct T, u) to not single out a specific union > member? Sure, I can just get rid of the UNION_TYPE handling from the function, or use it only if the TBAA access type is compatible. Jakub
On Thu, 14 Jan 2021, Jakub Jelinek wrote: > On Thu, Jan 14, 2021 at 11:05:40AM +0100, Richard Biener wrote: > > > > Could we print > > > > t.u.b > > > > if the TBAA type is compatible with the type of the reference and perhaps > > > > *(int*)&t.u.b > > > > if it is incompatible? > > > > >From the aliasing perspective that is still different, but we don't print > > > > the TBAA type anyway. > > > > True. As said we could simply add a GCC extension to write a MEM_REF > > in source and print that syntax ... then it would be valid (GCC) C/C++. > > But even if we do that unless people are familiar with that extension they > wouldn't know what it means (and they didn't write it in that way in their > source). I'd just use it in case we can't express it in the C/C++ way (thus when the TBAA type differs). We already print {ref-all} in type dumping for example, which isn't C/C++ either. > > > There is another option I forgot about, but perhaps it is too verbose. > > > Print > > > *(int*)((char*)&t + offsetof (struct T, u.b)) > > > > or rather offsetof (struct T, u) to not single out a specific union > > member? > > Sure, I can just get rid of the UNION_TYPE handling from the function, > or use it only if the TBAA access type is compatible. Sure. I wonder if we can use some pp flags to tell whether we're being called from FE or middle-end diagnostics and thus whether we should try to produce source-like expressions or can expect weird GIMPLE IL derived expressions. Richard.
On 1/14/21 12:43 AM, Richard Biener wrote: > On Wed, 13 Jan 2021, Jakub Jelinek wrote: > >> Hi! >> >> The following patch doesn't actually fix the print_mem_ref bugs, I've kept >> it for now as broken as it was, but at least improves the cases where >> we can unambiguously map back MEM[&something + off] into some particular >> reference (e.g. something.foo[1].bar etc.). >> In the distant past I think we were folding such MEM_REFs back to >> COMPONENT_REFs and ARRAY_REFs, but we've stopped doing that. > > Yeah, because it has different semantics - *(((int *)t + 3) > accesses an int object while t.u.b accesses a 't' object from the TBAA > perspective. > >> But for >> diagnostics that is what the user actually want to see IMHO. >> So on the attached testcase, instead of printing what is in left column >> it prints what is in right column: >> ((int*)t) + 3 t.u.b >> ((int*)t) + 6 t.u.e.i >> ((int*)t) + 8 t.v >> s + 1 s[1] > > so while that's "nice" in general, for TBAA diagnostics it might actually > be misleading. > > I wonder whether we absolutely need to print a C expression here. > We could print, instead of *((int *)t + 3), "access to a memory > object of type 'int' at offset 12 bytes from 't'", thus explain > in plain english. > > That said, *((int *)t + 3) is exactly what the access is, > semantically. There's the case of a mismatch of the access type > and the TBAA type which we cannot write down in C terms but maybe > we want to have a builtin for this? __builtin_access (ptr, lvalue-type, > tbaa-type)? > >> Of course, print_mem_ref needs to be also fixed to avoid printing the >> nonsense it is printing right now, t is a structure type, so it can't be >> cast to int* in C and in C++ only using some user operator, and >> the result of what it printed is a pointer, while the uninitialized reads >> are int. >> >> I was hoping Martin would fix that, but given his comment in the PR I think >> I'll fix it myself tomorrow. >> >> Anyway, this patch is useful on its own. Bootstrapped/regtested on >> x86_64-linux and i686-linux, ok for trunk? > > In the light of Martins patch this is probably reasonable but still > the general direction is wrong (which is why I didn't approve Martins > original patch). I'm also somewhat disappointed we're breaking this > so late in the cycle. So am I. I didn't test this change as exhaustively as I could and (in light of the poor test coverage) should have. That's my bad. FWIW, I did do it for the first patch (by instrumenting GCC and formatting every MEM_REF it came across), but it didn't occur to me to do it this time around. I have now completed this testing (it found one more ICE elsewhere that I'll fix soon). That said, as I mentioned in the patch submission, most middle end warnings don't format MEM_REFs. -Wuninitialized is an outlier here. Most middle end warnings about invalid accesses print the decl or allocation call with either a type or size of the access, followed by either an array index or a byte offset in to the target of the access. For consistency I'd like to converge most middle end warnings on the same style (formatted by the same code). When that happens, the MEM_REF format should become much less important, so I wouldn't invest too much effort into perfecting it. Martin
--- gcc/c-family/c-pretty-print.c.jj 2021-01-13 08:02:09.425498954 +0100 +++ gcc/c-family/c-pretty-print.c 2021-01-13 15:02:57.860329998 +0100 @@ -1809,6 +1809,113 @@ pp_c_call_argument_list (c_pretty_printe pp_c_right_paren (pp); } +/* Try to fold *(type *)&op into op.fld.fld2[1] if possible. + Only used for printing expressions. Should punt if ambiguous + (e.g. in unions). */ + +static tree +c_fold_indirect_ref_for_warn (location_t loc, tree type, tree op, + offset_int &off) +{ + tree optype = TREE_TYPE (op); + if (off == 0) + { + if (lang_hooks.types_compatible_p (optype, type)) + return op; + /* *(foo *)&complexfoo => __real__ complexfoo */ + else if (TREE_CODE (optype) == COMPLEX_TYPE + && lang_hooks.types_compatible_p (type, TREE_TYPE (optype))) + return build1_loc (loc, REALPART_EXPR, type, op); + } + /* ((foo*)&complexfoo)[1] => __imag__ complexfoo */ + else if (TREE_CODE (optype) == COMPLEX_TYPE + && lang_hooks.types_compatible_p (type, TREE_TYPE (optype)) + && tree_to_uhwi (TYPE_SIZE_UNIT (type)) == off) + { + off = 0; + return build1_loc (loc, IMAGPART_EXPR, type, op); + } + /* ((foo *)&fooarray)[x] => fooarray[x] */ + if (TREE_CODE (optype) == ARRAY_TYPE + && TYPE_SIZE_UNIT (TREE_TYPE (optype)) + && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (optype))) == INTEGER_CST + && !integer_zerop (TYPE_SIZE_UNIT (TREE_TYPE (optype)))) + { + tree type_domain = TYPE_DOMAIN (optype); + tree min_val = size_zero_node; + if (type_domain && TYPE_MIN_VALUE (type_domain)) + min_val = TYPE_MIN_VALUE (type_domain); + offset_int el_sz = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (optype))); + offset_int idx = off / el_sz; + offset_int rem = off % el_sz; + if (TREE_CODE (min_val) == INTEGER_CST) + { + tree index + = wide_int_to_tree (sizetype, idx + wi::to_offset (min_val)); + op = build4_loc (loc, ARRAY_REF, TREE_TYPE (optype), op, index, + NULL_TREE, NULL_TREE); + off = rem; + if (tree ret = c_fold_indirect_ref_for_warn (loc, type, op, off)) + return ret; + return op; + } + } + /* ((foo *)&struct_with_foo_field)[x] => COMPONENT_REF */ + else if (TREE_CODE (optype) == RECORD_TYPE) + { + for (tree field = TYPE_FIELDS (optype); + field; field = DECL_CHAIN (field)) + if (TREE_CODE (field) == FIELD_DECL + && TREE_TYPE (field) != error_mark_node + && TYPE_SIZE_UNIT (TREE_TYPE (field)) + && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (field))) == INTEGER_CST) + { + tree pos = byte_position (field); + if (TREE_CODE (pos) != INTEGER_CST) + continue; + offset_int upos = wi::to_offset (pos); + offset_int el_sz + = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (field))); + if (upos <= off && off < upos + el_sz) + { + tree cop = build3_loc (loc, COMPONENT_REF, TREE_TYPE (field), + op, field, NULL_TREE); + off = off - upos; + if (tree ret = c_fold_indirect_ref_for_warn (loc, type, cop, + off)) + return ret; + return cop; + } + } + } + /* Similarly for unions, but in this case try to be very conservative, + only match if some field has type compatible with type and it is the + only such field. */ + else if (TREE_CODE (optype) == UNION_TYPE) + { + tree fld = NULL_TREE; + for (tree field = TYPE_FIELDS (optype); + field; field = DECL_CHAIN (field)) + if (TREE_CODE (field) == FIELD_DECL + && TREE_TYPE (field) != error_mark_node + && lang_hooks.types_compatible_p (TREE_TYPE (field), type)) + { + if (fld) + return NULL_TREE; + else + fld = field; + } + if (fld) + { + off = 0; + return build3_loc (loc, COMPONENT_REF, TREE_TYPE (fld), op, fld, + NULL_TREE); + } + } + + return NULL_TREE; +} + /* Print the MEM_REF expression REF, including its type and offset. Apply casts as necessary if the type of the access is different from the type of the accessed object. Produce compact output @@ -1836,6 +1943,17 @@ print_mem_ref (c_pretty_printer *pp, tre const bool addr = TREE_CODE (arg) == ADDR_EXPR; if (addr) { + tree op + = c_fold_indirect_ref_for_warn (EXPR_LOCATION (e), TREE_TYPE (e), + TREE_OPERAND (arg, 0), byte_off); + if (op + && byte_off == 0 + && lang_hooks.types_compatible_p (TREE_TYPE (e), TREE_TYPE (op))) + { + pp->expression (op); + return; + } + byte_off = wi::to_offset (TREE_OPERAND (e, 1)); arg = TREE_OPERAND (arg, 0); if (byte_off == 0) { --- gcc/testsuite/gcc.dg/uninit-40.c.jj 2021-01-13 15:23:47.366134905 +0100 +++ gcc/testsuite/gcc.dg/uninit-40.c 2021-01-13 15:23:15.208500279 +0100 @@ -0,0 +1,31 @@ +/* PR tree-optimization/98597 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -Wuninitialized" } */ + +union U { double d; int i; float f; }; +struct S { char a; int b; char c; unsigned d; union U e; }; +struct T { char t; struct S u; int v; }; +typedef short U[2][2]; +void baz (U *); + +static inline int +bar (char *p) +{ + return *(int *) p; +} + +void +foo (int *q) +{ + struct T t; + t.t = 1; + t.u.c = 2; + char *pt = (char *) &t; + q[0] = bar (pt + __builtin_offsetof (struct T, u.b)); /* { dg-warning "'t\\.u\\.b' is used uninitialized" } */ + q[1] = bar (pt + __builtin_offsetof (struct T, u.e)); /* { dg-warning "'t\\.u\\.e\\.i' is used uninitialized" } */ + q[2] = bar (pt + __builtin_offsetof (struct T, v)); /* { dg-warning "'t\\.v' is used uninitialized" } */ + int s[3]; + s[0] = 1; + char *ps = (char *) s; + q[3] = bar (ps + sizeof (int)); /* { dg-warning "'s\\\[1\\\]' is used uninitialized" } */ +}