diff mbox series

C++ warning on vexing parse

Message ID 2305655f-3433-a199-e49a-76bfc12986c4@acm.org
State New
Headers show
Series C++ warning on vexing parse | expand

Commit Message

Nathan Sidwell Oct. 3, 2017, 5:08 p.m. UTC
[non-c++ people on CC, there's a reason ...]

This patch adds a new warning, concerning unnecessary parentheses on a 
declaration.  For instance:
    prettyprinter (pp);
That's a declaration of a pp variable of type prettyprinter -- not a 
call of a prettyprinter function.  The reason is that in C++, anything 
that is syntactically a declaration, is a declaration.  This can lead to 
surprising unexpected code generation, and the documentation I added 
provides motivation:
   std::unique_lock<std::mutex> (mymutex);
That is NOT a lock of mymutex.  It's a declaration of a variable called 
mymutex locking no mutex.  If we're in a member function and 'mymutex' 
is a field, Wshadow-parm doesn't trigger either.

This patch notes when a direct-declarator is parenthesized, and then 
warns about it in grokdeclarator.  Obviously, parens are sometimes 
necessary -- when declaring pointers to arrays or functions, and we 
don't warn then.  The simplest way to do that was during the parsing, 
not in grokdeclarator, where the cp_declarator object is constant.  We'd 
either have to change that, or have some other local state.

I added this to -Wparentheses (enabled by -Wall).  It triggered in a few 
cases during bootstrap:

1) in toplev.c we have the above prettyprinter example.  I think that's 
clearly poorly formatted code.

2) caller-save.c: insert_save has 'HARD_REG_SET (*to_save)', which also 
seems poorly formatted to me -- other cases of ptr-to-HRS don't do this.

3) A couple of places do:
    T (name  // line break to avoid wrap
       [LONGEXPR][OTHERLONGEXPR]);

The parens aid the editor's formatter. The patch removes the parens, but 
I can see there may be disagreement.  I suppose we could permit parens 
at the outermost declarator for an array?
Affects ira-int.h (target_ira_int::x_ira_reg_mode_hard_regset)
	 reload.h (target_reload::x_regno_save_mode)

3) A set of typedef'd function types (NOT ptr-to-fn).  The name being 
typedef'd is parenthesized and not an idiom I recall seeing before. 
AFAICT these are the only cases of typedefing a plain fn.  We could, I 
suppose, ignore parens on a typedef'd fntype, but that seems a little 
random.
Affects gengtype.c (frul_actionrout_t)
	lto-streamer.h (lto_get_section_data_f & to_free_section_data_f)
	tree-ssa-threadedge.c (pfn_simplify)

If you've objections to the changes I've made in instances of #3 & #4 
let me know.

Jason, do you have concerns about the C++ patch itself?

nathan

Comments

Jason Merrill Oct. 4, 2017, 2:44 p.m. UTC | #1
On Tue, Oct 3, 2017 at 1:08 PM, Nathan Sidwell <nathan@acm.org> wrote:
> [non-c++ people on CC, there's a reason ...]
>
> This patch adds a new warning, concerning unnecessary parentheses on a
> declaration.  For instance:
>    prettyprinter (pp);
> That's a declaration of a pp variable of type prettyprinter -- not a call of
> a prettyprinter function.  The reason is that in C++, anything that is
> syntactically a declaration, is a declaration.  This can lead to surprising
> unexpected code generation, and the documentation I added provides
> motivation:
>   std::unique_lock<std::mutex> (mymutex);
> That is NOT a lock of mymutex.  It's a declaration of a variable called
> mymutex locking no mutex.  If we're in a member function and 'mymutex' is a
> field, Wshadow-parm doesn't trigger either.
>
> This patch notes when a direct-declarator is parenthesized, and then warns
> about it in grokdeclarator.  Obviously, parens are sometimes necessary --
> when declaring pointers to arrays or functions, and we don't warn then.  The
> simplest way to do that was during the parsing, not in grokdeclarator, where
> the cp_declarator object is constant.  We'd either have to change that, or
> have some other local state.
>
> I added this to -Wparentheses (enabled by -Wall).  It triggered in a few
> cases during bootstrap:
>
> 1) in toplev.c we have the above prettyprinter example.  I think that's
> clearly poorly formatted code.

Definitely.

> 2) caller-save.c: insert_save has 'HARD_REG_SET (*to_save)', which also
> seems poorly formatted to me -- other cases of ptr-to-HRS don't do this.

Probably autopilot from someone used to parenthesizing declarators for
pointer-to-function or pointer-to-array types.

> 3) A couple of places do:
>    T (name  // line break to avoid wrap
>       [LONGEXPR][OTHERLONGEXPR]);
>
> The parens aid the editor's formatter. The patch removes the parens, but I
> can see there may be disagreement.  I suppose we could permit parens at the
> outermost declarator for an array?
> Affects ira-int.h (target_ira_int::x_ira_reg_mode_hard_regset)
>          reload.h (target_reload::x_regno_save_mode)

We could also check whether the declarator spans lines to detect this case.

> 4) A set of typedef'd function types (NOT ptr-to-fn).  The name being
> typedef'd is parenthesized and not an idiom I recall seeing before. AFAICT
> these are the only cases of typedefing a plain fn.  We could, I suppose,
> ignore parens on a typedef'd fntype, but that seems a little random.
> Affects gengtype.c (frul_actionrout_t)
>         lto-streamer.h (lto_get_section_data_f & to_free_section_data_f)
>         tree-ssa-threadedge.c (pfn_simplify)

Hmm, I don't think we want to diagnose these; if a parameter-list
follows the parenthesized declarator, it isn't ambiguous.

3 and 4 seem like false positives.  The problematic cases are all ones
where the parenthesized name is at the end of the declarator; if the
declarator continues after the name, either within or without the
parens, this

> Jason, do you have concerns about the C++ patch itself?

Nope.

Jason
Jakub Jelinek Oct. 4, 2017, 2:49 p.m. UTC | #2
On Wed, Oct 04, 2017 at 10:44:11AM -0400, Jason Merrill wrote:
> > 3) A couple of places do:
> >    T (name  // line break to avoid wrap
> >       [LONGEXPR][OTHERLONGEXPR]);
> >
> > The parens aid the editor's formatter. The patch removes the parens, but I
> > can see there may be disagreement.  I suppose we could permit parens at the
> > outermost declarator for an array?
> > Affects ira-int.h (target_ira_int::x_ira_reg_mode_hard_regset)
> >          reload.h (target_reload::x_regno_save_mode)
> 
> We could also check whether the declarator spans lines to detect this case.

I think it is better to just fix it up so that no line break is needed
there.  E.g. define a macro, enum or const int with the sizes and use that
in the declaration.

> > 4) A set of typedef'd function types (NOT ptr-to-fn).  The name being
> > typedef'd is parenthesized and not an idiom I recall seeing before. AFAICT
> > these are the only cases of typedefing a plain fn.  We could, I suppose,
> > ignore parens on a typedef'd fntype, but that seems a little random.
> > Affects gengtype.c (frul_actionrout_t)
> >         lto-streamer.h (lto_get_section_data_f & to_free_section_data_f)
> >         tree-ssa-threadedge.c (pfn_simplify)
> 
> Hmm, I don't think we want to diagnose these; if a parameter-list
> follows the parenthesized declarator, it isn't ambiguous.
> 
> 3 and 4 seem like false positives.  The problematic cases are all ones
> where the parenthesized name is at the end of the declarator; if the
> declarator continues after the name, either within or without the
> parens, this
> 
> > Jason, do you have concerns about the C++ patch itself?
> 
> Nope.

	Jakub
Eric Gallager Oct. 5, 2017, 12:17 a.m. UTC | #3
On Tue, Oct 3, 2017 at 1:08 PM, Nathan Sidwell <nathan@acm.org> wrote:
> [non-c++ people on CC, there's a reason ...]
>
> This patch adds a new warning, concerning unnecessary parentheses on a
> declaration.  For instance:
>    prettyprinter (pp);
> That's a declaration of a pp variable of type prettyprinter -- not a call of
> a prettyprinter function.  The reason is that in C++, anything that is
> syntactically a declaration, is a declaration.  This can lead to surprising
> unexpected code generation, and the documentation I added provides
> motivation:
>   std::unique_lock<std::mutex> (mymutex);
> That is NOT a lock of mymutex.  It's a declaration of a variable called
> mymutex locking no mutex.  If we're in a member function and 'mymutex' is a
> field, Wshadow-parm doesn't trigger either.
>
> This patch notes when a direct-declarator is parenthesized, and then warns
> about it in grokdeclarator.  Obviously, parens are sometimes necessary --
> when declaring pointers to arrays or functions, and we don't warn then.  The
> simplest way to do that was during the parsing, not in grokdeclarator, where
> the cp_declarator object is constant.  We'd either have to change that, or
> have some other local state.
>
> I added this to -Wparentheses (enabled by -Wall).  It triggered in a few
> cases during bootstrap:
>
> 1) in toplev.c we have the above prettyprinter example.  I think that's
> clearly poorly formatted code.
>
> 2) caller-save.c: insert_save has 'HARD_REG_SET (*to_save)', which also
> seems poorly formatted to me -- other cases of ptr-to-HRS don't do this.
>
> 3) A couple of places do:
>    T (name  // line break to avoid wrap
>       [LONGEXPR][OTHERLONGEXPR]);
>
> The parens aid the editor's formatter. The patch removes the parens, but I
> can see there may be disagreement.  I suppose we could permit parens at the
> outermost declarator for an array?
> Affects ira-int.h (target_ira_int::x_ira_reg_mode_hard_regset)
>          reload.h (target_reload::x_regno_save_mode)
>
> 3) A set of typedef'd function types (NOT ptr-to-fn).  The name being
> typedef'd is parenthesized and not an idiom I recall seeing before. AFAICT
> these are the only cases of typedefing a plain fn.  We could, I suppose,
> ignore parens on a typedef'd fntype, but that seems a little random.
> Affects gengtype.c (frul_actionrout_t)
>         lto-streamer.h (lto_get_section_data_f & to_free_section_data_f)
>         tree-ssa-threadedge.c (pfn_simplify)
>
> If you've objections to the changes I've made in instances of #3 & #4 let me
> know.
>
> Jason, do you have concerns about the C++ patch itself?
>
> nathan
> --
> Nathan Sidwell

Could you check and see if this fixes any of the preexisting
currently-open bugs related to most-vexing-parse (or similar) warnings
on Bugzilla? For example:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64679
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25814
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69855
Nathan Sidwell Oct. 5, 2017, 12:23 p.m. UTC | #4
On 10/04/2017 08:17 PM, Eric Gallager wrote:
> On Tue, Oct 3, 2017 at 1:08 PM, Nathan Sidwell <nathan@acm.org> wrote:
>> [non-c++ people on CC, there's a reason ...]
>>
>> This patch adds a new warning, concerning unnecessary parentheses on a
>> declaration.  For instance:
>>     prettyprinter (pp);

> 
> Could you check and see if this fixes any of the preexisting
> currently-open bugs related to most-vexing-parse (or similar) warnings
> on Bugzilla? For example:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64679
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25814
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69855

Sadly no.  But, 69855 appears to already have been fixed -- perhaps by 
my namespace lookup changes earlier this year?

nathan
Nathan Sidwell Oct. 5, 2017, 1:26 p.m. UTC | #5
On 10/04/2017 10:44 AM, Jason Merrill wrote:

> Hmm, I don't think we want to diagnose these; if a parameter-list
> follows the parenthesized declarator, it isn't ambiguous.

True.  I went with that approach

> 3 and 4 seem like false positives.  The problematic cases are all ones
> where the parenthesized name is at the end of the declarator; if the
> declarator continues after the name, either within or without the
> parens, this

I went with this approach.  For an array declarator of the form:
   T (ary...[X]);
we only warn when the '(' and ')' are on the same line.

Thus no triggers on GCC's source base (other than the earlier two I 
already fixed).

Applying the attached

nathan
diff mbox series

Patch

2017-10-03  Nathan Sidwell  <nathan@acm.org>

	* caller-save.c (insert_save): Remove parens around TO_SAVE arg.
	* gengtype.c (frul_actionrout_t): Remove parens.
	* ira-int.h (target_ira_int::x_ira_reg_mode_hard_regset): Likewise.
	* lto-streamer.h (lto_get_section_data_f,
	lto_free_section_data_f): Remove parens.
	* reload.h (target_reload::x_regno_save_mode): Remove parens.
	* toplev.c (toplev::main): Remove parens on pp declaration.
	* tree-ssa-threadedge.c (pfn_simplify): Remove parens.
	* doc/invoke.texi (Wparentheses): Document C++ MVP behaviour.

	cp/
	* cp-tree.h (struct cp_declarator): Add parenthesized field.
	* decl.c (grokdeclarator): Warn about unnecessary parens.
	* parser.c (make_declarator): Init parenthesized field.
	(make_call_declarator, make_array_declarator): Conditionally clear
	parenthesized field.
	(token_pair::open_loc): New accessor.
	(cp_parser_direct_declarator): Set parenthesized field.

	testsuite/
	* g++.dg/warn/mvp.C: New.

Index: caller-save.c
===================================================================
--- caller-save.c	(revision 253381)
+++ caller-save.c	(working copy)
@@ -1265,7 +1265,7 @@  insert_restore (struct insn_chain *chain
 
 static int
 insert_save (struct insn_chain *chain, int before_p, int regno,
-	     HARD_REG_SET (*to_save), machine_mode *save_mode)
+	     HARD_REG_SET *to_save, machine_mode *save_mode)
 {
   int i;
   unsigned int k;
Index: gengtype.c
===================================================================
--- gengtype.c	(revision 253381)
+++ gengtype.c	(working copy)
@@ -1894,7 +1894,7 @@  get_file_gtfilename (const input_file *i
  */
 
 /* Signature of actions in file rules.  */
-typedef outf_p (frul_actionrout_t) (input_file*, char**, char**);
+typedef outf_p frul_actionrout_t (input_file*, char**, char**);
 
 
 struct file_rule_st {
Index: ira-int.h
===================================================================
--- ira-int.h	(revision 253381)
+++ ira-int.h	(working copy)
@@ -801,8 +801,8 @@  struct target_ira_int {
 
   /* Map: hard regs X modes -> set of hard registers for storing value
      of given mode starting with given hard register.  */
-  HARD_REG_SET (x_ira_reg_mode_hard_regset
-		[FIRST_PSEUDO_REGISTER][NUM_MACHINE_MODES]);
+  HARD_REG_SET x_ira_reg_mode_hard_regset
+		[FIRST_PSEUDO_REGISTER][NUM_MACHINE_MODES];
 
   /* Maximum cost of moving from a register in one class to a register
      in another class.  Based on TARGET_REGISTER_MOVE_COST.  */
Index: lto-streamer.h
===================================================================
--- lto-streamer.h	(revision 253381)
+++ lto-streamer.h	(working copy)
@@ -278,19 +278,19 @@  lto_file_decl_data_num_ ## name ## s (st
    to be obtained.  The third parameter is the name of the function
    and is only used when finding a function body; otherwise it is
    NULL.  The fourth parameter is the length of the data returned.  */
-typedef const char* (lto_get_section_data_f) (struct lto_file_decl_data *,
-					      enum lto_section_type,
-					      const char *,
-					      size_t *);
+typedef const char* lto_get_section_data_f (struct lto_file_decl_data *,
+					    enum lto_section_type,
+					    const char *,
+					    size_t *);
 
 /* Return the data found from the above call.  The first three
    parameters are the same as above.  The fourth parameter is the data
    itself and the fifth is the length of the data. */
-typedef void (lto_free_section_data_f) (struct lto_file_decl_data *,
-					enum lto_section_type,
-					const char *,
-					const char *,
-					size_t);
+typedef void lto_free_section_data_f (struct lto_file_decl_data *,
+				      enum lto_section_type,
+				      const char *,
+				      const char *,
+				      size_t);
 
 /* The location cache holds expanded locations for streamed in trees.
    This is done to reduce memory usage of libcpp linemap that strongly preffers
Index: reload.h
===================================================================
--- reload.h	(revision 253381)
+++ reload.h	(working copy)
@@ -174,9 +174,8 @@  struct target_reload {
      enough to save the entire contents of the register.  When saving the
      register because it is live we first try to save in multi-register modes.
      If that is not possible the save is done one register at a time.  */
-  machine_mode (x_regno_save_mode
-		     [FIRST_PSEUDO_REGISTER]
-		     [MAX_MOVE_MAX / MIN_UNITS_PER_WORD + 1]);
+  machine_mode x_regno_save_mode[FIRST_PSEUDO_REGISTER]
+			[MAX_MOVE_MAX / MIN_UNITS_PER_WORD + 1];
 
   /* Nonzero if an address (plus (reg frame_pointer) (reg ...)) is valid
      in the given mode.  */
Index: toplev.c
===================================================================
--- toplev.c	(revision 253381)
+++ toplev.c	(working copy)
@@ -2186,7 +2186,7 @@  toplev::main (int argc, char **argv)
     {
       gcc_assert (global_dc->edit_context_ptr);
 
-      pretty_printer (pp);
+      pretty_printer pp;
       pp_show_color (&pp) = pp_show_color (global_dc->printer);
       global_dc->edit_context_ptr->print_diff (&pp, true);
       pp_flush (&pp);
Index: tree-ssa-threadedge.c
===================================================================
--- tree-ssa-threadedge.c	(revision 253381)
+++ tree-ssa-threadedge.c	(working copy)
@@ -47,9 +47,8 @@  static int stmt_count;
 /* Array to record value-handles per SSA_NAME.  */
 vec<tree> ssa_name_values;
 
-typedef tree (pfn_simplify) (gimple *, gimple *,
-			     class avail_exprs_stack *,
-			     basic_block);
+typedef tree pfn_simplify (gimple *, gimple *, class avail_exprs_stack *,
+			   basic_block);
 
 /* Set the value for the SSA name NAME to VALUE.  */
 
Index: cp/cp-tree.h
===================================================================
--- cp/cp-tree.h	(revision 253381)
+++ cp/cp-tree.h	(working copy)
@@ -5676,6 +5676,10 @@  struct cp_declarator {
   /* Whether we parsed an ellipsis (`...') just before the declarator,
      to indicate this is a parameter pack.  */
   BOOL_BITFIELD parameter_pack_p : 1;
+  /* If this declaration is parethesize, this the open-paren.  It is
+     UNKNOWN_LOCATION when not parethesized.  */
+  location_t parenthesized;
+
   location_t id_loc; /* Currently only set for cdk_id, cdk_decomp and
 			cdk_function. */
   /* GNU Attributes that apply to this declarator.  If the declarator
Index: cp/decl.c
===================================================================
--- cp/decl.c	(revision 253381)
+++ cp/decl.c	(working copy)
@@ -10807,6 +10807,9 @@  grokdeclarator (const cp_declarator *dec
 					    attr_flags);
 	}
 
+      if (declarator->parenthesized != UNKNOWN_LOCATION)
+	warning_at (declarator->parenthesized, OPT_Wparentheses,
+		    "unnecessary parentheses in declaration of %qs", name);
       if (declarator->kind == cdk_id || declarator->kind == cdk_decomp)
 	break;
 
Index: cp/parser.c
===================================================================
--- cp/parser.c	(revision 253381)
+++ cp/parser.c	(working copy)
@@ -1451,6 +1451,7 @@  make_declarator (cp_declarator_kind kind
 
   declarator = (cp_declarator *) alloc_declarator (sizeof (cp_declarator));
   declarator->kind = kind;
+  declarator->parenthesized = UNKNOWN_LOCATION;
   declarator->attributes = NULL_TREE;
   declarator->std_attributes = NULL_TREE;
   declarator->declarator = NULL;
@@ -1603,6 +1604,11 @@  make_call_declarator (cp_declarator *tar
 
   declarator = make_declarator (cdk_function);
   declarator->declarator = target;
+  /* Although parens are also superfluous if target is an array,
+     mixing arrays and functions is sufficiently confusing, let's not
+     check that.  */
+  if (target && target->kind != cdk_id && target->kind != cdk_function)
+    target->parenthesized = UNKNOWN_LOCATION;
   declarator->u.function.parameters = parms;
   declarator->u.function.qualifiers = cv_qualifiers;
   declarator->u.function.virt_specifiers = virt_specifiers;
@@ -1633,6 +1639,11 @@  make_array_declarator (cp_declarator *el
 
   declarator = make_declarator (cdk_array);
   declarator->declarator = element;
+  /* Although parens are also superfluous if target is a function,
+     mixing arrays and functions is sufficiently confusing, let's not
+     check that.  */
+  if (element && element->kind != cdk_id && element->kind != cdk_array)
+    element->parenthesized = UNKNOWN_LOCATION;
   declarator->u.array.bounds = bounds;
   if (element)
     {
@@ -4533,6 +4544,11 @@  class token_pair
 			      traits_t::required_token_open);
   }
 
+  location_t open_loc () const
+  {
+    return m_open_loc;
+  }
+
   /* Consume the next token from PARSER, recording its location as
      that of the opening token within the pair.  */
 
@@ -19986,6 +20002,8 @@  cp_parser_direct_declarator (cp_parser*
 		= cp_parser_declarator (parser, dcl_kind, ctor_dtor_or_conv_p,
 					/*parenthesized_p=*/NULL,
 					member_p, friend_p);
+	      if (declarator)
+		declarator->parenthesized = parens.open_loc ();
 	      parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
 	      first = false;
 	      /* Expect a `)'.  */
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 253381)
+++ doc/invoke.texi	(working copy)
@@ -4579,6 +4579,16 @@  in the @code{?}: operator is a boolean e
 always 1.  Often programmers expect it to be a value computed
 inside the conditional expression instead.
 
+For C++ this also warns on unnecessary parentheses in declarations, which
+can indicate an attempt at a function call instead of a declaration:
+@smallexample
+@{
+  // Declares a local variable called mymutex.
+  std::unique_lock<std::mutex> (mymutex);
+  // User meant std::unique_lock<std::mutex> lock (mymutex);
+@}
+@end smallexample
+
 This warning is enabled by @option{-Wall}.
 
 @item -Wsequence-point
Index: testsuite/g++.dg/warn/mvp.C
===================================================================
--- testsuite/g++.dg/warn/mvp.C	(revision 0)
+++ testsuite/g++.dg/warn/mvp.C	(working copy)
@@ -0,0 +1,68 @@ 
+// { dg-additional-options -Wparentheses }
+
+// Most Vexing Parse warnings
+// in C++ anythig that syntactically looks like a decl IS a decl, this
+// can lead to confused users, but worse silent unexpectedly unsafe
+// code generation.
+
+int (a); // { dg-warning "" }
+int (*b);  // { dg-warning "" }
+extern int (&c);  // { dg-warning "" }
+
+namespace fns 
+{
+  int (*a) ();
+  int (b) (); // { dg-warning "" }
+  int (*c ()) ();
+  int (d ()); // { dg-warning "" }
+  int (e) (int); // { dg-warning "" }
+  int g (int (a)); // { dg-warning "in declaration of 'a'" }
+  int (h) (int (a)); // { dg-warning "in declaration of 'a'" }
+  // { dg-warning "in declaration of 'h'" "" { target *-*-* } .-1 }
+}
+
+namespace arys
+{
+  int (*a)[1];
+  int (b)[1]; // { dg-warning "" }
+  int (*c[1])[1];
+  int (d[1]); // { dg-warning "" }
+  int (e[1])[1]; // { dg-warning "" }
+}
+
+namespace complex
+{
+  int (*a())[1];
+  int (*b[1])();
+  int ((*c())[1]); // { dg-warning "" }
+  int ((*d[1])()); // { dg-warning "" }
+}
+
+namespace motivation
+{
+  typedef int shared_mutex; // for exposition
+  struct locker
+  {
+    locker ();
+    locker (int &r);
+    ~locker ();
+  };
+  class protected_state 
+  {
+    shared_mutex mutex; // not a real mutex type
+    int state;
+
+  public:
+    void not_thread_safe ()
+    {
+      locker (mutex); // { dg-warning "" }
+      state++; // oops
+    }
+    
+    void thread_safe ()
+    {
+      locker lock (mutex);
+      state++; // ok;
+    }
+  };
+}