diff mbox

[libstdc++/65420] Use constexpr variables as regex_constans flags

Message ID CAG4ZjN=F=AsqN_wBGY-+Acz_UNBL2e+k_jF8zx3dY5GKaBLNyA@mail.gmail.com
State New
Headers show

Commit Message

Tim Shen March 16, 2015, 6:27 a.m. UTC
On Sun, Mar 15, 2015 at 5:59 AM, Daniel Krügler
<daniel.kruegler@gmail.com> wrote:

(I switched to a less interesting but practical change, so all class
stuff is gone. But I'd like to reply to your comments.)

> Your implementation choice is an interesting approach. I believe that
> a strict reading of the library specification does not allow that
> form, because you are using as bitmask type a class type that is not
> std::bitset:

You are too kind :)
I simply forgot the exact requirement of bitmask, and didn't know how,
thought that it should be "a concept, with several operators defined".

As you mentioned, this isn't actually a bad idea; we are using
concepts everywhere, then why don't make it a concept?

> What I would consider as more controversial are the following things:
>
> a) The class takes advantage of a deprecated rule to implicitly
> declare a copy-assignment operator (because it has a user-declared
> copy-constructor). I suggest to fix that and add a defaulted one
> before at some future point this bitmask type is no longer
> copy-assignable.

This would be a easy fix. I admit that I can never remenber in which
case a special member function will be implicitly defined or deleted
:(

> b) Basically no operation is marked as noexcept. This is
> user-observable and seems like an unfortunate outcome.

Obviously all of them should be marked as noexcept.

> c) The presence of a non-explicit conversion to bool allows users to
> unintentionally write code like
>
> std::regex_constants::syntax_option_type a = ..;
> [..]
> int x = a + 3; // Oops
>
> The presence of this function also *seemingly* allows to implicitly
> convert syntax_option_type to any integer type:
>
> int x = a; // Oops
>
> I suggest to make this conversion function explicit.

I did so just because I used this implicit conversion (since they are
originally enums) elsewhere in the code.

I didn't want to change all those code; besides, standard didn't
specify exact behavior of these two types. So it is crappy, yes, but
was more like an engineering trade off to me (when I'm overengineering
things ;).

> d) One could consider to mark the mutable operations
> (syntax_option_type& operator&=(syntax_option_type)) as constexpr in
> C++14 mode, but that is just an idea, not a required functionality.

I did noticed that in other functions (operator bool, etc), but ignored these.


On Sun, Mar 15, 2015 at 6:13 AM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> On 15 March 2015 at 08:09, Tim Shen wrote:
>> Did a little bit refectoring on regex_constants, so that users won't
>> be polluted too much (they are not enum types anymore).
>
> I think this is overengineered and unnecessary.
>
> All it needs is something like:
>
> enum syntax_option_type : unsigned { };
> constexpr syntax_option_type icase = ...;

Acknowledged.

Here's the simple version of it.

Thanks!

Comments

Tim Shen March 24, 2015, 8:05 p.m. UTC | #1
On Sun, Mar 15, 2015 at 11:27 PM, Tim Shen <timshen@google.com> wrote:
> Here's the simple version of it.

Ping?
Jonathan Wakely March 26, 2015, 11:43 a.m. UTC | #2
On 15/03/15 23:27 -0700, Tim Shen wrote:
>> All it needs is something like:
>>
>> enum syntax_option_type : unsigned { };
>> constexpr syntax_option_type icase = ...;
>
>Acknowledged.
>
>Here's the simple version of it.

OK for trunk, thanks.
Jonathan Wakely March 27, 2015, 12:44 p.m. UTC | #3
On 15/03/15 23:27 -0700, Tim Shen wrote:
>+#include <regex>
>+#include <testsuite_hooks.h>
>+
>+// libstdc++/65420
>+void
>+test01()
>+{
>+  bool test __attribute__((unused)) = true;

I forgot to say that for a { dg-do compile } test you don't need a
'test' variable and don't need to #include <testsuite_hooks.h>,
diff mbox

Patch

diff --git a/libstdc++-v3/include/bits/regex_constants.h b/libstdc++-v3/include/bits/regex_constants.h
index 1ef5a43..e2c7631 100644
--- a/libstdc++-v3/include/bits/regex_constants.h
+++ b/libstdc++-v3/include/bits/regex_constants.h
@@ -77,88 +77,97 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
    * elements @c ECMAScript, @c basic, @c extended, @c awk, @c grep, @c egrep
    * %set.
    */
-  enum syntax_option_type : unsigned int
-  {
-    /**
-     * Specifies that the matching of regular expressions against a character
-     * sequence shall be performed without regard to case.
-     */
-    icase      = 1 << _S_icase,
-
-    /**
-     * Specifies that when a regular expression is matched against a character
-     * container sequence, no sub-expression matches are to be stored in the
-     * supplied match_results structure.
-     */
-    nosubs     = 1 << _S_nosubs,
-
-    /**
-     * Specifies that the regular expression engine should pay more attention to
-     * the speed with which regular expressions are matched, and less to the
-     * speed with which regular expression objects are constructed. Otherwise
-     * it has no detectable effect on the program output.
-     */
-    optimize   = 1 << _S_optimize,
-
-    /**
-     * Specifies that character ranges of the form [a-b] should be locale
-     * sensitive.
-     */
-    collate    = 1 << _S_collate,
-
-    /**
-     * Specifies that the grammar recognized by the regular expression engine is
-     * that used by ECMAScript in ECMA-262 [Ecma International, ECMAScript
-     * Language Specification, Standard Ecma-262, third edition, 1999], as
-     * modified in section [28.13].  This grammar is similar to that defined
-     * in the PERL scripting language but extended with elements found in the
-     * POSIX regular expression grammar.
-     */
-    ECMAScript = 1 << _S_ECMAScript,
-
-    /**
-     * Specifies that the grammar recognized by the regular expression engine is
-     * that used by POSIX basic regular expressions in IEEE Std 1003.1-2001,
-     * Portable Operating System Interface (POSIX), Base Definitions and
-     * Headers, Section 9, Regular Expressions [IEEE, Information Technology --
-     * Portable Operating System Interface (POSIX), IEEE Standard 1003.1-2001].
-     */
-    basic      = 1 << _S_basic,
-
-    /**
-     * Specifies that the grammar recognized by the regular expression engine is
-     * that used by POSIX extended regular expressions in IEEE Std 1003.1-2001,
-     * Portable Operating System Interface (POSIX), Base Definitions and
-     * Headers, Section 9, Regular Expressions.
-     */
-    extended   = 1 << _S_extended,
-
-    /**
-     * Specifies that the grammar recognized by the regular expression engine is
-     * that used by POSIX utility awk in IEEE Std 1003.1-2001.  This option is
-     * identical to syntax_option_type extended, except that C-style escape
-     * sequences are supported.  These sequences are:
-     * \\\\, \\a, \\b, \\f, \\n, \\r, \\t , \\v, \\&apos,, &apos,,
-     * and \\ddd (where ddd is one, two, or three octal digits).
-     */
-    awk        = 1 << _S_awk,
-
-    /**
-     * Specifies that the grammar recognized by the regular expression engine is
-     * that used by POSIX utility grep in IEEE Std 1003.1-2001.  This option is
-     * identical to syntax_option_type basic, except that newlines are treated
-     * as whitespace.
-     */
-    grep       = 1 << _S_grep,
-
-    /**
-     * Specifies that the grammar recognized by the regular expression engine is
-     * that used by POSIX utility grep when given the -E option in
-     * IEEE Std 1003.1-2001.  This option is identical to syntax_option_type
-     * extended, except that newlines are treated as whitespace.
-     */
-    egrep      = 1 << _S_egrep,
-  };
+  enum syntax_option_type : unsigned int { };
+
+  /**
+   * Specifies that the matching of regular expressions against a character
+   * sequence shall be performed without regard to case.
+   */
+  constexpr syntax_option_type icase =
+    static_cast<syntax_option_type>(1 << _S_icase);
+
+  /**
+   * Specifies that when a regular expression is matched against a character
+   * container sequence, no sub-expression matches are to be stored in the
+   * supplied match_results structure.
+   */
+  constexpr syntax_option_type nosubs =
+    static_cast<syntax_option_type>(1 << _S_nosubs);
+
+  /**
+   * Specifies that the regular expression engine should pay more attention to
+   * the speed with which regular expressions are matched, and less to the
+   * speed with which regular expression objects are constructed. Otherwise
+   * it has no detectable effect on the program output.
+   */
+  constexpr syntax_option_type optimize =
+    static_cast<syntax_option_type>(1 << _S_optimize);
+
+  /**
+   * Specifies that character ranges of the form [a-b] should be locale
+   * sensitive.
+   */
+  constexpr syntax_option_type collate =
+    static_cast<syntax_option_type>(1 << _S_collate);
+
+  /**
+   * Specifies that the grammar recognized by the regular expression engine is
+   * that used by ECMAScript in ECMA-262 [Ecma International, ECMAScript
+   * Language Specification, Standard Ecma-262, third edition, 1999], as
+   * modified in section [28.13].  This grammar is similar to that defined
+   * in the PERL scripting language but extended with elements found in the
+   * POSIX regular expression grammar.
+   */
+  constexpr syntax_option_type ECMAScript =
+    static_cast<syntax_option_type>(1 << _S_ECMAScript);
+
+  /**
+   * Specifies that the grammar recognized by the regular expression engine is
+   * that used by POSIX basic regular expressions in IEEE Std 1003.1-2001,
+   * Portable Operating System Interface (POSIX), Base Definitions and
+   * Headers, Section 9, Regular Expressions [IEEE, Information Technology --
+   * Portable Operating System Interface (POSIX), IEEE Standard 1003.1-2001].
+   */
+  constexpr syntax_option_type basic =
+    static_cast<syntax_option_type>(1 << _S_basic);
+
+  /**
+   * Specifies that the grammar recognized by the regular expression engine is
+   * that used by POSIX extended regular expressions in IEEE Std 1003.1-2001,
+   * Portable Operating System Interface (POSIX), Base Definitions and
+   * Headers, Section 9, Regular Expressions.
+   */
+  constexpr syntax_option_type extended =
+    static_cast<syntax_option_type>(1 << _S_extended);
+
+  /**
+   * Specifies that the grammar recognized by the regular expression engine is
+   * that used by POSIX utility awk in IEEE Std 1003.1-2001.  This option is
+   * identical to syntax_option_type extended, except that C-style escape
+   * sequences are supported.  These sequences are:
+   * \\\\, \\a, \\b, \\f, \\n, \\r, \\t , \\v, \\&apos,, &apos,,
+   * and \\ddd (where ddd is one, two, or three octal digits).
+   */
+  constexpr syntax_option_type awk =
+    static_cast<syntax_option_type>(1 << _S_awk);
+
+  /**
+   * Specifies that the grammar recognized by the regular expression engine is
+   * that used by POSIX utility grep in IEEE Std 1003.1-2001.  This option is
+   * identical to syntax_option_type basic, except that newlines are treated
+   * as whitespace.
+   */
+  constexpr syntax_option_type grep =
+    static_cast<syntax_option_type>(1 << _S_grep);
+
+  /**
+   * Specifies that the grammar recognized by the regular expression engine is
+   * that used by POSIX utility grep when given the -E option in
+   * IEEE Std 1003.1-2001.  This option is identical to syntax_option_type
+   * extended, except that newlines are treated as whitespace.
+   */
+  constexpr syntax_option_type egrep =
+    static_cast<syntax_option_type>(1 << _S_egrep);
 
   constexpr inline syntax_option_type
   operator&(syntax_option_type __a, syntax_option_type __b)
@@ -233,111 +242,121 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
    * perform bitwise operations on these values and expect the right thing to
    * happen.
    */
-  enum match_flag_type : unsigned int
-  {
-    /**
-     * The default matching rules.
-     */
-    match_default     = 0,
-
-    /**
-     * The first character in the sequence [first, last) is treated as though it
-     * is not at the beginning of a line, so the character (^) in the regular
-     * expression shall not match [first, first).
-     */
-    match_not_bol     = 1 << _S_not_bol,
-
-    /**
-     * The last character in the sequence [first, last) is treated as though it
-     * is not at the end of a line, so the character ($) in the regular
-     * expression shall not match [last, last).
-     */
-    match_not_eol     = 1 << _S_not_eol,
-
-    /**
-     * The expression \\b is not matched against the sub-sequence
-     * [first,first).
-     */
-    match_not_bow     = 1 << _S_not_bow,
-
-    /**
-     * The expression \\b should not be matched against the sub-sequence
-     * [last,last).
-     */
-    match_not_eow     = 1 << _S_not_eow,
-
-    /**
-     * If more than one match is possible then any match is an acceptable
-     * result.
-     */
-    match_any         = 1 << _S_any,
-
-    /**
-     * The expression does not match an empty sequence.
-     */
-    match_not_null    = 1 << _S_not_null,
-
-    /**
-     * The expression only matches a sub-sequence that begins at first .
-     */
-    match_continuous  = 1 << _S_continuous,
-
-    /**
-     * --first is a valid iterator position.  When this flag is set then the
-     * flags match_not_bol and match_not_bow are ignored by the regular
-     * expression algorithms 28.11 and iterators 28.12.
-     */
-    match_prev_avail  = 1 << _S_prev_avail,
-
-    /**
-     * When a regular expression match is to be replaced by a new string, the
-     * new string is constructed using the rules used by the ECMAScript replace
-     * function in ECMA- 262 [Ecma International, ECMAScript Language
-     * Specification, Standard Ecma-262, third edition, 1999], part 15.5.4.11
-     * String.prototype.replace. In addition, during search and replace
-     * operations all non-overlapping occurrences of the regular expression
-     * are located and replaced, and sections of the input that did not match
-     * the expression are copied unchanged to the output string.
-     *
-     * Format strings (from ECMA-262 [15.5.4.11]):
-     * @li $$  The dollar-sign itself ($)
-     * @li $&  The matched substring.
-     * @li $`  The portion of @a string that precedes the matched substring.
-     *         This would be match_results::prefix().
-     * @li $'  The portion of @a string that follows the matched substring.
-     *         This would be match_results::suffix().
-     * @li $n  The nth capture, where n is in [1,9] and $n is not followed by a
-     *         decimal digit.  If n <= match_results::size() and the nth capture
-     *         is undefined, use the empty string instead.  If n >
-     *         match_results::size(), the result is implementation-defined.
-     * @li $nn The nnth capture, where nn is a two-digit decimal number on
-     *         [01, 99].  If nn <= match_results::size() and the nth capture is
-     *         undefined, use the empty string instead. If
-     *         nn > match_results::size(), the result is implementation-defined.
-     */
-    format_default    = 0,
-
-    /**
-     * When a regular expression match is to be replaced by a new string, the
-     * new string is constructed using the rules used by the POSIX sed utility
-     * in IEEE Std 1003.1- 2001 [IEEE, Information Technology -- Portable
-     * Operating System Interface (POSIX), IEEE Standard 1003.1-2001].
-     */
-    format_sed        = 1 << _S_sed,
-
-    /**
-     * During a search and replace operation, sections of the character
-     * container sequence being searched that do not match the regular
-     * expression shall not be copied to the output string.
-     */
-    format_no_copy    = 1 << _S_no_copy,
-
-    /**
-     * When specified during a search and replace operation, only the first
-     * occurrence of the regular expression shall be replaced.
-     */
-    format_first_only = 1 << _S_first_only,
-  };
+  enum match_flag_type : unsigned int { };
+
+  /**
+   * The default matching rules.
+   */
+  constexpr match_flag_type match_default = static_cast<match_flag_type>(0);
+
+  /**
+   * The first character in the sequence [first, last) is treated as though it
+   * is not at the beginning of a line, so the character (^) in the regular
+   * expression shall not match [first, first).
+   */
+  constexpr match_flag_type match_not_bol =
+    static_cast<match_flag_type>(1 << _S_not_bol);
+
+  /**
+   * The last character in the sequence [first, last) is treated as though it
+   * is not at the end of a line, so the character ($) in the regular
+   * expression shall not match [last, last).
+   */
+  constexpr match_flag_type match_not_eol =
+    static_cast<match_flag_type>(1 << _S_not_eol);
+
+  /**
+   * The expression \\b is not matched against the sub-sequence
+   * [first,first).
+   */
+  constexpr match_flag_type match_not_bow =
+    static_cast<match_flag_type>(1 << _S_not_bow);
+
+  /**
+   * The expression \\b should not be matched against the sub-sequence
+   * [last,last).
+   */
+  constexpr match_flag_type match_not_eow =
+    static_cast<match_flag_type>(1 << _S_not_eow);
+
+  /**
+   * If more than one match is possible then any match is an acceptable
+   * result.
+   */
+  constexpr match_flag_type match_any =
+    static_cast<match_flag_type>(1 << _S_any);
+
+  /**
+   * The expression does not match an empty sequence.
+   */
+  constexpr match_flag_type match_not_null =
+    static_cast<match_flag_type>(1 << _S_not_null);
+
+  /**
+   * The expression only matches a sub-sequence that begins at first .
+   */
+  constexpr match_flag_type match_continuous =
+    static_cast<match_flag_type>(1 << _S_continuous);
+
+  /**
+   * --first is a valid iterator position.  When this flag is set then the
+   * flags match_not_bol and match_not_bow are ignored by the regular
+   * expression algorithms 28.11 and iterators 28.12.
+   */
+  constexpr match_flag_type match_prev_avail =
+    static_cast<match_flag_type>(1 << _S_prev_avail);
+
+  /**
+   * When a regular expression match is to be replaced by a new string, the
+   * new string is constructed using the rules used by the ECMAScript replace
+   * function in ECMA- 262 [Ecma International, ECMAScript Language
+   * Specification, Standard Ecma-262, third edition, 1999], part 15.5.4.11
+   * String.prototype.replace. In addition, during search and replace
+   * operations all non-overlapping occurrences of the regular expression
+   * are located and replaced, and sections of the input that did not match
+   * the expression are copied unchanged to the output string.
+   *
+   * Format strings (from ECMA-262 [15.5.4.11]):
+   * @li $$  The dollar-sign itself ($)
+   * @li $&  The matched substring.
+   * @li $`  The portion of @a string that precedes the matched substring.
+   *         This would be match_results::prefix().
+   * @li $'  The portion of @a string that follows the matched substring.
+   *         This would be match_results::suffix().
+   * @li $n  The nth capture, where n is in [1,9] and $n is not followed by a
+   *         decimal digit.  If n <= match_results::size() and the nth capture
+   *         is undefined, use the empty string instead.  If n >
+   *         match_results::size(), the result is implementation-defined.
+   * @li $nn The nnth capture, where nn is a two-digit decimal number on
+   *         [01, 99].  If nn <= match_results::size() and the nth capture is
+   *         undefined, use the empty string instead. If
+   *         nn > match_results::size(), the result is implementation-defined.
+   */
+  constexpr match_flag_type format_default = static_cast<match_flag_type>(0);
+
+  /**
+   * When a regular expression match is to be replaced by a new string, the
+   * new string is constructed using the rules used by the POSIX sed utility
+   * in IEEE Std 1003.1- 2001 [IEEE, Information Technology -- Portable
+   * Operating System Interface (POSIX), IEEE Standard 1003.1-2001].
+   */
+  constexpr match_flag_type format_sed =
+    static_cast<match_flag_type>(1 << _S_sed);
+
+  /**
+   * During a search and replace operation, sections of the character
+   * container sequence being searched that do not match the regular
+   * expression shall not be copied to the output string.
+   */
+  constexpr match_flag_type format_no_copy =
+    static_cast<match_flag_type>(1 << _S_no_copy);
+
+  /**
+   * When specified during a search and replace operation, only the first
+   * occurrence of the regular expression shall be replaced.
+   */
+  constexpr match_flag_type format_first_only =
+    static_cast<match_flag_type>(1 << _S_first_only);
 
   constexpr inline match_flag_type
   operator&(match_flag_type __a, match_flag_type __b)
diff --git a/libstdc++-v3/testsuite/28_regex/constants/constexpr.cc b/libstdc++-v3/testsuite/28_regex/constants/constexpr.cc
new file mode 100644
index 0000000..80ab857
--- /dev/null
+++ b/libstdc++-v3/testsuite/28_regex/constants/constexpr.cc
@@ -0,0 +1,63 @@ 
+// { dg-options "-std=gnu++11" }
+// { dg-do compile }
+//
+// Copyright (C) 2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// 28.5.4
+
+#include <regex>
+#include <testsuite_hooks.h>
+
+// libstdc++/65420
+void
+test01()
+{
+  bool test __attribute__((unused)) = true;
+
+  const std::regex_constants::syntax_option_type* option __attribute__((unused));
+  option = &std::regex_constants::icase;
+  option = &std::regex_constants::nosubs;
+  option = &std::regex_constants::optimize;
+  option = &std::regex_constants::collate;
+  option = &std::regex_constants::ECMAScript;
+  option = &std::regex_constants::basic;
+  option = &std::regex_constants::extended;
+  option = &std::regex_constants::awk;
+  option = &std::regex_constants::grep;
+  option = &std::regex_constants::egrep;
+
+  const std::regex_constants::match_flag_type* flag __attribute__((unused));
+  flag = &std::regex_constants::match_not_bol;
+  flag = &std::regex_constants::match_not_eol;
+  flag = &std::regex_constants::match_not_bow;
+  flag = &std::regex_constants::match_not_eow;
+  flag = &std::regex_constants::match_any;
+  flag = &std::regex_constants::match_not_null;
+  flag = &std::regex_constants::match_continuous;
+  flag = &std::regex_constants::match_prev_avail;
+  flag = &std::regex_constants::format_default;
+  flag = &std::regex_constants::format_sed;
+  flag = &std::regex_constants::format_no_copy;
+  flag = &std::regex_constants::format_first_only;
+}
+
+int main()
+{
+  test01();
+  return 0;
+}