diff mbox series

[v2] preprocessor: -Wbidi-chars and UCNs [PR104030]

Message ID Ye8qCnvhp0uN+2I9@redhat.com
State New
Headers show
Series [v2] preprocessor: -Wbidi-chars and UCNs [PR104030] | expand

Commit Message

Marek Polacek Jan. 24, 2022, 10:36 p.m. UTC
Here's an updated version of the patch which uses EnumSet (great
to have it, thanks Jakub!) rather than hardcoding strings.

With this patch we accept -Wbidi-chars=none,ucn as well as -Wbidi-chars=ucn.
I think that's OK.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Stephan Bergmann reported that our -Wbidi-chars breaks the build
of LibreOffice because we warn about UCNs even when their usage
is correct: LibreOffice constructs strings piecewise, as in:

  aText = u"\u202D" + aText;

and warning about that is overzealous.  Since no editor (AFAIK)
interprets UCNs to show them as Unicode characters, there's less
risk in misinterpreting them, and so perhaps we shouldn't warn
about them by default.  However, identifiers containing UCNs or
programs generating other programs could still cause confusion,
so I'm keeping the UCN checking.  To turn it on, you just need
to use -Wbidi-chars=unpaired,ucn or -Wbidi-chars=any,ucn.

The implementation is done by using the new EnumSet feature.

	PR preprocessor/104030

gcc/c-family/ChangeLog:

	* c.opt (Wbidi-chars): Mark as EnumSet.  Also accept =ucn.

gcc/ChangeLog:

	* doc/invoke.texi: Update documentation for -Wbidi-chars.

libcpp/ChangeLog:

	* include/cpplib.h (enum cpp_bidirectional_level): Add
	bidirectional_ucn.  Set values explicitly.
	* internal.h (cpp_reader): Adjust warn_bidi_p.
	* lex.cc (maybe_warn_bidi_on_close): Don't warn about UCNs
	unless UCN checking is on.
	(maybe_warn_bidi_on_char): Likewise.

gcc/testsuite/ChangeLog:

	* c-c++-common/Wbidi-chars-10.c: Turn on UCN checking.
	* c-c++-common/Wbidi-chars-11.c: Likewise.
	* c-c++-common/Wbidi-chars-14.c: Likewise.
	* c-c++-common/Wbidi-chars-16.c: Likewise.
	* c-c++-common/Wbidi-chars-17.c: Likewise.
	* c-c++-common/Wbidi-chars-4.c: Likewise.
	* c-c++-common/Wbidi-chars-5.c: Likewise.
	* c-c++-common/Wbidi-chars-6.c: Likewise.
	* c-c++-common/Wbidi-chars-7.c: Likewise.
	* c-c++-common/Wbidi-chars-8.c: Likewise.
	* c-c++-common/Wbidi-chars-9.c: Likewise.
	* c-c++-common/Wbidi-chars-ranges.c: Likewise.
	* c-c++-common/Wbidi-chars-18.c: New test.
	* c-c++-common/Wbidi-chars-19.c: New test.
	* c-c++-common/Wbidi-chars-20.c: New test.
	* c-c++-common/Wbidi-chars-21.c: New test.
	* c-c++-common/Wbidi-chars-22.c: New test.
	* c-c++-common/Wbidi-chars-23.c: New test.
---
 gcc/c-family/c.opt                              | 13 ++++++++-----
 gcc/doc/invoke.texi                             |  8 ++++++--
 gcc/testsuite/c-c++-common/Wbidi-chars-10.c     |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-11.c     |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-14.c     |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-16.c     |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-17.c     |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-18.c     | 11 +++++++++++
 gcc/testsuite/c-c++-common/Wbidi-chars-19.c     | 11 +++++++++++
 gcc/testsuite/c-c++-common/Wbidi-chars-20.c     | 11 +++++++++++
 gcc/testsuite/c-c++-common/Wbidi-chars-21.c     | 11 +++++++++++
 gcc/testsuite/c-c++-common/Wbidi-chars-22.c     | 11 +++++++++++
 gcc/testsuite/c-c++-common/Wbidi-chars-23.c     | 11 +++++++++++
 gcc/testsuite/c-c++-common/Wbidi-chars-4.c      |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-5.c      |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-6.c      |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-7.c      |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-8.c      |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-9.c      |  2 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-ranges.c |  2 +-
 libcpp/include/cpplib.h                         | 11 ++++++-----
 libcpp/internal.h                               |  3 ++-
 libcpp/lex.cc                                   | 16 ++++++++++------
 23 files changed, 110 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/Wbidi-chars-18.c
 create mode 100644 gcc/testsuite/c-c++-common/Wbidi-chars-19.c
 create mode 100644 gcc/testsuite/c-c++-common/Wbidi-chars-20.c
 create mode 100644 gcc/testsuite/c-c++-common/Wbidi-chars-21.c
 create mode 100644 gcc/testsuite/c-c++-common/Wbidi-chars-22.c
 create mode 100644 gcc/testsuite/c-c++-common/Wbidi-chars-23.c


base-commit: 4343f5e256791a5abaaef29fe1f831a03bab129e

Comments

Jakub Jelinek Jan. 24, 2022, 10:44 p.m. UTC | #1
On Mon, Jan 24, 2022 at 05:36:58PM -0500, Marek Polacek wrote:
> The implementation is done by using the new EnumSet feature.
> 
> 	PR preprocessor/104030
> 
> gcc/c-family/ChangeLog:
> 
> 	* c.opt (Wbidi-chars): Mark as EnumSet.  Also accept =ucn.
> 
> gcc/ChangeLog:
> 
> 	* doc/invoke.texi: Update documentation for -Wbidi-chars.
> 
> libcpp/ChangeLog:
> 
> 	* include/cpplib.h (enum cpp_bidirectional_level): Add
> 	bidirectional_ucn.  Set values explicitly.
> 	* internal.h (cpp_reader): Adjust warn_bidi_p.
> 	* lex.cc (maybe_warn_bidi_on_close): Don't warn about UCNs
> 	unless UCN checking is on.
> 	(maybe_warn_bidi_on_char): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* c-c++-common/Wbidi-chars-10.c: Turn on UCN checking.
> 	* c-c++-common/Wbidi-chars-11.c: Likewise.
> 	* c-c++-common/Wbidi-chars-14.c: Likewise.
> 	* c-c++-common/Wbidi-chars-16.c: Likewise.
> 	* c-c++-common/Wbidi-chars-17.c: Likewise.
> 	* c-c++-common/Wbidi-chars-4.c: Likewise.
> 	* c-c++-common/Wbidi-chars-5.c: Likewise.
> 	* c-c++-common/Wbidi-chars-6.c: Likewise.
> 	* c-c++-common/Wbidi-chars-7.c: Likewise.
> 	* c-c++-common/Wbidi-chars-8.c: Likewise.
> 	* c-c++-common/Wbidi-chars-9.c: Likewise.
> 	* c-c++-common/Wbidi-chars-ranges.c: Likewise.
> 	* c-c++-common/Wbidi-chars-18.c: New test.
> 	* c-c++-common/Wbidi-chars-19.c: New test.
> 	* c-c++-common/Wbidi-chars-20.c: New test.
> 	* c-c++-common/Wbidi-chars-21.c: New test.
> 	* c-c++-common/Wbidi-chars-22.c: New test.
> 	* c-c++-common/Wbidi-chars-23.c: New test.

LGTM, thanks.

	Jakub
Martin Liška Jan. 28, 2022, 1:53 p.m. UTC | #2
On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:
> |@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns the warning off. @option{-Wbidi-chars=any} warns about any use of bidirectional control characters. +By default, this warning does not warn about UCNs. It is, however, possible +to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or +@option{-Wbidi-chars=any,ucn}.|

Hello.

Can you please extend the documentation entry and explain what 'ucn' actually means?

'''
There are three levels of warning supported by GCC@.  The default is
@option{-Wbidi-chars=unpaired}, which warns about improperly terminated
bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
@option{-Wbidi-chars=any} warns about any use of bidirectional control
characters.
'''

Right now we have 4 levels and 'ucn' is not defined the paragraph.

Thanks,
Martin
Marek Polacek Jan. 28, 2022, 2:59 p.m. UTC | #3
On Fri, Jan 28, 2022 at 02:53:16PM +0100, Martin Liška wrote:
> On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:
> > |@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns the warning off. @option{-Wbidi-chars=any} warns about any use of bidirectional control characters. +By default, this warning does not warn about UCNs. It is, however, possible +to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or +@option{-Wbidi-chars=any,ucn}.|
> 
> Hello.
> 
> Can you please extend the documentation entry and explain what 'ucn' actually means?
> 
> '''
> There are three levels of warning supported by GCC@.  The default is
> @option{-Wbidi-chars=unpaired}, which warns about improperly terminated
> bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
> @option{-Wbidi-chars=any} warns about any use of bidirectional control
> characters.
> '''
> 
> Right now we have 4 levels and 'ucn' is not defined the paragraph.

The following paragraph says

By default, this warning does not warn about UCNs.  It is, however, possible
to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
@option{-Wbidi-chars=any,ucn}.

Is that not enough?

Marek
Martin Liška Jan. 28, 2022, 3:08 p.m. UTC | #4
On 1/28/22 15:59, Marek Polacek wrote:
> On Fri, Jan 28, 2022 at 02:53:16PM +0100, Martin Liška wrote:
>> On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:
>>> |@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns the warning off. @option{-Wbidi-chars=any} warns about any use of bidirectional control characters. +By default, this warning does not warn about UCNs. It is, however, possible +to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or +@option{-Wbidi-chars=any,ucn}.|
>>
>> Hello.
>>
>> Can you please extend the documentation entry and explain what 'ucn' actually means?
>>
>> '''
>> There are three levels of warning supported by GCC@.  The default is
>> @option{-Wbidi-chars=unpaired}, which warns about improperly terminated
>> bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
>> @option{-Wbidi-chars=any} warns about any use of bidirectional control
>> characters.
>> '''
>>
>> Right now we have 4 levels and 'ucn' is not defined the paragraph.
> 
> The following paragraph says
> 
> By default, this warning does not warn about UCNs.  It is, however, possible
> to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
> @option{-Wbidi-chars=any,ucn}.
> 
> Is that not enough?

Yeah, makes sense. Do I understand it correctly that one can't use -Wbidi-chars=ucn?

Thanks,
Martin

> 
> Marek
>
Marek Polacek Jan. 28, 2022, 3:26 p.m. UTC | #5
On Fri, Jan 28, 2022 at 04:08:18PM +0100, Martin Liška wrote:
> On 1/28/22 15:59, Marek Polacek wrote:
> > On Fri, Jan 28, 2022 at 02:53:16PM +0100, Martin Liška wrote:
> > > On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:
> > > > |@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns the warning off. @option{-Wbidi-chars=any} warns about any use of bidirectional control characters. +By default, this warning does not warn about UCNs. It is, however, possible +to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or +@option{-Wbidi-chars=any,ucn}.|
> > > 
> > > Hello.
> > > 
> > > Can you please extend the documentation entry and explain what 'ucn' actually means?
> > > 
> > > '''
> > > There are three levels of warning supported by GCC@.  The default is
> > > @option{-Wbidi-chars=unpaired}, which warns about improperly terminated
> > > bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
> > > @option{-Wbidi-chars=any} warns about any use of bidirectional control
> > > characters.
> > > '''
> > > 
> > > Right now we have 4 levels and 'ucn' is not defined the paragraph.
> > 
> > The following paragraph says
> > 
> > By default, this warning does not warn about UCNs.  It is, however, possible
> > to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
> > @option{-Wbidi-chars=any,ucn}.
> > 
> > Is that not enough?
> 
> Yeah, makes sense. Do I understand it correctly that one can't use -Wbidi-chars=ucn?

You could, it just means use the default (=unpaired) with UCN checking enabled.
Do you want me to make a note about that in the manual?

Marek
Martin Liška Jan. 28, 2022, 4:12 p.m. UTC | #6
On 1/28/22 16:26, Marek Polacek wrote:
> On Fri, Jan 28, 2022 at 04:08:18PM +0100, Martin Liška wrote:
>> On 1/28/22 15:59, Marek Polacek wrote:
>>> On Fri, Jan 28, 2022 at 02:53:16PM +0100, Martin Liška wrote:
>>>> On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:
>>>>> |@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns the warning off. @option{-Wbidi-chars=any} warns about any use of bidirectional control characters. +By default, this warning does not warn about UCNs. It is, however, possible +to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or +@option{-Wbidi-chars=any,ucn}.|
>>>>
>>>> Hello.
>>>>
>>>> Can you please extend the documentation entry and explain what 'ucn' actually means?
>>>>
>>>> '''
>>>> There are three levels of warning supported by GCC@.  The default is
>>>> @option{-Wbidi-chars=unpaired}, which warns about improperly terminated
>>>> bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
>>>> @option{-Wbidi-chars=any} warns about any use of bidirectional control
>>>> characters.
>>>> '''
>>>>
>>>> Right now we have 4 levels and 'ucn' is not defined the paragraph.
>>>
>>> The following paragraph says
>>>
>>> By default, this warning does not warn about UCNs.  It is, however, possible
>>> to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
>>> @option{-Wbidi-chars=any,ucn}.
>>>
>>> Is that not enough?
>>
>> Yeah, makes sense. Do I understand it correctly that one can't use -Wbidi-chars=ucn?
> 
> You could, it just means use the default (=unpaired) with UCN checking enabled.
> Do you want me to make a note about that in the manual?

Yes, please do so.

Martin

> 
> Marek
>
Marek Polacek Jan. 28, 2022, 8:57 p.m. UTC | #7
On Fri, Jan 28, 2022 at 05:12:41PM +0100, Martin Liška wrote:
> On 1/28/22 16:26, Marek Polacek wrote:
> > On Fri, Jan 28, 2022 at 04:08:18PM +0100, Martin Liška wrote:
> > > On 1/28/22 15:59, Marek Polacek wrote:
> > > > On Fri, Jan 28, 2022 at 02:53:16PM +0100, Martin Liška wrote:
> > > > > On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:
> > > > > > |@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns the warning off. @option{-Wbidi-chars=any} warns about any use of bidirectional control characters. +By default, this warning does not warn about UCNs. It is, however, possible +to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or +@option{-Wbidi-chars=any,ucn}.|
> > > > > 
> > > > > Hello.
> > > > > 
> > > > > Can you please extend the documentation entry and explain what 'ucn' actually means?
> > > > > 
> > > > > '''
> > > > > There are three levels of warning supported by GCC@.  The default is
> > > > > @option{-Wbidi-chars=unpaired}, which warns about improperly terminated
> > > > > bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
> > > > > @option{-Wbidi-chars=any} warns about any use of bidirectional control
> > > > > characters.
> > > > > '''
> > > > > 
> > > > > Right now we have 4 levels and 'ucn' is not defined the paragraph.
> > > > 
> > > > The following paragraph says
> > > > 
> > > > By default, this warning does not warn about UCNs.  It is, however, possible
> > > > to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
> > > > @option{-Wbidi-chars=any,ucn}.
> > > > 
> > > > Is that not enough?
> > > 
> > > Yeah, makes sense. Do I understand it correctly that one can't use -Wbidi-chars=ucn?
> > 
> > You could, it just means use the default (=unpaired) with UCN checking enabled.
> > Do you want me to make a note about that in the manual?
> 
> Yes, please do so.

Done:

    doc: Update -Wbidi-chars documentation

    gcc/ChangeLog:

            * doc/invoke.texi: Update -Wbidi-chars documentation.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9e588db4fce..cfd415110cd 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7822,7 +7822,9 @@ characters.

 By default, this warning does not warn about UCNs.  It is, however, possible
 to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
-@option{-Wbidi-chars=any,ucn}.
+@option{-Wbidi-chars=any,ucn}.  Using @option{-Wbidi-chars=ucn} is valid,
+and is equivalent to @option{-Wbidi-chars=unpaired,ucn}, if no previous
+@option{-Wbidi-chars=any} was specified.

 @item -Wbool-compare
 @opindex Wno-bool-compare


Marek
diff mbox series

Patch

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index db65c14a7a5..9cfd2a6bc4e 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -379,8 +379,8 @@  C ObjC C++ ObjC++ Warning Alias(Wbidi-chars=,any,none)
 ;
 
 Wbidi-chars=
-C ObjC C++ ObjC++ RejectNegative Joined Warning CPP(cpp_warn_bidirectional) CppReason(CPP_W_BIDIRECTIONAL) Var(warn_bidirectional) Init(bidirectional_unpaired) Enum(cpp_bidirectional_level)
--Wbidi-chars=[none|unpaired|any] Warn about UTF-8 bidirectional control characters.
+C ObjC C++ ObjC++ RejectNegative Joined Warning CPP(cpp_warn_bidirectional) CppReason(CPP_W_BIDIRECTIONAL) Var(warn_bidirectional) Init(bidirectional_unpaired) Enum(cpp_bidirectional_level) EnumSet
+-Wbidi-chars=[none|unpaired|any|ucn] Warn about UTF-8 bidirectional control characters.
 
 ; Required for these enum values.
 SourceInclude
@@ -390,13 +390,16 @@  Enum
 Name(cpp_bidirectional_level) Type(int) UnknownError(argument %qs to %<-Wbidi-chars%> not recognized)
 
 EnumValue
-Enum(cpp_bidirectional_level) String(none) Value(bidirectional_none)
+Enum(cpp_bidirectional_level) String(none) Value(bidirectional_none) Set(1)
 
 EnumValue
-Enum(cpp_bidirectional_level) String(unpaired) Value(bidirectional_unpaired)
+Enum(cpp_bidirectional_level) String(unpaired) Value(bidirectional_unpaired) Set(1)
 
 EnumValue
-Enum(cpp_bidirectional_level) String(any) Value(bidirectional_any)
+Enum(cpp_bidirectional_level) String(any) Value(bidirectional_any) Set(1)
+
+EnumValue
+Enum(cpp_bidirectional_level) String(ucn) Value(bidirectional_ucn) Set(2)
 
 Wbool-compare
 C ObjC C++ ObjC++ Var(warn_bool_compare) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 309f5e38a85..9e588db4fce 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -328,7 +328,7 @@  Objective-C and Objective-C++ Dialects}.
 -Warray-bounds  -Warray-bounds=@var{n}  -Warray-compare @gol
 -Wno-attributes  -Wattribute-alias=@var{n} -Wno-attribute-alias @gol
 -Wno-attribute-warning  @gol
--Wbidi-chars=@r{[}none@r{|}unpaired@r{|}any@r{]} @gol
+-Wbidi-chars=@r{[}none@r{|}unpaired@r{|}any@r{|}ucn@r{]} @gol
 -Wbool-compare  -Wbool-operation @gol
 -Wno-builtin-declaration-mismatch @gol
 -Wno-builtin-macro-redefined  -Wc90-c99-compat  -Wc99-c11-compat @gol
@@ -7803,7 +7803,7 @@  Attributes considered include @code{alloc_align}, @code{alloc_size},
 This is the default.  You can disable these warnings with either
 @option{-Wno-attribute-alias} or @option{-Wattribute-alias=0}.
 
-@item -Wbidi-chars=@r{[}none@r{|}unpaired@r{|}any@r{]}
+@item -Wbidi-chars=@r{[}none@r{|}unpaired@r{|}any@r{|}ucn@r{]}
 @opindex Wbidi-chars=
 @opindex Wbidi-chars
 @opindex Wno-bidi-chars
@@ -7820,6 +7820,10 @@  bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
 @option{-Wbidi-chars=any} warns about any use of bidirectional control
 characters.
 
+By default, this warning does not warn about UCNs.  It is, however, possible
+to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
+@option{-Wbidi-chars=any,ucn}.
+
 @item -Wbool-compare
 @opindex Wno-bool-compare
 @opindex Wbool-compare
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-10.c b/gcc/testsuite/c-c++-common/Wbidi-chars-10.c
index 3f851b69e65..cdcdce2be08 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-10.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-10.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=unpaired" } */
+/* { dg-options "-Wbidi-chars=unpaired,ucn" } */
 /* More nesting testing.  */
 
 /* RLE‫ LRI⁦ PDF‬ PDI⁩*/
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-11.c b/gcc/testsuite/c-c++-common/Wbidi-chars-11.c
index 270ce2368a9..ea83029d6b9 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-11.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-11.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=unpaired" } */
+/* { dg-options "-Wbidi-chars=unpaired,ucn" } */
 /* Test that we warn when mixing UCN and UTF-8.  */
 
 int LRE_‪_PDF_\u202c;
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-14.c b/gcc/testsuite/c-c++-common/Wbidi-chars-14.c
index ba5f75d9553..cb6b05efac1 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-14.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-14.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=unpaired" } */
+/* { dg-options "-Wbidi-chars=unpaired,ucn" } */
 /* Test PDI handling, which also pops any subsequent LREs, RLEs, LROs,
    or RLOs.  */
 
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-16.c b/gcc/testsuite/c-c++-common/Wbidi-chars-16.c
index baa0159861c..eaf0ec9a777 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-16.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-16.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=any" } */
+/* { dg-options "-Wbidi-chars=any,ucn" } */
 /* Test LTR/RTL chars.  */
 
 /* LTR<‎> */
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-17.c b/gcc/testsuite/c-c++-common/Wbidi-chars-17.c
index 07cb4321f96..341922146a7 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-17.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-17.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=unpaired" } */
+/* { dg-options "-Wbidi-chars=unpaired,ucn" } */
 /* Test LTR/RTL chars.  */
 
 /* LTR<‎> */
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-18.c b/gcc/testsuite/c-c++-common/Wbidi-chars-18.c
new file mode 100644
index 00000000000..ae586d5e08c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-18.c
@@ -0,0 +1,11 @@ 
+/* PR preprocessor/104030 */
+/* { dg-do compile } */
+/* By default, don't warn about UCNs.  */
+
+const char *
+fn ()
+{
+  const char *aText = "\u202D" "abc";
+/* { dg-bogus "unpaired" "" { target *-*-* } .-1 } */
+  return aText;
+}
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-19.c b/gcc/testsuite/c-c++-common/Wbidi-chars-19.c
new file mode 100644
index 00000000000..9985c3be7a5
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-19.c
@@ -0,0 +1,11 @@ 
+/* PR preprocessor/104030 */
+/* { dg-do compile } */
+/* { dg-options "-Wbidi-chars=unpaired,ucn" } */
+
+const char *
+fn ()
+{
+  const char *aText = "\u202D" "abc";
+/* { dg-warning "unpaired" "" { target *-*-* } .-1 } */
+  return aText;
+}
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-20.c b/gcc/testsuite/c-c++-common/Wbidi-chars-20.c
new file mode 100644
index 00000000000..859f3d53779
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-20.c
@@ -0,0 +1,11 @@ 
+/* PR preprocessor/104030 */
+/* { dg-do compile } */
+/* { dg-options "-Wbidi-chars=any" } */
+
+const char *
+fn ()
+{
+  const char *aText = "\u202D" "abc";
+/* { dg-bogus "U\\+202D" "" { target *-*-* } .-1 } */
+  return aText;
+}
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-21.c b/gcc/testsuite/c-c++-common/Wbidi-chars-21.c
new file mode 100644
index 00000000000..2720b8a883e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-21.c
@@ -0,0 +1,11 @@ 
+/* PR preprocessor/104030 */
+/* { dg-do compile } */
+/* { dg-options "-Wbidi-chars=ucn,any" } */
+
+const char *
+fn ()
+{
+  const char *aText = "\u202D" "abc";
+/* { dg-warning "U\\+202D" "" { target *-*-* } .-1 } */
+  return aText;
+}
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-22.c b/gcc/testsuite/c-c++-common/Wbidi-chars-22.c
new file mode 100644
index 00000000000..f960e597c59
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-22.c
@@ -0,0 +1,11 @@ 
+/* PR preprocessor/104030 */
+/* { dg-do compile } */
+/* { dg-options "-Wbidi-chars=none,ucn" } */
+
+const char *
+fn ()
+{
+  const char *aText = "\u202D" "abc";
+/* { dg-bogus "" "" { target *-*-* } .-1 } */
+  return aText;
+}
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-23.c b/gcc/testsuite/c-c++-common/Wbidi-chars-23.c
new file mode 100644
index 00000000000..7de0a11070a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-23.c
@@ -0,0 +1,11 @@ 
+/* PR preprocessor/104030 */
+/* { dg-do compile } */
+/* { dg-options "-Wbidi-chars=ucn" } */
+
+const char *
+fn ()
+{
+  const char *aText = "\u202D" "abc";
+/* { dg-warning "unpaired" "" { target *-*-* } .-1 } */
+  return aText;
+}
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-4.c b/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
index 639e5c62e88..d2f0739dae0 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=any -Wno-multichar -Wno-overflow" } */
+/* { dg-options "-Wbidi-chars=any,ucn -Wno-multichar -Wno-overflow" } */
 /* Test all bidi chars in various contexts (identifiers, comments,
    string literals, character constants), both UCN and UTF-8.  The bidi
    chars here are properly terminated, except for the character constants.  */
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-5.c b/gcc/testsuite/c-c++-common/Wbidi-chars-5.c
index 68cb053144b..ad49498fe23 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-5.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-5.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=unpaired -Wno-multichar -Wno-overflow" } */
+/* { dg-options "-Wbidi-chars=unpaired,ucn -Wno-multichar -Wno-overflow" } */
 /* Test all bidi chars in various contexts (identifiers, comments,
    string literals, character constants), both UCN and UTF-8.  The bidi
    chars here are properly terminated, except for the character constants.  */
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-6.c b/gcc/testsuite/c-c++-common/Wbidi-chars-6.c
index 0ce6fff2dee..8c1c1b2a270 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-6.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-6.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=unpaired" } */
+/* { dg-options "-Wbidi-chars=ucn,unpaired" } */
 /* Test nesting of bidi chars in various contexts.  */
 
 /* Terminated by the wrong char:  */
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-7.c b/gcc/testsuite/c-c++-common/Wbidi-chars-7.c
index d012d420ec0..3270952a09a 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-7.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-7.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=any" } */
+/* { dg-options "-Wbidi-chars=any,ucn" } */
 /* Test we ignore UCNs in comments.  */
 
 // a b c \u202a 1 2 3
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-8.c b/gcc/testsuite/c-c++-common/Wbidi-chars-8.c
index 4f54c5092ec..3983168c9f1 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-8.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-8.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=any" } */
+/* { dg-options "-Wbidi-chars=any,ucn" } */
 /* Test \u vs \U.  */
 
 int a_\u202A;
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-9.c b/gcc/testsuite/c-c++-common/Wbidi-chars-9.c
index e2af1b1ca97..0ddb0d93108 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-9.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-9.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=unpaired" } */
+/* { dg-options "-Wbidi-chars=unpaired,ucn" } */
 /* Test that we properly separate bidi contexts (comment/identifier/character
    constant/string literal).  */
 
diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-ranges.c b/gcc/testsuite/c-c++-common/Wbidi-chars-ranges.c
index 298750a2a64..0c71f306dbc 100644
--- a/gcc/testsuite/c-c++-common/Wbidi-chars-ranges.c
+++ b/gcc/testsuite/c-c++-common/Wbidi-chars-ranges.c
@@ -1,6 +1,6 @@ 
 /* PR preprocessor/103026 */
 /* { dg-do compile } */
-/* { dg-options "-Wbidi-chars=unpaired -fdiagnostics-show-caret" } */
+/* { dg-options "-Wbidi-chars=unpaired,ucn -fdiagnostics-show-caret" } */
 /* Verify that we escape and underline pertinent bidirectional
    control characters when quoting the source.  */
 
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 940c79f98c1..3eba6f74b57 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -319,15 +319,16 @@  enum cpp_main_search
   CMS_system,  /* Search the system INCLUDE path.  */
 };
 
-/* The possible bidirectional control characters checking levels, from least
-   restrictive to most.  */
+/* The possible bidirectional control characters checking levels.  */
 enum cpp_bidirectional_level {
   /* No checking.  */
-  bidirectional_none,
+  bidirectional_none = 0,
   /* Only detect unpaired uses of bidirectional control characters.  */
-  bidirectional_unpaired,
+  bidirectional_unpaired = 1,
   /* Detect any use of bidirectional control characters.  */
-  bidirectional_any
+  bidirectional_any = 2,
+  /* Also warn about UCNs.  */
+  bidirectional_ucn = 4
 };
 
 /* This structure is nested inside struct cpp_reader, and
diff --git a/libcpp/internal.h b/libcpp/internal.h
index 364c41c8149..badfd1b40da 100644
--- a/libcpp/internal.h
+++ b/libcpp/internal.h
@@ -605,7 +605,8 @@  struct cpp_reader
      characters.  */
   bool warn_bidi_p () const
   {
-    return CPP_OPTION (this, cpp_warn_bidirectional) != bidirectional_none;
+    return (CPP_OPTION (this, cpp_warn_bidirectional)
+	    & (bidirectional_unpaired|bidirectional_any));
   }
 };
 
diff --git a/libcpp/lex.cc b/libcpp/lex.cc
index 4d736576cc1..fb1dfabb7af 100644
--- a/libcpp/lex.cc
+++ b/libcpp/lex.cc
@@ -1560,8 +1560,11 @@  class unpaired_bidi_rich_location : public rich_location
 static void
 maybe_warn_bidi_on_close (cpp_reader *pfile, const uchar *p)
 {
-  if (CPP_OPTION (pfile, cpp_warn_bidirectional) == bidirectional_unpaired
-      && bidi::vec.count () > 0)
+  const auto warn_bidi = CPP_OPTION (pfile, cpp_warn_bidirectional);
+  if (bidi::vec.count () > 0
+      && (warn_bidi & bidirectional_unpaired
+	  && (!bidi::current_ctx_ucn_p ()
+	      || (warn_bidi & bidirectional_ucn))))
     {
       const location_t loc
 	= linemap_position_for_column (pfile->line_table,
@@ -1597,7 +1600,7 @@  maybe_warn_bidi_on_char (cpp_reader *pfile, bidi::kind kind,
 
   const auto warn_bidi = CPP_OPTION (pfile, cpp_warn_bidirectional);
 
-  if (warn_bidi != bidirectional_none)
+  if (warn_bidi & (bidirectional_unpaired|bidirectional_any))
     {
       rich_location rich_loc (pfile->line_table, loc);
       rich_loc.set_escape_on_output (true);
@@ -1605,10 +1608,10 @@  maybe_warn_bidi_on_char (cpp_reader *pfile, bidi::kind kind,
       /* It seems excessive to warn about a PDI/PDF that is closing
 	 an opened context because we've already warned about the
 	 opening character.  Except warn when we have a UCN x UTF-8
-	 mismatch.  */
+	 mismatch, if UCN checking is enabled.  */
       if (kind == bidi::current_ctx ())
 	{
-	  if (warn_bidi == bidirectional_unpaired
+	  if (warn_bidi == (bidirectional_unpaired|bidirectional_ucn)
 	      && bidi::current_ctx_ucn_p () != ucn_p)
 	    {
 	      rich_loc.add_range (bidi::current_ctx_loc ());
@@ -1617,7 +1620,8 @@  maybe_warn_bidi_on_char (cpp_reader *pfile, bidi::kind kind,
 			      "a context by \"%s\"", bidi::to_str (kind));
 	    }
 	}
-      else if (warn_bidi == bidirectional_any)
+      else if (warn_bidi & bidirectional_any
+	       && (!ucn_p || (warn_bidi & bidirectional_ucn)))
 	{
 	  if (kind == bidi::kind::PDF || kind == bidi::kind::PDI)
 	    cpp_warning_at (pfile, CPP_W_BIDIRECTIONAL, &rich_loc,