diff mbox series

[v2] C, ObjC: Add -Wunterminated-string-initialization

Message ID 20230324133928.14753-1-alx@kernel.org
State New
Headers show
Series [v2] C, ObjC: Add -Wunterminated-string-initialization | expand

Commit Message

Alejandro Colomar March 24, 2023, 1:39 p.m. UTC
Warn about the following:

    char  s[3] = "foo";

Initializing a char array with a string literal of the same length as
the size of the array is usually a mistake.  Rarely is the case where
one wants to create a non-terminated character sequence from a string
literal.

In some cases, for writing faster code, one may want to use arrays
instead of pointers, since that removes the need for storing an array of
pointers apart from the strings themselves.

    char  *log_levels[]   = { "info", "warning", "err" };
vs.
    char  log_levels[][7] = { "info", "warning", "err" };

This forces the programmer to specify a size, which might change if a
new entry is later added.  Having no way to enforce null termination is
very dangerous, however, so it is useful to have a warning for this, so
that the compiler can make sure that the programmer didn't make any
mistakes.  This warning catches the bug above, so that the programmer
will be able to fix it and write:

    char  log_levels[][8] = { "info", "warning", "err" };

This warning already existed as part of -Wc++-compat, but this patch
allows enabling it separately.  It is also included in -Wextra, since
it may not always be desired (when unterminated character sequences are
wanted), but it's likely to be desired in most cases.

Link: <https://lists.gnu.org/archive/html/groff/2022-11/msg00059.html>
Link: <https://lists.gnu.org/archive/html/groff/2022-11/msg00063.html>
Link: <https://inbox.sourceware.org/gcc/36da94eb-1cac-5ae8-7fea-ec66160cf413@gmail.com/T/>
Acked-by: Doug McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Ralph Corderoy <ralph@inputplus.co.uk>
Cc: Dave Kemper <saint.snit@gmail.com>
Cc: Larry McVoy <lm@mcvoy.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Cc: Jonathan Wakely <jwakely.gcc@gmail.com>
Cc: Andrew Clayton <andrew@digital-domain.net>
Cc: Martin Uecker <muecker@gwdg.de>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---

Hi,

I sent v1 to the wrong list.  This time I've made sure to write to
gcc-patches@.

v2 adds some draft of a test, as suggested by Martin.  However, I don't
know yet how to write those, so the test is just a draft.  But I did
test the feature, by compiling GCC and compiling some small program with
it.

Cheers,
Alex


Range-diff against v1:
1:  61ddf1eb816 ! 1:  e40d8f54942 C, ObjC: Add -Wunterminated-string-initialization
    @@ Commit message
         Cc: Andrew Pinski <pinskia@gmail.com>
         Cc: Jonathan Wakely <jwakely.gcc@gmail.com>
         Cc: Andrew Clayton <andrew@digital-domain.net>
    +    Cc: Martin Uecker <muecker@gwdg.de>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## gcc/c-family/c.opt ##
    @@ gcc/c/c-typeck.cc: digest_init (location_t init_loc, tree type, tree init, tree
      	      if (compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
      		{
      		  unsigned HOST_WIDE_INT size
    +
    + ## gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c (new) ##
    +@@
    ++/* { dg-do compile } */
    ++/* { dg-options "-Wunterminated-string-initialization" } */
    ++
    ++char a1[] = "a";
    ++char a2[1] = "a";	/* { dg-warning "unterminated char sequence" } */
    ++char a3[2] = "a";

 gcc/c-family/c.opt                                         | 4 ++++
 gcc/c/c-typeck.cc                                          | 6 +++---
 gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c | 6 ++++++
 3 files changed, 13 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c

Comments

David Malcolm March 24, 2023, 2:53 p.m. UTC | #1
On Fri, 2023-03-24 at 14:39 +0100, Alejandro Colomar via Gcc-patches
wrote:
> Warn about the following:
> 
>     char  s[3] = "foo";
> 
> Initializing a char array with a string literal of the same length as
> the size of the array is usually a mistake.  Rarely is the case where
> one wants to create a non-terminated character sequence from a string
> literal.
> 
> In some cases, for writing faster code, one may want to use arrays
> instead of pointers, since that removes the need for storing an array
> of
> pointers apart from the strings themselves.
> 
>     char  *log_levels[]   = { "info", "warning", "err" };
> vs.
>     char  log_levels[][7] = { "info", "warning", "err" };
> 
> This forces the programmer to specify a size, which might change if a
> new entry is later added.  Having no way to enforce null termination
> is
> very dangerous, however, so it is useful to have a warning for this,
> so
> that the compiler can make sure that the programmer didn't make any
> mistakes.  This warning catches the bug above, so that the programmer
> will be able to fix it and write:
> 
>     char  log_levels[][8] = { "info", "warning", "err" };
> 
> This warning already existed as part of -Wc++-compat, but this patch
> allows enabling it separately.  It is also included in -Wextra, since
> it may not always be desired (when unterminated character sequences
> are
> wanted), but it's likely to be desired in most cases.
> 
> Link:
> <https://lists.gnu.org/archive/html/groff/2022-11/msg00059.html>
> Link:
> <https://lists.gnu.org/archive/html/groff/2022-11/msg00063.html>
> Link:
> <https://inbox.sourceware.org/gcc/36da94eb-1cac-5ae8-7fea-ec66160cf41
> 3@gmail.com/T/>
> Acked-by: Doug McIlroy <douglas.mcilroy@dartmouth.edu>
> Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
> Cc: Ralph Corderoy <ralph@inputplus.co.uk>
> Cc: Dave Kemper <saint.snit@gmail.com>
> Cc: Larry McVoy <lm@mcvoy.com>
> Cc: Andrew Pinski <pinskia@gmail.com>
> Cc: Jonathan Wakely <jwakely.gcc@gmail.com>
> Cc: Andrew Clayton <andrew@digital-domain.net>
> Cc: Martin Uecker <muecker@gwdg.de>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
> 
> Hi,

Hi Alex, thanks for the patch.

> 
> I sent v1 to the wrong list.  This time I've made sure to write to
> gcc-patches@.

Note that right now we're deep in bug-fixing/stabilization for GCC 13
(and trunk is still tracking that effort), so your patch might be more
suitable for GCC 14.

> 
> v2 adds some draft of a test, as suggested by Martin.  However, I
> don't
> know yet how to write those, so the test is just a draft.  But I did
> test the feature, by compiling GCC and compiling some small program
> with
> it.

Unfortunately the answer to the question "how do I run just one
testcase in GCC's testsuite" is rather non-trivial; FWIW I've written
up some notes on working with the GCC testsuite here:
https://gcc-newbies-guide.readthedocs.io/en/latest/working-with-the-testsuite.html

Hope this is helpful
Dave


[...snip...]
Alejandro Colomar March 24, 2023, 5:45 p.m. UTC | #2
Hi David,

On 3/24/23 15:53, David Malcolm wrote:
> On Fri, 2023-03-24 at 14:39 +0100, Alejandro Colomar via Gcc-patches
> wrote:
>> Warn about the following:
>>
>>     char  s[3] = "foo";
>>
[...]

>> ---
>>
>> Hi,
> 
> Hi Alex, thanks for the patch.

:)

> 
>>
>> I sent v1 to the wrong list.  This time I've made sure to write to
>> gcc-patches@.
> 
> Note that right now we're deep in bug-fixing/stabilization for GCC 13
> (and trunk is still tracking that effort), so your patch might be more
> suitable for GCC 14.

Sure, no problem.  Do you have a "next" branch where you pick patches
for after the release, or should I resend after the release?  Is
discussion of a patch reasonable now, or is it too much distracting
from your stabilization efforts?

> 
>>
>> v2 adds some draft of a test, as suggested by Martin.  However, I
>> don't
>> know yet how to write those, so the test is just a draft.  But I did
>> test the feature, by compiling GCC and compiling some small program
>> with
>> it.
> 
> Unfortunately the answer to the question "how do I run just one
> testcase in GCC's testsuite" is rather non-trivial; FWIW I've written
> up some notes on working with the GCC testsuite here:
> https://gcc-newbies-guide.readthedocs.io/en/latest/working-with-the-testsuite.html

Hmm, I'll try following that; thanks!  Is there anything obvious that
I might have missed, at first glance?

> 
> Hope this is helpful

Yup.  Thanks!

> Dave

Cheers,
Alex
David Malcolm March 24, 2023, 5:58 p.m. UTC | #3
On Fri, 2023-03-24 at 18:45 +0100, Alejandro Colomar wrote:
> Hi David,
> 
> On 3/24/23 15:53, David Malcolm wrote:
> > On Fri, 2023-03-24 at 14:39 +0100, Alejandro Colomar via Gcc-
> > patches
> > wrote:
> > > Warn about the following:
> > > 
> > >     char  s[3] = "foo";
> > > 
> [...]
> 
> > > ---
> > > 
> > > Hi,
> > 
> > Hi Alex, thanks for the patch.
> 
> :)
> 
> > 
> > > 
> > > I sent v1 to the wrong list.  This time I've made sure to write
> > > to
> > > gcc-patches@.
> > 
> > Note that right now we're deep in bug-fixing/stabilization for GCC
> > 13
> > (and trunk is still tracking that effort), so your patch might be
> > more
> > suitable for GCC 14.
> 
> Sure, no problem.  Do you have a "next" branch where you pick patches
> for after the release, or should I resend after the release?  

We don't; resending it after release is probably best.

> Is
> discussion of a patch reasonable now, or is it too much distracting
> from your stabilization efforts?

FWIW I'd prefer to postpone the discussion until after we branch for
the release.

> 
> > 
> > > 
> > > v2 adds some draft of a test, as suggested by Martin.  However, I
> > > don't
> > > know yet how to write those, so the test is just a draft.  But I
> > > did
> > > test the feature, by compiling GCC and compiling some small
> > > program
> > > with
> > > it.
> > 
> > Unfortunately the answer to the question "how do I run just one
> > testcase in GCC's testsuite" is rather non-trivial; FWIW I've
> > written
> > up some notes on working with the GCC testsuite here:
> > https://gcc-newbies-guide.readthedocs.io/en/latest/working-with-the-testsuite.html
> 
> Hmm, I'll try following that; thanks!  Is there anything obvious that
> I might have missed, at first glance?

The main thing is that there's a difference between compiling the test
case "by hand", versus doing it through the test harness - the latter
sets up the environment in a particular way, injects a particular set
of flags, etc etc.

Dave
Alejandro Colomar April 20, 2023, 5:17 p.m. UTC | #4
Hi David,

On 3/24/23 18:58, David Malcolm wrote:
> On Fri, 2023-03-24 at 18:45 +0100, Alejandro Colomar wrote:
>> Hi David,
>>
>> On 3/24/23 15:53, David Malcolm wrote:
>>> On Fri, 2023-03-24 at 14:39 +0100, Alejandro Colomar via Gcc-
>>> patches
>>> wrote:
>>>> Warn about the following:
>>>>
>>>>     char  s[3] = "foo";
>>>>
>> [...]
>>
>>>> ---
>>>>
>>>> Hi,
>>>
>>> Hi Alex, thanks for the patch.
>>
>> :)
>>
>>>
>>>>
>>>> I sent v1 to the wrong list.  This time I've made sure to write
>>>> to
>>>> gcc-patches@.
>>>
>>> Note that right now we're deep in bug-fixing/stabilization for GCC
>>> 13
>>> (and trunk is still tracking that effort), so your patch might be
>>> more
>>> suitable for GCC 14.
>>
>> Sure, no problem.  Do you have a "next" branch where you pick patches
>> for after the release, or should I resend after the release?  
> 
> We don't; resending it after release is probably best.
> 
>> Is
>> discussion of a patch reasonable now, or is it too much distracting
>> from your stabilization efforts?
> 
> FWIW I'd prefer to postpone the discussion until after we branch for
> the release.

Sure.  AFAIK it's fair game already to propose patches to GCC 14,
right?  Would you please have a look into this?  Thanks!

> 
>>
>>>
>>>>
>>>> v2 adds some draft of a test, as suggested by Martin.  However, I
>>>> don't
>>>> know yet how to write those, so the test is just a draft.  But I
>>>> did
>>>> test the feature, by compiling GCC and compiling some small
>>>> program
>>>> with
>>>> it.
>>>
>>> Unfortunately the answer to the question "how do I run just one
>>> testcase in GCC's testsuite" is rather non-trivial; FWIW I've
>>> written
>>> up some notes on working with the GCC testsuite here:
>>> https://gcc-newbies-guide.readthedocs.io/en/latest/working-with-the-testsuite.html
>>
>> Hmm, I'll try following that; thanks!  Is there anything obvious that
>> I might have missed, at first glance?
> 
> The main thing is that there's a difference between compiling the test
> case "by hand", versus doing it through the test harness - the latter
> sets up the environment in a particular way, injects a particular set
> of flags, etc etc.

I forgot about this; I'll have a look into it when I find some time.

Cheers,
Alex

> 
> Dave
>
Alejandro Colomar Oct. 1, 2023, 12:55 a.m. UTC | #5
Hi David,

Sorry for the half-year delay!  I have some update.  :)

On Fri, Mar 24, 2023 at 10:53:22AM -0400, David Malcolm wrote:
> On Fri, 2023-03-24 at 14:39 +0100, Alejandro Colomar via Gcc-patches
> wrote:
> > Warn about the following:
> > 
> >     char  s[3] = "foo";
> > 
> > Initializing a char array with a string literal of the same length as
> > the size of the array is usually a mistake.  Rarely is the case where
> > one wants to create a non-terminated character sequence from a string
> > literal.
> > 
> > In some cases, for writing faster code, one may want to use arrays
> > instead of pointers, since that removes the need for storing an array
> > of
> > pointers apart from the strings themselves.
> > 
> >     char  *log_levels[]   = { "info", "warning", "err" };
> > vs.
> >     char  log_levels[][7] = { "info", "warning", "err" };
> > 
> > This forces the programmer to specify a size, which might change if a
> > new entry is later added.  Having no way to enforce null termination
> > is
> > very dangerous, however, so it is useful to have a warning for this,
> > so
> > that the compiler can make sure that the programmer didn't make any
> > mistakes.  This warning catches the bug above, so that the programmer
> > will be able to fix it and write:
> > 
> >     char  log_levels[][8] = { "info", "warning", "err" };
> > 
> > This warning already existed as part of -Wc++-compat, but this patch
> > allows enabling it separately.  It is also included in -Wextra, since
> > it may not always be desired (when unterminated character sequences
> > are
> > wanted), but it's likely to be desired in most cases.
> > 
> > Link:
> > <https://lists.gnu.org/archive/html/groff/2022-11/msg00059.html>
> > Link:
> > <https://lists.gnu.org/archive/html/groff/2022-11/msg00063.html>
> > Link:
> > <https://inbox.sourceware.org/gcc/36da94eb-1cac-5ae8-7fea-ec66160cf41
> > 3@gmail.com/T/>
> > Acked-by: Doug McIlroy <douglas.mcilroy@dartmouth.edu>
> > Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
> > Cc: Ralph Corderoy <ralph@inputplus.co.uk>
> > Cc: Dave Kemper <saint.snit@gmail.com>
> > Cc: Larry McVoy <lm@mcvoy.com>
> > Cc: Andrew Pinski <pinskia@gmail.com>
> > Cc: Jonathan Wakely <jwakely.gcc@gmail.com>
> > Cc: Andrew Clayton <andrew@digital-domain.net>
> > Cc: Martin Uecker <muecker@gwdg.de>
> > Signed-off-by: Alejandro Colomar <alx@kernel.org>
> > ---
> > 
> > Hi,
> 
> Hi Alex, thanks for the patch.
> 
> > 
> > I sent v1 to the wrong list.  This time I've made sure to write to
> > gcc-patches@.
> 
> Note that right now we're deep in bug-fixing/stabilization for GCC 13
> (and trunk is still tracking that effort), so your patch might be more
> suitable for GCC 14.
> 
> > 
> > v2 adds some draft of a test, as suggested by Martin.  However, I
> > don't
> > know yet how to write those, so the test is just a draft.  But I did
> > test the feature, by compiling GCC and compiling some small program
> > with
> > it.
> 
> Unfortunately the answer to the question "how do I run just one
> testcase in GCC's testsuite" is rather non-trivial; FWIW I've written
> up some notes on working with the GCC testsuite here:
> https://gcc-newbies-guide.readthedocs.io/en/latest/working-with-the-testsuite.html
> 
> Hope this is helpful
> Dave
> 
> 
> [...snip...]
> 

I ran the tests, and get some unexpected failure.  I used dg-warning,
but maybe I used it wrong?  Here's the output:

```
output is:
/home/alx/src/gnu/gcc/wustr/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c:5:14: warning: initializer-string for array of 'char' is too long [-Wunterminated-string-initi
alization]

FAIL: gcc.dg/Wunterminated-string-initialization.c  (test for warnings, line 5)
```

And here's the test:

```
/* { dg-do compile } */
/* { dg-options "-Wunterminated-string-initialization" } */

char a1[] = "a";
char a2[1] = "a";	/* { dg-warning "unterminated char sequence" } */
char a3[2] = "a";
```

Why isn't it expecting the warning?

Thanks,
Alex
Martin Uecker Oct. 1, 2023, 7:37 a.m. UTC | #6
(I shortened the recipient list)

Am Sonntag, dem 01.10.2023 um 02:55 +0200 schrieb Alejandro Colomar:

> > 
...
> I ran the tests, and get some unexpected failure.  I used dg-warning,
> but maybe I used it wrong?  Here's the output:
> 
> ```
> output is:
> /home/alx/src/gnu/gcc/wustr/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c:5:14: warning: initializer-string for array of 'char' is too long [-Wunterminated-string-initi
> alization]
> 
> FAIL: gcc.dg/Wunterminated-string-initialization.c  (test for warnings, line 5)
> ```
> 
> And here's the test:
> 
> ```
> /* { dg-do compile } */
> /* { dg-options "-Wunterminated-string-initialization" } */
> 
> char a1[] = "a";
> char a2[1] = "a";	/* { dg-warning "unterminated char sequence" } */
> char a3[2] = "a";
> ```
> 
> Why isn't it expecting the warning?

Because the text does not match the actual output above?
You should see an additional FAIL for an excess warning.

Martin
Alejandro Colomar Oct. 1, 2023, 11:35 a.m. UTC | #7
On Sun, Oct 01, 2023 at 09:37:59AM +0200, Martin Uecker wrote:
> 
> (I shortened the recipient list)
> 
> Am Sonntag, dem 01.10.2023 um 02:55 +0200 schrieb Alejandro Colomar:
> 
> > > 
> ...
> > I ran the tests, and get some unexpected failure.  I used dg-warning,
> > but maybe I used it wrong?  Here's the output:
> > 
> > ```
> > output is:
> > /home/alx/src/gnu/gcc/wustr/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c:5:14: warning: initializer-string for array of 'char' is too long [-Wunterminated-string-initi
> > alization]
> > 
> > FAIL: gcc.dg/Wunterminated-string-initialization.c  (test for warnings, line 5)
> > ```
> > 
> > And here's the test:
> > 
> > ```
> > /* { dg-do compile } */
> > /* { dg-options "-Wunterminated-string-initialization" } */
> > 
> > char a1[] = "a";
> > char a2[1] = "a";	/* { dg-warning "unterminated char sequence" } */
> > char a3[2] = "a";
> > ```
> > 
> > Why isn't it expecting the warning?
> 
> Because the text does not match the actual output above?

Makes sense.

Here's the output after a fix (which I'll send soon as a new revision):

```
output is:
/home/alx/src/gnu/gcc/wustr/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c:5:14: warning: initializer-string for array of 'char' is too long [-Wunterminated-string-initialization]

```
and

```
                === gcc Summary ===

# of expected passes            2
```

The only thing that concerns me a little bit is that I don't see any
PASS (or XFAIL).

> You should see an additional FAIL for an excess warning.

Yep, I did.

Cheers,
Alex

> 
> Martin
> 
> 
> 
>
diff mbox series

Patch

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 3333cddeece..7f1fccfe02b 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1382,6 +1382,10 @@  Wunsuffixed-float-constants
 C ObjC Var(warn_unsuffixed_float_constants) Warning
 Warn about unsuffixed float constants.
 
+Wunterminated-string-initialization
+C ObjC Var(warn_unterminated_string_initialization) Warning LangEnabledBy(C ObjC,Wextra || Wc++-compat)
+Warn about character arrays initialized as unterminated character sequences by a string literal.
+
 Wunused
 C ObjC C++ ObjC++ LangEnabledBy(C ObjC C++ ObjC++,Wall)
 ; documented in common.opt
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 45bacc06c47..ce2750f98bb 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -8420,11 +8420,11 @@  digest_init (location_t init_loc, tree type, tree init, tree origtype,
 		pedwarn_init (init_loc, 0,
 			      ("initializer-string for array of %qT "
 			       "is too long"), typ1);
-	      else if (warn_cxx_compat
+	      else if (warn_unterminated_string_initialization
 		       && compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
-		warning_at (init_loc, OPT_Wc___compat,
+		warning_at (init_loc, OPT_Wunterminated_string_initialization,
 			    ("initializer-string for array of %qT "
-			     "is too long for C++"), typ1);
+			     "is too long"), typ1);
 	      if (compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
 		{
 		  unsigned HOST_WIDE_INT size
diff --git a/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c b/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c
new file mode 100644
index 00000000000..c6517702d51
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c
@@ -0,0 +1,6 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Wunterminated-string-initialization" } */
+
+char a1[] = "a";
+char a2[1] = "a";	/* { dg-warning "unterminated char sequence" } */
+char a3[2] = "a";