diff mbox series

strlen: Punt on UB reads past end of string literal [PR94187]

Message ID 20200317083950.GW2156@tucnak
State New
Headers show
Series strlen: Punt on UB reads past end of string literal [PR94187] | expand

Commit Message

Li, Pan2 via Gcc-patches March 17, 2020, 8:39 a.m. UTC
Hi!

The gcc.dg/pr68785.c test which contains:
int
foo (void)
{
  return *(int *) "";
}
has UB in the program if it is ever called, but causes UB in the compiler
as well as at least in theory non-reproduceable code generation.
The problem is that nbytes is in this case 4, prep is the
TREE_STRING_POINTER of a "" string literal with TREE_STRING_LENGTH of 1 and
we do:
4890		  for (const char *p = prep; p != prep + nbytes; ++p)
4891		    if (*p)
4892		      {
4893			*allnul = false;
4894			break;
4895		      }
and so read the bytes after the STRING_CST payload, which can be random.
I think we should just punt in this case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-03-16  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/94187
	* tree-ssa-strlen.c (count_nonzero_bytes): Punt if
	nchars - offset < nbytes.



	Jakub

Comments

Li, Pan2 via Gcc-patches March 17, 2020, 9:05 a.m. UTC | #1
On Tue, Mar 17, 2020 at 9:40 AM Jakub Jelinek via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi!
>
> The gcc.dg/pr68785.c test which contains:
> int
> foo (void)
> {
>   return *(int *) "";
> }
> has UB in the program if it is ever called, but causes UB in the compiler
> as well as at least in theory non-reproduceable code generation.
> The problem is that nbytes is in this case 4, prep is the
> TREE_STRING_POINTER of a "" string literal with TREE_STRING_LENGTH of 1 and
> we do:
> 4890              for (const char *p = prep; p != prep + nbytes; ++p)
> 4891                if (*p)
> 4892                  {
> 4893                    *allnul = false;
> 4894                    break;
> 4895                  }
> and so read the bytes after the STRING_CST payload, which can be random.
> I think we should just punt in this case.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2020-03-16  Jakub Jelinek  <jakub@redhat.com>
>
>         PR tree-optimization/94187
>         * tree-ssa-strlen.c (count_nonzero_bytes): Punt if
>         nchars - offset < nbytes.
>
> --- gcc/tree-ssa-strlen.c.jj    2020-03-14 08:14:47.034742349 +0100
> +++ gcc/tree-ssa-strlen.c       2020-03-16 12:23:57.523534887 +0100
> @@ -4822,6 +4822,8 @@ count_nonzero_bytes (tree exp, unsigned
>            of the access), set it here to the size of the string, including
>            all internal and trailing nuls if the string has any.  */
>         nbytes = nchars - offset;
> +      else if (nchars - offset < nbytes)
> +       return false;
>
>        prep = TREE_STRING_POINTER (exp) + offset;
>      }
>
>
>         Jakub
>
Li, Pan2 via Gcc-patches March 17, 2020, 3:01 p.m. UTC | #2
On 3/17/20 2:39 AM, Jakub Jelinek wrote:
> Hi!
> 
> The gcc.dg/pr68785.c test which contains:
> int
> foo (void)
> {
>    return *(int *) "";
> }
> has UB in the program if it is ever called, but causes UB in the compiler
> as well as at least in theory non-reproduceable code generation.
> The problem is that nbytes is in this case 4, prep is the
> TREE_STRING_POINTER of a "" string literal with TREE_STRING_LENGTH of 1

At least as important as avoiding the Valgrind error is detecting
the bug in the code.  -Warray-bounds has all it needs to diagnose
it, but, regrettably, it doesn't.  I raised PR 94195 to make sure
it does.

Martin


  and
> we do:
> 4890		  for (const char *p = prep; p != prep + nbytes; ++p)
> 4891		    if (*p)
> 4892		      {
> 4893			*allnul = false;
> 4894			break;
> 4895		      }
> and so read the bytes after the STRING_CST payload, which can be random.
> I think we should just punt in this case.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2020-03-16  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/94187
> 	* tree-ssa-strlen.c (count_nonzero_bytes): Punt if
> 	nchars - offset < nbytes.
> 
> --- gcc/tree-ssa-strlen.c.jj	2020-03-14 08:14:47.034742349 +0100
> +++ gcc/tree-ssa-strlen.c	2020-03-16 12:23:57.523534887 +0100
> @@ -4822,6 +4822,8 @@ count_nonzero_bytes (tree exp, unsigned
>   	   of the access), set it here to the size of the string, including
>   	   all internal and trailing nuls if the string has any.  */
>   	nbytes = nchars - offset;
> +      else if (nchars - offset < nbytes)
> +	return false;
>   
>         prep = TREE_STRING_POINTER (exp) + offset;
>       }
> 
> 
> 	Jakub
>
diff mbox series

Patch

--- gcc/tree-ssa-strlen.c.jj	2020-03-14 08:14:47.034742349 +0100
+++ gcc/tree-ssa-strlen.c	2020-03-16 12:23:57.523534887 +0100
@@ -4822,6 +4822,8 @@  count_nonzero_bytes (tree exp, unsigned
 	   of the access), set it here to the size of the string, including
 	   all internal and trailing nuls if the string has any.  */
 	nbytes = nchars - offset;
+      else if (nchars - offset < nbytes)
+	return false;
 
       prep = TREE_STRING_POINTER (exp) + offset;
     }