From patchwork Sat Sep 29 16:06:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 976641 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-486664-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="DwD4GSHP"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42Mtj55nV3z9sCD for ; Sun, 30 Sep 2018 02:07:03 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:message-id:date:mime-version:content-type; q=dns; s= default; b=MqKU18PDuEfmD4+UoNoN/WfLUSPKGq7IKjrts8YD8OHp3+9Wjq5y9 rlPIgyhW/EBexTYrgIQaE73iCtpDOoXCRvQpRvt6Thipel7kgxkeOMK4CT7vTYIj XzM6bXPS10wSNhqmJI+fFb+Ah8ez6Cxa5nEIltOmSkHeFprC9hDPpI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:message-id:date:mime-version:content-type; s= default; bh=Nxg0qfVs+g217wgwjqCttnGtQbc=; b=DwD4GSHP21WRgeWQ9ihR MSU8G6A3hD/34JaqBdKuMNJ7QJG8WqzT6lX3zmhUYrHbLooevuD5LZZugSkz8P5E 0GQteX+ESimqWUuvYT1BTYUHtyBnT+2GYFPB/UIBsEFL2Xn9Vbtnv5oI03OtIxCp kLnaxtKchuyO0tBRLEbL6oA= Received: (qmail 15787 invoked by alias); 29 Sep 2018 16:06:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 15763 invoked by uid 89); 29 Sep 2018 16:06:55 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=compiletime, compile-time, accumulating, H*M:985c X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 29 Sep 2018 16:06:53 +0000 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CEC0186669 for ; Sat, 29 Sep 2018 16:06:51 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-4.rdu2.redhat.com [10.10.112.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id F10F02CFC0 for ; Sat, 29 Sep 2018 16:06:50 +0000 (UTC) From: Jeff Law Openpgp: preference=signencrypt To: gcc-patches Subject: [committed] Use structure to bubble up information about unterminated strings from c_strlen Message-ID: <5bac7adf-fad3-bd4a-985c-7f9d30bcebbd@redhat.com> Date: Sat, 29 Sep 2018 10:06:49 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 X-IsSubscribed: yes This patch changes the NONSTR argument to c_strlen to instead be a little data structure c_strlen can populate with nuggets of information about the string. There's clearly a need for the decl related to the non-string argument. I see an immediate need for the length of a non-terminated string (c_strlen returns NULL for non-terminated strings). I also see a need for the offset within the non-terminated strong as well. We only populate the structure when c_strlen encounters a non-terminated string. One could argue we should always fill in the members. Right now I think filling it in for unterminated cases makes the most sense, but I could be convinced otherwise. I won't be surprised if subsequent warnings from Martin need additional information about the string. The idea here is we can add more elements to the structure without continually adding arguments to c_strlen. Bootstrapped in isolation as well as with Martin's patches for strnlen and sprintf checking. Installing on the trunk. Jeff * builtins.c (unterminated_array): Pass in c_strlen_data * to c_strlen rather than just a tree *. (c_strlen): Change NONSTR argument to a c_strlen_data pointer. Update recursive calls appropriately. If caller did not provide a suitable data pointer, create a local one. When a non-terminated string is discovered, bubble up information about the string via the c_strlen_data object. * builtins.h (c_strlen): Update prototype. (c_strlen_data): New structure. * gimple-fold.c (get_range_strlen): Update calls to c_strlen. For a type 2 call, if c_strlen indicates a non-terminated string use the length of the non-terminated string. (gimple_fold_builtin_stpcpy): Update calls to c_strlen. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 7be6ceb3d62..23e0ec7b34d 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,19 @@ +2018-09-29 Jeff Law + + * builtins.c (unterminated_array): Pass in c_strlen_data * to + c_strlen rather than just a tree *. + (c_strlen): Change NONSTR argument to a c_strlen_data pointer. + Update recursive calls appropriately. If caller did not provide a + suitable data pointer, create a local one. When a non-terminated + string is discovered, bubble up information about the string via the + c_strlen_data object. + * builtins.h (c_strlen): Update prototype. + (c_strlen_data): New structure. + * gimple-fold.c (get_range_strlen): Update calls to c_strlen. + For a type 2 call, if c_strlen indicates a non-terminated string + use the length of the non-terminated string. + (gimple_fold_builtin_stpcpy): Update calls to c_strlen. + 2018-09-28 John David Anglin * match.pd (simple_comparison): Don't optimize if either operand is diff --git a/gcc/builtins.c b/gcc/builtins.c index e655623febd..fe411efd9a9 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -570,9 +570,10 @@ warn_string_no_nul (location_t loc, const char *fn, tree arg, tree decl) tree unterminated_array (tree exp) { - tree nonstr = NULL; - c_strlen (exp, 1, &nonstr); - return nonstr; + c_strlen_data data; + memset (&data, 0, sizeof (c_strlen_data)); + c_strlen (exp, 1, &data); + return data.decl; } /* Compute the length of a null-terminated character string or wide @@ -592,10 +593,12 @@ unterminated_array (tree exp) accesses. Note that this implies the result is not going to be emitted into the instruction stream. - If a not zero-terminated string value is encountered and NONSTR is - non-zero, the declaration of the string value is assigned to *NONSTR. - *NONSTR is accumulating, thus not cleared on success, therefore it has - to be initialized to NULL_TREE by the caller. + Additional information about the string accessed may be recorded + in DATA. For example, if SRC references an unterminated string, + then the declaration will be stored in the DECL field. If the + length of the unterminated string can be determined, it'll be + stored in the LEN field. Note this length could well be different + than what a C strlen call would return. ELTSIZE is 1 for normal single byte character strings, and 2 or 4 for wide characer strings. ELTSIZE is by default 1. @@ -603,8 +606,16 @@ unterminated_array (tree exp) The value returned is of type `ssizetype'. */ tree -c_strlen (tree src, int only_value, tree *nonstr, unsigned eltsize) +c_strlen (tree src, int only_value, c_strlen_data *data, unsigned eltsize) { + /* If we were not passed a DATA pointer, then get one to a local + structure. That avoids having to check DATA for NULL before + each time we want to use it. */ + c_strlen_data local_strlen_data; + memset (&local_strlen_data, 0, sizeof (c_strlen_data)); + if (!data) + data = &local_strlen_data; + gcc_checking_assert (eltsize == 1 || eltsize == 2 || eltsize == 4); STRIP_NOPS (src); if (TREE_CODE (src) == COND_EXPR @@ -612,15 +623,15 @@ c_strlen (tree src, int only_value, tree *nonstr, unsigned eltsize) { tree len1, len2; - len1 = c_strlen (TREE_OPERAND (src, 1), only_value, nonstr, eltsize); - len2 = c_strlen (TREE_OPERAND (src, 2), only_value, nonstr, eltsize); + len1 = c_strlen (TREE_OPERAND (src, 1), only_value, data, eltsize); + len2 = c_strlen (TREE_OPERAND (src, 2), only_value, data, eltsize); if (tree_int_cst_equal (len1, len2)) return len1; } if (TREE_CODE (src) == COMPOUND_EXPR && (only_value || !TREE_SIDE_EFFECTS (TREE_OPERAND (src, 0)))) - return c_strlen (TREE_OPERAND (src, 1), only_value, nonstr, eltsize); + return c_strlen (TREE_OPERAND (src, 1), only_value, data, eltsize); location_t loc = EXPR_LOC_OR_LOC (src, input_location); @@ -666,13 +677,15 @@ c_strlen (tree src, int only_value, tree *nonstr, unsigned eltsize) start searching for it. */ unsigned len = string_length (ptr, eltsize, strelts); - /* Return when an embedded null character is found or none at all. */ + /* Return when an embedded null character is found or none at all. + In the latter case, set the DECL/LEN field in the DATA structure + so that callers may examine them. */ if (len + 1 < strelts) return NULL_TREE; else if (len >= maxelts) { - if (nonstr && decl) - *nonstr = decl; + data->decl = decl; + data->len = ssize_int (len); return NULL_TREE; } @@ -737,11 +750,12 @@ c_strlen (tree src, int only_value, tree *nonstr, unsigned eltsize) strelts - eltoff); /* Don't know what to return if there was no zero termination. - Ideally this would turn into a gcc_checking_assert over time. */ + Ideally this would turn into a gcc_checking_assert over time. + Set DECL/LEN so callers can examine them. */ if (len >= maxelts - eltoff) { - if (nonstr && decl) - *nonstr = decl; + data->decl = decl; + data->len = ssize_int (len); return NULL_TREE; } @@ -3965,13 +3979,14 @@ expand_builtin_stpcpy_1 (tree exp, rtx target, machine_mode mode) compile-time, not an expression containing a string. This is because the latter will potentially produce pessimized code when used to produce the return value. */ - tree nonstr = NULL_TREE; + c_strlen_data data; + memset (&data, 0, sizeof (c_strlen_data)); if (!c_getstr (src, NULL) - || !(len = c_strlen (src, 0, &nonstr, 1))) + || !(len = c_strlen (src, 0, &data, 1))) return expand_movstr (dst, src, target, /*endp=*/2); - if (nonstr && !TREE_NO_WARNING (exp)) - warn_string_no_nul (EXPR_LOCATION (exp), "stpcpy", src, nonstr); + if (data.decl && !TREE_NO_WARNING (exp)) + warn_string_no_nul (EXPR_LOCATION (exp), "stpcpy", src, data.decl); lenp1 = size_binop_loc (loc, PLUS_EXPR, len, ssize_int (1)); ret = expand_builtin_mempcpy_args (dst, src, lenp1, @@ -8444,22 +8459,23 @@ fold_builtin_strlen (location_t loc, tree type, tree arg) return NULL_TREE; else { - tree nonstr = NULL_TREE; - tree len = c_strlen (arg, 0, &nonstr); + c_strlen_data data; + memset (&data, 0, sizeof (c_strlen_data)); + tree len = c_strlen (arg, 0, &data); if (len) return fold_convert_loc (loc, type, len); - if (!nonstr) - c_strlen (arg, 1, &nonstr); + if (!data.decl) + c_strlen (arg, 1, &data); - if (nonstr) + if (data.decl) { if (EXPR_HAS_LOCATION (arg)) loc = EXPR_LOCATION (arg); else if (loc == UNKNOWN_LOCATION) loc = input_location; - warn_string_no_nul (loc, "strlen", arg, nonstr); + warn_string_no_nul (loc, "strlen", arg, data.decl); } return NULL_TREE; diff --git a/gcc/builtins.h b/gcc/builtins.h index 45ad684cb52..3801251f372 100644 --- a/gcc/builtins.h +++ b/gcc/builtins.h @@ -57,7 +57,14 @@ extern bool get_pointer_alignment_1 (tree, unsigned int *, unsigned HOST_WIDE_INT *); extern unsigned int get_pointer_alignment (tree); extern unsigned string_length (const void*, unsigned, unsigned); -extern tree c_strlen (tree, int, tree * = NULL, unsigned = 1); +struct c_strlen_data +{ + tree decl; + tree len; + tree off; +}; + +extern tree c_strlen (tree, int, c_strlen_data * = NULL, unsigned = 1); extern void expand_builtin_setjmp_setup (rtx, rtx); extern void expand_builtin_setjmp_receiver (rtx); extern void expand_builtin_update_setjmp_buf (rtx); diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index 1e84722d22d..cf04c92180b 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -1337,7 +1337,23 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type, return false; } else - val = c_strlen (arg, 1, nonstr, eltsize); + { + c_strlen_data data; + memset (&data, 0, sizeof (c_strlen_data)); + val = c_strlen (arg, 1, &data, eltsize); + + /* If we potentially had a non-terminated string, then + bubble that information up to the caller. */ + if (!val) + { + *nonstr = data.decl; + /* If TYPE is asking for a maximum, then use any + length (including the length of an unterminated + string) for VAL. */ + if (type == 2) + val = data.len; + } + } if (!val && fuzzy) { @@ -2812,21 +2828,22 @@ gimple_fold_builtin_stpcpy (gimple_stmt_iterator *gsi) } /* Set to non-null if ARG refers to an unterminated array. */ - tree nonstr = NULL; - tree len = c_strlen (src, 1, &nonstr, 1); + c_strlen_data data; + memset (&data, 0, sizeof (c_strlen_data)); + tree len = c_strlen (src, 1, &data, 1); if (!len || TREE_CODE (len) != INTEGER_CST) { - nonstr = unterminated_array (src); - if (!nonstr) + data.decl = unterminated_array (src); + if (!data.decl) return false; } - if (nonstr) + if (data.decl) { /* Avoid folding calls with unterminated arrays. */ if (!gimple_no_warning_p (stmt)) - warn_string_no_nul (loc, "stpcpy", src, nonstr); + warn_string_no_nul (loc, "stpcpy", src, data.decl); gimple_set_no_warning (stmt, true); return false; }