diff mbox series

Fix wrong code with truncated string literals (PR 86711/86714)

Message ID DB6PR0701MB266438272BCFD277C6600399E4280@DB6PR0701MB2664.eurprd07.prod.outlook.com
State New
Headers show
Series Fix wrong code with truncated string literals (PR 86711/86714) | expand

Commit Message

Bernd Edlinger July 29, 2018, 10:56 a.m. UTC
Hi!

This fixes two wrong code bugs where string_constant
returns over length string constants.  Initializers
like that are rejected in C++, but valid in C.

I have xfailed strlenopt-49.c, which tests this feature.
Personally I don't think that it is worth the effort to
optimize something that is per se invalid in C++.

The hunk in builtins.c is unrelated, but I would like
to use tree_to_shwi here, to be in line with the
tree_fits_shwi_p check above, and the rest of that
function which uses signed HWI throughout.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
gcc:
2018-07-29  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	PR middle-end/86711
	PR middle-end/86714
	* builtins.c (c_strlen): Use tree_to_shwi because maxelts is signed.
	* expr.c (string_constant): Don't return truncated string literals.
	* fold-const.c (c_getstr): Fix function comment.  Remove unused third
	argument.  Fix range checks.
	* fold-const.c (c_getstr): Adjust protoype.

testsuite:
2018-07-29  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	PR middle-end/86711
	PR middle-end/86714
	* gcc.c-torture/execute/pr86711.c: New test.
	* gcc.c-torture/execute/pr86714.c: New test.
	* gcc.dg/strlenopt-49.c: Add xfail.

Comments

Martin Sebor July 29, 2018, 11:05 p.m. UTC | #1
On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
> Hi!
>
> This fixes two wrong code bugs where string_constant
> returns over length string constants.  Initializers
> like that are rejected in C++, but valid in C.

If by valid you are referring to declarations like the one in
the added test:

     const char a[2][3] = { "1234", "xyz" };

then (as I explained), the excess elements in "1234" make
the char[3] initialization and thus the test case undefined.
I have resolved bug 86711 as invalid on those grounds.

Bug 86711 has a valid test case that needs to be fixed, along
with bug 86688 that I raised for the same underlying problem:
considering the excess nul as part of the string.  As has been
discussed in a separate bug, rather than working around
the excessively long strings in the middle-end, it would be
preferable to avoid creating them to begin with.

I'm already working on a fix for bug 86688, in part because
I introduced the code change and also because I'm making other
changes in this area -- bug 86552.  Both of these in response
to your comments.

I would normally welcome someone else helping with my work
but (as I already made clear last week) it's counteproductive
to have two people working in the very same area at the same
time, especially when they are working at cross purposes as
you seem to be hell-bent on doing.

> I have xfailed strlenopt-49.c, which tests this feature.

That's not appropriate.  The purpose of the test is to verify
the fix for bug 86428: namely, that a call to strlen() on
an array initialized with a string of the same length is
folded, such as in:

     const char b[4] = "123\0";

That's a valid initializer that can and should continue to be
folded.  The test needs to continue to exercise that feature.

The test also happens to exercise invalid/overlong initializers.
This is because that, in my view, it's safer to fold the result
of such calls to a constant than than to call the library
function and have it either return an unpredictable value or
perhaps even crash.

> Personally I don't think that it is worth the effort to
> optimize something that is per se invalid in C++.

This is a C test, not C++.  (I don't suppose you are actually
saying that only the common subset between C and C++ is worth
optimizing.)

Just in case it isn't clear from the above: the point of
the test exercising the behavior for overlong strings isn't
optimizing undefined behavior but rather avoiding the worst
consequences of it.  I have already explained this once
before so I'm starting to wonder if I'm being insufficiently
clear or if you are not receiving or reading (or understanding)
my responses.  We can have a broader discussion about whether
this is the best approach for GCC to take either in this instance
or in general, but in the meantime I would appreciate it if you
refrained from undoing my changes just because you don't agree
with or don't understand the motivation behind them.

Martin

PS I continue to wonder about your motivation and ethics.  It's
rare to have someone so openly, blatantly and persistently try
to undermine someone else's work.
Richard Biener July 30, 2018, 6:57 a.m. UTC | #2
On Sun, 29 Jul 2018, Martin Sebor wrote:

> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
> > Hi!
> > 
> > This fixes two wrong code bugs where string_constant
> > returns over length string constants.  Initializers
> > like that are rejected in C++, but valid in C.
> 
> If by valid you are referring to declarations like the one in
> the added test:
> 
>     const char a[2][3] = { "1234", "xyz" };
> 
> then (as I explained), the excess elements in "1234" make
> the char[3] initialization and thus the test case undefined.
> I have resolved bug 86711 as invalid on those grounds.
> 
> Bug 86711 has a valid test case that needs to be fixed, along
> with bug 86688 that I raised for the same underlying problem:
> considering the excess nul as part of the string.  As has been
> discussed in a separate bug, rather than working around
> the excessively long strings in the middle-end, it would be
> preferable to avoid creating them to begin with.
> 
> I'm already working on a fix for bug 86688, in part because
> I introduced the code change and also because I'm making other
> changes in this area -- bug 86552.  Both of these in response
> to your comments.
> 
> I would normally welcome someone else helping with my work
> but (as I already made clear last week) it's counteproductive
> to have two people working in the very same area at the same
> time, especially when they are working at cross purposes as
> you seem to be hell-bent on doing.
> 
> > I have xfailed strlenopt-49.c, which tests this feature.
> 
> That's not appropriate.  The purpose of the test is to verify
> the fix for bug 86428: namely, that a call to strlen() on
> an array initialized with a string of the same length is
> folded, such as in:
> 
>     const char b[4] = "123\0";
> 
> That's a valid initializer that can and should continue to be
> folded.  The test needs to continue to exercise that feature.
> 
> The test also happens to exercise invalid/overlong initializers.
> This is because that, in my view, it's safer to fold the result
> of such calls to a constant than than to call the library
> function and have it either return an unpredictable value or
> perhaps even crash.
> 
> > Personally I don't think that it is worth the effort to
> > optimize something that is per se invalid in C++.
> 
> This is a C test, not C++.  (I don't suppose you are actually
> saying that only the common subset between C and C++ is worth
> optimizing.)
> 
> Just in case it isn't clear from the above: the point of
> the test exercising the behavior for overlong strings isn't
> optimizing undefined behavior but rather avoiding the worst
> consequences of it.  I have already explained this once
> before so I'm starting to wonder if I'm being insufficiently
> clear or if you are not receiving or reading (or understanding)
> my responses.  We can have a broader discussion about whether
> this is the best approach for GCC to take either in this instance
> or in general, but in the meantime I would appreciate it if you
> refrained from undoing my changes just because you don't agree
> with or don't understand the motivation behind them.
> 
> Martin
> 
> PS I continue to wonder about your motivation and ethics.  It's
> rare to have someone so openly, blatantly and persistently try
> to undermine someone else's work.

Martin, you are clearly the one being hostile here - Bernd is trying
to help.  In fact his patches are more focused, easier to undestand
and thus easier to review.

Get your feelings about "ownership" under control.

Richard.
Martin Sebor July 30, 2018, 3 p.m. UTC | #3
On 07/30/2018 12:57 AM, Richard Biener wrote:
> On Sun, 29 Jul 2018, Martin Sebor wrote:
>
>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
>>> Hi!
>>>
>>> This fixes two wrong code bugs where string_constant
>>> returns over length string constants.  Initializers
>>> like that are rejected in C++, but valid in C.
>>
>> If by valid you are referring to declarations like the one in
>> the added test:
>>
>>     const char a[2][3] = { "1234", "xyz" };
>>
>> then (as I explained), the excess elements in "1234" make
>> the char[3] initialization and thus the test case undefined.
>> I have resolved bug 86711 as invalid on those grounds.
>>
>> Bug 86711 has a valid test case that needs to be fixed, along
>> with bug 86688 that I raised for the same underlying problem:
>> considering the excess nul as part of the string.  As has been
>> discussed in a separate bug, rather than working around
>> the excessively long strings in the middle-end, it would be
>> preferable to avoid creating them to begin with.
>>
>> I'm already working on a fix for bug 86688, in part because
>> I introduced the code change and also because I'm making other
>> changes in this area -- bug 86552.  Both of these in response
>> to your comments.
>>
>> I would normally welcome someone else helping with my work
>> but (as I already made clear last week) it's counteproductive
>> to have two people working in the very same area at the same
>> time, especially when they are working at cross purposes as
>> you seem to be hell-bent on doing.
>>
>>> I have xfailed strlenopt-49.c, which tests this feature.
>>
>> That's not appropriate.  The purpose of the test is to verify
>> the fix for bug 86428: namely, that a call to strlen() on
>> an array initialized with a string of the same length is
>> folded, such as in:
>>
>>     const char b[4] = "123\0";
>>
>> That's a valid initializer that can and should continue to be
>> folded.  The test needs to continue to exercise that feature.
>>
>> The test also happens to exercise invalid/overlong initializers.
>> This is because that, in my view, it's safer to fold the result
>> of such calls to a constant than than to call the library
>> function and have it either return an unpredictable value or
>> perhaps even crash.
>>
>>> Personally I don't think that it is worth the effort to
>>> optimize something that is per se invalid in C++.
>>
>> This is a C test, not C++.  (I don't suppose you are actually
>> saying that only the common subset between C and C++ is worth
>> optimizing.)
>>
>> Just in case it isn't clear from the above: the point of
>> the test exercising the behavior for overlong strings isn't
>> optimizing undefined behavior but rather avoiding the worst
>> consequences of it.  I have already explained this once
>> before so I'm starting to wonder if I'm being insufficiently
>> clear or if you are not receiving or reading (or understanding)
>> my responses.  We can have a broader discussion about whether
>> this is the best approach for GCC to take either in this instance
>> or in general, but in the meantime I would appreciate it if you
>> refrained from undoing my changes just because you don't agree
>> with or don't understand the motivation behind them.
>>
>> Martin
>>
>> PS I continue to wonder about your motivation and ethics.  It's
>> rare to have someone so openly, blatantly and persistently try
>> to undermine someone else's work.
>
> Martin, you are clearly the one being hostile here - Bernd is trying
> to help.  In fact his patches are more focused, easier to undestand
> and thus easier to review.

As I explained, it's unhelpful for two people to making changes
to the same code at the same time, especially when one is undoing
the other one's changes.  I have made it clear that I am working
in this area -- I welcome and address valid feedback but I can't
very well do that while someone is compromising my work.

I appreciate test cases and suggestions for improvements but
please avoid making changes to this code while I'm working on
it.  It is not helpful.

Martin
Bernd Edlinger July 30, 2018, 3:24 p.m. UTC | #4
On 07/30/18 01:05, Martin Sebor wrote:
> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
>> Hi!
>>
>> This fixes two wrong code bugs where string_constant
>> returns over length string constants.  Initializers
>> like that are rejected in C++, but valid in C.
> 
> If by valid you are referring to declarations like the one in
> the added test:
> 
>      const char a[2][3] = { "1234", "xyz" };
> 
> then (as I explained), the excess elements in "1234" make
> the char[3] initialization and thus the test case undefined.
> I have resolved bug 86711 as invalid on those grounds.
> 
> Bug 86711 has a valid test case that needs to be fixed, along
> with bug 86688 that I raised for the same underlying problem:
> considering the excess nul as part of the string.  As has been
> discussed in a separate bug, rather than working around
> the excessively long strings in the middle-end, it would be
> preferable to avoid creating them to begin with.
> 
> I'm already working on a fix for bug 86688, in part because
> I introduced the code change and also because I'm making other
> changes in this area -- bug 86552.  Both of these in response
> to your comments.
> 

Sorry, I must admit, I have completely lost track on how many things
you are trying to work in parallel.

Nevertheless I started to review you pr86552 patch here:
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html

But so far you did not respond to me.

Well actually I doubt your patch does apply to trunk,
maybe you start to re-base that one, and post it again
instead?


Thanks
Bernd.
Martin Sebor July 30, 2018, 7:52 p.m. UTC | #5
On 07/30/2018 09:24 AM, Bernd Edlinger wrote:
> On 07/30/18 01:05, Martin Sebor wrote:
>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
>>> Hi!
>>>
>>> This fixes two wrong code bugs where string_constant
>>> returns over length string constants.  Initializers
>>> like that are rejected in C++, but valid in C.
>>
>> If by valid you are referring to declarations like the one in
>> the added test:
>>
>>      const char a[2][3] = { "1234", "xyz" };
>>
>> then (as I explained), the excess elements in "1234" make
>> the char[3] initialization and thus the test case undefined.
>> I have resolved bug 86711 as invalid on those grounds.
>>
>> Bug 86711 has a valid test case that needs to be fixed, along
>> with bug 86688 that I raised for the same underlying problem:
>> considering the excess nul as part of the string.  As has been
>> discussed in a separate bug, rather than working around
>> the excessively long strings in the middle-end, it would be
>> preferable to avoid creating them to begin with.
>>
>> I'm already working on a fix for bug 86688, in part because
>> I introduced the code change and also because I'm making other
>> changes in this area -- bug 86552.  Both of these in response
>> to your comments.
>>
>
> Sorry, I must admit, I have completely lost track on how many things
> you are trying to work in parallel.
>
> Nevertheless I started to review you pr86552 patch here:
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html
>
> But so far you did not respond to me.
>
> Well actually I doubt your patch does apply to trunk,
> maybe you start to re-base that one, and post it again
> instead?

I read your comments and have been busy working on enhancing
the patch (among other things).  There are a large number of
additional contexts where constant strings are expected and
where a missing nul needs to be detected.  Some include
additional instances of strlen calls that my initial patch
didn't handle, many more others that involve other string
functions.  I have posted an updated patch that applies
cleanly and that handles the first set.

There is also a class of problems involving constant character
arrays initialized by a braced list, as in char [] = { x, y, z };
Those are currently not recognized as strings even if they are
nul-terminated, but they are far more likely to be a source of
these problems than string literals, certainly in C++ where
string initializers must fit in the array.  I am testing a patch
to convert those into STRING_CST so they can be handled as well.

Since initializing arrays with more elements than fit is
undefined in C and since the behavior is undefined at compile
time it seems to me that rejecting such initializers with
a hard error (as opposed to a warning) would be appropriate
and obviate having to deal with them in the middle-end.

Martin
Bernd Edlinger July 30, 2018, 8:21 p.m. UTC | #6
On 07/30/18 21:52, Martin Sebor wrote:
> On 07/30/2018 09:24 AM, Bernd Edlinger wrote:
>> On 07/30/18 01:05, Martin Sebor wrote:
>>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
>>>> Hi!
>>>>
>>>> This fixes two wrong code bugs where string_constant
>>>> returns over length string constants.  Initializers
>>>> like that are rejected in C++, but valid in C.
>>>
>>> If by valid you are referring to declarations like the one in
>>> the added test:
>>>
>>>      const char a[2][3] = { "1234", "xyz" };
>>>
>>> then (as I explained), the excess elements in "1234" make
>>> the char[3] initialization and thus the test case undefined.
>>> I have resolved bug 86711 as invalid on those grounds.
>>>
>>> Bug 86711 has a valid test case that needs to be fixed, along
>>> with bug 86688 that I raised for the same underlying problem:
>>> considering the excess nul as part of the string.  As has been
>>> discussed in a separate bug, rather than working around
>>> the excessively long strings in the middle-end, it would be
>>> preferable to avoid creating them to begin with.
>>>
>>> I'm already working on a fix for bug 86688, in part because
>>> I introduced the code change and also because I'm making other
>>> changes in this area -- bug 86552.  Both of these in response
>>> to your comments.
>>>
>>
>> Sorry, I must admit, I have completely lost track on how many things
>> you are trying to work in parallel.
>>
>> Nevertheless I started to review you pr86552 patch here:
>> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html
>>
>> But so far you did not respond to me.
>>
>> Well actually I doubt your patch does apply to trunk,
>> maybe you start to re-base that one, and post it again
>> instead?
> 
> I read your comments and have been busy working on enhancing
> the patch (among other things).  There are a large number of
> additional contexts where constant strings are expected and
> where a missing nul needs to be detected.  Some include
> additional instances of strlen calls that my initial patch
> didn't handle, many more others that involve other string
> functions.  I have posted an updated patch that applies
> cleanly and that handles the first set.
> 
> There is also a class of problems involving constant character
> arrays initialized by a braced list, as in char [] = { x, y, z };
> Those are currently not recognized as strings even if they are
> nul-terminated, but they are far more likely to be a source of
> these problems than string literals, certainly in C++ where
> string initializers must fit in the array.  I am testing a patch
> to convert those into STRING_CST so they can be handled as well.
> 
> Since initializing arrays with more elements than fit is
> undefined in C and since the behavior is undefined at compile
> time it seems to me that rejecting such initializers with
> a hard error (as opposed to a warning) would be appropriate
> and obviate having to deal with them in the middle-end.
> 

We do not want to change what is currently accepted by the
front end. period.

But there is no reason why ambiguous string constants
have to be passed to the middle end.

For instance char c[2] = "a\0"; should look like c[1] = "a";
while c[2] = "aaa"; should look like c[2] = "aa"; varasm.c
will cut the excess precision off anyway.

That is TREE_STRING_LENGTH (str) == 3 and TREE_STRING_POINTER(str) = "aa\0";

I propose to have all STRING_CST always be created by the
FE with explicit nul termination, but the
TYPE_SIZE_UNIT (TREE_TYPE (str)) >= TREE_STRING_LENGTH(str) in normal case (null-terminated)
TREE_SIZE_UNIT (TREE_TYPE (str)) < TREE_STRING_LENGTH(str) if non zero terminated,
truncated in the initializer.

Do you understand what I mean?


Bernd.
Martin Sebor July 31, 2018, 4:01 a.m. UTC | #7
On 07/30/2018 02:21 PM, Bernd Edlinger wrote:
> On 07/30/18 21:52, Martin Sebor wrote:
>> On 07/30/2018 09:24 AM, Bernd Edlinger wrote:
>>> On 07/30/18 01:05, Martin Sebor wrote:
>>>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
>>>>> Hi!
>>>>>
>>>>> This fixes two wrong code bugs where string_constant
>>>>> returns over length string constants.  Initializers
>>>>> like that are rejected in C++, but valid in C.
>>>>
>>>> If by valid you are referring to declarations like the one in
>>>> the added test:
>>>>
>>>>      const char a[2][3] = { "1234", "xyz" };
>>>>
>>>> then (as I explained), the excess elements in "1234" make
>>>> the char[3] initialization and thus the test case undefined.
>>>> I have resolved bug 86711 as invalid on those grounds.
>>>>
>>>> Bug 86711 has a valid test case that needs to be fixed, along
>>>> with bug 86688 that I raised for the same underlying problem:
>>>> considering the excess nul as part of the string.  As has been
>>>> discussed in a separate bug, rather than working around
>>>> the excessively long strings in the middle-end, it would be
>>>> preferable to avoid creating them to begin with.
>>>>
>>>> I'm already working on a fix for bug 86688, in part because
>>>> I introduced the code change and also because I'm making other
>>>> changes in this area -- bug 86552.  Both of these in response
>>>> to your comments.
>>>>
>>>
>>> Sorry, I must admit, I have completely lost track on how many things
>>> you are trying to work in parallel.
>>>
>>> Nevertheless I started to review you pr86552 patch here:
>>> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html
>>>
>>> But so far you did not respond to me.
>>>
>>> Well actually I doubt your patch does apply to trunk,
>>> maybe you start to re-base that one, and post it again
>>> instead?
>>
>> I read your comments and have been busy working on enhancing
>> the patch (among other things).  There are a large number of
>> additional contexts where constant strings are expected and
>> where a missing nul needs to be detected.  Some include
>> additional instances of strlen calls that my initial patch
>> didn't handle, many more others that involve other string
>> functions.  I have posted an updated patch that applies
>> cleanly and that handles the first set.
>>
>> There is also a class of problems involving constant character
>> arrays initialized by a braced list, as in char [] = { x, y, z };
>> Those are currently not recognized as strings even if they are
>> nul-terminated, but they are far more likely to be a source of
>> these problems than string literals, certainly in C++ where
>> string initializers must fit in the array.  I am testing a patch
>> to convert those into STRING_CST so they can be handled as well.
>>
>> Since initializing arrays with more elements than fit is
>> undefined in C and since the behavior is undefined at compile
>> time it seems to me that rejecting such initializers with
>> a hard error (as opposed to a warning) would be appropriate
>> and obviate having to deal with them in the middle-end.
>>
>
> We do not want to change what is currently accepted by the
> front end. period.

On whose behalf are you making such categorical statements?
It was Jakub and Richard's suggestion in bug 86714 to reject
the undefined excessive initializers and I happen to like
the idea.  I don't recall anyone making a decision about what
"we" do or don't want to change.

That said, if rejecting such initializers is not acceptable
an alternate solution that I believe Richard preferred is to
trim excess elements early on, e.g., during gimplification
(or at some other point after the front-end is done).  That's
okay with me too.

>
> But there is no reason why ambiguous string constants
> have to be passed to the middle end.
>
> For instance char c[2] = "a\0"; should look like c[1] = "a";
> while c[2] = "aaa"; should look like c[2] = "aa"; varasm.c
> will cut the excess precision off anyway.
>
> That is TREE_STRING_LENGTH (str) == 3 and TREE_STRING_POINTER(str) = "aa\0";
>
> I propose to have all STRING_CST always be created by the
> FE with explicit nul termination, but the
> TYPE_SIZE_UNIT (TREE_TYPE (str)) >= TREE_STRING_LENGTH(str) in normal case (null-terminated)
> TREE_SIZE_UNIT (TREE_TYPE (str)) < TREE_STRING_LENGTH(str) if non zero terminated,
> truncated in the initializer.
>
> Do you understand what I mean?

I don't insist on any particular internal representation as long
as it makes it possible to detect and diagnose common bugs, and
(ideally) also help mitigate their worst consequences.

Martin
Jakub Jelinek July 31, 2018, 6:33 a.m. UTC | #8
On Mon, Jul 30, 2018 at 10:01:38PM -0600, Martin Sebor wrote:
> > We do not want to change what is currently accepted by the
> > front end. period.
> 
> On whose behalf are you making such categorical statements?
> It was Jakub and Richard's suggestion in bug 86714 to reject
> the undefined excessive initializers and I happen to like
> the idea.  I don't recall anyone making a decision about what

It was not my suggestion and it is already rejected with -pedantic-errors,
which is all that is needed, reject it in pedantic mode.
Otherwise there is a warning emitted by default which is enough.

The suggestion was that if we don't reject (and no reason to change when we
reject it), that we handle it right, which is what your optimization broke.

	Jakub
Jeff Law Aug. 3, 2018, 9:15 p.m. UTC | #9
On 07/30/2018 02:21 PM, Bernd Edlinger wrote:
> On 07/30/18 21:52, Martin Sebor wrote:
>> On 07/30/2018 09:24 AM, Bernd Edlinger wrote:
>>> On 07/30/18 01:05, Martin Sebor wrote:
>>>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
>>>>> Hi!
>>>>>
>>>>> This fixes two wrong code bugs where string_constant
>>>>> returns over length string constants.  Initializers
>>>>> like that are rejected in C++, but valid in C.
>>>>
>>>> If by valid you are referring to declarations like the one in
>>>> the added test:
>>>>
>>>>      const char a[2][3] = { "1234", "xyz" };
>>>>
>>>> then (as I explained), the excess elements in "1234" make
>>>> the char[3] initialization and thus the test case undefined.
>>>> I have resolved bug 86711 as invalid on those grounds.
>>>>
>>>> Bug 86711 has a valid test case that needs to be fixed, along
>>>> with bug 86688 that I raised for the same underlying problem:
>>>> considering the excess nul as part of the string.  As has been
>>>> discussed in a separate bug, rather than working around
>>>> the excessively long strings in the middle-end, it would be
>>>> preferable to avoid creating them to begin with.
>>>>
>>>> I'm already working on a fix for bug 86688, in part because
>>>> I introduced the code change and also because I'm making other
>>>> changes in this area -- bug 86552.  Both of these in response
>>>> to your comments.
>>>>
>>>
>>> Sorry, I must admit, I have completely lost track on how many things
>>> you are trying to work in parallel.
>>>
>>> Nevertheless I started to review you pr86552 patch here:
>>> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html
>>>
>>> But so far you did not respond to me.
>>>
>>> Well actually I doubt your patch does apply to trunk,
>>> maybe you start to re-base that one, and post it again
>>> instead?
>>
>> I read your comments and have been busy working on enhancing
>> the patch (among other things).  There are a large number of
>> additional contexts where constant strings are expected and
>> where a missing nul needs to be detected.  Some include
>> additional instances of strlen calls that my initial patch
>> didn't handle, many more others that involve other string
>> functions.  I have posted an updated patch that applies
>> cleanly and that handles the first set.
>>
>> There is also a class of problems involving constant character
>> arrays initialized by a braced list, as in char [] = { x, y, z };
>> Those are currently not recognized as strings even if they are
>> nul-terminated, but they are far more likely to be a source of
>> these problems than string literals, certainly in C++ where
>> string initializers must fit in the array.  I am testing a patch
>> to convert those into STRING_CST so they can be handled as well.
>>
>> Since initializing arrays with more elements than fit is
>> undefined in C and since the behavior is undefined at compile
>> time it seems to me that rejecting such initializers with
>> a hard error (as opposed to a warning) would be appropriate
>> and obviate having to deal with them in the middle-end.
>>
> 
> We do not want to change what is currently accepted by the
> front end. period.
?!?  I don't follow this.  When we can detect an error, we should issue
a suitable diagnostic.  As is mentioned later in the thread, some
diagnostics are considered "pedantic" and are conditional.  But I don't
see how you get to "We do not want to change what is currently accepted
by the front end. period."  That seems like far too strong of a statement.

I'm not sure I agree with this being a pedantic error only.  See below...
> 
> But there is no reason why ambiguous string constants
> have to be passed to the middle end.
Agreed -- if we're not outright rejecting, then we should present the
middle end with something reasonable.   But....

> 
> For instance char c[2] = "a\0"; should look like c[1] = "a";
> while c[2] = "aaa"; should look like c[2] = "aa"; varasm.c
> will cut the excess precision off anyway.
I get the first case.  The second is less clear.  One could argue that
c[1] should be NUL or one could argue that c[1] should be 'a'.   It's
the inability to know what is "more correct" and the security
implications of leaving the string unterminated that drives my opinion
that this should not be a pedantic error, but instead a hard error.

> 
> That is TREE_STRING_LENGTH (str) == 3 and TREE_STRING_POINTER(str) = "aa\0";
See above.  That's going to leave an unterminated string in the
resulting code.  That has a negative security impact.

> 
> I propose to have all STRING_CST always be created by the
> FE with explicit nul termination, but the
> TYPE_SIZE_UNIT (TREE_TYPE (str)) >= TREE_STRING_LENGTH(str) in normal case (null-terminated)
> TREE_SIZE_UNIT (TREE_TYPE (str)) < TREE_STRING_LENGTH(str) if non zero terminated,
> truncated in the initializer.
> 
> Do you understand what I mean?
I think you're starting from the wrong basis, namely that we shouldn't
be issuing a hard error for excess initializers like this when we don't
have a clear cut best way to handle it.

jeff
Jeff Law Aug. 3, 2018, 9:16 p.m. UTC | #10
On 07/31/2018 12:33 AM, Jakub Jelinek wrote:
> On Mon, Jul 30, 2018 at 10:01:38PM -0600, Martin Sebor wrote:
>>> We do not want to change what is currently accepted by the
>>> front end. period.
>>
>> On whose behalf are you making such categorical statements?
>> It was Jakub and Richard's suggestion in bug 86714 to reject
>> the undefined excessive initializers and I happen to like
>> the idea.  I don't recall anyone making a decision about what
> 
> It was not my suggestion and it is already rejected with -pedantic-errors,
> which is all that is needed, reject it in pedantic mode.
> Otherwise there is a warning emitted by default which is enough.
But I think that's a mistake.  I think a hard error at this point is
warranted and the safest thing to do.

> 
> The suggestion was that if we don't reject (and no reason to change when we
> reject it), that we handle it right, which is what your optimization broke.
But it's not always clear what is right.  That's my concern.  If we had
rules which were clearly right, then applying those rules and continuing
is much more reasonable.
jeff
Jeff Law Aug. 3, 2018, 9:21 p.m. UTC | #11
On 07/30/2018 01:52 PM, Martin Sebor wrote:
> On 07/30/2018 09:24 AM, Bernd Edlinger wrote:
>> On 07/30/18 01:05, Martin Sebor wrote:
>>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
>>>> Hi!
>>>>
>>>> This fixes two wrong code bugs where string_constant
>>>> returns over length string constants.  Initializers
>>>> like that are rejected in C++, but valid in C.
>>>
>>> If by valid you are referring to declarations like the one in
>>> the added test:
>>>
>>>      const char a[2][3] = { "1234", "xyz" };
>>>
>>> then (as I explained), the excess elements in "1234" make
>>> the char[3] initialization and thus the test case undefined.
>>> I have resolved bug 86711 as invalid on those grounds.
>>>
>>> Bug 86711 has a valid test case that needs to be fixed, along
>>> with bug 86688 that I raised for the same underlying problem:
>>> considering the excess nul as part of the string.  As has been
>>> discussed in a separate bug, rather than working around
>>> the excessively long strings in the middle-end, it would be
>>> preferable to avoid creating them to begin with.
>>>
>>> I'm already working on a fix for bug 86688, in part because
>>> I introduced the code change and also because I'm making other
>>> changes in this area -- bug 86552.  Both of these in response
>>> to your comments.
>>>
>>
>> Sorry, I must admit, I have completely lost track on how many things
>> you are trying to work in parallel.
>>
>> Nevertheless I started to review you pr86552 patch here:
>> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html
>>
>> But so far you did not respond to me.
>>
>> Well actually I doubt your patch does apply to trunk,
>> maybe you start to re-base that one, and post it again
>> instead?
> 
> I read your comments and have been busy working on enhancing
> the patch (among other things).  There are a large number of
> additional contexts where constant strings are expected and
> where a missing nul needs to be detected.  Some include
> additional instances of strlen calls that my initial patch
> didn't handle, many more others that involve other string
> functions.  I have posted an updated patch that applies
> cleanly and that handles the first set.
So without seeing the code my worry here is we end up with a patch that
gets increasingly complex because it's trying to handle a large number
of cases.

If at all possible let's try to make incremental improvements, focusing
first on correctness issues.


> 
> There is also a class of problems involving constant character
> arrays initialized by a braced list, as in char [] = { x, y, z };
> Those are currently not recognized as strings even if they are
> nul-terminated, but they are far more likely to be a source of
> these problems than string literals, certainly in C++ where
> string initializers must fit in the array.  I am testing a patch
> to convert those into STRING_CST so they can be handled as well.
This feels like an independent, but very useful change.

> Since initializing arrays with more elements than fit is
> undefined in C and since the behavior is undefined at compile
> time it seems to me that rejecting such initializers with
> a hard error (as opposed to a warning) would be appropriate
> and obviate having to deal with them in the middle-end.
I tend to agree when there's no good set of rules we can apply to allow
compilation to continue.  However, I think that means getting some
consensus to change existing practice where this is just a pedantic error.

jeff
Jakub Jelinek Aug. 3, 2018, 9:28 p.m. UTC | #12
On Fri, Aug 03, 2018 at 03:16:41PM -0600, Jeff Law wrote:
> On 07/31/2018 12:33 AM, Jakub Jelinek wrote:
> > On Mon, Jul 30, 2018 at 10:01:38PM -0600, Martin Sebor wrote:
> >>> We do not want to change what is currently accepted by the
> >>> front end. period.
> >>
> >> On whose behalf are you making such categorical statements?
> >> It was Jakub and Richard's suggestion in bug 86714 to reject
> >> the undefined excessive initializers and I happen to like
> >> the idea.  I don't recall anyone making a decision about what
> > 
> > It was not my suggestion and it is already rejected with -pedantic-errors,
> > which is all that is needed, reject it in pedantic mode.
> > Otherwise there is a warning emitted by default which is enough.
> But I think that's a mistake.  I think a hard error at this point is
> warranted and the safest thing to do.

The normal behavior when it isn't an error is that the initializer is
truncated.  That is what happens if I use
struct S { char a[4]; };
struct S r = { "abc" };
struct S s = { "abcd" };
struct S t = { "abcde" };

C says that in the s.a initializer is actually just 'a', 'b',
'c', 'd' rather than 'a', 'b', 'c', 'd', '\0', so there is a silent
truncation in the language naturally, so the extension that truncates even
more with a warning enabled by default is IMHO natural and doesn't need to
be more pedantic.  We've always truncated, so there should be no surprises.

> > The suggestion was that if we don't reject (and no reason to change when we
> > reject it), that we handle it right, which is what your optimization broke.
> But it's not always clear what is right.  That's my concern.  If we had
> rules which were clearly right, then applying those rules and continuing
> is much more reasonable.

Perhaps we haven't written them down, but we've always behaved that way.
clang also truncates with a warning enabled by default, only rejects it like
gcc with -pedantic-errors.  So does icc.

	Jakub
Bernd Edlinger Aug. 3, 2018, 9:38 p.m. UTC | #13
On 08/03/18 23:15, Jeff Law wrote:
> On 07/30/2018 02:21 PM, Bernd Edlinger wrote:
>> On 07/30/18 21:52, Martin Sebor wrote:
>>> On 07/30/2018 09:24 AM, Bernd Edlinger wrote:
>>>> On 07/30/18 01:05, Martin Sebor wrote:
>>>>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
>>>>>> Hi!
>>>>>>
>>>>>> This fixes two wrong code bugs where string_constant
>>>>>> returns over length string constants.  Initializers
>>>>>> like that are rejected in C++, but valid in C.
>>>>>
>>>>> If by valid you are referring to declarations like the one in
>>>>> the added test:
>>>>>
>>>>>       const char a[2][3] = { "1234", "xyz" };
>>>>>
>>>>> then (as I explained), the excess elements in "1234" make
>>>>> the char[3] initialization and thus the test case undefined.
>>>>> I have resolved bug 86711 as invalid on those grounds.
>>>>>
>>>>> Bug 86711 has a valid test case that needs to be fixed, along
>>>>> with bug 86688 that I raised for the same underlying problem:
>>>>> considering the excess nul as part of the string.  As has been
>>>>> discussed in a separate bug, rather than working around
>>>>> the excessively long strings in the middle-end, it would be
>>>>> preferable to avoid creating them to begin with.
>>>>>
>>>>> I'm already working on a fix for bug 86688, in part because
>>>>> I introduced the code change and also because I'm making other
>>>>> changes in this area -- bug 86552.  Both of these in response
>>>>> to your comments.
>>>>>
>>>>
>>>> Sorry, I must admit, I have completely lost track on how many things
>>>> you are trying to work in parallel.
>>>>
>>>> Nevertheless I started to review you pr86552 patch here:
>>>> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html
>>>>
>>>> But so far you did not respond to me.
>>>>
>>>> Well actually I doubt your patch does apply to trunk,
>>>> maybe you start to re-base that one, and post it again
>>>> instead?
>>>
>>> I read your comments and have been busy working on enhancing
>>> the patch (among other things).  There are a large number of
>>> additional contexts where constant strings are expected and
>>> where a missing nul needs to be detected.  Some include
>>> additional instances of strlen calls that my initial patch
>>> didn't handle, many more others that involve other string
>>> functions.  I have posted an updated patch that applies
>>> cleanly and that handles the first set.
>>>
>>> There is also a class of problems involving constant character
>>> arrays initialized by a braced list, as in char [] = { x, y, z };
>>> Those are currently not recognized as strings even if they are
>>> nul-terminated, but they are far more likely to be a source of
>>> these problems than string literals, certainly in C++ where
>>> string initializers must fit in the array.  I am testing a patch
>>> to convert those into STRING_CST so they can be handled as well.
>>>
>>> Since initializing arrays with more elements than fit is
>>> undefined in C and since the behavior is undefined at compile
>>> time it seems to me that rejecting such initializers with
>>> a hard error (as opposed to a warning) would be appropriate
>>> and obviate having to deal with them in the middle-end.
>>>
>>
>> We do not want to change what is currently accepted by the
>> front end. period.
> ?!?  I don't follow this.  When we can detect an error, we should issue
> a suitable diagnostic.  As is mentioned later in the thread, some
> diagnostics are considered "pedantic" and are conditional.  But I don't
> see how you get to "We do not want to change what is currently accepted
> by the front end. period."  That seems like far too strong of a statement.
> 

What I mean here is: we should fix a middle-end consistency issue with
the string constants.  When that is done, we can decide to make
change a pedantic error into a hard error, but IMHO it should be an independent
patch of it's own.

By the way, I found that Fortran has non-zero terminated strings with
index range starting from 1.
Then there are also overlength strings in Fortran, that have excess precision.
Should we forbid that Fortran feature as well?

Then Ada and Go do not have zero-terminated strings, but I do not know
if these can hit the strlen pass.
C++ has a -fpermissive option that also allows overlength strings,
but all test cases are C only. etc. etc.

> I'm not sure I agree with this being a pedantic error only.  See below...
>>
>> But there is no reason why ambiguous string constants
>> have to be passed to the middle end.
> Agreed -- if we're not outright rejecting, then we should present the
> middle end with something reasonable.   But....
> 
>>
>> For instance char c[2] = "a\0"; should look like c[1] = "a";
>> while c[2] = "aaa"; should look like c[2] = "aa"; varasm.c
>> will cut the excess precision off anyway.
> I get the first case.  The second is less clear.  One could argue that
> c[1] should be NUL or one could argue that c[1] should be 'a'.   It's
> the inability to know what is "more correct" and the security
> implications of leaving the string unterminated that drives my opinion
> that this should not be a pedantic error, but instead a hard error.
> 

Yes, in the moment I just look at how C works, and try to make that
the normal.

However I think those details can be discussed.


>>
>> That is TREE_STRING_LENGTH (str) == 3 and TREE_STRING_POINTER(str) = "aa\0";
> See above.  That's going to leave an unterminated string in the
> resulting code.  That has a negative security impact.
> 
>>
>> I propose to have all STRING_CST always be created by the
>> FE with explicit nul termination, but the
>> TYPE_SIZE_UNIT (TREE_TYPE (str)) >= TREE_STRING_LENGTH(str) in normal case (null-terminated)
>> TREE_SIZE_UNIT (TREE_TYPE (str)) < TREE_STRING_LENGTH(str) if non zero terminated,
>> truncated in the initializer.
>>
>> Do you understand what I mean?
> I think you're starting from the wrong basis, namely that we shouldn't
> be issuing a hard error for excess initializers like this when we don't
> have a clear cut best way to handle it.
> 
> jeff
>
Bernd Edlinger Aug. 12, 2018, 9:06 a.m. UTC | #14
Hi,

I'd like to ping for this patch: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01800.html

I will add a new BZ entry for the (minor) regression this patch
introduces in gcc.dg/strlenopt-49.c and assign it to myself.


Thanks
Bernd.

On 07/29/18 12:56, Bernd Edlinger wrote:
> Hi!
> 
> This fixes two wrong code bugs where string_constant
> returns over length string constants.  Initializers
> like that are rejected in C++, but valid in C.
> 
> I have xfailed strlenopt-49.c, which tests this feature.
> Personally I don't think that it is worth the effort to
> optimize something that is per se invalid in C++.
> 
> The hunk in builtins.c is unrelated, but I would like
> to use tree_to_shwi here, to be in line with the
> tree_fits_shwi_p check above, and the rest of that
> function which uses signed HWI throughout.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.
Martin Sebor Aug. 13, 2018, 2:27 p.m. UTC | #15
As I said below, the patch for PR 86552, 86711, 86714 that
was first posted on July 19 fixes both of these issues and
also diagnoses calls with unterminated strings:

   https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00155.html

On 08/12/2018 03:06 AM, Bernd Edlinger wrote:
> Hi,
>
> I'd like to ping for this patch: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01800.html
>
> I will add a new BZ entry for the (minor) regression this patch
> introduces in gcc.dg/strlenopt-49.c and assign it to myself.
>
>
> Thanks
> Bernd.
>
> On 07/29/18 12:56, Bernd Edlinger wrote:
>> Hi!
>>
>> This fixes two wrong code bugs where string_constant
>> returns over length string constants.  Initializers
>> like that are rejected in C++, but valid in C.
>>
>> I have xfailed strlenopt-49.c, which tests this feature.
>> Personally I don't think that it is worth the effort to
>> optimize something that is per se invalid in C++.
>>
>> The hunk in builtins.c is unrelated, but I would like
>> to use tree_to_shwi here, to be in line with the
>> tree_fits_shwi_p check above, and the rest of that
>> function which uses signed HWI throughout.
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.
Bernd Edlinger Aug. 14, 2018, 10:29 a.m. UTC | #16
On 08/13/18 16:27, Martin Sebor wrote:
> As I said below, the patch for PR 86552, 86711, 86714 that
> was first posted on July 19 fixes both of these issues and
> also diagnoses calls with unterminated strings:
> 
>    https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00155.html
> 

Sorry, but you you read my comment on your patch here:

https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00184.html

????

So I am strictly opposed to implementing new warnings now,
which just papers over existing design issues in this area.

I think the most important problem in the strlen folding is that
c_strlen and get_range_strlen do not have the the information
what kind of string length they should return,
(as measured in char/wchar16/wchar32)
Most of the time only length in char makes any sense.

Only for sprintf processing a different char type may be
requested, but the sprintf code in the middle end does not
know which is the wchar type.

I have sent a followup patch that passes char size units
to c_strlen and get_range_strlen here:
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00795.html

I hope you understand, that I would like to fix design issues
before implementing new features, since that will obviously
become more difficult when even more features are being
implemented without fixing some basics first.


Bernd.
Jeff Law Aug. 15, 2018, 5:37 a.m. UTC | #17
On 08/03/2018 03:28 PM, Jakub Jelinek wrote:
> On Fri, Aug 03, 2018 at 03:16:41PM -0600, Jeff Law wrote:
>> On 07/31/2018 12:33 AM, Jakub Jelinek wrote:
>>> On Mon, Jul 30, 2018 at 10:01:38PM -0600, Martin Sebor wrote:
>>>>> We do not want to change what is currently accepted by the
>>>>> front end. period.
>>>>
>>>> On whose behalf are you making such categorical statements?
>>>> It was Jakub and Richard's suggestion in bug 86714 to reject
>>>> the undefined excessive initializers and I happen to like
>>>> the idea.  I don't recall anyone making a decision about what
>>>
>>> It was not my suggestion and it is already rejected with -pedantic-errors,
>>> which is all that is needed, reject it in pedantic mode.
>>> Otherwise there is a warning emitted by default which is enough.
>> But I think that's a mistake.  I think a hard error at this point is
>> warranted and the safest thing to do.
> 
> The normal behavior when it isn't an error is that the initializer is
> truncated.  That is what happens if I use
> struct S { char a[4]; };
> struct S r = { "abc" };
> struct S s = { "abcd" };
> struct S t = { "abcde" };
> 
> C says that in the s.a initializer is actually just 'a', 'b',
> 'c', 'd' rather than 'a', 'b', 'c', 'd', '\0', so there is a silent
> truncation in the language naturally, so the extension that truncates even
> more with a warning enabled by default is IMHO natural and doesn't need to
> be more pedantic.  We've always truncated, so there should be no surprises.
I wasn't aware C mandated a behavior here.  With that in mind I think we
truncate per the C semantics and get on with our lives :-)

Jeff
Jeff Law Aug. 15, 2018, 5:47 a.m. UTC | #18
On 08/03/2018 03:38 PM, Bernd Edlinger wrote:
> On 08/03/18 23:15, Jeff Law wrote:
>> On 07/30/2018 02:21 PM, Bernd Edlinger wrote:
>>> On 07/30/18 21:52, Martin Sebor wrote:
>>>> On 07/30/2018 09:24 AM, Bernd Edlinger wrote:
>>>>> On 07/30/18 01:05, Martin Sebor wrote:
>>>>>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> This fixes two wrong code bugs where string_constant
>>>>>>> returns over length string constants.  Initializers
>>>>>>> like that are rejected in C++, but valid in C.
>>>>>>
>>>>>> If by valid you are referring to declarations like the one in
>>>>>> the added test:
>>>>>>
>>>>>>       const char a[2][3] = { "1234", "xyz" };
>>>>>>
>>>>>> then (as I explained), the excess elements in "1234" make
>>>>>> the char[3] initialization and thus the test case undefined.
>>>>>> I have resolved bug 86711 as invalid on those grounds.
>>>>>>
>>>>>> Bug 86711 has a valid test case that needs to be fixed, along
>>>>>> with bug 86688 that I raised for the same underlying problem:
>>>>>> considering the excess nul as part of the string.  As has been
>>>>>> discussed in a separate bug, rather than working around
>>>>>> the excessively long strings in the middle-end, it would be
>>>>>> preferable to avoid creating them to begin with.
>>>>>>
>>>>>> I'm already working on a fix for bug 86688, in part because
>>>>>> I introduced the code change and also because I'm making other
>>>>>> changes in this area -- bug 86552.  Both of these in response
>>>>>> to your comments.
>>>>>>
>>>>>
>>>>> Sorry, I must admit, I have completely lost track on how many things
>>>>> you are trying to work in parallel.
>>>>>
>>>>> Nevertheless I started to review you pr86552 patch here:
>>>>> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html
>>>>>
>>>>> But so far you did not respond to me.
>>>>>
>>>>> Well actually I doubt your patch does apply to trunk,
>>>>> maybe you start to re-base that one, and post it again
>>>>> instead?
>>>>
>>>> I read your comments and have been busy working on enhancing
>>>> the patch (among other things).  There are a large number of
>>>> additional contexts where constant strings are expected and
>>>> where a missing nul needs to be detected.  Some include
>>>> additional instances of strlen calls that my initial patch
>>>> didn't handle, many more others that involve other string
>>>> functions.  I have posted an updated patch that applies
>>>> cleanly and that handles the first set.
>>>>
>>>> There is also a class of problems involving constant character
>>>> arrays initialized by a braced list, as in char [] = { x, y, z };
>>>> Those are currently not recognized as strings even if they are
>>>> nul-terminated, but they are far more likely to be a source of
>>>> these problems than string literals, certainly in C++ where
>>>> string initializers must fit in the array.  I am testing a patch
>>>> to convert those into STRING_CST so they can be handled as well.
>>>>
>>>> Since initializing arrays with more elements than fit is
>>>> undefined in C and since the behavior is undefined at compile
>>>> time it seems to me that rejecting such initializers with
>>>> a hard error (as opposed to a warning) would be appropriate
>>>> and obviate having to deal with them in the middle-end.
>>>>
>>>
>>> We do not want to change what is currently accepted by the
>>> front end. period.
>> ?!?  I don't follow this.  When we can detect an error, we should issue
>> a suitable diagnostic.  As is mentioned later in the thread, some
>> diagnostics are considered "pedantic" and are conditional.  But I don't
>> see how you get to "We do not want to change what is currently accepted
>> by the front end. period."  That seems like far too strong of a statement.
>>
> 
> What I mean here is: we should fix a middle-end consistency issue with
> the string constants.  When that is done, we can decide to make
> change a pedantic error into a hard error, but IMHO it should be an independent
> patch of it's own.
Note Jakub has indicated why changing this to a hard error would be a
mistake and what the semantics ought to be in this scenario.  so ignore
what I said about turning this into a hard error.  I think we do
truncation per the defined semantics and present the middle end with
"canonicalized" literals.

> 
> By the way, I found that Fortran has non-zero terminated strings with
> index range starting from 1.
> Then there are also overlength strings in Fortran, that have excess precision.
> Should we forbid that Fortran feature as well?
I have no clue.

> 
> Then Ada and Go do not have zero-terminated strings, but I do not know
> if these can hit the strlen pass.
I wouldn't think so, but maybe in code that's meant to interoperate with
C.  Or if they implemented a C looking strlen and GCC recognized it and
turned it into a strlen call.  I don't think we do anything like that
now, but you never know when someone might add it.

jeff
Bernd Edlinger Aug. 17, 2018, 9:14 a.m. UTC | #19
Hi!


After the other patch has been applied, I re-based this patch accordingly.

Except the mechanical changes, there are a few notable differences to the
previous version:

In string_constant, I added a similar check for the STRING_CSTs
because when callers don't use mem_size, they assume to be
able to read "TREE_STRING_LENGTH (array)" bytes, but that is
not always the case, for languages that don't always use
zero-terminated strings, for instance hollerith strings in fortran.

--- gcc/expr.c  2018-08-17 05:32:57.332211963 +0200
+++ gcc/expr.c  2018-08-16 23:08:23.544940795 +0200
@@ -11372,6 +11372,9 @@ string_constant (tree arg, tree *ptr_off
       *ptr_offset = fold_convert (sizetype, offset);
       if (mem_size)
        *mem_size = TYPE_SIZE_UNIT (TREE_TYPE (array));
+      else if (compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (array)),
+                                TREE_STRING_LENGTH (array)) < 0)
+       return NULL_TREE;
       return array;
     }


The range check in c_getstr was refined as well:

This I added, because vla arrays can be initialized with string constants,
especially since the 71625 patch was installed:
In this case we end up with mem_size that fails to be constant.


@@ -14606,25 +14603,17 @@ c_getstr (tree src, unsigned HOST_WIDE_I
        offset = tree_to_uhwi (offset_node);
     }

+  if (!tree_fits_uhwi_p (mem_size))
+    return NULL;
+
   /* STRING_LENGTH is the size of the string literal, including any
      embedded NULs.  STRING_SIZE is the size of the array the string
      literal is stored in.  */

Also the rest of the string length checks are refined, to return
actually zero-terminated single byte strings when strlen is not given,
and return something not necessarily zero-terminated which is
suitable for memxxx-functions otherwise.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
gcc:
2018-08-17  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	PR middle-end/86711
	PR middle-end/86714
	* expr.c (string_constant): Don't return truncated string literals.
	* fold-const.c (c_getstr): Fix function comment.  Remove unused third
	argument.  Fix range checks.
	* fold-const.c (c_getstr): Adjust protoype.

testsuite:
2018-08-17  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	PR middle-end/86711
	PR middle-end/86714
	* gcc.c-torture/execute/pr86711.c: New test.
	* gcc.c-torture/execute/pr86714.c: New test.

diff -Npur gcc/expr.c gcc/expr.c
--- gcc/expr.c	2018-08-17 05:32:57.332211963 +0200
+++ gcc/expr.c	2018-08-16 23:08:23.544940795 +0200
@@ -11372,6 +11372,9 @@ string_constant (tree arg, tree *ptr_off
       *ptr_offset = fold_convert (sizetype, offset);
       if (mem_size)
 	*mem_size = TYPE_SIZE_UNIT (TREE_TYPE (array));
+      else if (compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (array)),
+				 TREE_STRING_LENGTH (array)) < 0)
+	return NULL_TREE;
       return array;
     }
 
@@ -11414,26 +11417,10 @@ string_constant (tree arg, tree *ptr_off
   if (!init || TREE_CODE (init) != STRING_CST)
     return NULL_TREE;
 
-  tree array_size = DECL_SIZE_UNIT (array);
-  if (!array_size || TREE_CODE (array_size) != INTEGER_CST)
-    return NULL_TREE;
-
-  /* Avoid returning a string that doesn't fit in the array
-     it is stored in, like
-     const char a[4] = "abcde";
-     but do handle those that fit even if they have excess
-     initializers, such as in
-     const char a[4] = "abc\000\000";
-     The excess elements contribute to TREE_STRING_LENGTH()
-     but not to strlen().  */
-  unsigned HOST_WIDE_INT charsize
-    = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (init))));
-  unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
-  length = string_length (TREE_STRING_POINTER (init), charsize,
-			  length / charsize);
   if (mem_size)
     *mem_size = TYPE_SIZE_UNIT (TREE_TYPE (init));
-  else if (compare_tree_int (array_size, length + 1) < 0)
+  else if (compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (init)),
+			     TREE_STRING_LENGTH (init)) < 0)
     return NULL_TREE;
 
   *ptr_offset = offset;
diff -Npur gcc/fold-const.c gcc/fold-const.c
--- gcc/fold-const.c	2018-07-16 08:49:39.000000000 +0200
+++ gcc/fold-const.c	2018-08-16 23:31:11.490869136 +0200
@@ -14577,23 +14577,20 @@ fold_build_pointer_plus_hwi_loc (locatio
 /* Return a pointer P to a NUL-terminated string representing the sequence
    of constant characters referred to by SRC (or a subsequence of such
    characters within it if SRC is a reference to a string plus some
-   constant offset).  If STRLEN is non-null, store stgrlen(P) in *STRLEN.
-   If STRSIZE is non-null, store in *STRSIZE the size of the array
-   the string is stored in; in that case, even though P points to a NUL
-   terminated string, SRC need not refer to one.  This can happen when
-   SRC refers to a constant character array initialized to all non-NUL
-   values, as in the C declaration: char a[4] = "1234";  */
+   constant offset).  If STRLEN is non-null, store the number of bytes
+   in the string constant including the terminating NUL char.  *STRLEN is
+   typically strlen(P) + 1 in the absence of embedded NUL characters.  */
 
 const char *
-c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */,
-	  unsigned HOST_WIDE_INT *strsize /* = NULL */)
+c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */)
 {
   tree offset_node;
+  tree mem_size;
 
   if (strlen)
     *strlen = 0;
 
-  src = string_constant (src, &offset_node);
+  src = string_constant (src, &offset_node, &mem_size);
   if (src == 0)
     return NULL;
 
@@ -14606,25 +14603,17 @@ c_getstr (tree src, unsigned HOST_WIDE_I
 	offset = tree_to_uhwi (offset_node);
     }
 
+  if (!tree_fits_uhwi_p (mem_size))
+    return NULL;
+
   /* STRING_LENGTH is the size of the string literal, including any
      embedded NULs.  STRING_SIZE is the size of the array the string
      literal is stored in.  */
   unsigned HOST_WIDE_INT string_length = TREE_STRING_LENGTH (src);
-  unsigned HOST_WIDE_INT string_size = string_length;
-  tree type = TREE_TYPE (src);
-  if (tree size = TYPE_SIZE_UNIT (type))
-    if (tree_fits_shwi_p (size))
-      string_size = tree_to_uhwi (size);
+  unsigned HOST_WIDE_INT string_size = tree_to_uhwi (mem_size);
 
-  if (strlen)
-    {
-      /* Compute and store the length of the substring at OFFSET.
-	 All offsets past the initial length refer to null strings.  */
-      if (offset <= string_length)
-	*strlen = string_length - offset;
-      else
-	*strlen = 0;
-    }
+  if (string_length > string_size)
+    string_length = string_size;
 
   const char *string = TREE_STRING_POINTER (src);
 
@@ -14632,21 +14621,26 @@ c_getstr (tree src, unsigned HOST_WIDE_I
       || offset >= string_size)
     return NULL;
 
-  if (strsize)
+  if (strlen)
     {
-      /* Support even constant character arrays that aren't proper
-	 NUL-terminated strings.  */
-      *strsize = string_size;
+      /* Compute and store the length of the substring at OFFSET.
+	 All offsets past the initial length refer to null strings.  */
+      if (offset < string_length)
+	*strlen = string_length - offset;
+      else
+	*strlen = 1;
     }
-  else if (string[string_length - 1] != '\0')
+  else
     {
-      /* Support only properly NUL-terminated strings but handle
-	 consecutive strings within the same array, such as the six
-	 substrings in "1\0002\0003".  */
-      return NULL;
+      tree eltype = TREE_TYPE (TREE_TYPE (src));
+      /* Support only properly NUL-terminated single byte strings.  */
+      if (tree_to_uhwi (TYPE_SIZE_UNIT (eltype)) != 1)
+	return NULL;
+      if (string[string_length - 1] != '\0')
+	return NULL;
     }
 
-  return offset <= string_length ? string + offset : "";
+  return offset < string_length ? string + offset : "";
 }
 
 /* Given a tree T, compute which bits in T may be nonzero.  */
diff -Npur gcc/fold-const.h gcc/fold-const.h
--- gcc/fold-const.h	2018-07-16 08:49:39.000000000 +0200
+++ gcc/fold-const.h	2018-08-16 22:38:49.962205027 +0200
@@ -187,8 +187,7 @@ extern bool expr_not_equal_to (tree t, c
 extern tree const_unop (enum tree_code, tree, tree);
 extern tree const_binop (enum tree_code, tree, tree, tree);
 extern bool negate_mathfn_p (combined_fn);
-extern const char *c_getstr (tree, unsigned HOST_WIDE_INT * = NULL,
-			     unsigned HOST_WIDE_INT * = NULL);
+extern const char *c_getstr (tree, unsigned HOST_WIDE_INT * = NULL);
 extern wide_int tree_nonzero_bits (const_tree);
 
 /* Return OFF converted to a pointer offset type suitable as offset for
diff -Npur gcc/testsuite/gcc.c-torture/execute/pr86711.c gcc/testsuite/gcc.c-torture/execute/pr86711.c
--- gcc/testsuite/gcc.c-torture/execute/pr86711.c	1970-01-01 01:00:00.000000000 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr86711.c	2018-08-16 22:38:49.963205014 +0200
@@ -0,0 +1,11 @@
+/* PR middle-end/86711 */
+
+static const char a[2][4] = { "1234", "5678" };
+
+int main ()
+{
+  void *p = __builtin_memchr (a, 0, 5);
+
+  if (p)
+    __builtin_abort ();
+}
diff -Npur gcc/testsuite/gcc.c-torture/execute/pr86714.c gcc/testsuite/gcc.c-torture/execute/pr86714.c
--- gcc/testsuite/gcc.c-torture/execute/pr86714.c	1970-01-01 01:00:00.000000000 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr86714.c	2018-08-16 22:38:49.963205014 +0200
@@ -0,0 +1,12 @@
+/* PR middle-end/86714 */
+
+const char a[2][3] = { "1234", "xyz" };
+char b[6];
+
+int main ()
+{
+  __builtin_memcpy (b, a, 4);
+  __builtin_memset (b + 4, 'a', 2);
+  if (__builtin_memcmp (b, "123xaa", 6))
+    __builtin_abort ();
+}
Jeff Law Aug. 18, 2018, 4:01 a.m. UTC | #20
On 08/17/2018 03:14 AM, Bernd Edlinger wrote:
> Hi!
> 
> 
> After the other patch has been applied, I re-based this patch accordingly.
> 
> Except the mechanical changes, there are a few notable differences to the
> previous version:
> 
> In string_constant, I added a similar check for the STRING_CSTs
> because when callers don't use mem_size, they assume to be
> able to read "TREE_STRING_LENGTH (array)" bytes, but that is
> not always the case, for languages that don't always use
> zero-terminated strings, for instance hollerith strings in fortran.
> 
> --- gcc/expr.c  2018-08-17 05:32:57.332211963 +0200
> +++ gcc/expr.c  2018-08-16 23:08:23.544940795 +0200
> @@ -11372,6 +11372,9 @@ string_constant (tree arg, tree *ptr_off
>        *ptr_offset = fold_convert (sizetype, offset);
>        if (mem_size)
>         *mem_size = TYPE_SIZE_UNIT (TREE_TYPE (array));
> +      else if (compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (array)),
> +                                TREE_STRING_LENGTH (array)) < 0)
> +       return NULL_TREE;
>        return array;
>      }
> 
Yes.  I purposefully didn't include this change in its entirety as it
was dependent upon your earlier patch.   Instead I twiddled your patch
so that it applied on the current trunk and didn't regress anything
while keeping the core concept you were introducing.


I'm also confident that change breaks one or more tests in the
testsuite.  I'm not sure why you didn't see the regression.   But I
certainly saw it with the variant shown above.

Anyway, the basic idea was to carve out the basic concept of your patch
to allow callers to specify how to count without regressing the trunk.

That allows us to carve out an issue, resolve it and move on.  The
interdependent and conflicting patches are a nightmare to sort out.

> 
> The range check in c_getstr was refined as well:
> 
> This I added, because vla arrays can be initialized with string constants,
> especially since the 71625 patch was installed:
> In this case we end up with mem_size that fails to be constant.
> 
> 
> @@ -14606,25 +14603,17 @@ c_getstr (tree src, unsigned HOST_WIDE_I
>         offset = tree_to_uhwi (offset_node);
>      }
> 
> +  if (!tree_fits_uhwi_p (mem_size))
> +    return NULL;
> +
>    /* STRING_LENGTH is the size of the string literal, including any
>       embedded NULs.  STRING_SIZE is the size of the array the string
>       literal is stored in.  */
> 
> Also the rest of the string length checks are refined, to return
> actually zero-terminated single byte strings when strlen is not given,
> and return something not necessarily zero-terminated which is
> suitable for memxxx-functions otherwise.

> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?
Not yet.  There's a lot to go over here and I haven't finished reviewing
all the discussions around 86711/86714.


Jeff
Bernd Edlinger Aug. 18, 2018, 12:22 p.m. UTC | #21
On 08/18/18 06:01, Jeff Law wrote:
> On 08/17/2018 03:14 AM, Bernd Edlinger wrote:
>> Hi!
>>
>>
>> After the other patch has been applied, I re-based this patch accordingly.
>>
>> Except the mechanical changes, there are a few notable differences to the
>> previous version:
>>
>> In string_constant, I added a similar check for the STRING_CSTs
>> because when callers don't use mem_size, they assume to be
>> able to read "TREE_STRING_LENGTH (array)" bytes, but that is
>> not always the case, for languages that don't always use
>> zero-terminated strings, for instance hollerith strings in fortran.
>>
>> --- gcc/expr.c  2018-08-17 05:32:57.332211963 +0200
>> +++ gcc/expr.c  2018-08-16 23:08:23.544940795 +0200
>> @@ -11372,6 +11372,9 @@ string_constant (tree arg, tree *ptr_off
>>         *ptr_offset = fold_convert (sizetype, offset);
>>         if (mem_size)
>>          *mem_size = TYPE_SIZE_UNIT (TREE_TYPE (array));
>> +      else if (compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (array)),
>> +                                TREE_STRING_LENGTH (array)) < 0)
>> +       return NULL_TREE;
>>         return array;
>>       }
>>
> Yes.  I purposefully didn't include this change in its entirety as it
> was dependent upon your earlier patch.   Instead I twiddled your patch
> so that it applied on the current trunk and didn't regress anything
> while keeping the core concept you were introducing.
> 

That's a more or less theoretical possibility that I just
considered useful for symmetry, and maybe Ada/Go strings.

So it should be completely impossible to have this change anything
in C (hopefully!!!).

If it would happen, that would probably be a bug that needs
to be fixed.

> 
> I'm also confident that change breaks one or more tests in the
> testsuite.  I'm not sure why you didn't see the regression.   But I
> certainly saw it with the variant shown above.
> 

I tested the patch against an older version of the trunk,
but today, I repeated the regression test against r263644.

This time I attach the test results for your reference:
gcc-trunk-263644-test.txt: unchanged r263644
gcc-trunk-263644-0-test.txt: r263644 + this patch
gcc-trunk-263644-1-test.txt: r263644 + this patch without string_constant

As you can see, these regressions are already in r263644:
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.target/i386/20040112-1.c scan-assembler testb

I have never seen those before, so they are brand new.

FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 1\\n
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 3\\n
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 4\\n

these were already there on monday, but not last week.

The test result with this patch as is does not change anything.

But what's surprising, is this:

--- gcc-trunk-r263644-test.txt	2018-08-18 13:49:27.214851609 +0200
+++ gcc-trunk-r263644-1-test.txt	2018-08-18 13:49:27.070851601 +0200
@@ -18,6 +18,12 @@
  
  
  Running target unix
+FAIL: gcc.c-torture/execute/pr86714.c   -O1  execution test
+FAIL: gcc.c-torture/execute/pr86714.c   -O2  execution test
+FAIL: gcc.c-torture/execute/pr86714.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
+FAIL: gcc.c-torture/execute/pr86714.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
+FAIL: gcc.c-torture/execute/pr86714.c   -O3 -g  execution test
+FAIL: gcc.c-torture/execute/pr86714.c   -Os  execution test
  FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O0  execution test
  FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2  execution test
  FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2 -flto -fno-use-linker-plugin -flto-partiti

There is another regression in gcc-trunk-r263644-1-test.txt only which is likely only noise
that happens randomly and can be ignored:

                 === libgo Summary for unix ===

# of expected passes            163

Running target unix/-m32
FAIL: sync


The result is very interesting, and is probably what you were looking for:

The test was run with both test cases, but only pr86714 failed, so the change in c_getstr
Fixes pr86711, while the change in string_constant fixes pr86714 (or both).

The patch could actually be split between pr86711 and pr86714.

The c_getstr / p86711 should be applied before
string_constant / pr86714.


Thanks
Bernd.

> Anyway, the basic idea was to carve out the basic concept of your patch
> to allow callers to specify how to count without regressing the trunk.
> 
> That allows us to carve out an issue, resolve it and move on.  The
> interdependent and conflicting patches are a nightmare to sort out.
> 
>>
>> The range check in c_getstr was refined as well:
>>
>> This I added, because vla arrays can be initialized with string constants,
>> especially since the 71625 patch was installed:
>> In this case we end up with mem_size that fails to be constant.
>>
>>
>> @@ -14606,25 +14603,17 @@ c_getstr (tree src, unsigned HOST_WIDE_I
>>          offset = tree_to_uhwi (offset_node);
>>       }
>>
>> +  if (!tree_fits_uhwi_p (mem_size))
>> +    return NULL;
>> +
>>     /* STRING_LENGTH is the size of the string literal, including any
>>        embedded NULs.  STRING_SIZE is the size of the array the string
>>        literal is stored in.  */
>>
>> Also the rest of the string length checks are refined, to return
>> actually zero-terminated single byte strings when strlen is not given,
>> and return something not necessarily zero-terminated which is
>> suitable for memxxx-functions otherwise.
> 
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
> Not yet.  There's a lot to go over here and I haven't finished reviewing
> all the discussions around 86711/86714.
> 
> 
> Jeff
>
cat <<'EOF' |
		=== acats tests ===

		=== acats Summary ===
# of expected passes		2320
# of unexpected failures	0
Native configuration is x86_64-pc-linux-gnu

		=== brig tests ===


Running target unix

		=== brig Summary ===

# of unsupported tests		1
		=== gcc tests ===


Running target unix
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.target/i386/20040112-1.c scan-assembler testb

		=== gcc Summary ===

# of expected passes		133752
# of unexpected failures	7
# of expected failures		422
# of unsupported tests		2104
/home/ed/gnu/gcc-build-0/gcc/xgcc  version 9.0.0 20180818 (experimental) (GCC) 

		=== gfortran tests ===


Running target unix

		=== gfortran Summary ===

# of expected passes		47667
# of expected failures		104
# of unsupported tests		81
/home/ed/gnu/gcc-build-0/gcc/testsuite/gfortran/../../gfortran  version 9.0.0 20180818 (experimental) (GCC) 

		=== g++ tests ===


Running target unix
FAIL: g++.dg/pr80481.C  -std=gnu++11  scan-assembler-not vmovaps
FAIL: g++.dg/pr80481.C  -std=gnu++14  scan-assembler-not vmovaps
FAIL: g++.dg/pr80481.C  -std=gnu++98  scan-assembler-not vmovaps
FAIL: g++.dg/pr83239.C  -std=gnu++98 (test for excess errors)
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test

		=== g++ Summary ===

# of expected passes		126700
# of unexpected failures	10
# of expected failures		526
# of unsupported tests		4993
/home/ed/gnu/gcc-build-0/gcc/testsuite/g++/../../xg++  version 9.0.0 20180818 (experimental) (GCC) 

		=== gnat tests ===


Running target unix
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 1\\n
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 3\\n
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 4\\n

		=== gnat Summary ===

# of expected passes		2872
# of unexpected failures	3
# of expected failures		24
# of unsupported tests		3
/home/ed/gnu/gcc-build-0/gcc/gnatmake version 9.0.0 20180818 (experimental)

		=== go tests ===


Running target unix
FAIL: ./index0-out.go execution,  -O0 -g -fno-var-tracking-assignments 

		=== go Summary ===

# of expected passes		7285
# of unexpected failures	1
# of expected failures		1
# of untested testcases		6
# of unsupported tests		1
/home/ed/gnu/gcc-build-0/gcc/testsuite/go/../../gccgo  version 9.0.0 20180818 (experimental) (GCC) 

		=== obj-c++ tests ===


Running target unix

		=== obj-c++ Summary ===

# of expected passes		1456
# of expected failures		10
# of unsupported tests		77
/home/ed/gnu/gcc-build-0/gcc/testsuite/obj-c++/../../xg++  version 9.0.0 20180818 (experimental) (GCC) 

		=== objc tests ===


Running target unix

		=== objc Summary ===

# of expected passes		2797
# of expected failures		6
# of unsupported tests		68
/home/ed/gnu/gcc-build-0/gcc/xgcc  version 9.0.0 20180818 (experimental) (GCC) 

		=== gotools tests ===


		=== gotools Summary ===
# of expected passes		389
# of untested testcases		190
/home/ed/gnu/gcc-build-0/./gcc/gccgo version 9.0.0 20180818 (experimental) (GCC)

		=== libatomic tests ===


Running target unix

		=== libatomic Summary ===

# of expected passes		54
		=== libffi tests ===


Running target unix

		=== libffi Summary ===

# of expected passes		2214
		=== libgo tests ===


Running target unix
FAIL: sync

		=== libgo Summary for unix ===

# of expected passes		162
# of unexpected failures	1

Running target unix/-m32

		=== libgo Summary for unix/-m32 ===

# of expected passes		163

		=== libgo Summary ===

# of expected passes		325
# of unexpected failures	1
/home/ed/gnu/gcc-build-0/./gcc/gccgo version 9.0.0 20180818 (experimental) (GCC)

		=== libgomp tests ===


Running target unix
UNRESOLVED: libgomp.oacc-c++/non-scalar-data.C -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1  -O2  compilation failed to produce executable

		=== libgomp Summary ===

# of expected passes		6052
# of expected failures		3
# of unresolved testcases	1
# of unsupported tests		322
		=== libitm tests ===


Running target unix

		=== libitm Summary ===

# of expected passes		42
# of expected failures		3
# of unsupported tests		1
		=== libstdc++ tests ===


Running target unix

		=== libstdc++ Summary ===

# of expected passes		12647
# of expected failures		77
# of unsupported tests		582

Compiler version: 9.0.0 20180818 (experimental) (GCC) 
Platform: x86_64-pc-linux-gnu
configure flags: --prefix=/home/ed/gnu/install --enable-languages=all
EOF
Mail -s "Results for 9.0.0 20180818 (experimental) (GCC) testsuite on x86_64-pc-linux-gnu" gcc-testresults@gcc.gnu.org &&
true
cat <<'EOF' |
		=== acats tests ===

		=== acats Summary ===
# of expected passes		2320
# of unexpected failures	0
Native configuration is x86_64-pc-linux-gnu

		=== brig tests ===


Running target unix

		=== brig Summary ===

# of unsupported tests		1
		=== gcc tests ===


Running target unix
FAIL: gcc.c-torture/execute/pr86714.c   -O1  execution test
FAIL: gcc.c-torture/execute/pr86714.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr86714.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.c-torture/execute/pr86714.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/pr86714.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/pr86714.c   -Os  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.target/i386/20040112-1.c scan-assembler testb

		=== gcc Summary ===

# of expected passes		133746
# of unexpected failures	13
# of expected failures		422
# of unsupported tests		2104
/home/ed/gnu/gcc-build-1/gcc/xgcc  version 9.0.0 20180818 (experimental) (GCC) 

		=== gfortran tests ===


Running target unix

		=== gfortran Summary ===

# of expected passes		47667
# of expected failures		104
# of unsupported tests		81
/home/ed/gnu/gcc-build-1/gcc/testsuite/gfortran/../../gfortran  version 9.0.0 20180818 (experimental) (GCC) 

		=== g++ tests ===


Running target unix
FAIL: g++.dg/pr80481.C  -std=gnu++11  scan-assembler-not vmovaps
FAIL: g++.dg/pr80481.C  -std=gnu++14  scan-assembler-not vmovaps
FAIL: g++.dg/pr80481.C  -std=gnu++98  scan-assembler-not vmovaps
FAIL: g++.dg/pr83239.C  -std=gnu++98 (test for excess errors)
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test

		=== g++ Summary ===

# of expected passes		126700
# of unexpected failures	10
# of expected failures		526
# of unsupported tests		4993
/home/ed/gnu/gcc-build-1/gcc/testsuite/g++/../../xg++  version 9.0.0 20180818 (experimental) (GCC) 

		=== gnat tests ===


Running target unix
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 1\\n
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 3\\n
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 4\\n

		=== gnat Summary ===

# of expected passes		2872
# of unexpected failures	3
# of expected failures		24
# of unsupported tests		3
/home/ed/gnu/gcc-build-1/gcc/gnatmake version 9.0.0 20180818 (experimental)

		=== go tests ===


Running target unix
FAIL: ./index0-out.go execution,  -O0 -g -fno-var-tracking-assignments 

		=== go Summary ===

# of expected passes		7285
# of unexpected failures	1
# of expected failures		1
# of untested testcases		6
# of unsupported tests		1
/home/ed/gnu/gcc-build-1/gcc/testsuite/go/../../gccgo  version 9.0.0 20180818 (experimental) (GCC) 

		=== obj-c++ tests ===


Running target unix

		=== obj-c++ Summary ===

# of expected passes		1456
# of expected failures		10
# of unsupported tests		77
/home/ed/gnu/gcc-build-1/gcc/testsuite/obj-c++/../../xg++  version 9.0.0 20180818 (experimental) (GCC) 

		=== objc tests ===


Running target unix

		=== objc Summary ===

# of expected passes		2797
# of expected failures		6
# of unsupported tests		68
/home/ed/gnu/gcc-build-1/gcc/xgcc  version 9.0.0 20180818 (experimental) (GCC) 

		=== gotools tests ===


		=== gotools Summary ===
# of expected passes		389
# of untested testcases		190
/home/ed/gnu/gcc-build-1/./gcc/gccgo version 9.0.0 20180818 (experimental) (GCC)

		=== libatomic tests ===


Running target unix

		=== libatomic Summary ===

# of expected passes		54
		=== libffi tests ===


Running target unix

		=== libffi Summary ===

# of expected passes		2214
		=== libgo tests ===


Running target unix

		=== libgo Summary for unix ===

# of expected passes		163

Running target unix/-m32
FAIL: sync

		=== libgo Summary for unix/-m32 ===

# of expected passes		162
# of unexpected failures	1

		=== libgo Summary ===

# of expected passes		325
# of unexpected failures	1
/home/ed/gnu/gcc-build-1/./gcc/gccgo version 9.0.0 20180818 (experimental) (GCC)

		=== libgomp tests ===


Running target unix
UNRESOLVED: libgomp.oacc-c++/non-scalar-data.C -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1  -O2  compilation failed to produce executable

		=== libgomp Summary ===

# of expected passes		6052
# of expected failures		3
# of unresolved testcases	1
# of unsupported tests		322
		=== libitm tests ===


Running target unix

		=== libitm Summary ===

# of expected passes		42
# of expected failures		3
# of unsupported tests		1
		=== libstdc++ tests ===


Running target unix

		=== libstdc++ Summary ===

# of expected passes		12647
# of expected failures		77
# of unsupported tests		582

Compiler version: 9.0.0 20180818 (experimental) (GCC) 
Platform: x86_64-pc-linux-gnu
configure flags: --prefix=/home/ed/gnu/install --enable-languages=all
EOF
Mail -s "Results for 9.0.0 20180818 (experimental) (GCC) testsuite on x86_64-pc-linux-gnu" gcc-testresults@gcc.gnu.org &&
true
cat <<'EOF' |
		=== acats tests ===

		=== acats Summary ===
# of expected passes		2320
# of unexpected failures	0
Native configuration is x86_64-pc-linux-gnu

		=== brig tests ===


Running target unix

		=== brig Summary ===

# of unsupported tests		1
		=== gcc tests ===


Running target unix
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.target/i386/20040112-1.c scan-assembler testb

		=== gcc Summary ===

# of expected passes		133724
# of unexpected failures	7
# of expected failures		422
# of unsupported tests		2104
/home/ed/gnu/gcc-build/gcc/xgcc  version 9.0.0 20180818 (experimental) (GCC) 

		=== gfortran tests ===


Running target unix

		=== gfortran Summary ===

# of expected passes		47667
# of expected failures		104
# of unsupported tests		81
/home/ed/gnu/gcc-build/gcc/testsuite/gfortran/../../gfortran  version 9.0.0 20180818 (experimental) (GCC) 

		=== g++ tests ===


Running target unix
FAIL: g++.dg/pr80481.C  -std=gnu++11  scan-assembler-not vmovaps
FAIL: g++.dg/pr80481.C  -std=gnu++14  scan-assembler-not vmovaps
FAIL: g++.dg/pr80481.C  -std=gnu++98  scan-assembler-not vmovaps
FAIL: g++.dg/pr83239.C  -std=gnu++98 (test for excess errors)
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O0  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2  execution test
FAIL: c-c++-common/torture/builtin-arith-overflow-p-17.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test

		=== g++ Summary ===

# of expected passes		126700
# of unexpected failures	10
# of expected failures		526
# of unsupported tests		4993
/home/ed/gnu/gcc-build/gcc/testsuite/g++/../../xg++  version 9.0.0 20180818 (experimental) (GCC) 

		=== gnat tests ===


Running target unix
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 1\\n
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 3\\n
FAIL: gnat.dg/dinst.adb scan-assembler loc [0-9] 5 [0-9]( is_stmt [0-9])? discriminator 4\\n

		=== gnat Summary ===

# of expected passes		2872
# of unexpected failures	3
# of expected failures		24
# of unsupported tests		3
/home/ed/gnu/gcc-build/gcc/gnatmake version 9.0.0 20180818 (experimental)

		=== go tests ===


Running target unix
FAIL: ./index0-out.go execution,  -O0 -g -fno-var-tracking-assignments 

		=== go Summary ===

# of expected passes		7285
# of unexpected failures	1
# of expected failures		1
# of untested testcases		6
# of unsupported tests		1
/home/ed/gnu/gcc-build/gcc/testsuite/go/../../gccgo  version 9.0.0 20180818 (experimental) (GCC) 

		=== obj-c++ tests ===


Running target unix

		=== obj-c++ Summary ===

# of expected passes		1456
# of expected failures		10
# of unsupported tests		77
/home/ed/gnu/gcc-build/gcc/testsuite/obj-c++/../../xg++  version 9.0.0 20180818 (experimental) (GCC) 

		=== objc tests ===


Running target unix

		=== objc Summary ===

# of expected passes		2797
# of expected failures		6
# of unsupported tests		68
/home/ed/gnu/gcc-build/gcc/xgcc  version 9.0.0 20180818 (experimental) (GCC) 

		=== gotools tests ===


		=== gotools Summary ===
# of expected passes		389
# of untested testcases		190
/home/ed/gnu/gcc-build/./gcc/gccgo version 9.0.0 20180818 (experimental) (GCC)

		=== libatomic tests ===


Running target unix

		=== libatomic Summary ===

# of expected passes		54
		=== libffi tests ===


Running target unix

		=== libffi Summary ===

# of expected passes		2214
		=== libgo tests ===


Running target unix

		=== libgo Summary for unix ===

# of expected passes		163

Running target unix/-m32

		=== libgo Summary for unix/-m32 ===

# of expected passes		163

		=== libgo Summary ===

# of expected passes		326
/home/ed/gnu/gcc-build/./gcc/gccgo version 9.0.0 20180818 (experimental) (GCC)

		=== libgomp tests ===


Running target unix
UNRESOLVED: libgomp.oacc-c++/non-scalar-data.C -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1  -O2  compilation failed to produce executable

		=== libgomp Summary ===

# of expected passes		6052
# of expected failures		3
# of unresolved testcases	1
# of unsupported tests		322
		=== libitm tests ===


Running target unix

		=== libitm Summary ===

# of expected passes		42
# of expected failures		3
# of unsupported tests		1
		=== libstdc++ tests ===


Running target unix

		=== libstdc++ Summary ===

# of expected passes		12647
# of expected failures		77
# of unsupported tests		582

Compiler version: 9.0.0 20180818 (experimental) (GCC) 
Platform: x86_64-pc-linux-gnu
configure flags: --prefix=/home/ed/gnu/install --enable-languages=all
EOF
Mail -s "Results for 9.0.0 20180818 (experimental) (GCC) testsuite on x86_64-pc-linux-gnu" gcc-testresults@gcc.gnu.org &&
true
Jeff Law Aug. 29, 2018, 5:18 p.m. UTC | #22
On 08/17/2018 03:14 AM, Bernd Edlinger wrote:
> Hi!
> 
> 
> After the other patch has been applied, I re-based this patch accordingly.
> 
> Except the mechanical changes, there are a few notable differences to the
> previous version:
> 
> In string_constant, I added a similar check for the STRING_CSTs
> because when callers don't use mem_size, they assume to be
> able to read "TREE_STRING_LENGTH (array)" bytes, but that is
> not always the case, for languages that don't always use
> zero-terminated strings, for instance hollerith strings in fortran.
> 
> --- gcc/expr.c  2018-08-17 05:32:57.332211963 +0200
> +++ gcc/expr.c  2018-08-16 23:08:23.544940795 +0200
> @@ -11372,6 +11372,9 @@ string_constant (tree arg, tree *ptr_off
>        *ptr_offset = fold_convert (sizetype, offset);
>        if (mem_size)
>         *mem_size = TYPE_SIZE_UNIT (TREE_TYPE (array));
> +      else if (compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (array)),
> +                                TREE_STRING_LENGTH (array)) < 0)
> +       return NULL_TREE;
>        return array;
>      }
> 
> 
> The range check in c_getstr was refined as well:
> 
> This I added, because vla arrays can be initialized with string constants,
> especially since the 71625 patch was installed:
> In this case we end up with mem_size that fails to be constant.
> 
> 
> @@ -14606,25 +14603,17 @@ c_getstr (tree src, unsigned HOST_WIDE_I
>         offset = tree_to_uhwi (offset_node);
>      }
> 
> +  if (!tree_fits_uhwi_p (mem_size))
> +    return NULL;
> +
>    /* STRING_LENGTH is the size of the string literal, including any
>       embedded NULs.  STRING_SIZE is the size of the array the string
>       literal is stored in.  */
> 
> Also the rest of the string length checks are refined, to return
> actually zero-terminated single byte strings when strlen is not given,
> and return something not necessarily zero-terminated which is
> suitable for memxxx-functions otherwise.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.
> 
> 
> patch-pr86711.diff
> 
> 
> gcc:
> 2018-08-17  Bernd Edlinger  <bernd.edlinger@hotmail.de>
> 
> 	PR middle-end/86711
> 	PR middle-end/86714
> 	* expr.c (string_constant): Don't return truncated string literals.
> 	* fold-const.c (c_getstr): Fix function comment.  Remove unused third
> 	argument.  Fix range checks.
> 	* fold-const.c (c_getstr): Adjust protoype.
> 
> testsuite:
> 2018-08-17  Bernd Edlinger  <bernd.edlinger@hotmail.de>
> 
> 	PR middle-end/86711
> 	PR middle-end/86714
> 	* gcc.c-torture/execute/pr86711.c: New test.
> 	* gcc.c-torture/execute/pr86714.c: New test.



Note that Martin's patch covers both these tests in slightly better ways.
Jeff
diff mbox series

Patch

Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 263045)
+++ gcc/builtins.c	(working copy)
@@ -617,7 +617,7 @@  c_strlen (tree src, int only_value)
   if (tree size = TYPE_SIZE_UNIT (type))
     if (tree_fits_shwi_p (size))
       {
-	maxelts = tree_to_uhwi (size);
+	maxelts = tree_to_shwi (size);
 	maxelts = maxelts / eltsize - 1;
       }
 
Index: gcc/expr.c
===================================================================
--- gcc/expr.c	(revision 263045)
+++ gcc/expr.c	(working copy)
@@ -11410,24 +11410,14 @@  string_constant (tree arg, tree *ptr_offset)
   if (!init || TREE_CODE (init) != STRING_CST)
     return NULL_TREE;
 
-  tree array_size = DECL_SIZE_UNIT (array);
-  if (!array_size || TREE_CODE (array_size) != INTEGER_CST)
-    return NULL_TREE;
-
   /* Avoid returning a string that doesn't fit in the array
-     it is stored in, like
-     const char a[4] = "abcde";
-     but do handle those that fit even if they have excess
-     initializers, such as in
-     const char a[4] = "abc\000\000";
-     The excess elements contribute to TREE_STRING_LENGTH()
-     but not to strlen().  */
-  unsigned HOST_WIDE_INT charsize
-    = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (init))));
-  unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
-  length = string_length (TREE_STRING_POINTER (init), charsize,
-			  length / charsize);
-  if (compare_tree_int (array_size, length + 1) < 0)
+     it is stored in, like:
+     const char a[4] = "abcd";
+     because callers expect to be able to access the string
+     up to the limit imposed by TREE_STRING_LENGTH which
+     always includes the terminating NUL char.  */
+  if (compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (init)),
+			TREE_STRING_LENGTH (init)) < 0)
     return NULL_TREE;
 
   *ptr_offset = offset;
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	(revision 263045)
+++ gcc/fold-const.c	(working copy)
@@ -14577,16 +14577,12 @@  fold_build_pointer_plus_hwi_loc (location_t loc, t
 /* Return a pointer P to a NUL-terminated string representing the sequence
    of constant characters referred to by SRC (or a subsequence of such
    characters within it if SRC is a reference to a string plus some
-   constant offset).  If STRLEN is non-null, store stgrlen(P) in *STRLEN.
-   If STRSIZE is non-null, store in *STRSIZE the size of the array
-   the string is stored in; in that case, even though P points to a NUL
-   terminated string, SRC need not refer to one.  This can happen when
-   SRC refers to a constant character array initialized to all non-NUL
-   values, as in the C declaration: char a[4] = "1234";  */
+   constant offset).  If STRLEN is non-null, store the number of bytes
+   in the string constant including the terminating NUL char.  *STRLEN is
+   typically strlen(P) + 1 in the absence of embedded NUL characters.  */
 
 const char *
-c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */,
-	  unsigned HOST_WIDE_INT *strsize /* = NULL */)
+c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */)
 {
   tree offset_node;
 
@@ -14613,40 +14609,24 @@  const char *
   unsigned HOST_WIDE_INT string_size = string_length;
   tree type = TREE_TYPE (src);
   if (tree size = TYPE_SIZE_UNIT (type))
-    if (tree_fits_shwi_p (size))
+    if (tree_fits_uhwi_p (size))
       string_size = tree_to_uhwi (size);
 
+  if (offset >= string_size)
+    return NULL;
+
   if (strlen)
     {
       /* Compute and store the length of the substring at OFFSET.
 	 All offsets past the initial length refer to null strings.  */
-      if (offset <= string_length)
+      if (offset < string_length)
 	*strlen = string_length - offset;
       else
-	*strlen = 0;
+	*strlen = 1;
     }
 
   const char *string = TREE_STRING_POINTER (src);
-
-  if (string_length == 0
-      || offset >= string_size)
-    return NULL;
-
-  if (strsize)
-    {
-      /* Support even constant character arrays that aren't proper
-	 NUL-terminated strings.  */
-      *strsize = string_size;
-    }
-  else if (string[string_length - 1] != '\0')
-    {
-      /* Support only properly NUL-terminated strings but handle
-	 consecutive strings within the same array, such as the six
-	 substrings in "1\0002\0003".  */
-      return NULL;
-    }
-
-  return offset <= string_length ? string + offset : "";
+  return offset < string_length ? string + offset : "";
 }
 
 /* Given a tree T, compute which bits in T may be nonzero.  */
Index: gcc/fold-const.h
===================================================================
--- gcc/fold-const.h	(revision 263045)
+++ gcc/fold-const.h	(working copy)
@@ -187,8 +187,7 @@  extern bool expr_not_equal_to (tree t, const wide_
 extern tree const_unop (enum tree_code, tree, tree);
 extern tree const_binop (enum tree_code, tree, tree, tree);
 extern bool negate_mathfn_p (combined_fn);
-extern const char *c_getstr (tree, unsigned HOST_WIDE_INT * = NULL,
-			     unsigned HOST_WIDE_INT * = NULL);
+extern const char *c_getstr (tree, unsigned HOST_WIDE_INT * = NULL);
 extern wide_int tree_nonzero_bits (const_tree);
 
 /* Return OFF converted to a pointer offset type suitable as offset for
Index: gcc/testsuite/gcc.c-torture/execute/pr86711.c
===================================================================
--- gcc/testsuite/gcc.c-torture/execute/pr86711.c	(revision 0)
+++ gcc/testsuite/gcc.c-torture/execute/pr86711.c	(working copy)
@@ -0,0 +1,11 @@ 
+/* PR middle-end/86711 */
+
+static const char a[2][4] = { "1234", "5678" };
+
+int main ()
+{
+  void *p = __builtin_memchr (a, 0, 5);
+
+  if (p)
+    __builtin_abort ();
+}
Index: gcc/testsuite/gcc.c-torture/execute/pr86714.c
===================================================================
--- gcc/testsuite/gcc.c-torture/execute/pr86714.c	(revision 0)
+++ gcc/testsuite/gcc.c-torture/execute/pr86714.c	(working copy)
@@ -0,0 +1,12 @@ 
+/* PR middle-end/86714 */
+
+const char a[2][3] = { "1234", "xyz" };
+char b[6];
+
+int main ()
+{
+  __builtin_memcpy (b, a, 4);
+  __builtin_memset (b + 4, 'a', 2);
+  if (__builtin_memcmp (b, "123xaa", 6))
+    __builtin_abort ();
+}
Index: gcc/testsuite/gcc.dg/strlenopt-49.c
===================================================================
--- gcc/testsuite/gcc.dg/strlenopt-49.c	(revision 263045)
+++ gcc/testsuite/gcc.dg/strlenopt-49.c	(working copy)
@@ -45,9 +45,9 @@  int cmp88 (void)
   return cmp88;
 }
 
-/* { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } }
-   { dg-final { scan-tree-dump-times "len0 = 0;" 1 "gimple" } }
-   { dg-final { scan-tree-dump-times "len = 18;" 1 "gimple" } }
-   { dg-final { scan-tree-dump-times "lenx = 8;" 1 "gimple" } }
-   { dg-final { scan-tree-dump-times "leny = 0;" 1 "gimple" } }
-   { dg-final { scan-tree-dump-times "cmp88 = 0;" 1 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "strlen" 0 "gimple" { xfail *-*-* } } }
+   { dg-final { scan-tree-dump-times "len0 = 0;" 1 "gimple" { xfail *-*-* } } }
+   { dg-final { scan-tree-dump-times "len = 18;" 1 "gimple" { xfail *-*-* } } }
+   { dg-final { scan-tree-dump-times "lenx = 8;" 1 "gimple" { xfail *-*-* } } }
+   { dg-final { scan-tree-dump-times "leny = 0;" 1 "gimple" { xfail *-*-* } } }
+   { dg-final { scan-tree-dump-times "cmp88 = 0;" 1 "gimple" { xfail *-*-* } } } */