Set behavior of sprintf-like functions with overlapping source and destination

Message ID 20181220215409.8971-1-gabriel@inconstante.eti.br
State New
Headers show
Series
  • Set behavior of sprintf-like functions with overlapping source and destination
Related show

Commit Message

Gabriel F. T. Gomes Dec. 20, 2018, 9:54 p.m.
According to ISO C99, passing the same buffer as source and destination
to sprintf, snprintf, vsprintf, or vsnprintf has undefined behavior.
Until the commit

  commit 4e2f43f842ef5e253cc23383645adbaa03cedb86
  Author: Zack Weinberg <zackw@panix.com>
  Date:   Wed Mar 7 14:32:03 2018 -0500

      Use PRINTF_FORTIFY instead of _IO_FLAGS2_FORTIFY (bug 11319)

a call to sprintf or vsprintf with overlapping buffers, for instance
vsprintf (buf, "%sTEXT", buf), would append `TEXT' into buf, while a
call to snprintf or vsnprintf would override the contents of buf.
After the aforementioned commit, the behavior of sprintf and vsprintf
changed (so that they also override the contents of buf).

This patch reverts this behavioral change, because it will likely break
applications that rely on the previous behavior, even though it is
undefined by ISO C.  As noted by Szabolcs Nagy, this is used in SPEC2017
507.cactuBSSN_r/src/PUGH/PughUtils.c:

  sprintf(mess,"  Size:");
  for (i=0;i<dim+1;i++)
  {
      sprintf(mess,"%s %d",mess,pughGH->GFExtras[dim]->nsize[i]);
  }

More important to notice is the fact that the overwriting of the
destination buffer is not the only behavior affected by the refactoring.
Before the refactoring, sprintf and vsprintf would use _IO_str_jumps,
whereas __sprintf_chk and __vsprintf_chk would use _IO_str_chk_jumps.
After the refactoring, all use _IO_str_chk_jumps, which would make
sprintf and vsprintf report buffer overflows and terminate the program.
This patch also reverts this behavior, by installing the appropriate
jump table for each *sprintf functions.

Apart from reverting the changes, this patch adds a test case that has
the old behavior hardcoded, so that regressions are noticed if something
else unintentionally changes the behavior.

Tested for powerpc64le.

	* debug/sprintf_chk.c (___sprintf_chk): Use PRINTF_CHK.
	* debug/vsprintf_chk.c (___vsprintf_chk): Likewise.
	* libio/Makefile (tests): Add tst-sprintf-ub and
	tst-sprintf-chk-ub.
	CFLAGS-tst-sprintf-ub.c: New variable.
	CFLAGS-tst-sprintf-chk-ub.c: Likewise.
	* libio/iovsprintf.c (__vsprintf_internal): Only erase the
	destination buffer and check for overflows in fortified mode.
	* libio/libioP.h (PRINTF_CHK): New macro.
	* libio/tst-sprintf-chk-ub.c: New file.
	* libio/tst-sprintf-ub.c: Likewise.
---
 debug/sprintf_chk.c        |   4 ++
 debug/vsprintf_chk.c       |   4 ++
 libio/Makefile             |   7 +++-
 libio/iovsprintf.c         |  14 ++++++-
 libio/libioP.h             |   6 ++-
 libio/tst-sprintf-chk-ub.c |   2 +
 libio/tst-sprintf-ub.c     | 102 +++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 135 insertions(+), 4 deletions(-)
 create mode 100644 libio/tst-sprintf-chk-ub.c
 create mode 100644 libio/tst-sprintf-ub.c

Comments

Paul Eggert Dec. 20, 2018, 10:17 p.m. | #1
Dumb question: if fortification is enabled, why can't sprintf-like 
functions report an error when the source and destination overlap? The 
point of fortification is to catch and report undefined behavior when 
it's easy, as is the case here.

> /* Test the sprintf (buf, "%s", buf) does not override buf.

I'm leery of adding this test case, as it tests undefined behavior that 
the glibc manual does not document as an extension (and it shouldn't be 
documented either).

Traditionally we didn't worry about breaking code like PughUtils.c's 
'sprintf(mess,"%s %d",mess,...)' under the principle that such code was 
already broken. Why depart from that tradition here?
Gabriel F. T. Gomes Dec. 21, 2018, 1:21 a.m. | #2
On Thu, 20 Dec 2018, Paul Eggert wrote:

>Dumb question: if fortification is enabled, why can't sprintf-like 
>functions report an error when the source and destination overlap? The 
>point of fortification is to catch and report undefined behavior when 
>it's easy, as is the case here.

With fortification enabled, they still do.  This change is more about the
non-fortified case...

Before commit ID 4e2f43f842ef, non-fortified sprintf and vsprintf did
*not*:

  1. report buffer overflows and terminate the program;
  2. overwrite overlapping buffers.

But since commit ID 4e2f43f842ef, they do (I haven't noticed these changes
when working on that patch, now they are being questioned here:
https://sourceware.org/ml/libc-alpha/2018-12/msg00634.html and
https://sourceware.org/ml/libc-alpha/2018-12/msg00839.html).

This new patch is about reverting these two items for the non-fortified
case (maybe they should be considered separately).

>> /* Test the sprintf (buf, "%s", buf) does not override buf.  
>
>I'm leery of adding this test case, as it tests undefined behavior that 
>the glibc manual does not document as an extension (and it shouldn't be 
>documented either).
>
>Traditionally we didn't worry about breaking code like PughUtils.c's 
>'sprintf(mess,"%s %d",mess,...)' under the principle that such code was 
>already broken. Why depart from that tradition here?

I don't know how to answer that...  I'm sort of a rookie, and I wasn't
here to witness past, similar changes and what was the fallout.  Maybe
other people have better, backed opinions about it...
Joseph Myers Dec. 21, 2018, 1:24 a.m. | #3
On Thu, 20 Dec 2018, Gabriel F. T. Gomes wrote:

> >Traditionally we didn't worry about breaking code like PughUtils.c's 
> >'sprintf(mess,"%s %d",mess,...)' under the principle that such code was 
> >already broken. Why depart from that tradition here?
> 
> I don't know how to answer that...  I'm sort of a rookie, and I wasn't
> here to witness past, similar changes and what was the fallout.  Maybe
> other people have better, backed opinions about it...

We have the example of x86_64 memcpy (where a GLIBC_2.14 version was added 
to avoid breaking old *binaries* expecting it to have memmove semantics, 
but in that case breaking *sources* expecting overlapping copies to work 
was considered fine).
Florian Weimer Dec. 21, 2018, 4:21 p.m. | #4
* Joseph Myers:

> On Thu, 20 Dec 2018, Gabriel F. T. Gomes wrote:
>
>> >Traditionally we didn't worry about breaking code like PughUtils.c's 
>> >'sprintf(mess,"%s %d",mess,...)' under the principle that such code was 
>> >already broken. Why depart from that tradition here?
>> 
>> I don't know how to answer that...  I'm sort of a rookie, and I wasn't
>> here to witness past, similar changes and what was the fallout.  Maybe
>> other people have better, backed opinions about it...
>
> We have the example of x86_64 memcpy (where a GLIBC_2.14 version was added 
> to avoid breaking old *binaries* expecting it to have memmove semantics, 
> but in that case breaking *sources* expecting overlapping copies to work 
> was considered fine).

The fallout from that was largely negative, though, because
memcpy@GLIBC_2.14 was the only symbol that required the GLIBC_2.14
symbol version for quite some time.

These days, I'd expect us to provide an LD_PRELOAD library instead of a
symbol version (like libBrokenLocale), or wait until something else that
requires a new symbol version for many binaries comes around.

Thanks,
Florian
Siddhesh Poyarekar Dec. 26, 2018, 10:52 a.m. | #5
On 21/12/18 3:47 AM, Paul Eggert wrote:
> Dumb question: if fortification is enabled, why can't sprintf-like 
> functions report an error when the source and destination overlap? The 
> point of fortification is to catch and report undefined behavior when 
> it's easy, as is the case here.
> 
>> /* Test the sprintf (buf, "%s", buf) does not override buf.
> 
> I'm leery of adding this test case, as it tests undefined behavior that 
> the glibc manual does not document as an extension (and it shouldn't be 
> documented either).
> 
> Traditionally we didn't worry about breaking code like PughUtils.c's 
> 'sprintf(mess,"%s %d",mess,...)' under the principle that such code was 
> already broken. Why depart from that tradition here?

Is the disagreement here only about testing UB or also about retaining 
old behaviour in case of UB?  If it's just the former then we could make 
forward progress by just removing the UB test case and just keeping the 
ub-chk test case.

It may not be too hard for the compiler to see this undefined behaviour 
and warn about it either, at least in some trivial cases...

Siddhesh
Paul Eggert Dec. 26, 2018, 6:56 p.m. | #6
Siddhesh Poyarekar wrote:
> On 21/12/18 3:47 AM, Paul Eggert wrote:
>>> /* Test the sprintf (buf, "%s", buf) does not override buf.
>>
>> I'm leery of adding this test case, as it tests undefined behavior that the 
>> glibc manual does not document as an extension (and it shouldn't be documented 
>> either).
>>
>> Traditionally we didn't worry about breaking code like PughUtils.c's 
>> 'sprintf(mess,"%s %d",mess,...)' under the principle that such code was 
>> already broken. Why depart from that tradition here?
> 
> Is the disagreement here only about testing UB or also about retaining old 
> behaviour in case of UB?

Primarily the former. I don't want us to test for and/or guarantee support for 
this particular implementation of UB. (Although I'm not happy about the extra 
code inserted into every printf call to deal with this situation, it's not like 
printf is particularly fast now....)

> If it's just the former then we could make forward 
> progress by just removing the UB test case and just keeping the ub-chk test case.
> 
> It may not be too hard for the compiler to see this undefined behaviour and warn 
> about it either, at least in some trivial cases...

Yes, that'd be good.
Jonathan Nieder Dec. 26, 2018, 8:19 p.m. | #7
Paul Eggert wrote:
> Siddhesh Poyarekar wrote:
>> On 21/12/18 3:47 AM, Paul Eggert wrote:

>>> Traditionally we didn't worry about breaking code like PughUtils.c's
>>> 'sprintf(mess,"%s %d",mess,...)' under the principle that such code
>>> was already broken. Why depart from that tradition here?
>>
>> Is the disagreement here only about testing UB or also about retaining
>> old behaviour in case of UB?
>
> Primarily the former. I don't want us to test for and/or guarantee support
> for this particular implementation of UB. (Although I'm not happy about the
> extra code inserted into every printf call to deal with this situation, it's
> not like printf is particularly fast now....)

Can we keep the test and just include a (source) comment describing the
lack of support behind it?

It feels odd to bend over backwards to support a behavior on one hand,
while not having a test for it on the other.

>> If it's just the former then we could make forward progress by just
>> removing the UB test case and just keeping the ub-chk test case.
>>
>> It may not be too hard for the compiler to see this undefined behaviour
>> and warn about it either, at least in some trivial cases...
>
> Yes, that'd be good.

Learning from the GLIBC_2.14 memcpy, I wonder what our best option is.
I wonder if something like the following would be too complex:

 1. Introduce GLIBC_2.29 printf that does *not* support this undefined
    behavior.  Use __asm__(".symver [...]@@GLIBC_2.0") to ensure we
    still default to the GLIBC_2.0 version even when building new
    programs.

 2. After a critical mass of users are using glibc 2.29 or newer (e.g.
    after a year), switch the default version to GLIBC_2.29.

What do you think?

Thanks,
Jonathan
Paul Eggert Dec. 26, 2018, 10:22 p.m. | #8
Jonathan Nieder wrote:

> Can we keep the test and just include a (source) comment describing the
> lack of support behind it?

How about if we comment out the test, or enable it only if some weird macro is 
already defined?

> It feels odd to bend over backwards to support a behavior on one hand,
> while not having a test for it on the other.

It is a special case, indeed.

> I wonder if something like the following would be too complex:
> 
>   1. Introduce GLIBC_2.29 printf that does *not* support this undefined
>      behavior.  Use __asm__(".symver [...]@@GLIBC_2.0") to ensure we
>      still default to the GLIBC_2.0 version even when building new
>      programs.
> 
>   2. After a critical mass of users are using glibc 2.29 or newer (e.g.
>      after a year), switch the default version to GLIBC_2.29.
> 
> What do you think?

I'm afraid it sounds confusing, as GLIBC_2.29 wouldn't be the default for glibc 
2.29 when it's released. Perhaps the symbol should be GLIBC_UNSTABLE instead? 
But even then, I don't see why users would use the new version before the 
default was switched, so if there are issues we won't find them any more gently 
with this approach.
Siddhesh Poyarekar Dec. 27, 2018, 12:51 a.m. | #9
On 27/12/18 12:26 AM, Paul Eggert wrote:
>> It may not be too hard for the compiler to see this undefined 
>> behaviour and warn about it either, at least in some trivial cases...
> 
> Yes, that'd be good.

Turns out the compiler does identify aliases in this case  (although in 
a very generic way because of which the warning may not be super-clear 
to a lot of devs at first glance):

foo.c:9:11: warning: passing argument 1 to restrict-qualified parameter 
aliases with argument 3 [-Wrestrict]
    sprintf(buf, "%sCD", buf);
            ^~~          ~~~

So I suppose we have this part covered.

Siddhesh
Siddhesh Poyarekar Dec. 27, 2018, 1:19 a.m. | #10
On 27/12/18 3:52 AM, Paul Eggert wrote:
>> I wonder if something like the following would be too complex:
>>
>>   1. Introduce GLIBC_2.29 printf that does *not* support this undefined
>>      behavior.  Use __asm__(".symver [...]@@GLIBC_2.0") to ensure we
>>      still default to the GLIBC_2.0 version even when building new
>>      programs.
>>
>>   2. After a critical mass of users are using glibc 2.29 or newer (e.g.
>>      after a year), switch the default version to GLIBC_2.29.
>>
>> What do you think?
> 
> I'm afraid it sounds confusing, as GLIBC_2.29 wouldn't be the default 
> for glibc 2.29 when it's released. Perhaps the symbol should be 
> GLIBC_UNSTABLE instead? But even then, I don't see why users would use 
> the new version before the default was switched, so if there are issues 
> we won't find them any more gently with this approach.

We should not retain compatible behaviour for building, only for linking 
like we normally do.  The symbol versioning patch won't get backported 
(since it is an ABI event) to older distributions and as long as it is 
alongside gcc8+ (I tested 8, the warning might have been introduced 
earlier) there is an adequate warning that tells users their houses may 
burn down if they depend on this behaviour.  Or something like that.

The risk is not completely gone though and there may be environments 
(gcc5 for example) where there is no warning.  These would be custom 
built environments though and I suppose we can depend on these devs to 
read the manpage more carefully.  Or maybe add more code in the headers 
to link against the compat *printf if an older compiler is detected? 
It's debatable if all of this is necessary for undefined behaviour...

In any case, this doesn't look like something that'll be done within 
this week, so I propose we add this compatibility change in now (without 
the UB test) in the interest of not breaking things and then add 
versioning to only retain backward compatible behaviour for GLIBC_2.0 
printf after discussing at length how much breakage we can stand.  Maybe 
also get friends at LWN to write about how sprintf using aliased buffers 
is asking for trouble.

With this action plan we don't really need to test the undefined 
behaviour since we don't intend to retain it except in the compat case, 
just flip the switch in the ub-chk test later so that it always does the 
check instead of only under _FORTIFY_SOURCE.

Siddhesh
Zack Weinberg Dec. 27, 2018, 2:15 a.m. | #11
On Wed, Dec 26, 2018 at 8:19 PM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> On 27/12/18 3:52 AM, Paul Eggert wrote:
> >> I wonder if something like the following would be too complex:
> >>
> >>   1. Introduce GLIBC_2.29 printf that does *not* support this undefined
> >>      behavior.  Use __asm__(".symver [...]@@GLIBC_2.0") to ensure we
> >>      still default to the GLIBC_2.0 version even when building new
> >>      programs.
> >>
> >>   2. After a critical mass of users are using glibc 2.29 or newer (e.g.
> >>      after a year), switch the default version to GLIBC_2.29.
> >>
> >> What do you think?
> >
> > I'm afraid it sounds confusing, as GLIBC_2.29 wouldn't be the default
> > for glibc 2.29 when it's released. Perhaps the symbol should be
> > GLIBC_UNSTABLE instead? But even then, I don't see why users would use
> > the new version before the default was switched, so if there are issues
> > we won't find them any more gently with this approach.
>
> We should not retain compatible behaviour for building, only for linking
> like we normally do.  The symbol versioning patch won't get backported
> (since it is an ABI event) to older distributions and as long as it is
> alongside gcc8+ (I tested 8, the warning might have been introduced
> earlier) there is an adequate warning that tells users their houses may
> burn down if they depend on this behaviour.  Or something like that.

Despite having written the patch that broke the old behavior, I think
this is much too aggressive.  The fact that we almost immediately
discovered breakage after the patch landed means there are probably a
whole lot of programs out there relying on it, and I don't think it's
safe to assume people will pay attention to warnings _or_ read
documentation.  Witness how people are _still_ complaining about the
memcpy change.

I'm inclined to say that this degree of freedom is now frozen and we
need to accept that the old behavior has become a supported GNU
extension and we should document it as such, test for it, etc.  Not a
good extension, but one we are stuck with.  Failing that, I think we
need to preserve the old behavior for at least one more full release
and we need to announce as loudly and widely as possible that we are
changing it.  If we do change it, we should also make sure that the
new behavior is well-defined and tested for all cases of overlapping
buffers, and what the new behavior is must be documented, and we need
to stick to it.

zw
Siddhesh Poyarekar Dec. 27, 2018, 5:45 a.m. | #12
On 27/12/18 7:45 AM, Zack Weinberg wrote:
> Despite having written the patch that broke the old behavior, I think
> this is much too aggressive.  The fact that we almost immediately
> discovered breakage after the patch landed means there are probably a
> whole lot of programs out there relying on it, and I don't think it's
> safe to assume people will pay attention to warnings _or_ read
> documentation.  Witness how people are _still_ complaining about the
> memcpy change.
> 
> I'm inclined to say that this degree of freedom is now frozen and we
> need to accept that the old behavior has become a supported GNU
> extension and we should document it as such, test for it, etc.  Not a
> good extension, but one we are stuck with.  Failing that, I think we
> need to preserve the old behavior for at least one more full release
> and we need to announce as loudly and widely as possible that we are
> changing it.  If we do change it, we should also make sure that the
> new behavior is well-defined and tested for all cases of overlapping
> buffers, and what the new behavior is must be documented, and we need
> to stick to it.

Thanks, to clarify, is your position that we revert to old behaviour 
(for now) for the default case only or for everything, including 
_FORTIFY_SOURCE?

Siddhesh
Gabriel F. T. Gomes Dec. 27, 2018, 1:46 p.m. | #13
On Thu, Dec 27, 2018 at 11:15:06AM +0530, Siddhesh Poyarekar wrote:
> Thanks, to clarify, is your position that we revert to old behaviour (for
> now) for the default case only or for everything, including _FORTIFY_SOURCE?

The _FORTIFY_SOURCE version did not change behaviour...  Even before
Zack's patches, it would overwrite overlapping buffers.  This means that
this patch, as it is today, reverts all behavior to what it used to be.
Florian Weimer Dec. 27, 2018, 3:27 p.m. | #14
* Jonathan Nieder:

> Learning from the GLIBC_2.14 memcpy,

I assume you've taken my observations in

  <https://sourceware.org/ml/libc-alpha/2018-12/msg00871.html>

into account.

> I wonder what our best option is.  I wonder if something like the
> following would be too complex:
>
>  1. Introduce GLIBC_2.29 printf that does *not* support this undefined
>     behavior.  Use __asm__(".symver [...]@@GLIBC_2.0") to ensure we
>     still default to the GLIBC_2.0 version even when building new
>     programs.
>
>  2. After a critical mass of users are using glibc 2.29 or newer (e.g.
>     after a year), switch the default version to GLIBC_2.29.
>
> What do you think?

So you want to introduce a compat symbol for GLIBC_2.29, but leave the
default symbol version at whatever version we have today?

Waiting a year (until a 2020 release—how time flies!) will not make a
substantial difference in deployment due to the release schedules of
distributions currently under development and their glibc version
choices.  That part is a non-starter, I'm afraid.  A multi-year delay
would make a difference, of course, but I'm not sure if that's
appropriate.  I think the most feasible course of action would be to
bundle several such changes together, so that the impact is felt only
once.

Historically, we have used symbol versioning (in the sense that we
changed the default symbol version) for two different situations:

(1) A type changed size or a function changed its prototype in a
    binary-incompatible way, usually prompted by a desire to increase
    standards conformance.

(2) A function changed behavior and existing software broke.

memcpy@GLIBC_2.14 is clearly in the second category.  A more recent
example is glob@GLIBC_2.27.

The key difference between (1) and (2) is that rebuilding existing
software for (1) either works as before, or you get at a compiler
error and have to apply a simple fix.  For (2), on the other hand, the
bug is still there, and recompilation reintroduces it.

Over the years, I have come to the realization that (2) really only
benefits proprietary software, and unmaintained proprietary software
at that.  We still receive bug reports about the glob@GLIBC_2.27
transition because people try to build the wrong version of GNU make
on glibc 2.27 or later.  I would say that using symbol versioning to
make such backwards-incompatible changes is very confusing and may not
be worth it.  Instead, we should make the change without introducing a
symbol version (to treat everyone the same), or not make the change at
all.

The sprintf change clearly is in category (2).  However, there's a
mitigation circumstance: Distributions already build with
_FORTIFY_SOURCE (at least they try to), so they are largely exposed to
the new behavior.  So I'd expect that in this particular case, the
recompilation problem would largely affect unpackaged, in-house
software.  Maybe this makes the change more acceptable and could
actually justify introducing a symbol version in this case?
Zack Weinberg Dec. 27, 2018, 4:59 p.m. | #15
On Thu, Dec 27, 2018 at 8:46 AM Gabriel F. T. Gomes
<gabriel@inconstante.eti.br> wrote:
> On Thu, Dec 27, 2018 at 11:15:06AM +0530, Siddhesh Poyarekar wrote:
> > Thanks, to clarify, is your position that we revert to old behaviour (for
> > now) for the default case only or for everything, including _FORTIFY_SOURCE?
>
> The _FORTIFY_SOURCE version did not change behaviour...  Even before
> Zack's patches, it would overwrite overlapping buffers.  This means that
> this patch, as it is today, reverts all behavior to what it used to be.

Yes.  My position right now is that we should merge Gabriel's patch as
is -- including the test case! -- for 2.29, and consider whether we
want to make any further changes after the release.

zw
Zack Weinberg Dec. 27, 2018, 6:03 p.m. | #16
On Thu, Dec 27, 2018 at 10:27 AM Florian Weimer <fw@deneb.enyo.de> wrote:
> Historically, we have used symbol versioning (in the sense that we
> changed the default symbol version) for two different situations:
>
> (1) A type changed size or a function changed its prototype in a
>     binary-incompatible way, usually prompted by a desire to increase
>     standards conformance.
>
> (2) A function changed behavior and existing software broke.
>
> memcpy@GLIBC_2.14 is clearly in the second category.  A more recent
> example is glob@GLIBC_2.27.
>
> The key difference between (1) and (2) is that rebuilding existing
> software for (1) either works as before, or you get at a compiler
> error and have to apply a simple fix.  For (2), on the other hand, the
> bug is still there, and recompilation reintroduces it.
>
> Over the years, I have come to the realization that (2) really only
> benefits proprietary software, and unmaintained proprietary software
> at that.  We still receive bug reports about the glob@GLIBC_2.27
> transition because people try to build the wrong version of GNU make
> on glibc 2.27 or later.  I would say that using symbol versioning to
> make such backwards-incompatible changes is very confusing and may not
> be worth it.  Instead, we should make the change without introducing a
> symbol version (to treat everyone the same), or not make the change at
> all.

I agree with this and I also think we have historically been too
aggressive about breaking programs that depend on behavior that is
formally undefined in the C standard but has been well-defined for
many years in the GNU ecosystem.  In this case I am particularly
concerned about breaking programs that are distributed as source but
not carefully maintained.  It's really easy to miss one more warning
in a program that already has lots.

> The sprintf change clearly is in category (2).  However, there's a
> mitigation circumstance: Distributions already build with
> _FORTIFY_SOURCE (at least they try to), so they are largely exposed to
> the new behavior.  So I'd expect that in this particular case, the
> recompilation problem would largely affect unpackaged, in-house
> software.  Maybe this makes the change more acceptable and could
> actually justify introducing a symbol version in this case?

It seems to me it's the other way around: distribution-packaged
software is more likely to receive fixes for this type of problem than
unpackaged in-house stuff is.

zw
Florian Weimer Dec. 27, 2018, 6:16 p.m. | #17
* Zack Weinberg:

>> The sprintf change clearly is in category (2).  However, there's a
>> mitigation circumstance: Distributions already build with
>> _FORTIFY_SOURCE (at least they try to), so they are largely exposed to
>> the new behavior.  So I'd expect that in this particular case, the
>> recompilation problem would largely affect unpackaged, in-house
>> software.  Maybe this makes the change more acceptable and could
>> actually justify introducing a symbol version in this case?
>
> It seems to me it's the other way around: distribution-packaged
> software is more likely to receive fixes for this type of problem than
> unpackaged in-house stuff is.

On common code paths, yes, but the same applies to in-house code.
What I tried to say is that the distribution fixes have already
happened due to the impact of -D_FORTIFY_SOURCE=2 by default and
hopefully have been upstreamed by now.  (This may have given us the
delay that Jonathan was suggesting.)  As a result, we wouldn't punish
free software over proprietary software in this case.  Sorry if I
didn't explain my line of reasoning clearly enough.
Siddhesh Poyarekar Dec. 29, 2018, 1:37 a.m. | #18
On 27/12/18 10:29 PM, Zack Weinberg wrote:
> On Thu, Dec 27, 2018 at 8:46 AM Gabriel F. T. Gomes
> <gabriel@inconstante.eti.br> wrote:
>> On Thu, Dec 27, 2018 at 11:15:06AM +0530, Siddhesh Poyarekar wrote:
>>> Thanks, to clarify, is your position that we revert to old behaviour (for
>>> now) for the default case only or for everything, including _FORTIFY_SOURCE?
>>
>> The _FORTIFY_SOURCE version did not change behaviour...  Even before
>> Zack's patches, it would overwrite overlapping buffers.  This means that
>> this patch, as it is today, reverts all behavior to what it used to be.
> 
> Yes.  My position right now is that we should merge Gabriel's patch as
> is -- including the test case! -- for 2.29, and consider whether we
> want to make any further changes after the release.

OK thaks for the clarification.  Paul, do you have a strong opposition 
to this or is it OK if we take this course for this release?

Siddhesh
Joseph Myers Dec. 31, 2018, 5:46 p.m. | #19
On Thu, 27 Dec 2018, Siddhesh Poyarekar wrote:

> With this action plan we don't really need to test the undefined behaviour
> since we don't intend to retain it except in the compat case, just flip the

Compat symbol semantics should be tested in the testsuite (with 
appropriate TEST_COMPAT conditionals and use of compat_symbol_reference).  
(There are exceptional cases where the new symbol's semantics are a 
refinement of the old and so the two symbol versions alias each other and 
no separate test of the old version is useful, but a new version is added 
to stop new binaries working with old libc - e.g. when some fenv.h 
functions changed from void to int return type following C99TC1 - but the 
normal case is that the compat symbols should be tested to ensure they 
keep working as expected.)
Siddhesh Poyarekar Dec. 31, 2018, 5:52 p.m. | #20
On 31/12/18 11:16 PM, Joseph Myers wrote:
> Compat symbol semantics should be tested in the testsuite (with
> appropriate TEST_COMPAT conditionals and use of compat_symbol_reference).
> (There are exceptional cases where the new symbol's semantics are a
> refinement of the old and so the two symbol versions alias each other and
> no separate test of the old version is useful, but a new version is added
> to stop new binaries working with old libc - e.g. when some fenv.h
> functions changed from void to int return type following C99TC1 - but the
> normal case is that the compat symbols should be tested to ensure they
> keep working as expected.)

That's a good point, and something to keep in mind when we revive this 
discussion for 2.30.  For 2.29, do you agree that we should revert to 
old behaviour?

Siddhesh
Joseph Myers Dec. 31, 2018, 5:58 p.m. | #21
On Thu, 27 Dec 2018, Florian Weimer wrote:

> The key difference between (1) and (2) is that rebuilding existing
> software for (1) either works as before, or you get at a compiler
> error and have to apply a simple fix.  For (2), on the other hand, the
> bug is still there, and recompilation reintroduces it.

Recompilation reintroduces it *if the software doesn't have tests that 
expose the issue*.  If the software has sufficient tests, (2) reduces to 
(1) - recompilation produces a (test) error.  (And software is more likely 
to have tests run when it is recompiled than when glibc is updated without 
the software being recompiled - there are of course the differences here 
between GNU/Linux distributions as to whether packaged software without 
any changes gets rebuilt in bulk for each distribution release.)
Joseph Myers Dec. 31, 2018, 6:12 p.m. | #22
On Mon, 31 Dec 2018, Siddhesh Poyarekar wrote:

> That's a good point, and something to keep in mind when we revive this
> discussion for 2.30.  For 2.29, do you agree that we should revert to old
> behaviour?

Yes (and make sure it's tested in the testsuite).

Patch

diff --git a/debug/sprintf_chk.c b/debug/sprintf_chk.c
index 649e8ab4d5..dccdec92c3 100644
--- a/debug/sprintf_chk.c
+++ b/debug/sprintf_chk.c
@@ -29,6 +29,10 @@  ___sprintf_chk (char *s, int flag, size_t slen, const char *format, ...)
   va_list ap;
   int ret;
 
+  /* Regardless of the value of flag, let __vsprintf_internal know that
+     this is a call from *printf_chk.  */
+  mode |= PRINTF_CHK;
+
   if (slen == 0)
     __chk_fail ();
 
diff --git a/debug/vsprintf_chk.c b/debug/vsprintf_chk.c
index c1b1a8da4f..3db6f624de 100644
--- a/debug/vsprintf_chk.c
+++ b/debug/vsprintf_chk.c
@@ -25,6 +25,10 @@  ___vsprintf_chk (char *s, int flag, size_t slen, const char *format,
      can only come from read-only format strings.  */
   unsigned int mode = (flag > 0) ? PRINTF_FORTIFY : 0;
 
+  /* Regardless of the value of flag, let __vsprintf_internal know that
+     this is a call from *printf_chk.  */
+  mode |= PRINTF_CHK;
+
   if (slen == 0)
     __chk_fail ();
 
diff --git a/libio/Makefile b/libio/Makefile
index 8239fba828..e1b0bf629d 100644
--- a/libio/Makefile
+++ b/libio/Makefile
@@ -64,7 +64,8 @@  tests = tst_swprintf tst_wprintf tst_swscanf tst_wscanf tst_getwc tst_putwc   \
 	bug-memstream1 bug-wmemstream1 \
 	tst-setvbuf1 tst-popen1 tst-fgetwc bug-wsetpos tst-fseek \
 	tst-fwrite-error tst-ftell-partial-wide tst-ftell-active-handler \
-	tst-ftell-append tst-fputws tst-bz22415 tst-fgetc-after-eof
+	tst-ftell-append tst-fputws tst-bz22415 tst-fgetc-after-eof \
+	tst-sprintf-ub tst-sprintf-chk-ub
 
 tests-internal = tst-vtables tst-vtables-interposed tst-readline
 
@@ -152,6 +153,10 @@  CFLAGS-oldtmpfile.c += -fexceptions
 
 CFLAGS-tst_putwc.c += -DOBJPFX=\"$(objpfx)\"
 
+# These test cases intentionally use overlapping arguments
+CFLAGS-tst-sprintf-ub.c += -Wno-restrict
+CFLAGS-tst-sprintf-chk-ub.c += -Wno-restrict
+
 tst_wprintf2-ARGS = "Some Text"
 
 test-fmemopen-ENV = MALLOC_TRACE=$(objpfx)test-fmemopen.mtrace
diff --git a/libio/iovsprintf.c b/libio/iovsprintf.c
index 08e4002625..f58519d172 100644
--- a/libio/iovsprintf.c
+++ b/libio/iovsprintf.c
@@ -77,8 +77,18 @@  __vsprintf_internal (char *string, size_t maxlen,
   sf._sbf._f._lock = NULL;
 #endif
   _IO_no_init (&sf._sbf._f, _IO_USER_LOCK, -1, NULL, NULL);
-  _IO_JUMPS (&sf._sbf) = &_IO_str_chk_jumps;
-  string[0] = '\0';
+  /* When called from fortified sprintf/vsprintf, erase the destination
+     buffer and try to detect overflows.  When called from regular
+     sprintf/vsprintf, do not erase the destination buffer, because
+     known user code relies on this behavior (even though its undefined
+     by ISO C), nor try to detect overflows.  */
+  if ((mode_flags & PRINTF_CHK) != 0)
+    {
+      _IO_JUMPS (&sf._sbf) = &_IO_str_chk_jumps;
+      string[0] = '\0';
+    }
+  else
+    _IO_JUMPS (&sf._sbf) = &_IO_str_jumps;
   _IO_str_init_static_internal (&sf, string,
 				(maxlen == -1) ? -1 : maxlen - 1,
 				string);
diff --git a/libio/libioP.h b/libio/libioP.h
index 958ef9bffe..e5f8d2381d 100644
--- a/libio/libioP.h
+++ b/libio/libioP.h
@@ -705,9 +705,13 @@  extern int __vswprintf_internal (wchar_t *string, size_t maxlen,
    PRINTF_FORTIFY, when set to one, indicates that fortification checks
    are to be performed in input parameters.  This is used by the
    __*printf_chk functions, which are used when _FORTIFY_SOURCE is
-   defined to 1 or 2.  Otherwise, such checks are ignored.  */
+   defined to 1 or 2.  Otherwise, such checks are ignored.
+
+   PRINTF_CHK indicates, to the internal function being called, that the
+   call is originated from one of the __*printf_chk functions.  */
 #define PRINTF_LDBL_IS_DBL 0x0001
 #define PRINTF_FORTIFY     0x0002
+#define PRINTF_CHK	   0x0004
 
 extern size_t _IO_getline (FILE *,char *, size_t, int, int);
 libc_hidden_proto (_IO_getline)
diff --git a/libio/tst-sprintf-chk-ub.c b/libio/tst-sprintf-chk-ub.c
new file mode 100644
index 0000000000..77ec64229a
--- /dev/null
+++ b/libio/tst-sprintf-chk-ub.c
@@ -0,0 +1,2 @@ 
+#define _FORTIFY_SOURCE 2
+#include <tst-sprintf-ub.c>
diff --git a/libio/tst-sprintf-ub.c b/libio/tst-sprintf-ub.c
new file mode 100644
index 0000000000..35c6711122
--- /dev/null
+++ b/libio/tst-sprintf-ub.c
@@ -0,0 +1,102 @@ 
+/* Test the sprintf (buf, "%s", buf) does not override buf.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+
+#include <support/check.h>
+
+enum
+{
+  FUNCTION_FIRST,
+  FUNCTION_SPRINTF = FUNCTION_FIRST,
+  FUNCTION_SNPRINF,
+  FUNCTION_VSPRINTF,
+  FUNCTION_VSNPRINTF,
+  FUNCTION_LAST
+};
+
+static void
+do_one_test (int function, char *buf, ...)
+{
+  va_list args;
+  char *arg;
+
+  /* Expected results for fortified and non-fortified sprintf.  */
+#if defined _FORTIFY_SOURCE && _FORTIFY_SOURCE > 1
+  const char *expected = "CD";
+#else
+  const char *expected = "ABCD";
+#endif
+
+  va_start (args, buf);
+  arg = va_arg (args, char *);
+  va_end (args);
+
+  switch (function)
+    {
+      /* The regular sprintf and vsprintf functions do not override the
+         destination buffer, even if source and destination overlap.  */
+      case FUNCTION_SPRINTF:
+        sprintf (buf, "%sCD", buf);
+        TEST_COMPARE_STRING (buf, expected);
+        break;
+      case FUNCTION_VSPRINTF:
+        va_start (args, buf);
+        vsprintf (arg, "%sCD", args);
+        TEST_COMPARE_STRING (arg, expected);
+        va_end (args);
+        break;
+      /* On the other hand, snprint and vsnprint override overlapping
+         source and destination buffers.  */
+      case FUNCTION_SNPRINF:
+        snprintf (buf, 3, "%sCD", buf);
+        TEST_COMPARE_STRING (buf, "CD");
+        break;
+      case FUNCTION_VSNPRINTF:
+        va_start (args, buf);
+        vsnprintf (arg, 3, "%sCD", args);
+        TEST_COMPARE_STRING (arg, "CD");
+        va_end (args);
+        break;
+      default:
+        support_record_failure ();
+    }
+}
+
+static int
+do_test (void)
+{
+  char buf[8];
+  int i;
+
+  /* For each function in the enum do:
+       - reset the buffer to the initial state "AB";
+       - call the function with buffer as source and destination;
+       - check for the desired behavior.  */
+  for (i = FUNCTION_FIRST; i < FUNCTION_LAST; i++)
+    {
+      strncpy (buf, "AB", 3);
+      do_one_test (i, buf, buf);
+    }
+
+  return 0;
+}
+
+#include <support/test-driver.c>