Message ID | 20121123172337.GA26106@gmail.com |
---|---|
State | New |
Headers | show |
On Fri, Nov 23, 2012 at 09:23:37AM -0800, H.J. Lu wrote: > This patch allocates extra 16 bytes for -fsanitize=address so that > asan won't report read beyond memory buffer. It is used by > bootstrap-asan. OK to install? As valgrind warns about that too, I'd say we should do that unconditionally, the additional 16-bytes just aren't a big deal. But, _cpp_convert_input isn't the only place which needs that, IMHO you want to also change the caller, read_file_guts, where it does buf = XNEWVEC (uchar, size + 1); and buf = XRESIZEVEC (uchar, buf, size + 1); I'll defer the review to Tom though. > 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> > > PR bootstrap/55380 > * charset.c (_cpp_convert_input): Allocate extra 16 bytes for > -fsanitize=address if __SANITIZE_ADDRESS__ is defined. > > diff --git a/libcpp/charset.c b/libcpp/charset.c > index cba19a6..dea8bb1 100644 > --- a/libcpp/charset.c > +++ b/libcpp/charset.c > @@ -1729,9 +1729,16 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_charset, > iconv_close (input_cset.cd); > > /* Resize buffer if we allocated substantially too much, or if we > - haven't enough space for the \n-terminator. */ > + haven't enough space for the \n-terminator. Allocate extra 16 > + bytes for -fsanitize=address. */ > if (to.len + 4096 < to.asize || to.len >= to.asize) > - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); > + { > +#ifdef __SANITIZE_ADDRESS__ > + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); > +#else > + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); > +#endif > + } > > /* If the file is using old-school Mac line endings (\r only), > terminate with another \r, not an \n, so that we do not mistake Jakub
On Fri, Nov 23, 2012 at 9:38 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Fri, Nov 23, 2012 at 09:23:37AM -0800, H.J. Lu wrote: >> This patch allocates extra 16 bytes for -fsanitize=address so that >> asan won't report read beyond memory buffer. It is used by >> bootstrap-asan. OK to install? > > As valgrind warns about that too, I'd say we should do that unconditionally, > the additional 16-bytes just aren't a big deal. This isn't sufficient for valgrind since valgrind will also report reading uninitialized data, which requires additional memset on extra 16 bytes. > But, _cpp_convert_input isn't the only place which needs that, IMHO you want This patch is sufficient to compile libcpp with -fsanitize=address. > to also change the caller, read_file_guts, where it does > buf = XNEWVEC (uchar, size + 1); > and > buf = XRESIZEVEC (uchar, buf, size + 1); I don't think it is necessary since there is no real data in those extra 16 bytes. They can have garbage and libcpp still functions correctly. They are purely used to silence ASAN. > I'll defer the review to Tom though. > >> 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> >> >> PR bootstrap/55380 >> * charset.c (_cpp_convert_input): Allocate extra 16 bytes for >> -fsanitize=address if __SANITIZE_ADDRESS__ is defined. >> >> diff --git a/libcpp/charset.c b/libcpp/charset.c >> index cba19a6..dea8bb1 100644 >> --- a/libcpp/charset.c >> +++ b/libcpp/charset.c >> @@ -1729,9 +1729,16 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_charset, >> iconv_close (input_cset.cd); >> >> /* Resize buffer if we allocated substantially too much, or if we >> - haven't enough space for the \n-terminator. */ >> + haven't enough space for the \n-terminator. Allocate extra 16 >> + bytes for -fsanitize=address. */ >> if (to.len + 4096 < to.asize || to.len >= to.asize) >> - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); >> + { >> +#ifdef __SANITIZE_ADDRESS__ >> + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); >> +#else >> + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); >> +#endif >> + } >> >> /* If the file is using old-school Mac line endings (\r only), >> terminate with another \r, not an \n, so that we do not mistake > > Jakub
On Fri, Nov 23, 2012 at 10:08 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Fri, Nov 23, 2012 at 9:38 AM, Jakub Jelinek <jakub@redhat.com> wrote: >> On Fri, Nov 23, 2012 at 09:23:37AM -0800, H.J. Lu wrote: >>> This patch allocates extra 16 bytes for -fsanitize=address so that >>> asan won't report read beyond memory buffer. It is used by >>> bootstrap-asan. OK to install? >> >> As valgrind warns about that too, I'd say we should do that unconditionally, >> the additional 16-bytes just aren't a big deal. > > This isn't sufficient for valgrind since valgrind will also report > reading uninitialized data, which requires additional memset > on extra 16 bytes. Valgrind should not report reading uninitialized data if that data is actually not used later (extra bytes are cleared by AND-ing with zeroes). If the code was mine, I would simply use (to.len + 17) w/o any conditionals. --kcc > >> But, _cpp_convert_input isn't the only place which needs that, IMHO you want > > This patch is sufficient to compile libcpp with -fsanitize=address. > >> to also change the caller, read_file_guts, where it does >> buf = XNEWVEC (uchar, size + 1); >> and >> buf = XRESIZEVEC (uchar, buf, size + 1); > > I don't think it is necessary since there is no real data in > those extra 16 bytes. They can have garbage and libcpp still > functions correctly. They are purely used to silence ASAN. > >> I'll defer the review to Tom though. >> >>> 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> >>> >>> PR bootstrap/55380 >>> * charset.c (_cpp_convert_input): Allocate extra 16 bytes for >>> -fsanitize=address if __SANITIZE_ADDRESS__ is defined. >>> >>> diff --git a/libcpp/charset.c b/libcpp/charset.c >>> index cba19a6..dea8bb1 100644 >>> --- a/libcpp/charset.c >>> +++ b/libcpp/charset.c >>> @@ -1729,9 +1729,16 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_charset, >>> iconv_close (input_cset.cd); >>> >>> /* Resize buffer if we allocated substantially too much, or if we >>> - haven't enough space for the \n-terminator. */ >>> + haven't enough space for the \n-terminator. Allocate extra 16 >>> + bytes for -fsanitize=address. */ >>> if (to.len + 4096 < to.asize || to.len >= to.asize) >>> - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); >>> + { >>> +#ifdef __SANITIZE_ADDRESS__ >>> + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); >>> +#else >>> + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); >>> +#endif >>> + } >>> >>> /* If the file is using old-school Mac line endings (\r only), >>> terminate with another \r, not an \n, so that we do not mistake >> >> Jakub > > > > -- > H.J.
On Fri, Nov 23, 2012 at 10:08:11AM -0800, H.J. Lu wrote: > > to also change the caller, read_file_guts, where it does > > buf = XNEWVEC (uchar, size + 1); > > and > > buf = XRESIZEVEC (uchar, buf, size + 1); > > I don't think it is necessary since there is no real data in > those extra 16 bytes. They can have garbage and libcpp still > functions correctly. They are purely used to silence ASAN. The thing is, if the buf from the caller has such size/total combination that if (to.len + 4096 < to.asize || to.len >= to.asize) isn't true, there won't be any reallocation and the buffer passed in from the caller will be used instead. And, if it doesn't have those extra 16 bytes, it will still result in asan warning... Guess you need to read file from stdin instead of a file for that to trigger that. For valgrind I bet just clearing those 16 bytes might still be cheap enough not to worry about it. > > I'll defer the review to Tom though. > > > >> 2012-11-21 H.J. Lu <hongjiu.lu@intel.com> > >> > >> PR bootstrap/55380 > >> * charset.c (_cpp_convert_input): Allocate extra 16 bytes for > >> -fsanitize=address if __SANITIZE_ADDRESS__ is defined. > >> > >> diff --git a/libcpp/charset.c b/libcpp/charset.c > >> index cba19a6..dea8bb1 100644 > >> --- a/libcpp/charset.c > >> +++ b/libcpp/charset.c > >> @@ -1729,9 +1729,16 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_charset, > >> iconv_close (input_cset.cd); > >> > >> /* Resize buffer if we allocated substantially too much, or if we > >> - haven't enough space for the \n-terminator. */ > >> + haven't enough space for the \n-terminator. Allocate extra 16 > >> + bytes for -fsanitize=address. */ > >> if (to.len + 4096 < to.asize || to.len >= to.asize) > >> - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); > >> + { > >> +#ifdef __SANITIZE_ADDRESS__ > >> + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); > >> +#else > >> + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); > >> +#endif > >> + } > >> > >> /* If the file is using old-school Mac line endings (\r only), > >> terminate with another \r, not an \n, so that we do not mistake Jakub
On Fri, Nov 23, 2012 at 10:12 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Fri, Nov 23, 2012 at 10:08:11AM -0800, H.J. Lu wrote: >> > to also change the caller, read_file_guts, where it does >> > buf = XNEWVEC (uchar, size + 1); >> > and >> > buf = XRESIZEVEC (uchar, buf, size + 1); >> >> I don't think it is necessary since there is no real data in >> those extra 16 bytes. They can have garbage and libcpp still >> functions correctly. They are purely used to silence ASAN. > > The thing is, if the buf from the caller has such size/total > combination that if (to.len + 4096 < to.asize || to.len >= to.asize) > isn't true, there won't be any reallocation and the buffer passed > in from the caller will be used instead. And, if it doesn't have those > extra 16 bytes, it will still result in asan warning... > Guess you need to read file from stdin instead of a file for that > to trigger that. I see. I will double check and update my patch. > For valgrind I bet just clearing those 16 bytes might still be cheap enough > not to worry about it. This is what I did for valgrind: http://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=libcpp/charset.c;h=16a37c36f0cbf7a69c93ae7acb5d79893d57b643;hp=cba19a67178c796dcef1c8c70ac5c43dcbc69071;hb=8ab0e2cd4ae0fae01d0b84d6ccc382acb81ab876;hpb=e5e0d29f8b8ccff799a26fc3e6435af8dbf358fc /* Resize buffer if we allocated substantially too much, or if we - haven't enough space for the \n-terminator. */ + haven't enough space for the \n-terminator. Allocate and + initialize extra 16 bytes for --enable-checking=valgrind. */ if (to.len + 4096 < to.asize || to.len >= to.asize) - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); + { +#ifdef ENABLE_VALGRIND_CHECKING + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); + memset (to.text + to.len + 1, 0, 16); +#else + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); +#endif + } I will check if I need to update it for stdin. Thanks.
On Fri, 23 Nov 2012, H.J. Lu wrote: > On Fri, Nov 23, 2012 at 9:38 AM, Jakub Jelinek <jakub@redhat.com> wrote: > > On Fri, Nov 23, 2012 at 09:23:37AM -0800, H.J. Lu wrote: > >> This patch allocates extra 16 bytes for -fsanitize=address so that > >> asan won't report read beyond memory buffer. It is used by > >> bootstrap-asan. OK to install? > > > > As valgrind warns about that too, I'd say we should do that unconditionally, > > the additional 16-bytes just aren't a big deal. > > This isn't sufficient for valgrind since valgrind will also report > reading uninitialized data, Only if that initialized data is actually used in a conditional. > which requires additional memset > on extra 16 bytes. (For plain operations, valgrind just operate on, typically passing along, the undefinedness.) Maybe that's what you meant, in which case my comment just clarifies what I saw as an ambiguity in your statement. brgds, H-P
diff --git a/libcpp/charset.c b/libcpp/charset.c index cba19a6..dea8bb1 100644 --- a/libcpp/charset.c +++ b/libcpp/charset.c @@ -1729,9 +1729,16 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_charset, iconv_close (input_cset.cd); /* Resize buffer if we allocated substantially too much, or if we - haven't enough space for the \n-terminator. */ + haven't enough space for the \n-terminator. Allocate extra 16 + bytes for -fsanitize=address. */ if (to.len + 4096 < to.asize || to.len >= to.asize) - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); + { +#ifdef __SANITIZE_ADDRESS__ + to.text = XRESIZEVEC (uchar, to.text, to.len + 17); +#else + to.text = XRESIZEVEC (uchar, to.text, to.len + 1); +#endif + } /* If the file is using old-school Mac line endings (\r only), terminate with another \r, not an \n, so that we do not mistake