Message ID | 20211018221548.76024-1-eggert@cs.ucla.edu |
---|---|
State | New |
Headers | show |
Series | regex: fix buffer read overrun in search [BZ#28470] | expand |
On Okt 18 2021, Paul Eggert wrote: > /* If MATCH_FIRST is out of the buffer, leave it as '\0'. > Note that MATCH_FIRST must not be smaller than 0. */ > - ch = (match_first >= length > + ch = (mctx.input.valid_len <= offset That needs to update the comment. Andreas.
On 10/19/21 00:17, Andreas Schwab wrote:
> That needs to update the comment.
Thanks, revised patch attached.
On Okt 19 2021, Paul Eggert wrote:
> + ch = (mctx.input.valid_len <= offset
This is backwards.
Andreas.
On 10/19/21 01:25, Andreas Schwab wrote: > On Okt 19 2021, Paul Eggert wrote: > >> + ch = (mctx.input.valid_len <= offset > > This is backwards. It's correct as-is, so that comment is merely about style. I revamped the patch to turn the comparison around; see attached. Let's not have our longstanding style disagreement distract us from the fix.
On Okt 19 2021, Paul Eggert wrote: > diff --git a/posix/regexec.c b/posix/regexec.c > index 83e9aaf8ca..6aeba3c0b4 100644 > --- a/posix/regexec.c > +++ b/posix/regexec.c > @@ -758,10 +758,9 @@ re_search_internal (const regex_t *preg, const char *string, Idx length, > > offset = match_first - mctx.input.raw_mbs_idx; > } > - /* If MATCH_FIRST is out of the buffer, leave it as '\0'. > - Note that MATCH_FIRST must not be smaller than 0. */ > - ch = (match_first >= length > - ? 0 : re_string_byte_at (&mctx.input, offset)); > + /* Use buffer byte if OFFSET is in buffer, otherwise '\0'. */ > + ch = (offset < mctx.input.valid_len > + ? re_string_byte_at (&mctx.input, offset) : 0); Why is the bug not in re_string_reconstruct? Since string[match_first] exists, so should re_string_byte_at (&mctx.input, offset). Andreas.
On 10/19/21 08:09, Andreas Schwab wrote: > Why is the bug not in re_string_reconstruct? Since string[match_first] > exists, so should re_string_byte_at (&mctx.input, offset). I don't know, as I lacked the time to investigate re_string_reconstruct. Although the patch I proposed fixes the test case that prompted it, possibly it is only a partial fix for a more-general problem.
No further comment, and the patch is safe and has been used in Gnulib for some time even if it doesn't necessarily fix all the underlying problem, so I installed it. Tests pass on x86-64.
On Nov 24 2021, Paul Eggert wrote:
> the patch is safe
Is it? Why?
Andreas.
On 11/24/21 14:45, Andreas Schwab wrote:
> Is it? Why?
Partly because it refuses to read past the bounds of an array, where the
old code would. And partly because it's been run through several tests
- not just glibc tests, but also grep and coreutils and probably some
others by now.
Of course this is not a 100% guarantee of safety, but it's close enough.
On Nov 24 2021, Paul Eggert wrote: > On 11/24/21 14:45, Andreas Schwab wrote: >> Is it? Why? > > Partly because it refuses to read past the bounds of an array, where the > old code would. That's just papering over a bug, not fixing it. > And partly because it's been run through several tests - not just > glibc tests, but also grep and coreutils and probably some others by > now. How much coverage do they provide? Also, you failed to add a test. Andreas.
On 11/25/21 01:01, Andreas Schwab wrote: >> Partly because it refuses to read past the bounds of an array, where the >> old code would. > > That's just papering over a bug, not fixing it. That's not clear to me. Perhaps you're right, but perhaps it really does fix the bug. >> And partly because it's been run through several tests - not just >> glibc tests, but also grep and coreutils and probably some others by >> now. > > How much coverage do they provide? Someone who has more time could presumably determine this by looking at the respective test suites. I forgot to mention, Gnulib also has its own regex tests (which also pass). > Also, you failed to add a test. Yes, that's correct. It would be nice if someone could do that. However, it'd be some work and like you I'm pressed for time.
On Nov 26 2021, Paul Eggert wrote: > On 11/25/21 01:01, Andreas Schwab wrote: > >>> Partly because it refuses to read past the bounds of an array, where the >>> old code would. >> That's just papering over a bug, not fixing it. > > That's not clear to me. Perhaps you're right, but perhaps it really does > fix the bug. That's why we need a proper test case. Not voodoo programming. Andreas.
diff --git a/posix/regexec.c b/posix/regexec.c index 83e9aaf8ca..a955aa2182 100644 --- a/posix/regexec.c +++ b/posix/regexec.c @@ -760,7 +760,7 @@ re_search_internal (const regex_t *preg, const char *string, Idx length, } /* If MATCH_FIRST is out of the buffer, leave it as '\0'. Note that MATCH_FIRST must not be smaller than 0. */ - ch = (match_first >= length + ch = (mctx.input.valid_len <= offset ? 0 : re_string_byte_at (&mctx.input, offset)); if (fastmap[ch]) break;