Message ID | 20180713184628.24488-1-dzickus@redhat.com |
---|---|
State | Changes Requested |
Headers | show |
Series | Add stricter checks when parsing incoming patches | expand |
On Fri, 2018-07-13 at 14:46 -0400, Don Zickus wrote: > The patch parser has a multi-stage if-then-else statement that tries to > determine which part of a patch is the comment and what piece is the > patch itself. > > Unfortunately there is a gap in between state 1 and state 2 where chunks > of the comment can be accidentally added to the patch. > > For example if a comment has a line that begins with 'diff' or 'Index'. > That will trigger the state 0 to state 1 transition and sit there until > many comment lines later a '--- ' line is found to move it to state 2. > > As a result many comment lines are truncated and stuck into a patch buffer > instead. This makes it more difficult to process metadata found in the > comment buffer. > > This patch adds some strict rules based on various common patch preambles > like git, quilt, and rcs. > > Now if the patch is in state 1 because of a 'diff ' or 'Index:', it needs > to expect the next common preamble to continue to stay in state 1 otherwise > accept the state 0 to state 1 transition was an accident and move back to > state 0. > > This patch has been in our internal patchwork instance for 8 years now and is > the result of various patches we have seen internally. > > Signed-off-by: Don Zickus <dzickus@redhat.com> I applied this locally and ran the unit tests. $ docker-compose run --rm web tox -e py27-django111 It's failing pretty hard. Ran 455 tests in 145.925s FAILED (failures=62, errors=6, skipped=1) Destroying test database for alias 'default'... I'm guessing a good few of these are as a result of how we auto- generate patches but I imagine a few more of them might be valid. Could you have a look at these failures and see how we can remedy them? Stephen > --- > patchwork/parser.py | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/patchwork/parser.py b/patchwork/parser.py > index a2db403..5566c69 100644 > --- a/patchwork/parser.py > +++ b/patchwork/parser.py > @@ -792,6 +792,19 @@ def parse_patch(content): > > if line.startswith(('rename from ', 'rename to ')): > state = 6 > + elif line.startswith('diff ') or line.startswith('Index: ') \ > + or line.startswith('deleted file ') \ > + or line.startswith('index ') \ > + or line.startswith('new file ') \ > + or line.startswith('====') \ > + or line.startswith('RCS file: ') \ > + or line.startswith('retrieving revision '): > + state = 1 > + else: > + state = 0 > + commentbuf += buf > + buf = '' > + > elif state == 2: > if line.startswith('+++ '): > state = 3
diff --git a/patchwork/parser.py b/patchwork/parser.py index a2db403..5566c69 100644 --- a/patchwork/parser.py +++ b/patchwork/parser.py @@ -792,6 +792,19 @@ def parse_patch(content): if line.startswith(('rename from ', 'rename to ')): state = 6 + elif line.startswith('diff ') or line.startswith('Index: ') \ + or line.startswith('deleted file ') \ + or line.startswith('index ') \ + or line.startswith('new file ') \ + or line.startswith('====') \ + or line.startswith('RCS file: ') \ + or line.startswith('retrieving revision '): + state = 1 + else: + state = 0 + commentbuf += buf + buf = '' + elif state == 2: if line.startswith('+++ '): state = 3
The patch parser has a multi-stage if-then-else statement that tries to determine which part of a patch is the comment and what piece is the patch itself. Unfortunately there is a gap in between state 1 and state 2 where chunks of the comment can be accidentally added to the patch. For example if a comment has a line that begins with 'diff' or 'Index'. That will trigger the state 0 to state 1 transition and sit there until many comment lines later a '--- ' line is found to move it to state 2. As a result many comment lines are truncated and stuck into a patch buffer instead. This makes it more difficult to process metadata found in the comment buffer. This patch adds some strict rules based on various common patch preambles like git, quilt, and rcs. Now if the patch is in state 1 because of a 'diff ' or 'Index:', it needs to expect the next common preamble to continue to stay in state 1 otherwise accept the state 0 to state 1 transition was an accident and move back to state 0. This patch has been in our internal patchwork instance for 8 years now and is the result of various patches we have seen internally. Signed-off-by: Don Zickus <dzickus@redhat.com> --- patchwork/parser.py | 13 +++++++++++++ 1 file changed, 13 insertions(+)