Add stricter checks when parsing incoming patches

Message ID 20180713184628.24488-1-dzickus@redhat.com
State New
Headers show
Series
  • Add stricter checks when parsing incoming patches
Related show

Commit Message

Don Zickus July 13, 2018, 6:46 p.m.
The patch parser has a multi-stage if-then-else statement that tries to
determine which part of a patch is the comment and what piece is the
patch itself.

Unfortunately there is a gap in between state 1 and state 2 where chunks
of the comment can be accidentally added to the patch.

For example if a comment has a line that begins with 'diff' or 'Index'.
That will trigger the state 0 to state 1 transition and sit there until
many comment lines later a '--- ' line is found to move it to state 2.

As a result many comment lines are truncated and stuck into a patch buffer
instead.  This makes it more difficult to process metadata found in the
comment buffer.

This patch adds some strict rules based on various common patch preambles
like git, quilt, and rcs.

Now if the patch is in state 1 because of a 'diff ' or 'Index:', it needs
to expect the next common preamble to continue to stay in state 1 otherwise
accept the state 0 to state 1 transition was an accident and move back to
state 0.

This patch has been in our internal patchwork instance for 8 years now and is
the result of various patches we have seen internally.

Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 patchwork/parser.py | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Patch

diff --git a/patchwork/parser.py b/patchwork/parser.py
index a2db403..5566c69 100644
--- a/patchwork/parser.py
+++ b/patchwork/parser.py
@@ -792,6 +792,19 @@  def parse_patch(content):
 
             if line.startswith(('rename from ', 'rename to ')):
                 state = 6
+            elif line.startswith('diff ') or line.startswith('Index: ') \
+                    or line.startswith('deleted file ') \
+                    or line.startswith('index ') \
+                    or line.startswith('new file ') \
+                    or line.startswith('====') \
+                    or line.startswith('RCS file: ') \
+                    or line.startswith('retrieving revision '):
+                state = 1
+            else:
+                state = 0
+                commentbuf += buf
+                buf = ''
+
         elif state == 2:
             if line.startswith('+++ '):
                 state = 3