parser: Ensure whitespace is stripped for long headers

Message ID 20181104140603.29412-1-stephen@that.guru
State New
Headers show
Series
  • parser: Ensure whitespace is stripped for long headers
Related show

Commit Message

Stephen Finucane Nov. 4, 2018, 2:06 p.m.
RFC2822 states that long headers can be wrapped using CRLF followed by
WSP [1]. For example:

    Subject: Foo bar,
     baz

Should be parsed as:

    Foo bar,baz

While we were stripping the former, we were not stripping the
latter. This mean that we ended up with the following:

    Foo bar, baz

Resolve this.

Signed-off-by: Stephen Finucane <stephen@that.guru>
Closes: #197
---
 patchwork/parser.py            | 1 +
 patchwork/tests/test_parser.py | 2 ++
 2 files changed, 3 insertions(+)

Patch

diff --git a/patchwork/parser.py b/patchwork/parser.py
index d6fa8437..946b6685 100644
--- a/patchwork/parser.py
+++ b/patchwork/parser.py
@@ -47,6 +47,7 @@  class DuplicateMailError(Exception):
 
 
 def normalise_space(value):
+    value = ''.join(re.split(r'\n\s+', value))
     whitespace_re = re.compile(r'\s+')
     return whitespace_re.sub(' ', value).strip()
 
diff --git a/patchwork/tests/test_parser.py b/patchwork/tests/test_parser.py
index a9df5e35..664edd5b 100644
--- a/patchwork/tests/test_parser.py
+++ b/patchwork/tests/test_parser.py
@@ -832,6 +832,8 @@  class SubjectTest(TestCase):
         self.assertEqual(clean_subject('[PATCH] meep'), ('meep', []))
         self.assertEqual(clean_subject("[PATCH] meep \n meep"),
                          ('meep meep', []))
+        self.assertEqual(clean_subject("[PATCH] meep,\n meep"),
+                         ('meep,meep', []))
         self.assertEqual(clean_subject('[PATCH RFC] meep'),
                          ('[RFC] meep', ['RFC']))
         self.assertEqual(clean_subject('[PATCH,RFC] meep'),