diff mbox

[v2,kteam-tools] ktl: Don't treat all strings matching "#[0-9]+" as a bug numbers

Message ID 1470852612-20243-1-git-send-email-seth.forshee@canonical.com
State New
Headers show

Commit Message

Seth Forshee Aug. 10, 2016, 6:10 p.m. UTC
The bug number regular expression matching correctly identifies
lines containing launchpad bug numbers, but once such a line is
identified it treats any string matching "#[0-9]+" as a bug
number. For example, in this changelog entry #1 is identified as
a bug number:

  * [LTCTest][Opal][OP820] Machine crashed with Oops: Kernel access of bad area,
    sig: 11 [#1] while executing Froze PE Error injection (LP: #1603449)

As far as I can tell the format of bug strings that we need to
match is the string "LP:" followed by whitespace, followed by one
or more bug numbers of the format "#[0-9]+" separated by a comma
and whitespace. This can be expressed as the following python
regex:

  LP:\s#[0-9]+(,\s#[0-9]+)*

I'm not a regex wizard, so there may be a more optimal way to
express this regex, but it seems to work.

Therefore this patch updates the bug number matching as follows:

 1. Use re.finditer with the above regex (modified to use non-
    capturing grouping) to match all groups of bug numbers on a
    line.

 2. Within each match, use findall to create a list of all bug
    numbers by matching against "#([0-9]+)".

 3. Append the list returned by re.findall directly to the bug
    list. The list returned by re.findall will never be empty
    since the string being searched matched the regex used in
    (1), but even if it was appending an empty list to another
    list is a nop.

In my testing this matches the same set of bug numbers matched by
the current code but does not match the #1 in the example above.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
---
 ktl/debian.py | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

Comments

Kamal Mostafa Aug. 10, 2016, 6:18 p.m. UTC | #1
Looks good to me.

 -Kamal
Stefan Bader Aug. 23, 2016, 12:59 p.m. UTC | #2
Think it looks reasonable.
Kamal Mostafa Aug. 23, 2016, 6:07 p.m. UTC | #3

Stefan Bader Aug. 24, 2016, 6:16 a.m. UTC | #4
On 23.08.2016 20:07, Kamal Mostafa wrote:
> 
> 
and pushed?
Andy Whitcroft Aug. 24, 2016, 8:23 a.m. UTC | #5
On Wed, Aug 10, 2016 at 01:10:12PM -0500, Seth Forshee wrote:
> The bug number regular expression matching correctly identifies
> lines containing launchpad bug numbers, but once such a line is
> identified it treats any string matching "#[0-9]+" as a bug
> number. For example, in this changelog entry #1 is identified as
> a bug number:
> 
>   * [LTCTest][Opal][OP820] Machine crashed with Oops: Kernel access of bad area,
>     sig: 11 [#1] while executing Froze PE Error injection (LP: #1603449)
> 
> As far as I can tell the format of bug strings that we need to
> match is the string "LP:" followed by whitespace, followed by one
> or more bug numbers of the format "#[0-9]+" separated by a comma
> and whitespace. This can be expressed as the following python
> regex:
> 
>   LP:\s#[0-9]+(,\s#[0-9]+)*

That prolly should be LP:\s+ for safety, at times these are typed by
hand in some packages.

> I'm not a regex wizard, so there may be a more optimal way to
> express this regex, but it seems to work.
> 
> Therefore this patch updates the bug number matching as follows:
> 
>  1. Use re.finditer with the above regex (modified to use non-
>     capturing grouping) to match all groups of bug numbers on a
>     line.
> 
>  2. Within each match, use findall to create a list of all bug
>     numbers by matching against "#([0-9]+)".
> 
>  3. Append the list returned by re.findall directly to the bug
>     list. The list returned by re.findall will never be empty
>     since the string being searched matched the regex used in
>     (1), but even if it was appending an empty list to another
>     list is a nop.
> 
> In my testing this matches the same set of bug numbers matched by
> the current code but does not match the #1 in the example above.
> 
> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
> ---
>  ktl/debian.py | 14 ++++++--------
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/ktl/debian.py b/ktl/debian.py
> index dd2703a..9ba2836 100644
> --- a/ktl/debian.py
> +++ b/ktl/debian.py
> @@ -8,7 +8,7 @@
>  from ktl.git                            import Git, GitError
>  from ktl.utils                          import debug, dump
>  from ktl.kernel                         import Kernel
> -from re                                 import compile, findall
> +from re                                 import compile, findall, finditer
>  from os                                 import path, listdir, walk
>  
>  # DebianError
> @@ -30,8 +30,8 @@ class Debian:
>  
>      package_rc = compile("^(linux[-\S])*.*$")
>      ver_rc     = compile("^linux[-\S]* \(([0-9]+\.[0-9]+\.[0-9]+[-\.][0-9]+\.[0-9]+[~a-z0-9]*)\).*$")
> -    bug_line_rc = compile("LP:\s#[0-9]+")
> -    bug_rc = compile("#([0-9]+)")
> +    bug_rc = compile("LP:\s#[0-9]+(?:,\s#[0-9]+)*")
> +    bug_nr_rc = compile("#([0-9]+)")
>  
>      # debian_directories
>      #
> @@ -176,11 +176,9 @@ class Debian:
>                  bugs = []
>              else:
>                  # find bug numbers and append them to the list
> -                bug_line_matches = cls.bug_line_rc.search(line)
> -                if bug_line_matches:
> -                    bug_matches = findall(cls.bug_rc,line)
> -                    if bug_matches:
> -                        bugs.extend( bug_matches )
> +                for bug_line_match in finditer(cls.bug_rc, line):
> +                    bug_matches = findall(cls.bug_nr_rc, bug_line_match.group(0))
> +                    bugs.extend(bug_matches)
>  
>                  content.append(line)
Kamal Mostafa Aug. 24, 2016, 2:42 p.m. UTC | #6
On Wed, Aug 24, 2016 at 08:16:14AM +0200, Stefan Bader wrote:
> On 23.08.2016 20:07, Kamal Mostafa wrote:
> > 
> > 
> and pushed?

It had been pushed to the wrong mirror; now resolved.

 -Kamal
diff mbox

Patch

diff --git a/ktl/debian.py b/ktl/debian.py
index dd2703a..9ba2836 100644
--- a/ktl/debian.py
+++ b/ktl/debian.py
@@ -8,7 +8,7 @@ 
 from ktl.git                            import Git, GitError
 from ktl.utils                          import debug, dump
 from ktl.kernel                         import Kernel
-from re                                 import compile, findall
+from re                                 import compile, findall, finditer
 from os                                 import path, listdir, walk
 
 # DebianError
@@ -30,8 +30,8 @@  class Debian:
 
     package_rc = compile("^(linux[-\S])*.*$")
     ver_rc     = compile("^linux[-\S]* \(([0-9]+\.[0-9]+\.[0-9]+[-\.][0-9]+\.[0-9]+[~a-z0-9]*)\).*$")
-    bug_line_rc = compile("LP:\s#[0-9]+")
-    bug_rc = compile("#([0-9]+)")
+    bug_rc = compile("LP:\s#[0-9]+(?:,\s#[0-9]+)*")
+    bug_nr_rc = compile("#([0-9]+)")
 
     # debian_directories
     #
@@ -176,11 +176,9 @@  class Debian:
                 bugs = []
             else:
                 # find bug numbers and append them to the list
-                bug_line_matches = cls.bug_line_rc.search(line)
-                if bug_line_matches:
-                    bug_matches = findall(cls.bug_rc,line)
-                    if bug_matches:
-                        bugs.extend( bug_matches )
+                for bug_line_match in finditer(cls.bug_rc, line):
+                    bug_matches = findall(cls.bug_nr_rc, bug_line_match.group(0))
+                    bugs.extend(bug_matches)
 
                 content.append(line)