diff mbox

[06/10] parser: Use full regexps for delegation rules paths

Message ID 1448712886-3221-7-git-send-email-mchehab@osg.samsung.com
State Rejected
Delegated to: Stephen Finucane
Headers show

Commit Message

Mauro Carvalho Chehab Nov. 28, 2015, 12:14 p.m. UTC
From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

Paths are validated by trying to compile it as a regexp using a custom
validator.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
---
 patchwork/bin/parsemail.py | 19 +++++++++++++++++--
 patchwork/models.py        |  9 ++++++++-
 2 files changed, 25 insertions(+), 3 deletions(-)

Comments

Stephen Finucane Dec. 24, 2015, 2:06 p.m. UTC | #1
On 28 Nov 10:14, Mauro Carvalho Chehab wrote:
> From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> 
> Paths are validated by trying to compile it as a regexp using a custom
> validator.
> 
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>

Some small nits that I can fix myself. Other than that,

Acked-by: Stephen Finucane <stephen.finucane@intel.com>

> ---
>  patchwork/bin/parsemail.py | 19 +++++++++++++++++--
>  patchwork/models.py        |  9 ++++++++-
>  2 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/patchwork/bin/parsemail.py b/patchwork/bin/parsemail.py
> index 4f22c7f2d6a0..5d6ddf45d264 100755
> --- a/patchwork/bin/parsemail.py
> +++ b/patchwork/bin/parsemail.py
> @@ -339,18 +339,33 @@ def get_state(state_name):
>              pass
>      return get_default_initial_patch_state()
>  
> +class Rule(object):
> +    pass
> +

nit: a defaultdict ({user: [path, path...]}) would be more semantic
here. I can change this, unless you've any reason I shouldn't.

>  def auto_delegate(project, filenames):
>      if not filenames:
>          return None
>  
> -    rules = list(DelegationRule.objects.filter(project = project))
> +    # Precompile the path regexps
> +    rules = []
> +    for rule in DelegationRule.objects.filter(project = project):
> +        r = Rule()
> +        r.user = rule.user
> +
> +        try:
> +            r.path = re.compile(rule.path)
> +        except:

I'm going to make this capture 're.error' only, if that's OK.

> +            print '%s is not a valid regular expression' % rule.path
> +            continue
> +
> +        rules.append(r)
>  
>      patch_delegate = None
>  
>      for filename in filenames:
>          file_delegate = None
>          for rule in rules:
> -            if fnmatch(filename, rule.path):
> +            if rule.path.match(filename):
>                  file_delegate = rule.user
>                  break;
>  
> diff --git a/patchwork/models.py b/patchwork/models.py
> index 101a9af9746f..e552217a3cbc 100644
> --- a/patchwork/models.py
> +++ b/patchwork/models.py
> @@ -19,6 +19,7 @@
>  
>  from django.db import models
>  from django.contrib.auth.models import User
> +from django.core.exceptions import ValidationError
>  from django.core.urlresolvers import reverse
>  from django.contrib.sites.models import Site
>  from django.conf import settings
> @@ -78,9 +79,15 @@ class Project(models.Model):
>          ordering = ['linkname']
>  
>  
> +def validate_rule_path(value):
> +    try:
> +        re.compile(value)
> +    except:
> +        raise ValidationError(u'`%s\' is not a valid regulator expression' % value)
> +
>  class DelegationRule(models.Model):
>      user = models.ForeignKey(User)
> -    path = models.CharField(max_length=255)
> +    path = models.CharField(max_length = 255, validators = [validate_rule_path])
>      project = models.ForeignKey(Project)
>      priority = models.IntegerField(default = 0)
>  
> -- 
> 2.5.0
> 
> _______________________________________________
> Patchwork mailing list
> Patchwork@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/patchwork
Johannes Berg Dec. 25, 2015, 3:59 p.m. UTC | #2
On Thu, 2015-12-24 at 14:06 +0000, Finucane, Stephen wrote:

> > Paths are validated by trying to compile it as a regexp using a
> > custom
> > validator.
> > 
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
> 
> Some small nits that I can fix myself. Other than that,
> 
A small comment that may or may not be relevant - but there's a bunch
of things one can do with regexes, from taking a lot of CPU to taking a
lot of memory.

What's the trust model for running regexes? I haven't found it in the
patches easily right now. If it's configured only in the config file it
should be OK, but if any kind of remote user can configure it then it
may need safeguards of some kind?

I'm just thinking of a use case like kernel.org where you don't really
even trust the people who are the typical delegates/admins in patchwork
for a given project, since they are pretty much just random people the
admin doesn't necessarily trust much.

(Or put another way - I'd hate for them to patch out/disable this
feature because of security concerns, since I'd want to use it on
kernel.org, but I'm not sure the admins would want me configuring
arbitrary regexes there)

johannes
Laurent Pinchart Dec. 26, 2015, 11 a.m. UTC | #3
Hi Stephen,

On Thursday 24 December 2015 14:06:58 Finucane, Stephen wrote:
> On 28 Nov 10:14, Mauro Carvalho Chehab wrote:
> > From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > 
> > Paths are validated by trying to compile it as a regexp using a custom
> > validator.
> > 
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
> 
> Some small nits that I can fix myself. Other than that,
> 
> Acked-by: Stephen Finucane <stephen.finucane@intel.com>
> 
> > ---
> > 
> >  patchwork/bin/parsemail.py | 19 +++++++++++++++++--
> >  patchwork/models.py        |  9 ++++++++-
> >  2 files changed, 25 insertions(+), 3 deletions(-)
> > 
> > diff --git a/patchwork/bin/parsemail.py b/patchwork/bin/parsemail.py
> > index 4f22c7f2d6a0..5d6ddf45d264 100755
> > --- a/patchwork/bin/parsemail.py
> > +++ b/patchwork/bin/parsemail.py
> > @@ -339,18 +339,33 @@ def get_state(state_name):
> >              pass
> >      return get_default_initial_patch_state()
> > 
> > +class Rule(object):
> > +    pass
> > +
> 
> nit: a defaultdict ({user: [path, path...]}) would be more semantic
> here. I can change this, unless you've any reason I shouldn't.

No reason, please go ahead. I didn't know about defaultdict, thanks for the 
tip.

> >  def auto_delegate(project, filenames):
> >      if not filenames:
> >          return None
> > 
> > -    rules = list(DelegationRule.objects.filter(project = project))
> > +    # Precompile the path regexps
> > +    rules = []
> > +    for rule in DelegationRule.objects.filter(project = project):
> > +        r = Rule()
> > +        r.user = rule.user
> > +
> > +        try:
> > +            r.path = re.compile(rule.path)
> > +        except:
>
> I'm going to make this capture 're.error' only, if that's OK.

No issue with that.

> > +            print '%s is not a valid regular expression' % rule.path
> > +            continue
> > +
> > +        rules.append(r)
> > 
> >      patch_delegate = None
> >      
> >      for filename in filenames:
> >          file_delegate = None
> >          for rule in rules:
> > -            if fnmatch(filename, rule.path):
> > +            if rule.path.match(filename):
> >                  file_delegate = rule.user
> >                  break;
Laurent Pinchart Dec. 26, 2015, 11:05 a.m. UTC | #4
Hi Johannes,

On Friday 25 December 2015 16:59:50 Johannes Berg wrote:
> On Thu, 2015-12-24 at 14:06 +0000, Finucane, Stephen wrote:
> >
> > > Paths are validated by trying to compile it as a regexp using a
> > > custom validator.
> > > 
> > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
> > 
> > Some small nits that I can fix myself. Other than that,
> 
> A small comment that may or may not be relevant - but there's a bunch
> of things one can do with regexes, from taking a lot of CPU to taking a
> lot of memory.
> 
> What's the trust model for running regexes? I haven't found it in the
> patches easily right now. If it's configured only in the config file it
> should be OK, but if any kind of remote user can configure it then it
> may need safeguards of some kind?
> 
> I'm just thinking of a use case like kernel.org where you don't really
> even trust the people who are the typical delegates/admins in patchwork
> for a given project, since they are pretty much just random people the
> admin doesn't necessarily trust much.
> 
> (Or put another way - I'd hate for them to patch out/disable this
> feature because of security concerns, since I'd want to use it on
> kernel.org, but I'm not sure the admins would want me configuring
> arbitrary regexes there)

I agree with your concerns but haven't given them a thought to be honest. 
Right now only patchwork admins can changes the rules, but as you mention we 
might not trust them.

I've used regexps for convenience, we could possibly replace them with a less 
dangerous type of pattern. One option I was toying with was to create rules 
automatically from MAINTAINERS, but I don't think that would be flexible 
enough.
Mauro Carvalho Chehab Dec. 26, 2015, 5:56 p.m. UTC | #5
Em Sat, 26 Dec 2015 13:05:55 +0200
Laurent Pinchart <laurent.pinchart@ideasonboard.com> escreveu:

> Hi Johannes,
> 
> On Friday 25 December 2015 16:59:50 Johannes Berg wrote:
> > On Thu, 2015-12-24 at 14:06 +0000, Finucane, Stephen wrote:
> > >
> > > > Paths are validated by trying to compile it as a regexp using a
> > > > custom validator.
> > > > 
> > > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > > Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
> > > 
> > > Some small nits that I can fix myself. Other than that,
> > 
> > A small comment that may or may not be relevant - but there's a bunch
> > of things one can do with regexes, from taking a lot of CPU to taking a
> > lot of memory.
> > 
> > What's the trust model for running regexes? I haven't found it in the
> > patches easily right now. If it's configured only in the config file it
> > should be OK, but if any kind of remote user can configure it then it
> > may need safeguards of some kind?
> > 
> > I'm just thinking of a use case like kernel.org where you don't really
> > even trust the people who are the typical delegates/admins in patchwork
> > for a given project, since they are pretty much just random people the
> > admin doesn't necessarily trust much.
> > 
> > (Or put another way - I'd hate for them to patch out/disable this
> > feature because of security concerns, since I'd want to use it on
> > kernel.org, but I'm not sure the admins would want me configuring
> > arbitrary regexes there)
> 
> I agree with your concerns but haven't given them a thought to be honest. 
> Right now only patchwork admins can changes the rules, but as you mention we 
> might not trust them.
> 
> I've used regexps for convenience, we could possibly replace them with a less 
> dangerous type of pattern. One option I was toying with was to create rules 
> automatically from MAINTAINERS, but I don't think that would be flexible 
> enough.
> 

IMHO, the real problem here is the need that everybody with write
access should be project maintainer in patchwork, and it lacks logs
when a patch is delegated or changed its status.

I would very much prefer to be able to delegate a patch to a driver
maintainer (with I don't have much trustee enough to promote it to
a Project Maintainer), but, in this case, I would need logs if such
person changes the patch status or delegates the patch to somebody 
else. I would also expect that the project maintainers would receive
any notification e-mails if such person changes the status.

Even better, I would like to be able to approve such changes for the
ones I don't trust enough, as I would need to confirm if the patch
change is associated with enough review emails at the ML, and, in
the case of patch acceptance, I would need to take the action of
adding the patch on my tree.

Regards,
Mauro
Stephen Finucane Jan. 4, 2016, 9:55 a.m. UTC | #6
On 26 Dec 15:56, Mauro Carvalho Chehab wrote:

[snip]

> > I've used regexps for convenience, we could possibly replace them with a less 
> > dangerous type of pattern. One option I was toying with was to create rules 
> > automatically from MAINTAINERS, but I don't think that would be flexible 
> > enough.
> > 
> 
> IMHO, the real problem here is the need that everybody with write
> access should be project maintainer in patchwork, and it lacks logs
> when a patch is delegated or changed its status.
> 
> I would very much prefer to be able to delegate a patch to a driver
> maintainer (with I don't have much trustee enough to promote it to
> a Project Maintainer), but, in this case, I would need logs if such
> person changes the patch status or delegates the patch to somebody 
> else. I would also expect that the project maintainers would receive
> any notification e-mails if such person changes the status.
> 
> Even better, I would like to be able to approve such changes for the
> ones I don't trust enough, as I would need to confirm if the patch
> change is associated with enough review emails at the ML, and, in
> the case of patch acceptance, I would need to take the action of
> adding the patch on my tree.

Agreed - we do need some form of authentication model: patchwork is
"trusting" to a fault at the moment. We should be concerned that by
adding such a feature we'd subtract from one of patchwork's guiding
principles, which is that patchwork should be simple. However, I
think this is important enough that we can accept any potential
complexity.

I've opened an issue to track this feature request [1]. I'm happy
to accept patches from anyone who'd care to investigate it.

Stephen

[1] https://waffle.io/getpatchwork/patchwork/cards/568a410a5367701600afa13a
Stephen Finucane Jan. 4, 2016, 10 a.m. UTC | #7
On 26 Dec 13:05, Laurent Pinchart wrote:
> Hi Johannes,
> 
> On Friday 25 December 2015 16:59:50 Johannes Berg wrote:
> > On Thu, 2015-12-24 at 14:06 +0000, Finucane, Stephen wrote:
> > >
> > > > Paths are validated by trying to compile it as a regexp using a
> > > > custom validator.
> > > > 
> > > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > > Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
> > > 
> > > Some small nits that I can fix myself. Other than that,
> > 
> > A small comment that may or may not be relevant - but there's a bunch
> > of things one can do with regexes, from taking a lot of CPU to taking a
> > lot of memory.
> > 
> > What's the trust model for running regexes? I haven't found it in the
> > patches easily right now. If it's configured only in the config file it
> > should be OK, but if any kind of remote user can configure it then it
> > may need safeguards of some kind?
> > 
> > I'm just thinking of a use case like kernel.org where you don't really
> > even trust the people who are the typical delegates/admins in patchwork
> > for a given project, since they are pretty much just random people the
> > admin doesn't necessarily trust much.
> > 
> > (Or put another way - I'd hate for them to patch out/disable this
> > feature because of security concerns, since I'd want to use it on
> > kernel.org, but I'm not sure the admins would want me configuring
> > arbitrary regexes there)
> 
> I agree with your concerns but haven't given them a thought to be honest. 
> Right now only patchwork admins can changes the rules, but as you mention we 
> might not trust them.
> 
> I've used regexps for convenience, we could possibly replace them with a less 
> dangerous type of pattern. One option I was toying with was to create rules 
> automatically from MAINTAINERS, but I don't think that would be flexible 
> enough.

Could we use fnmatch instead? This is the suggestion on StackOverflow [1] and
documentation for the function suggests that the grammar is a very simple one
without the possibility for backrefs or other "dangerous" things [2].

Stephen

[1] http://stackoverflow.com/a/1998868/613428
[2] https://books.google.com/books?id=GxKWdn7u4w8C&lpg=PA232&dq=fnmatch%20backtracking&pg=PA232#v=onepage&q=fnmatch%20backtracking&f=false
Johannes Berg Jan. 5, 2016, 8:38 a.m. UTC | #8
On Mon, 2016-01-04 at 10:00 +0000, Finucane, Stephen wrote:

> > I agree with your concerns but haven't given them a thought to be honest. 
> > Right now only patchwork admins can changes the rules, but as you mention we 
> > might not trust them.

Frankly, I'm not quite sure of the permissions model, and even what
"admin" means.

I'm "maintainer" of the linux-wireless project on the kernel.org
patchwork, and in that role I think I should be able to change the
auto-delegate settings.
However, the kernel.org server admin might not trust me with arbitrary
regexps.

> Could we use fnmatch instead? This is the suggestion on StackOverflow
> [1] and
> documentation for the function suggests that the grammar is a very
> simple one
> without the possibility for backrefs or other "dangerous" things [2].

I see no problem with that.

johannes
Stephen Finucane Jan. 6, 2016, 5:17 p.m. UTC | #9
On 05 Jan 09:38, Johannes Berg wrote:
> On Mon, 2016-01-04 at 10:00 +0000, Finucane, Stephen wrote:
> 
> > > I agree with your concerns but haven't given them a thought to be honest. 
> > > Right now only patchwork admins can changes the rules, but as you mention we 
> > > might not trust them.
> 
> Frankly, I'm not quite sure of the permissions model, and even what
> "admin" means.
> 
> I'm "maintainer" of the linux-wireless project on the kernel.org
> patchwork, and in that role I think I should be able to change the
> auto-delegate settings.
> However, the kernel.org server admin might not trust me with arbitrary
> regexps.
> 
> > Could we use fnmatch instead? This is the suggestion on StackOverflow
> > [1] and
> > documentation for the function suggests that the grammar is a very
> > simple one
> > without the possibility for backrefs or other "dangerous" things [2].
> 
> I see no problem with that.
> 
> johannes

Actually, this patch adds regex support in place of the fnmatch already
used. In light of the security risks, I'm reluctant to add support for
this in its current form. Far as I see it, we can either avoid regex
support or if it's valuable enough to include, make it an optional
feature that can be enabled/disabled accordingly. I'd rather the former
for simplicity, though I don't have any visibility into how useful this
is so I'd like input. Thoughts?

Stephen
Johannes Berg Jan. 6, 2016, 8:53 p.m. UTC | #10
On Wed, 2016-01-06 at 17:17 +0000, Finucane, Stephen wrote:

> Actually, this patch adds regex support in place of the fnmatch
> already used. In light of the security risks, I'm reluctant to add
> support for this in its current form. Far as I see it, we can either
> avoid regex support or if it's valuable enough to include, make it an
> optional feature that can be enabled/disabled accordingly. I'd rather
> the former for simplicity, though I don't have any visibility into
> how useful this is so I'd like input. Thoughts?

Personally, I see very little use for regex matches over fnmatch. I'd
expect that, as far as we'd use it for wireless, to really only have
patterns like
   my/path/*
   your/path/*

johannes
Stephen Finucane Jan. 8, 2016, 10:27 a.m. UTC | #11
On 06 Jan 21:53, Johannes Berg wrote:
> On Wed, 2016-01-06 at 17:17 +0000, Finucane, Stephen wrote:
> > 
> > Actually, this patch adds regex support in place of the fnmatch
> > already used. In light of the security risks, I'm reluctant to add
> > support for this in its current form. Far as I see it, we can either
> > avoid regex support or if it's valuable enough to include, make it an
> > optional feature that can be enabled/disabled accordingly. I'd rather
> > the former for simplicity, though I don't have any visibility into
> > how useful this is so I'd like input. Thoughts?
> 
> Personally, I see very little use for regex matches over fnmatch. I'd
> expect that, as far as we'd use it for wireless, to really only have
> patterns like
>    my/path/*
>    your/path/*
> 
> johannes

I agree. I'm happy to drop this patch for now and reassess it as
necessary in the future.

Stephen
diff mbox

Patch

diff --git a/patchwork/bin/parsemail.py b/patchwork/bin/parsemail.py
index 4f22c7f2d6a0..5d6ddf45d264 100755
--- a/patchwork/bin/parsemail.py
+++ b/patchwork/bin/parsemail.py
@@ -339,18 +339,33 @@  def get_state(state_name):
             pass
     return get_default_initial_patch_state()
 
+class Rule(object):
+    pass
+
 def auto_delegate(project, filenames):
     if not filenames:
         return None
 
-    rules = list(DelegationRule.objects.filter(project = project))
+    # Precompile the path regexps
+    rules = []
+    for rule in DelegationRule.objects.filter(project = project):
+        r = Rule()
+        r.user = rule.user
+
+        try:
+            r.path = re.compile(rule.path)
+        except:
+            print '%s is not a valid regular expression' % rule.path
+            continue
+
+        rules.append(r)
 
     patch_delegate = None
 
     for filename in filenames:
         file_delegate = None
         for rule in rules:
-            if fnmatch(filename, rule.path):
+            if rule.path.match(filename):
                 file_delegate = rule.user
                 break;
 
diff --git a/patchwork/models.py b/patchwork/models.py
index 101a9af9746f..e552217a3cbc 100644
--- a/patchwork/models.py
+++ b/patchwork/models.py
@@ -19,6 +19,7 @@ 
 
 from django.db import models
 from django.contrib.auth.models import User
+from django.core.exceptions import ValidationError
 from django.core.urlresolvers import reverse
 from django.contrib.sites.models import Site
 from django.conf import settings
@@ -78,9 +79,15 @@  class Project(models.Model):
         ordering = ['linkname']
 
 
+def validate_rule_path(value):
+    try:
+        re.compile(value)
+    except:
+        raise ValidationError(u'`%s\' is not a valid regulator expression' % value)
+
 class DelegationRule(models.Model):
     user = models.ForeignKey(User)
-    path = models.CharField(max_length=255)
+    path = models.CharField(max_length = 255, validators = [validate_rule_path])
     project = models.ForeignKey(Project)
     priority = models.IntegerField(default = 0)