Message ID | 1340736849-14875-3-git-send-email-yinghai@kernel.org |
---|---|
State | Rejected |
Headers | show |
On Tue, Jun 26, 2012 at 11:53 AM, Yinghai Lu <yinghai@kernel.org> wrote: > It is changed from busn_res only version, because Bjorn found that version > was not holding resource_lock. > Even it may be ok for busn_res not holding resource_lock. > It would be better to have it to be generic and use lock and we may > use it for other resources. > > probe_resource() will try to find specified size or more in parent bus. > If can not find current parent resource, and it will try to expand parents > top. > If still can not find that specified on top, it will try to reduce target size > until find one. > > It will return 0, if it find any resource that could be used. > > Returned resource is registered in the tree. > So caller may need to use replace_resource to put real resource in tree. > > -v3: remove two parameters that is for debug purpose. > -v4: fix stop_flags checking. > -v5: adjust stop_flags checking position to avoid not needed calling > into allocate_resource(). please check attached one that is updated after first patch with __allocate_resource changes. except this one and first one are changed. left ones are not needed to be updated. So i'm not going to resend them. the git branch does get updated. Thanks Yinghai
On Tue, Jun 26, 2012 at 3:07 PM, Yinghai Lu <yinghai@kernel.org> wrote: > On Tue, Jun 26, 2012 at 11:53 AM, Yinghai Lu <yinghai@kernel.org> wrote: >> It is changed from busn_res only version, because Bjorn found that version >> was not holding resource_lock. >> Even it may be ok for busn_res not holding resource_lock. >> It would be better to have it to be generic and use lock and we may >> use it for other resources. >> >> probe_resource() will try to find specified size or more in parent bus. >> If can not find current parent resource, and it will try to expand parents >> top. >> If still can not find that specified on top, it will try to reduce target size >> until find one. >> >> It will return 0, if it find any resource that could be used. >> >> Returned resource is registered in the tree. >> So caller may need to use replace_resource to put real resource in tree. >> >> -v3: remove two parameters that is for debug purpose. >> -v4: fix stop_flags checking. >> -v5: adjust stop_flags checking position to avoid not needed calling >> into allocate_resource(). > > please check attached one that is updated after first patch with > __allocate_resource changes. > except this one and first one are changed. left ones are not needed to > be updated. > So i'm not going to resend them. please check update one. -v7 Thanks Yinghai
On Tue, Aug 28, 2012 at 9:09 AM, Yinghai Lu <yinghai@kernel.org> wrote: > > please check update one. -v7 Ugh. Ok, looking closer at this, I think I agree with Bjorn. This is too ugly. The whole "reduce size/needed_size by 1" double loop is O(n**2). And it does ugly "insert fake resource to check, and then remove it". Ugh. Maybe this works in practice for the special case of some PCI bus numbers that are small integers, but the code is nasty nasty horrible and not a generic resource allocation thing. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 28, 2012 at 10:05 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Ugh. Ok, looking closer at this, Btw, looking at that code, I also found what looks like a potential locking bug in allocate_resource(). The code does if (new->parent) .. reallocate .. to check whether a resource was already allocated. HOWEVER, it does so without actually holding the resource lock. Which means that "new->parent" might in theory change. I don't really know if we care. Anybody who does a "allocate_resource()" on an existing resource that might already be in the resource tree hopefully does *not* do that in parallel with another user trying to change the resource allocation, so maybe the right thing to do is to just say "whatever, if there is a race with two threads reallocating the same resource at the same time, the bug is a much more serious one at a higher level". So this may well be a non-issue. It was introduced some time ago, in commit 23c570a67448e ("resource: ability to resize an allocated resource"), since before that we never even allowed re-allocation of an already allocated resource in the first place, and everything happened under the lock. I dunno. Here's a (UNTESTED!) patch that should fix it. Because it's extremely doubtful whether this is a real problem, I'm certainly not going to commit it now, much less mark it for stable. But I'm throwing it out as an RFC. Technically, if people rely on any races being handled by the resource subsystem, this *could* trigger. But I'm hoping that the PCI layer has some higher-level locking for "reallocate the resources of this PCI device". I did *not* check the callers. Btw, reallocate_resource() isn't actually used anywhere in the tree that I can see, so maybe we should remove it and the export, and just have the __reallocate_resource() that is static to resource.c and is to be called only with the lock held. Linus
On Tue, Aug 28, 2012 at 5:10 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, Aug 28, 2012 at 10:05 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> Ugh. Ok, looking closer at this, > > Btw, looking at that code, I also found what looks like a potential > locking bug in allocate_resource(). > > The code does > > if (new->parent) > .. reallocate .. > > to check whether a resource was already allocated. HOWEVER, it does so > without actually holding the resource lock. Which means that > "new->parent" might in theory change. > > I don't really know if we care. Anybody who does a > "allocate_resource()" on an existing resource that might already be in > the resource tree hopefully does *not* do that in parallel with > another user trying to change the resource allocation, so maybe the > right thing to do is to just say "whatever, if there is a race with > two threads reallocating the same resource at the same time, the bug > is a much more serious one at a higher level". yes, another patch that that split __allocate_resource out have the similar fix. http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=commitdiff;h=5d52b21303b7271ba4c5302b387766628f074ae2 but the changelog does not that mention the reason. also have another version for probe_resource, please check attached version -v8. Thanks Yinghai
On Wed, Aug 29, 2012 at 8:57 AM, Yinghai Lu <yinghai@kernel.org> wrote: > also have another version for probe_resource, please check attached version -v8. > sorry, v8 forget removing two lines. please -v9 instead. -v8: Linus said: allocation/return is not right, and -1 step tricks make it not work as generic resource probe. So try to remove the needed_size tricks, and also use __adjust_resource for probing instead. -v9: remove two lines that is supposed to be removed after converting to use __adjust_resource Thanks Yinghai
On Wed, Aug 29, 2012 at 10:36 AM, Yinghai Lu <yinghai@kernel.org> wrote: > On Wed, Aug 29, 2012 at 8:57 AM, Yinghai Lu <yinghai@kernel.org> wrote: >> also have another version for probe_resource, please check attached version -v8. >> > > sorry, v8 forget removing two lines. > > please -v9 instead. > > -v8: Linus said: allocation/return is not right, and -1 step tricks make it > not work as generic resource probe. > So try to remove the needed_size tricks, and also use __adjust_resource > for probing instead. > -v9: remove two lines that is supposed to be removed after converting to use > __adjust_resource These tweaks might be slight improvements, but they completely miss the point of my objection. I just don't think the probe_resource() interface is a reasonable addition to kernel/resource.c. I think it's too hard to describe what it does, and it seems like it's too specific to what PCI needs in this particular case. We should be able to look at the prototype and get a pretty good idea of what the function does, but I can't do that with this: +int probe_resource(struct resource *b_res, + struct resource *busn_res, + resource_size_t needed_size, struct resource **p, + int skip_nr, int limit, int stop_flags) We already have adjust_resource(), which grows or shrinks a resource while maintaining the invariants that the adjusted resource (1) doesn't overlap any of its siblings and (2) still contains all its children. adjust_resource() seems like a fairly generic, generally useful interface. What you're trying to do with probe_resource() is quite similar, except that probe_resource() adds the idea of walking up the tree. I think you should consider something like an "expand_resource()" that just balloons a resource at both ends until it abuts its siblings, i.e., it grows the resource as much as possible. Then you know the largest possible size, and you can use adjust_resource() to shrink it again if you don't need that much. You can walk up the tree in the caller when you need to. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 589e0e7..8292e8b 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -156,6 +156,13 @@ extern int allocate_resource(struct resource *root, struct resource *new, resource_size_t, resource_size_t), void *alignf_data); +void resource_shrink_parents_top(struct resource *b_res, + long size, struct resource *parent_res); +struct device; +int probe_resource(struct resource *b_res, + struct resource *busn_res, + resource_size_t needed_size, struct resource **p, + int skip_nr, int limit, int flags); struct resource *lookup_resource(struct resource *root, resource_size_t start); int adjust_resource(struct resource *res, resource_size_t start, resource_size_t size); diff --git a/kernel/resource.c b/kernel/resource.c index 3f6e522..d5d9aef 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -980,6 +980,155 @@ void __release_region(struct resource *parent, resource_size_t start, } EXPORT_SYMBOL(__release_region); +static void __resource_extend_parents_top(struct resource *b_res, + long size, struct resource *parent_res) +{ + struct resource *res = b_res; + + if (!size) + return; + + while (res && res != parent_res) { + res->end += size; + res = res->parent; + } +} + +void resource_shrink_parents_top(struct resource *b_res, + long size, struct resource *parent_res) +{ + write_lock(&resource_lock); + __resource_extend_parents_top(b_res, -size, parent_res); + write_unlock(&resource_lock); +} + +static resource_size_t __find_res_top_free_size(struct resource *res, + int skip_nr) +{ + resource_size_t n_size; + struct resource tmp_res; + + /* + * find out free number below res->end that we can use. + * res->start to res->start + skip_nr - 1 can not be used. + */ + n_size = resource_size(res); + if (n_size <= skip_nr) + return 0; + + n_size -= skip_nr; + memset(&tmp_res, 0, sizeof(struct resource)); + while (n_size > 0) { + int ret; + + ret = __allocate_resource(res, &tmp_res, n_size, + res->end - n_size + skip_nr, res->end, + 1, NULL, NULL, false, false); + if (ret == 0) { + __release_resource(&tmp_res); + break; + } + n_size--; + } + + return n_size; +} + +/** + * probe_resource - Probe resource in parent resource. + * @b_res: parent resource descriptor + * @busn_res: return probed resource + * @needed_size: target size + * @p: pointer to farest parent that we extend the top + * @skip_nr: number in b_res start that we need to skip. + * @limit: local boundary + * @stop_flags: flags for stopping extend parent res + * + * will try to allocate resource in b_res, if can not find the range + * will try to extend parent resources' top. + * if still can not make it, will reduce needed_size. + */ +int probe_resource(struct resource *b_res, + struct resource *busn_res, + resource_size_t needed_size, struct resource **p, + int skip_nr, int limit, int stop_flags) +{ + int ret = -ENOMEM; + resource_size_t n_size; + struct resource *parent_res = NULL; + resource_size_t tmp = b_res->end + 1; + +again: + /* + * We first try to allocate biggest range in b_res that + * we can use in b_res directly. + * we also need to skip skip_nr from start of b_res. + */ + n_size = resource_size(b_res); + if (n_size > skip_nr) + n_size -= skip_nr; + else + n_size = 0; + memset(busn_res, 0, sizeof(struct resource)); + while (n_size >= needed_size) { + ret = allocate_resource(b_res, busn_res, n_size, + b_res->start + skip_nr, b_res->end, + 1, NULL, NULL); + if (!ret) + return ret; + n_size--; + } + + /* Try to extend the top of parent resources to meet needed_size */ + write_lock(&resource_lock); + + /* b_res could be root bus resource and can not be extended */ + if (b_res->flags & stop_flags) + goto reduce_needed_size; + + /* find out free range under top at first */ + n_size = __find_res_top_free_size(b_res, skip_nr); + /* can not extend cross local boundary */ + if ((limit - b_res->end) < (needed_size - n_size)) + goto reduce_needed_size; + + /* Probe extended range above top */ + memset(busn_res, 0, sizeof(struct resource)); + parent_res = b_res->parent; + while (parent_res && !(parent_res->flags & stop_flags)) { + ret = __allocate_resource(parent_res, busn_res, + needed_size - n_size, + tmp, tmp + needed_size - n_size - 1, + 1, NULL, NULL, false, false); + if (!ret) { + /* save parent_res, we need it as stopper later */ + *p = parent_res; + + /* prepare busn_res for return */ + __release_resource(busn_res); + busn_res->start -= n_size; + + /* extend parent resources top */ + __resource_extend_parents_top(b_res, + needed_size - n_size, parent_res); + __request_resource(b_res, busn_res); + + write_unlock(&resource_lock); + return ret; + } + parent_res = parent_res->parent; + } + +reduce_needed_size: + write_unlock(&resource_lock); + /* ret must not be 0 here */ + needed_size--; + if (needed_size) + goto again; + + return ret; +} + /* * Managed region resource */
It is changed from busn_res only version, because Bjorn found that version was not holding resource_lock. Even it may be ok for busn_res not holding resource_lock. It would be better to have it to be generic and use lock and we may use it for other resources. probe_resource() will try to find specified size or more in parent bus. If can not find current parent resource, and it will try to expand parents top. If still can not find that specified on top, it will try to reduce target size until find one. It will return 0, if it find any resource that could be used. Returned resource is registered in the tree. So caller may need to use replace_resource to put real resource in tree. -v3: remove two parameters that is for debug purpose. -v4: fix stop_flags checking. -v5: adjust stop_flags checking position to avoid not needed calling into allocate_resource(). Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> --- include/linux/ioport.h | 7 ++ kernel/resource.c | 149 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 156 insertions(+), 0 deletions(-)