Message ID | 1383042632-7102-2-git-send-email-vfalico@redhat.com |
---|---|
State | Not Applicable |
Headers | show |
On Tue, Oct 29, 2013 at 11:30:30AM +0100, Veaceslav Falico wrote: > Currently, we first do kobject_put(&entry->kobj) and the kfree(entry), > however kobject_put() doesn't guarantee us that it was the last reference > and that the kobj isn't used currently by someone else, so after we > kfree(entry) with the struct kobject - other users will begin using the > freed memory, instead of the actual kobject. > > Fix this by using the kobject->release callback, which is called last when > the kobject is indeed not used and is cleaned up - it's msi_kobj_release(), > which can do the kfree(entry) safely (kobject_put/cleanup doesn't use the > kobj itself after ->release() was called, so we're safe). > > In case we've failed to create the sysfs directories - just kfree() > it - cause we don't have the kobjects attached. > > Also, remove the same functionality from populate_msi_sysfs(), cause on > failure we anyway call free_msi_irqs(), which will take care of all the > kobjects properly. > > And add the forgotten pci_dev_put(pdev) in case of failure to register the > kobject in populate_msi_sysfs(). > > CC: Bjorn Helgaas <bhelgaas@google.com> > CC: Neil Horman <nhorman@tuxdriver.com> > CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > CC: linux-pci@vger.kernel.org > CC: linux-kernel@vger.kernel.org > Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 29, 2013 at 3:30 AM, Veaceslav Falico <vfalico@redhat.com> wrote: > /* > * Its possible that we get into this path > * When populate_msi_sysfs fails, which means the entries > * were not registered with sysfs. In that case don't > - * unregister them. > + * unregister them, and just free. Otherwise the > + * kobject->release will take care of freeing the entry via > + * msi_kobj_release(). > */ > if (entry->kobj.parent) { > kobject_del(&entry->kobj); > kobject_put(&entry->kobj); > + } else { > + kfree(entry); > } > - > - list_del(&entry->list); > - kfree(entry); So this code sequence still makes me very unhappy. Why does not just a simple unconditional kobject_del(&entry->kobj); kobject_put(&entry->kobj); work for the "not registered with sysfs" case? And if the sysfs code really gets confused, why not if (entry->kobj.parent) kobject_del(&entry->kobj); kobject_put(&entry->kobj); (btw, looking at the sysfs code, this looks *very* suspicious in sysfs_remove_dir(): struct sysfs_dirent *sd = kobj->sd; spin_lock(&sysfs_assoc_lock); kobj->sd = NULL; spin_unlock(&sysfs_assoc_lock); and I would suggest that "sd = kobj->sd" should be done under the lock, because otherwise the lock is kind of pointless..) Greg? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 29, 2013 at 09:34:28AM -0700, Linus Torvalds wrote: >On Tue, Oct 29, 2013 at 3:30 AM, Veaceslav Falico <vfalico@redhat.com> wrote: >> /* >> * Its possible that we get into this path >> * When populate_msi_sysfs fails, which means the entries >> * were not registered with sysfs. In that case don't >> - * unregister them. >> + * unregister them, and just free. Otherwise the >> + * kobject->release will take care of freeing the entry via >> + * msi_kobj_release(). >> */ >> if (entry->kobj.parent) { >> kobject_del(&entry->kobj); >> kobject_put(&entry->kobj); >> + } else { >> + kfree(entry); >> } >> - >> - list_del(&entry->list); >> - kfree(entry); > >So this code sequence still makes me very unhappy. > >Why does not just a simple unconditional > > kobject_del(&entry->kobj); > kobject_put(&entry->kobj); > >work for the "not registered with sysfs" case? And if the sysfs code >really gets confused, why not > > if (entry->kobj.parent) > kobject_del(&entry->kobj); > kobject_put(&entry->kobj); It was fixed this way in 424eb39 ("PCI: msi: fix imbalanced refcount of msi irq sysfs objects"). kobject_put() still failed in case it wasn't registered with sysfs earlier. populate_msi_sysfs() creates an kobject for each entry in msi_list, and we have no idea (on fallback) up to which entry was it already registered, on which it failed and which entries are still not kobject_init_and_add()ed. > >(btw, looking at the sysfs code, this looks *very* suspicious in >sysfs_remove_dir(): > > struct sysfs_dirent *sd = kobj->sd; > > spin_lock(&sysfs_assoc_lock); > kobj->sd = NULL; > spin_unlock(&sysfs_assoc_lock); > >and I would suggest that "sd = kobj->sd" should be done under the >lock, because otherwise the lock is kind of pointless..) > >Greg? > > Linus -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 29, 2013 at 09:34:28AM -0700, Linus Torvalds wrote: > On Tue, Oct 29, 2013 at 3:30 AM, Veaceslav Falico <vfalico@redhat.com> wrote: > > /* > > * Its possible that we get into this path > > * When populate_msi_sysfs fails, which means the entries > > * were not registered with sysfs. In that case don't > > - * unregister them. > > + * unregister them, and just free. Otherwise the > > + * kobject->release will take care of freeing the entry via > > + * msi_kobj_release(). > > */ > > if (entry->kobj.parent) { > > kobject_del(&entry->kobj); > > kobject_put(&entry->kobj); > > + } else { > > + kfree(entry); > > } > > - > > - list_del(&entry->list); > > - kfree(entry); > > So this code sequence still makes me very unhappy. > > Why does not just a simple unconditional > > kobject_del(&entry->kobj); > kobject_put(&entry->kobj); > > work for the "not registered with sysfs" case? And if the sysfs code > really gets confused, why not > > if (entry->kobj.parent) > kobject_del(&entry->kobj); > kobject_put(&entry->kobj); > > (btw, looking at the sysfs code, this looks *very* suspicious in > sysfs_remove_dir(): > > struct sysfs_dirent *sd = kobj->sd; > > spin_lock(&sysfs_assoc_lock); > kobj->sd = NULL; > spin_unlock(&sysfs_assoc_lock); > > and I would suggest that "sd = kobj->sd" should be done under the > lock, because otherwise the lock is kind of pointless..) > > Greg? That is really odd, but I guess it works as-is because no one ever calls that function on the same kobject at the same time. I don't know what that is trying to do. There has been some work by Tejun in this area for linux-next, but that lock and logic is still there, I'll look into fixing that up... thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 29, 2013 at 11:30:30AM +0100, Veaceslav Falico wrote: > Currently, we first do kobject_put(&entry->kobj) and the kfree(entry), > however kobject_put() doesn't guarantee us that it was the last reference > and that the kobj isn't used currently by someone else, so after we > kfree(entry) with the struct kobject - other users will begin using the > freed memory, instead of the actual kobject. > > Fix this by using the kobject->release callback, which is called last when > the kobject is indeed not used and is cleaned up - it's msi_kobj_release(), > which can do the kfree(entry) safely (kobject_put/cleanup doesn't use the > kobj itself after ->release() was called, so we're safe). > > In case we've failed to create the sysfs directories - just kfree() > it - cause we don't have the kobjects attached. > > Also, remove the same functionality from populate_msi_sysfs(), cause on > failure we anyway call free_msi_irqs(), which will take care of all the > kobjects properly. > > And add the forgotten pci_dev_put(pdev) in case of failure to register the > kobject in populate_msi_sysfs(). > > CC: Bjorn Helgaas <bhelgaas@google.com> > CC: Neil Horman <nhorman@tuxdriver.com> > CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > CC: linux-pci@vger.kernel.org > CC: linux-kernel@vger.kernel.org > Signed-off-by: Veaceslav Falico <vfalico@redhat.com> I'm still hoping that Greg (or somebody else; Greg already posted the bulk of the work) will finish up the conversion to attributes, and that Neil will confirm that works with no changes to irqbalance. So I'm going to drop these patches for now. Poke me if we need to revive them. Bjorn > --- > drivers/pci/msi.c | 31 ++++++++++++------------------- > 1 file changed, 12 insertions(+), 19 deletions(-) > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > index d5f90d6..5d70f49 100644 > --- a/drivers/pci/msi.c > +++ b/drivers/pci/msi.c > @@ -374,19 +374,22 @@ static void free_msi_irqs(struct pci_dev *dev) > iounmap(entry->mask_base); > } > > + list_del(&entry->list); > + > /* > * Its possible that we get into this path > * When populate_msi_sysfs fails, which means the entries > * were not registered with sysfs. In that case don't > - * unregister them. > + * unregister them, and just free. Otherwise the > + * kobject->release will take care of freeing the entry via > + * msi_kobj_release(). > */ > if (entry->kobj.parent) { > kobject_del(&entry->kobj); > kobject_put(&entry->kobj); > + } else { > + kfree(entry); > } > - > - list_del(&entry->list); > - kfree(entry); > } > } > > @@ -509,6 +512,7 @@ static void msi_kobj_release(struct kobject *kobj) > struct msi_desc *entry = to_msi_desc(kobj); > > pci_dev_put(entry->dev); > + kfree(entry); > } > > static struct kobj_type msi_irq_ktype = { > @@ -522,7 +526,6 @@ static int populate_msi_sysfs(struct pci_dev *pdev) > struct msi_desc *entry; > struct kobject *kobj; > int ret; > - int count = 0; > > pdev->msi_kset = kset_create_and_add("msi_irqs", NULL, &pdev->dev.kobj); > if (!pdev->msi_kset) > @@ -534,23 +537,13 @@ static int populate_msi_sysfs(struct pci_dev *pdev) > pci_dev_get(pdev); > ret = kobject_init_and_add(kobj, &msi_irq_ktype, NULL, > "%u", entry->irq); > - if (ret) > - goto out_unroll; > - > - count++; > + if (ret) { > + pci_dev_put(pdev); > + return ret; > + } > } > > return 0; > - > -out_unroll: > - list_for_each_entry(entry, &pdev->msi_list, list) { > - if (!count) > - break; > - kobject_del(&entry->kobj); > - kobject_put(&entry->kobj); > - count--; > - } > - return ret; > } > > /** > -- > 1.8.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 25, 2013 at 04:12:48PM -0700, Bjorn Helgaas wrote: > On Tue, Oct 29, 2013 at 11:30:30AM +0100, Veaceslav Falico wrote: > > Currently, we first do kobject_put(&entry->kobj) and the kfree(entry), > > however kobject_put() doesn't guarantee us that it was the last reference > > and that the kobj isn't used currently by someone else, so after we > > kfree(entry) with the struct kobject - other users will begin using the > > freed memory, instead of the actual kobject. > > > > Fix this by using the kobject->release callback, which is called last when > > the kobject is indeed not used and is cleaned up - it's msi_kobj_release(), > > which can do the kfree(entry) safely (kobject_put/cleanup doesn't use the > > kobj itself after ->release() was called, so we're safe). > > > > In case we've failed to create the sysfs directories - just kfree() > > it - cause we don't have the kobjects attached. > > > > Also, remove the same functionality from populate_msi_sysfs(), cause on > > failure we anyway call free_msi_irqs(), which will take care of all the > > kobjects properly. > > > > And add the forgotten pci_dev_put(pdev) in case of failure to register the > > kobject in populate_msi_sysfs(). > > > > CC: Bjorn Helgaas <bhelgaas@google.com> > > CC: Neil Horman <nhorman@tuxdriver.com> > > CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > > CC: linux-pci@vger.kernel.org > > CC: linux-kernel@vger.kernel.org > > Signed-off-by: Veaceslav Falico <vfalico@redhat.com> > > I'm still hoping that Greg (or somebody else; Greg already posted the bulk > of the work) will finish up the conversion to attributes, and that Neil > will confirm that works with no changes to irqbalance. So I'm going to > drop these patches for now. Poke me if we need to revive them. Ah, sorry about that. I'll redo my series, dropping the existing format and using the attribute-only code. Give me a day or so, thanks for the reminder. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 25, 2013 at 03:29:36PM -0800, Greg Kroah-Hartman wrote: > On Mon, Nov 25, 2013 at 04:12:48PM -0700, Bjorn Helgaas wrote: > > On Tue, Oct 29, 2013 at 11:30:30AM +0100, Veaceslav Falico wrote: > > > Currently, we first do kobject_put(&entry->kobj) and the kfree(entry), > > > however kobject_put() doesn't guarantee us that it was the last reference > > > and that the kobj isn't used currently by someone else, so after we > > > kfree(entry) with the struct kobject - other users will begin using the > > > freed memory, instead of the actual kobject. > > > > > > Fix this by using the kobject->release callback, which is called last when > > > the kobject is indeed not used and is cleaned up - it's msi_kobj_release(), > > > which can do the kfree(entry) safely (kobject_put/cleanup doesn't use the > > > kobj itself after ->release() was called, so we're safe). > > > > > > In case we've failed to create the sysfs directories - just kfree() > > > it - cause we don't have the kobjects attached. > > > > > > Also, remove the same functionality from populate_msi_sysfs(), cause on > > > failure we anyway call free_msi_irqs(), which will take care of all the > > > kobjects properly. > > > > > > And add the forgotten pci_dev_put(pdev) in case of failure to register the > > > kobject in populate_msi_sysfs(). > > > > > > CC: Bjorn Helgaas <bhelgaas@google.com> > > > CC: Neil Horman <nhorman@tuxdriver.com> > > > CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > > > CC: linux-pci@vger.kernel.org > > > CC: linux-kernel@vger.kernel.org > > > Signed-off-by: Veaceslav Falico <vfalico@redhat.com> > > > > I'm still hoping that Greg (or somebody else; Greg already posted the bulk > > of the work) will finish up the conversion to attributes, and that Neil > > will confirm that works with no changes to irqbalance. So I'm going to > > drop these patches for now. Poke me if we need to revive them. > > Ah, sorry about that. I'll redo my series, dropping the existing format > and using the attribute-only code. Give me a day or so, thanks for the > reminder. > > greg k-h > Thank you greg, I'll make sure Irqbalance doesn't choke on the new format as soon as you post them. Neil -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index d5f90d6..5d70f49 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -374,19 +374,22 @@ static void free_msi_irqs(struct pci_dev *dev) iounmap(entry->mask_base); } + list_del(&entry->list); + /* * Its possible that we get into this path * When populate_msi_sysfs fails, which means the entries * were not registered with sysfs. In that case don't - * unregister them. + * unregister them, and just free. Otherwise the + * kobject->release will take care of freeing the entry via + * msi_kobj_release(). */ if (entry->kobj.parent) { kobject_del(&entry->kobj); kobject_put(&entry->kobj); + } else { + kfree(entry); } - - list_del(&entry->list); - kfree(entry); } } @@ -509,6 +512,7 @@ static void msi_kobj_release(struct kobject *kobj) struct msi_desc *entry = to_msi_desc(kobj); pci_dev_put(entry->dev); + kfree(entry); } static struct kobj_type msi_irq_ktype = { @@ -522,7 +526,6 @@ static int populate_msi_sysfs(struct pci_dev *pdev) struct msi_desc *entry; struct kobject *kobj; int ret; - int count = 0; pdev->msi_kset = kset_create_and_add("msi_irqs", NULL, &pdev->dev.kobj); if (!pdev->msi_kset) @@ -534,23 +537,13 @@ static int populate_msi_sysfs(struct pci_dev *pdev) pci_dev_get(pdev); ret = kobject_init_and_add(kobj, &msi_irq_ktype, NULL, "%u", entry->irq); - if (ret) - goto out_unroll; - - count++; + if (ret) { + pci_dev_put(pdev); + return ret; + } } return 0; - -out_unroll: - list_for_each_entry(entry, &pdev->msi_list, list) { - if (!count) - break; - kobject_del(&entry->kobj); - kobject_put(&entry->kobj); - count--; - } - return ret; } /**
Currently, we first do kobject_put(&entry->kobj) and the kfree(entry), however kobject_put() doesn't guarantee us that it was the last reference and that the kobj isn't used currently by someone else, so after we kfree(entry) with the struct kobject - other users will begin using the freed memory, instead of the actual kobject. Fix this by using the kobject->release callback, which is called last when the kobject is indeed not used and is cleaned up - it's msi_kobj_release(), which can do the kfree(entry) safely (kobject_put/cleanup doesn't use the kobj itself after ->release() was called, so we're safe). In case we've failed to create the sysfs directories - just kfree() it - cause we don't have the kobjects attached. Also, remove the same functionality from populate_msi_sysfs(), cause on failure we anyway call free_msi_irqs(), which will take care of all the kobjects properly. And add the forgotten pci_dev_put(pdev) in case of failure to register the kobject in populate_msi_sysfs(). CC: Bjorn Helgaas <bhelgaas@google.com> CC: Neil Horman <nhorman@tuxdriver.com> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: linux-pci@vger.kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Veaceslav Falico <vfalico@redhat.com> --- drivers/pci/msi.c | 31 ++++++++++++------------------- 1 file changed, 12 insertions(+), 19 deletions(-)