Message ID | 1468942904-26464-1-git-send-email-andrew.smirnov@gmail.com |
---|---|
State | Superseded |
Headers | show |
Am 19.07.2016 um 17:41 schrieb Andrey Smirnov: > If no user specified chip->select_chip() function is provided, code in > nand_base.c will automatically set this hook to nand_select_chip(), > which in turn depends on chip->cmd_ctrl() hook being valid. Not > providing both of those functions in NAND controller driver (for example > by mistake) will result in a bit cryptic segfault. Replace it with > explicit BUG_ON statement so it would be obvious what went wrong. > > Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> > --- > drivers/mtd/nand/nand_base.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c > index ce7b2ca..57043a6 100644 > --- a/drivers/mtd/nand/nand_base.c > +++ b/drivers/mtd/nand/nand_base.c > @@ -3128,8 +3128,10 @@ static void nand_set_defaults(struct nand_chip *chip, int busw) > if (chip->waitfunc == NULL) > chip->waitfunc = nand_wait; > > - if (!chip->select_chip) > + if (!chip->select_chip) { > + BUG_ON(!chip->cmd_ctrl); Please don't add new BUG_ON() calls. WARN_ON() is good enough to raise the driver developer's attention and won't kill the machine. Thanks, //richard
On Tue, 19 Jul 2016 17:44:48 +0200 Richard Weinberger <richard@nod.at> wrote: > Am 19.07.2016 um 17:41 schrieb Andrey Smirnov: > > If no user specified chip->select_chip() function is provided, code in > > nand_base.c will automatically set this hook to nand_select_chip(), > > which in turn depends on chip->cmd_ctrl() hook being valid. Not > > providing both of those functions in NAND controller driver (for example > > by mistake) will result in a bit cryptic segfault. Replace it with > > explicit BUG_ON statement so it would be obvious what went wrong. > > > > Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> > > --- > > drivers/mtd/nand/nand_base.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c > > index ce7b2ca..57043a6 100644 > > --- a/drivers/mtd/nand/nand_base.c > > +++ b/drivers/mtd/nand/nand_base.c > > @@ -3128,8 +3128,10 @@ static void nand_set_defaults(struct nand_chip *chip, int busw) > > if (chip->waitfunc == NULL) > > chip->waitfunc = nand_wait; > > > > - if (!chip->select_chip) > > + if (!chip->select_chip) { > > + BUG_ON(!chip->cmd_ctrl); > > Please don't add new BUG_ON() calls. WARN_ON() is good enough to raise the driver developer's > attention and won't kill the machine. Not sure a BUG_ON() is worst than a NULL-pointer exception ;-).
Am 19.07.2016 um 17:59 schrieb Boris Brezillon: > On Tue, 19 Jul 2016 17:44:48 +0200 > Richard Weinberger <richard@nod.at> wrote: > >> Am 19.07.2016 um 17:41 schrieb Andrey Smirnov: >>> If no user specified chip->select_chip() function is provided, code in >>> nand_base.c will automatically set this hook to nand_select_chip(), >>> which in turn depends on chip->cmd_ctrl() hook being valid. Not >>> providing both of those functions in NAND controller driver (for example >>> by mistake) will result in a bit cryptic segfault. Replace it with >>> explicit BUG_ON statement so it would be obvious what went wrong. >>> >>> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> >>> --- >>> drivers/mtd/nand/nand_base.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c >>> index ce7b2ca..57043a6 100644 >>> --- a/drivers/mtd/nand/nand_base.c >>> +++ b/drivers/mtd/nand/nand_base.c >>> @@ -3128,8 +3128,10 @@ static void nand_set_defaults(struct nand_chip *chip, int busw) >>> if (chip->waitfunc == NULL) >>> chip->waitfunc = nand_wait; >>> >>> - if (!chip->select_chip) >>> + if (!chip->select_chip) { >>> + BUG_ON(!chip->cmd_ctrl); >> >> Please don't add new BUG_ON() calls. WARN_ON() is good enough to raise the driver developer's >> attention and won't kill the machine. > > Not sure a BUG_ON() is worst than a NULL-pointer exception ;-). When this really just triggers a NULL-pointer exception, we don't need a BUG_ON or WARN_ON at all since the kernel can tell anyway what went wrong. From the patch description I thought it is a more cryptic problem... Thanks, //richard
On Tue, 19 Jul 2016 18:02:27 +0200 Richard Weinberger <richard@nod.at> wrote: > Am 19.07.2016 um 17:59 schrieb Boris Brezillon: > > On Tue, 19 Jul 2016 17:44:48 +0200 > > Richard Weinberger <richard@nod.at> wrote: > > > >> Am 19.07.2016 um 17:41 schrieb Andrey Smirnov: > >>> If no user specified chip->select_chip() function is provided, code in > >>> nand_base.c will automatically set this hook to nand_select_chip(), > >>> which in turn depends on chip->cmd_ctrl() hook being valid. Not > >>> providing both of those functions in NAND controller driver (for example > >>> by mistake) will result in a bit cryptic segfault. Replace it with > >>> explicit BUG_ON statement so it would be obvious what went wrong. > >>> > >>> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> > >>> --- > >>> drivers/mtd/nand/nand_base.c | 4 +++- > >>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c > >>> index ce7b2ca..57043a6 100644 > >>> --- a/drivers/mtd/nand/nand_base.c > >>> +++ b/drivers/mtd/nand/nand_base.c > >>> @@ -3128,8 +3128,10 @@ static void nand_set_defaults(struct nand_chip *chip, int busw) > >>> if (chip->waitfunc == NULL) > >>> chip->waitfunc = nand_wait; > >>> > >>> - if (!chip->select_chip) > >>> + if (!chip->select_chip) { > >>> + BUG_ON(!chip->cmd_ctrl); > >> > >> Please don't add new BUG_ON() calls. WARN_ON() is good enough to raise the driver developer's > >> attention and won't kill the machine. > > > > Not sure a BUG_ON() is worst than a NULL-pointer exception ;-). > > When this really just triggers a NULL-pointer exception, we don't need a BUG_ON or WARN_ON at > all since the kernel can tell anyway what went wrong. Hm, that's not entirely true, depending on your debug options you don't have all the information to guess which line triggered the NULL pointer exception, and this makes it harder to debug. And I agree with Andrey here, it's better to complain at registration time than letting the controller register all its NAND devices and generate exceptions when the NAND is really used. BTW, I don't quite understand the rational behind BUG_ON() eradication. I agree that they should not be used when the driver can recover from a specific failure, but that's not really the case here (some NAND controller drivers don't check nand_scan_tail() or nand_scan() return code). The best solution would probably be to patch all those drivers and then return an error when one of the mandatory hooks is missing, but in the meantime I don't see any problem in adding BUG_ON() calls.
Am 19.07.2016 um 18:12 schrieb Boris Brezillon: >>> Not sure a BUG_ON() is worst than a NULL-pointer exception ;-). >> >> When this really just triggers a NULL-pointer exception, we don't need a BUG_ON or WARN_ON at >> all since the kernel can tell anyway what went wrong. > > Hm, that's not entirely true, depending on your debug options you don't > have all the information to guess which line triggered the NULL pointer > exception, and this makes it harder to debug. > And I agree with Andrey here, it's better to complain at registration > time than letting the controller register all its NAND devices and > generate exceptions when the NAND is really used. > > BTW, I don't quite understand the rational behind BUG_ON() eradication. > I agree that they should not be used when the driver can recover from a > specific failure, but that's not really the case here (some NAND > controller drivers don't check nand_scan_tail() or nand_scan() return > code). I've been told that new code (except core code) should not BUG()/_ON(). > The best solution would probably be to patch all those drivers and then > return an error when one of the mandatory hooks is missing, but in the > meantime I don't see any problem in adding BUG_ON() calls. Yes, definitely. Thanks, //richard
On Tue, Jul 19, 2016 at 9:22 AM, Richard Weinberger <richard@nod.at> wrote: > Am 19.07.2016 um 18:12 schrieb Boris Brezillon: >>>> Not sure a BUG_ON() is worst than a NULL-pointer exception ;-). >>> >>> When this really just triggers a NULL-pointer exception, we don't need a BUG_ON or WARN_ON at >>> all since the kernel can tell anyway what went wrong. >> >> Hm, that's not entirely true, depending on your debug options you don't >> have all the information to guess which line triggered the NULL pointer >> exception, and this makes it harder to debug. >> And I agree with Andrey here, it's better to complain at registration >> time than letting the controller register all its NAND devices and >> generate exceptions when the NAND is really used. >> >> BTW, I don't quite understand the rational behind BUG_ON() eradication. >> I agree that they should not be used when the driver can recover from a >> specific failure, but that's not really the case here (some NAND >> controller drivers don't check nand_scan_tail() or nand_scan() return >> code). > > I've been told that new code (except core code) should not BUG()/_ON(). > >> The best solution would probably be to patch all those drivers and then >> return an error when one of the mandatory hooks is missing, but in the >> meantime I don't see any problem in adding BUG_ON() calls. > > Yes, definitely. I don't have any preferences as far BUG_ON/WARN_ON are concerned and am more than happy to change one for another. The reason I came up with that patch is that I stumbled on that segfault (by not providing custom select_chip() and not setting up cmd_ctrl()) and it took me good 20 minutes to figure out the nature of the problem, whereas, IMHO, having a BUG/WARN statement at the would have been more self-documenting/explanatory. What if I modify the patch to change nand_set_default's signature to return a error code, add corresponding checking in nand_get_flash_type()/nand_scan_ident() and replace BUG_ON with WARN_ON? Would it be more agreeable solution? Andrey
On Tue, 19 Jul 2016 11:11:54 -0700 Andrey Smirnov <andrew.smirnov@gmail.com> wrote: > On Tue, Jul 19, 2016 at 9:22 AM, Richard Weinberger <richard@nod.at> wrote: > > Am 19.07.2016 um 18:12 schrieb Boris Brezillon: > >>>> Not sure a BUG_ON() is worst than a NULL-pointer exception ;-). > >>> > >>> When this really just triggers a NULL-pointer exception, we don't need a BUG_ON or WARN_ON at > >>> all since the kernel can tell anyway what went wrong. > >> > >> Hm, that's not entirely true, depending on your debug options you don't > >> have all the information to guess which line triggered the NULL pointer > >> exception, and this makes it harder to debug. > >> And I agree with Andrey here, it's better to complain at registration > >> time than letting the controller register all its NAND devices and > >> generate exceptions when the NAND is really used. > >> > >> BTW, I don't quite understand the rational behind BUG_ON() eradication. > >> I agree that they should not be used when the driver can recover from a > >> specific failure, but that's not really the case here (some NAND > >> controller drivers don't check nand_scan_tail() or nand_scan() return > >> code). > > > > I've been told that new code (except core code) should not BUG()/_ON(). > > > >> The best solution would probably be to patch all those drivers and then > >> return an error when one of the mandatory hooks is missing, but in the > >> meantime I don't see any problem in adding BUG_ON() calls. > > > > Yes, definitely. > > I don't have any preferences as far BUG_ON/WARN_ON are concerned and > am more than happy to change one for another. > > The reason I came up with that patch is that I stumbled on that > segfault (by not providing custom select_chip() and not setting up > cmd_ctrl()) and it took me good 20 minutes to figure out the nature of > the problem, whereas, IMHO, having a BUG/WARN statement at the would > have been more self-documenting/explanatory. > > What if I modify the patch to change nand_set_default's signature to > return a error code, add corresponding checking in > nand_get_flash_type()/nand_scan_ident() and replace BUG_ON with > WARN_ON? Would it be more agreeable solution? Agreed.
On Tue, Jul 19, 2016 at 06:12:48PM +0200, Boris Brezillon wrote: > On Tue, 19 Jul 2016 18:02:27 +0200 > Richard Weinberger <richard@nod.at> wrote: > > Am 19.07.2016 um 17:59 schrieb Boris Brezillon: > > > On Tue, 19 Jul 2016 17:44:48 +0200 > > > Richard Weinberger <richard@nod.at> wrote: > > >> Am 19.07.2016 um 17:41 schrieb Andrey Smirnov: > > >>> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c > > >>> index ce7b2ca..57043a6 100644 > > >>> --- a/drivers/mtd/nand/nand_base.c > > >>> +++ b/drivers/mtd/nand/nand_base.c > > >>> @@ -3128,8 +3128,10 @@ static void nand_set_defaults(struct nand_chip *chip, int busw) > > >>> if (chip->waitfunc == NULL) > > >>> chip->waitfunc = nand_wait; > > >>> > > >>> - if (!chip->select_chip) > > >>> + if (!chip->select_chip) { > > >>> + BUG_ON(!chip->cmd_ctrl); > > >> > > >> Please don't add new BUG_ON() calls. WARN_ON() is good enough to raise the driver developer's > > >> attention and won't kill the machine. > > > > > > Not sure a BUG_ON() is worst than a NULL-pointer exception ;-). > > > > When this really just triggers a NULL-pointer exception, we don't need a BUG_ON or WARN_ON at > > all since the kernel can tell anyway what went wrong. > > Hm, that's not entirely true, depending on your debug options you don't > have all the information to guess which line triggered the NULL pointer > exception, and this makes it harder to debug. > And I agree with Andrey here, it's better to complain at registration > time than letting the controller register all its NAND devices and > generate exceptions when the NAND is really used. Yes, definitely better to complain at registration. But complaining doesn't have to be BUG_ON(). > BTW, I don't quite understand the rational behind BUG_ON() eradication. > I agree that they should not be used when the driver can recover from a > specific failure, but that's not really the case here (some NAND > controller drivers don't check nand_scan_tail() or nand_scan() return > code). It's really not helpful to anyone to have a single picky/buggy/whatever driver crash the entire system (e.g., on an early prototype board; or while somebody is tinkering and forgets something) when we could perfectly easily just fail to register the driver. There are plenty of other subsystems that do this, and the world hasn't caught fire yet. And regarding the "drivers don't check ... return code": I'm pretty tired of that excuse. I don't want to gate any more correct error handling on the fact that drivers are s**t. > The best solution would probably be to patch all those drivers and then > return an error when one of the mandatory hooks is missing, but in the > meantime I don't see any problem in adding BUG_ON() calls. I do. Regards, Brian
On Tue, Jul 19, 2016 at 08:16:11PM +0200, Boris Brezillon wrote: > On Tue, 19 Jul 2016 11:11:54 -0700 > Andrey Smirnov <andrew.smirnov@gmail.com> wrote: > > > On Tue, Jul 19, 2016 at 9:22 AM, Richard Weinberger <richard@nod.at> wrote: > > > Am 19.07.2016 um 18:12 schrieb Boris Brezillon: > > >>>> Not sure a BUG_ON() is worst than a NULL-pointer exception ;-). > > >>> > > >>> When this really just triggers a NULL-pointer exception, we don't need a BUG_ON or WARN_ON at > > >>> all since the kernel can tell anyway what went wrong. > > >> > > >> Hm, that's not entirely true, depending on your debug options you don't > > >> have all the information to guess which line triggered the NULL pointer > > >> exception, and this makes it harder to debug. > > >> And I agree with Andrey here, it's better to complain at registration > > >> time than letting the controller register all its NAND devices and > > >> generate exceptions when the NAND is really used. > > >> > > >> BTW, I don't quite understand the rational behind BUG_ON() eradication. > > >> I agree that they should not be used when the driver can recover from a > > >> specific failure, but that's not really the case here (some NAND > > >> controller drivers don't check nand_scan_tail() or nand_scan() return > > >> code). > > > > > > I've been told that new code (except core code) should not BUG()/_ON(). > > > > > >> The best solution would probably be to patch all those drivers and then > > >> return an error when one of the mandatory hooks is missing, but in the > > >> meantime I don't see any problem in adding BUG_ON() calls. > > > > > > Yes, definitely. > > > > I don't have any preferences as far BUG_ON/WARN_ON are concerned and > > am more than happy to change one for another. > > > > The reason I came up with that patch is that I stumbled on that > > segfault (by not providing custom select_chip() and not setting up > > cmd_ctrl()) and it took me good 20 minutes to figure out the nature of > > the problem, whereas, IMHO, having a BUG/WARN statement at the would > > have been more self-documenting/explanatory. Would a normal print statement and error return have helped, like most sane drivers? Like: if (!chip->cmd_ctrl) { pr_err("No cmd_ctrl() provided\n"); return -EINVAL; } > > What if I modify the patch to change nand_set_default's signature to > > return a error code, add corresponding checking in > > nand_get_flash_type()/nand_scan_ident() and replace BUG_ON with > > WARN_ON? Would it be more agreeable solution? Sounds better to me, though I still don't see why even WARN_ON() is necessary. I guess we are infected by plenty of those already anyway, since I guess that's easier than writing a descriptive error message... > Agreed. Glad we're on mostly the same page. Brian
On Tue, Jul 19, 2016 at 11:23 AM, Brian Norris <computersforpeace@gmail.com> wrote: > On Tue, Jul 19, 2016 at 08:16:11PM +0200, Boris Brezillon wrote: >> On Tue, 19 Jul 2016 11:11:54 -0700 >> Andrey Smirnov <andrew.smirnov@gmail.com> wrote: >> >> > On Tue, Jul 19, 2016 at 9:22 AM, Richard Weinberger <richard@nod.at> wrote: >> > > Am 19.07.2016 um 18:12 schrieb Boris Brezillon: >> > >>>> Not sure a BUG_ON() is worst than a NULL-pointer exception ;-). >> > >>> >> > >>> When this really just triggers a NULL-pointer exception, we don't need a BUG_ON or WARN_ON at >> > >>> all since the kernel can tell anyway what went wrong. >> > >> >> > >> Hm, that's not entirely true, depending on your debug options you don't >> > >> have all the information to guess which line triggered the NULL pointer >> > >> exception, and this makes it harder to debug. >> > >> And I agree with Andrey here, it's better to complain at registration >> > >> time than letting the controller register all its NAND devices and >> > >> generate exceptions when the NAND is really used. >> > >> >> > >> BTW, I don't quite understand the rational behind BUG_ON() eradication. >> > >> I agree that they should not be used when the driver can recover from a >> > >> specific failure, but that's not really the case here (some NAND >> > >> controller drivers don't check nand_scan_tail() or nand_scan() return >> > >> code). >> > > >> > > I've been told that new code (except core code) should not BUG()/_ON(). >> > > >> > >> The best solution would probably be to patch all those drivers and then >> > >> return an error when one of the mandatory hooks is missing, but in the >> > >> meantime I don't see any problem in adding BUG_ON() calls. >> > > >> > > Yes, definitely. >> > >> > I don't have any preferences as far BUG_ON/WARN_ON are concerned and >> > am more than happy to change one for another. >> > >> > The reason I came up with that patch is that I stumbled on that >> > segfault (by not providing custom select_chip() and not setting up >> > cmd_ctrl()) and it took me good 20 minutes to figure out the nature of >> > the problem, whereas, IMHO, having a BUG/WARN statement at the would >> > have been more self-documenting/explanatory. > > Would a normal print statement and error return have helped, like most > sane drivers? Like: > > if (!chip->cmd_ctrl) { > pr_err("No cmd_ctrl() provided\n"); > return -EINVAL; > } Yes, that would've worked perfectly fine. > >> > What if I modify the patch to change nand_set_default's signature to >> > return a error code, add corresponding checking in >> > nand_get_flash_type()/nand_scan_ident() and replace BUG_ON with >> > WARN_ON? Would it be more agreeable solution? > > Sounds better to me, though I still don't see why even WARN_ON() is > necessary. I guess we are infected by plenty of those already anyway, > since I guess that's easier than writing a descriptive error message... It's not necessary, WARN_ON might be slightly more visible when skimming through dmesg, but pr_err should work as well. I'll use the latter in v2 then. Andrey
On Tue, 19 Jul 2016 11:19:16 -0700 Brian Norris <computersforpeace@gmail.com> wrote: > On Tue, Jul 19, 2016 at 06:12:48PM +0200, Boris Brezillon wrote: > > On Tue, 19 Jul 2016 18:02:27 +0200 > > Richard Weinberger <richard@nod.at> wrote: > > > Am 19.07.2016 um 17:59 schrieb Boris Brezillon: > > > > On Tue, 19 Jul 2016 17:44:48 +0200 > > > > Richard Weinberger <richard@nod.at> wrote: > > > >> Am 19.07.2016 um 17:41 schrieb Andrey Smirnov: > > > >>> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c > > > >>> index ce7b2ca..57043a6 100644 > > > >>> --- a/drivers/mtd/nand/nand_base.c > > > >>> +++ b/drivers/mtd/nand/nand_base.c > > > >>> @@ -3128,8 +3128,10 @@ static void nand_set_defaults(struct nand_chip *chip, int busw) > > > >>> if (chip->waitfunc == NULL) > > > >>> chip->waitfunc = nand_wait; > > > >>> > > > >>> - if (!chip->select_chip) > > > >>> + if (!chip->select_chip) { > > > >>> + BUG_ON(!chip->cmd_ctrl); > > > >> > > > >> Please don't add new BUG_ON() calls. WARN_ON() is good enough to raise the driver developer's > > > >> attention and won't kill the machine. > > > > > > > > Not sure a BUG_ON() is worst than a NULL-pointer exception ;-). > > > > > > When this really just triggers a NULL-pointer exception, we don't need a BUG_ON or WARN_ON at > > > all since the kernel can tell anyway what went wrong. > > > > Hm, that's not entirely true, depending on your debug options you don't > > have all the information to guess which line triggered the NULL pointer > > exception, and this makes it harder to debug. > > And I agree with Andrey here, it's better to complain at registration > > time than letting the controller register all its NAND devices and > > generate exceptions when the NAND is really used. > > Yes, definitely better to complain at registration. But complaining > doesn't have to be BUG_ON(). > > > BTW, I don't quite understand the rational behind BUG_ON() eradication. > > I agree that they should not be used when the driver can recover from a > > specific failure, but that's not really the case here (some NAND > > controller drivers don't check nand_scan_tail() or nand_scan() return > > code). > > It's really not helpful to anyone to have a single picky/buggy/whatever > driver crash the entire system (e.g., on an early prototype board; or > while somebody is tinkering and forgets something) when we could > perfectly easily just fail to register the driver. There are plenty of > other subsystems that do this, and the world hasn't caught fire yet. Hey, I'm already convinced that properly handling error codes in all drivers and core code is the best approach, but we should also patch all the code that does not follow this rule and not only framework code. > > And regarding the "drivers don't check ... return code": I'm pretty > tired of that excuse. I don't want to gate any more correct error > handling on the fact that drivers are s**t. That's a bit unfair. I'm trying to improve things in the NAND framework (and will keep doing so as much as I can), but last time I suggested to patch a driver to properly handle nand_scan_tail() return code instead of ignoring it you said it was not required [1]. You'll say that these drivers have already been used/tested by the people who submitted them, and that they're known to work fine as is, but keeping these old drivers in an unclean state just encourages new comers to submit code reproducing the same mistake, so let's tackle the problem and patch all offenders. > > > The best solution would probably be to patch all those drivers and then > > return an error when one of the mandatory hooks is missing, but in the > > meantime I don't see any problem in adding BUG_ON() calls. > > I do. I'm not against dropping all BUG_ON()/BUG() usage in the NAND/MTD framework, but we should also patch all the offending drivers. Regards, Boris [1]http://thread.gmane.org/gmane.linux.drivers.mtd/66245
Hi, On Tue, Jul 19, 2016 at 08:47:03PM +0200, Boris Brezillon wrote: > On Tue, 19 Jul 2016 11:19:16 -0700 > Brian Norris <computersforpeace@gmail.com> wrote: > > > On Tue, Jul 19, 2016 at 06:12:48PM +0200, Boris Brezillon wrote: > > > On Tue, 19 Jul 2016 18:02:27 +0200 > > > Richard Weinberger <richard@nod.at> wrote: > > > > Am 19.07.2016 um 17:59 schrieb Boris Brezillon: > > > > > On Tue, 19 Jul 2016 17:44:48 +0200 > > > > > Richard Weinberger <richard@nod.at> wrote: > > > > >> Am 19.07.2016 um 17:41 schrieb Andrey Smirnov: > > > > >>> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c > > > > >>> index ce7b2ca..57043a6 100644 > > > > >>> --- a/drivers/mtd/nand/nand_base.c > > > > >>> +++ b/drivers/mtd/nand/nand_base.c > > > > >>> @@ -3128,8 +3128,10 @@ static void nand_set_defaults(struct nand_chip *chip, int busw) > > > > >>> if (chip->waitfunc == NULL) > > > > >>> chip->waitfunc = nand_wait; > > > > >>> > > > > >>> - if (!chip->select_chip) > > > > >>> + if (!chip->select_chip) { > > > > >>> + BUG_ON(!chip->cmd_ctrl); > > > > >> > > > > >> Please don't add new BUG_ON() calls. WARN_ON() is good enough to raise the driver developer's > > > > >> attention and won't kill the machine. > > > > > > > > > > Not sure a BUG_ON() is worst than a NULL-pointer exception ;-). > > > > > > > > When this really just triggers a NULL-pointer exception, we don't need a BUG_ON or WARN_ON at > > > > all since the kernel can tell anyway what went wrong. > > > > > > Hm, that's not entirely true, depending on your debug options you don't > > > have all the information to guess which line triggered the NULL pointer > > > exception, and this makes it harder to debug. > > > And I agree with Andrey here, it's better to complain at registration > > > time than letting the controller register all its NAND devices and > > > generate exceptions when the NAND is really used. > > > > Yes, definitely better to complain at registration. But complaining > > doesn't have to be BUG_ON(). > > > > > BTW, I don't quite understand the rational behind BUG_ON() eradication. > > > I agree that they should not be used when the driver can recover from a > > > specific failure, but that's not really the case here (some NAND > > > controller drivers don't check nand_scan_tail() or nand_scan() return > > > code). > > > > It's really not helpful to anyone to have a single picky/buggy/whatever > > driver crash the entire system (e.g., on an early prototype board; or > > while somebody is tinkering and forgets something) when we could > > perfectly easily just fail to register the driver. There are plenty of > > other subsystems that do this, and the world hasn't caught fire yet. > > Hey, I'm already convinced that properly handling error codes in all > drivers and core code is the best approach, but we should also patch > all the code that does not follow this rule and not only framework code. OK. > > And regarding the "drivers don't check ... return code": I'm pretty > > tired of that excuse. I don't want to gate any more correct error > > handling on the fact that drivers are s**t. > > That's a bit unfair. I'm trying to improve things in the NAND framework > (and will keep doing so as much as I can), but last time I suggested to Sorry if I was unfair there. Wasn't intending to blame you (you didn't write 90%+ of the code); just suggesting different priorities. > patch a driver to properly handle nand_scan_tail() return code instead > of ignoring it you said it was not required [1]. I believe I suggested that it was not required as a dependency for returning errors in the core code. Not that such patches to fix drivers were unwanted. > You'll say that these drivers have already been used/tested by the > people who submitted them, and that they're known to work fine as is, > but keeping these old drivers in an unclean state just encourages new > comers to submit code reproducing the same mistake, so let's tackle the > problem and patch all offenders. Patching all offenders is a great goal. Reviewing new drivers to avoid repeating mistakes is good too. And the former can affect the latter, certainly. I just don't want the poor drivers to be an excuse for doing things poorly in the core. > > > The best solution would probably be to patch all those drivers and then > > > return an error when one of the mandatory hooks is missing, but in the > > > meantime I don't see any problem in adding BUG_ON() calls. > > > > I do. > > I'm not against dropping all BUG_ON()/BUG() usage in the NAND/MTD > framework, but we should also patch all the offending drivers. Yes, we're in agreement on that point then. Brian
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c index ce7b2ca..57043a6 100644 --- a/drivers/mtd/nand/nand_base.c +++ b/drivers/mtd/nand/nand_base.c @@ -3128,8 +3128,10 @@ static void nand_set_defaults(struct nand_chip *chip, int busw) if (chip->waitfunc == NULL) chip->waitfunc = nand_wait; - if (!chip->select_chip) + if (!chip->select_chip) { + BUG_ON(!chip->cmd_ctrl); chip->select_chip = nand_select_chip; + } /* set for ONFI nand */ if (!chip->onfi_set_features)
If no user specified chip->select_chip() function is provided, code in nand_base.c will automatically set this hook to nand_select_chip(), which in turn depends on chip->cmd_ctrl() hook being valid. Not providing both of those functions in NAND controller driver (for example by mistake) will result in a bit cryptic segfault. Replace it with explicit BUG_ON statement so it would be obvious what went wrong. Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> --- drivers/mtd/nand/nand_base.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)