Message ID | 1337589758-8775-3-git-send-email-johan.gunnarsson@axis.com |
---|---|
State | Superseded, archived |
Headers | show |
On Mon, 2012-05-21 at 10:42 +0200, Johan Gunnarsson wrote: > @@ -533,19 +541,29 @@ static void panic_nand_wait_ready(struct mtd_info *mtd, unsigned long timeo) > void nand_wait_ready(struct mtd_info *mtd) > { > struct nand_chip *chip = mtd->priv; > - unsigned long timeo = jiffies + 2; > + struct hrtimer timer; > > /* 400ms timeout */ > if (in_interrupt() || oops_in_progress) > return panic_nand_wait_ready(mtd, 400); > > led_trigger_event(nand_led_trigger, LED_FULL); > + > + /* Arm timeout timer for 20ms timeout */ > + hrtimer_init(&timer, CLOCK_REALTIME, HRTIMER_MODE_REL); > + timer.function = nand_wait_timeout_callback; > + hrtimer_start(&timer, ns_to_ktime(20 * 1000 * 1000), > + HRTIMER_MODE_REL); > + > /* Wait until command is processed or timeout occurs */ > do { > if (chip->dev_ready(mtd)) > break; > touch_softlockup_watchdog(); > - } while (time_before(jiffies, timeo)); Si this function waits for the chip, but if we cannot access the "ready" pin - it waits for 400msecs? Where this number 400 comes from? Why not 500? How many chips we have which do not provide "dev_ready()" function? Can we simplify this function and assume everyone has "dev_ready()" and add BUG_ON(!chip->dev_ready) - could you make a small research ? Or can we do like this: if (!dev_ready()) { msleep(400); return } do { } while Why should we iterate if "dev_ready" is NULL? My point is that this stuff ancient horror which needs love and cleaning up. Your does not make it less scary. I'd prefer to first clean the whole thing up, and then fix it. I'm CCing Brian who spent a lot of time digging nand_base.c recently, he has probably got some ideas.
> -----Original Message----- > From: Artem Bityutskiy [mailto:dedekind1@gmail.com] > Sent: den 22 maj 2012 09:53 > To: Johan Gunnarsson; Brian Norris > Cc: linux-mtd@lists.infradead.org; Jesper Nilsson > Subject: Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in > nand_wait{_ready, } > > On Mon, 2012-05-21 at 10:42 +0200, Johan Gunnarsson wrote: > > @@ -533,19 +541,29 @@ static void panic_nand_wait_ready(struct > mtd_info *mtd, unsigned long timeo) > > void nand_wait_ready(struct mtd_info *mtd) > > { > > struct nand_chip *chip = mtd->priv; > > - unsigned long timeo = jiffies + 2; > > + struct hrtimer timer; > > > > /* 400ms timeout */ > > if (in_interrupt() || oops_in_progress) > > return panic_nand_wait_ready(mtd, 400); > > > > led_trigger_event(nand_led_trigger, LED_FULL); > > + > > + /* Arm timeout timer for 20ms timeout */ > > + hrtimer_init(&timer, CLOCK_REALTIME, HRTIMER_MODE_REL); > > + timer.function = nand_wait_timeout_callback; > > + hrtimer_start(&timer, ns_to_ktime(20 * 1000 * 1000), > > + HRTIMER_MODE_REL); > > + > > /* Wait until command is processed or timeout occurs */ > > do { > > if (chip->dev_ready(mtd)) > > break; > > touch_softlockup_watchdog(); > > - } while (time_before(jiffies, timeo)); > > Si this function waits for the chip, but if we cannot access the > "ready" > pin - it waits for 400msecs? Where this number 400 comes from? Why not > 500? How many chips we have which do not provide "dev_ready()" > function? Not an expert on the MTD framework -- but I believe the 400 comes from the comment above nand_wait(): "Wait for command done. This applies to erase and program only. Erase can take up to 400ms and program up to 20ms according to general NAND and SmartMedia specs." > Can we simplify this function and assume everyone has "dev_ready()" and > add BUG_ON(!chip->dev_ready) - could you make a small research ? Or can The drivers/mtd/nand/h1910.c driver sets dev_ready to NULL on init. The others seem to have dev_ready implementations. > we do like this: > > if (!dev_ready()) { > msleep(400); > return > } > > do { > } while > > Why should we iterate if "dev_ready" is NULL? Actually, it's worse than that. If dev_ready is NULL (as in the h1910.c case) we'll crash the kernel, right? > > My point is that this stuff ancient horror which needs love and > cleaning > up. Your does not make it less scary. I'd prefer to first clean the > whole thing up, and then fix it. > > I'm CCing Brian who spent a lot of time digging nand_base.c recently, > he > has probably got some ideas. > > -- > Best Regards, > Artem Bityutskiy
On Tue, 2012-05-22 at 10:52 +0200, Johan Gunnarsson wrote: > Not an expert on the MTD framework -- but I believe the 400 comes from > the comment above nand_wait(): "Wait for command done. This applies to > erase and program only. Erase can take up to 400ms and program up to > 20ms according to general NAND and SmartMedia specs." Most probably this legacy can be killed along with the IPAQ h1910 driver you pointed. But could you please start with cleaning these functions up and turning the "!chip->dev_ready()" case into a simple msleep() or mdelay() for the panic case? Then I may get some ideas how to get rid of the jiffies without using hrtimer.
> -----Original Message----- > From: Artem Bityutskiy [mailto:dedekind1@gmail.com] > Sent: den 22 maj 2012 12:25 > To: Johan Gunnarsson > Cc: Brian Norris; linux-mtd@lists.infradead.org; Jesper Nilsson > Subject: RE: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in > nand_wait{_ready, } > > On Tue, 2012-05-22 at 10:52 +0200, Johan Gunnarsson wrote: > > Not an expert on the MTD framework -- but I believe the 400 comes > from > > the comment above nand_wait(): "Wait for command done. This applies > to > > erase and program only. Erase can take up to 400ms and program up to > > 20ms according to general NAND and SmartMedia specs." > > Most probably this legacy can be killed along with the IPAQ h1910 > driver you pointed. But could you please start with cleaning these > functions up and turning the "!chip->dev_ready()" case into a simple > msleep() or > mdelay() for the panic case? > I don't understand what clean-up is needed here. Can you please elaborate? The case you mentioned earlier doesn't really make sense to me: if (!dev_ready()) { msleep(400); return } Making a single call to dev_ready() and giving up if FALSE is wrong IMHO, since we very much want to poll dev_ready() and return as soon as it turns TRUE (or timeout.) Assuming you actually mean "if(!dev_ready)) {" also doesn't make sense since checks like that exists on all path leading to nand_wait_ready().
On Tue, May 22, 2012 at 09:52:22AM +0100, Johan Gunnarsson wrote: > > -----Original Message----- > > From: Artem Bityutskiy [mailto:dedekind1@gmail.com] > > Sent: den 22 maj 2012 09:53 > > To: Johan Gunnarsson; Brian Norris > > Cc: linux-mtd@lists.infradead.org; Jesper Nilsson > > Subject: Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in > > nand_wait{_ready, } (...) > > Si this function waits for the chip, but if we cannot access the > > "ready" > > pin - it waits for 400msecs? Where this number 400 comes from? Why not > > 500? How many chips we have which do not provide "dev_ready()" > > function? > > Not an expert on the MTD framework -- but I believe the 400 comes from the comment above nand_wait(): "Wait for command done. This applies to erase and program only. Erase can take up to 400ms and program up to 20ms according to general NAND and SmartMedia specs." > Hi Johan and Artem, Current NAND devices require approximately 1-3 ms per block erase, and 100-200 us per page program operation. But I think this timeout is only useful to detect and recover from broken hardware: IMHO there is no point in trying to "optimize" the timeout delay to 20ms or 400ms depending on which operation is being done (program or erase). Why not simplify and just use a single 1s (1000ms) timeout ? Johan, with such a timeout value, your bug would manifest itself only if interrupts are blocked for nearly a second: would that be acceptable ? > > Can we simplify this function and assume everyone has "dev_ready()" and > > add BUG_ON(!chip->dev_ready) - could you make a small research ? Or can > > The drivers/mtd/nand/h1910.c driver sets dev_ready to NULL on init. The others seem to have dev_ready implementations. > > > we do like this: > > > > if (!dev_ready()) { > > msleep(400); > > return > > } > > > > do { > > } while > > > > Why should we iterate if "dev_ready" is NULL? nand_wait() polls the device until it is ready or a timeout has elapsed. More details here: http://lists.infradead.org/pipermail/linux-mtd/2011-June/036795.html > Actually, it's worse than that. If dev_ready is NULL (as in the h1910.c case) we'll crash the kernel, right? > > > > > My point is that this stuff ancient horror which needs love and > > cleaning > > up. Your does not make it less scary. I'd prefer to first clean the > > whole thing up, and then fix it. I agree it needs some cleaning; here are a few things I have noticed: * panic_nand_wait() and panic_nand_wait_ready() are nearly identical * Some patterns are repeated: /* Apply delay or wait for ready/busy pin */ if (!chip->dev_ready) udelay(chip->chip_delay); else nand_wait_ready(mtd); * the udelay() fallback should probably be phased out if only one driver (h1910) depends on it because it does not define chip->dev_ready() ? * nand_wait() and nand_wait_ready() use different strategies to wait: - nand_wait_ready = busy loop - nand_wait = busy loop with cond_resched call Presumably, this is because nand_wait_ready() should not wait for more than ~25 us (typically), so it is more efficient to avoid the cond_resched() call ? * We could separate dev_ready() and STATUS polling inside nand_wait(), or even better, directly call nand_wait_ready() from nand_wait() ? The only difference between the 2 functions (besides cond_resched) is that nand_wait_ready() does not send any STATUS command: as a consequence, it does not interfere with the device mode (polling the device with STATUS reads while waiting during a read operation requires re-sending a READ0 command at the end to return to read mode). I would suggest the following approach: 1. Use a single timeout value (1s for instance). 2. Rewrite and simplify nand_wait() as follows: a) loop until dev_ready() returns true OR timeout b) send a STATUS command c) loop until status is ready OR timeout, and return NAND status The idea is that in most cases, dev_ready() will be defined and functional, and the loop in c) will finish immediately. If dev_ready() is not available or defective, step c) still ensures we properly wait. Step a) could be replaced with a call to a modified version of nand_wait_ready() with optional cond_resched-uling. What do you think ? Any suggestions ? I'm willing to submit patches if we reach a consensus on this, Best regards, -- Ivan
On Tue, 2012-05-22 at 19:10 +0200, Ivan Djelic wrote: > Current NAND devices require approximately 1-3 ms per block erase, and 100-200 us per page program operation. > But I think this timeout is only useful to detect and recover from broken hardware: IMHO there is no point in > trying to "optimize" the timeout delay to 20ms or 400ms depending on which operation is being done (program or erase). > > Why not simplify and just use a single 1s (1000ms) timeout ? I have many things on my plate now so my answers are are sloppy - I spent 1 minute or less looking at the code. But I thought that this timeout is important for the the case when 'chip->device_ready' is NULL. This is why I wanted to get rid of that case, and then the timeout would only become about detecting the forever loops. And this is exactly what I meant by saying that I have further idea on improving that without hrtimer - just have one single 1 second timeout. We indeed do not have to be precise for the errors detection. Oops, SIGBABY.
Hi, On Tue, May 22, 2012 at 10:10 AM, Ivan Djelic <ivan.djelic@parrot.com> wrote: > On Tue, May 22, 2012 at 09:52:22AM +0100, Johan Gunnarsson wrote: >> > -----Original Message----- >> > From: Artem Bityutskiy [mailto:dedekind1@gmail.com] >> > Sent: den 22 maj 2012 09:53 >> > To: Johan Gunnarsson; Brian Norris >> > Cc: linux-mtd@lists.infradead.org; Jesper Nilsson >> > Subject: Re: [PATCH 2/2] mtd: nand: use hrtimer to measure timeout in >> > nand_wait{_ready, } > > (...) > >> > Si this function waits for the chip, but if we cannot access the >> > "ready" >> > pin - it waits for 400msecs? Where this number 400 comes from? Why not >> > 500? How many chips we have which do not provide "dev_ready()" >> > function? > > Current NAND devices require approximately 1-3 ms per block erase, and 100-200 us per page program operation. > But I think this timeout is only useful to detect and recover from broken hardware: IMHO there is no point in > trying to "optimize" the timeout delay to 20ms or 400ms depending on which operation is being done (program or erase). > > Why not simplify and just use a single 1s (1000ms) timeout ? > >> > Can we simplify this function and assume everyone has "dev_ready()" and >> > add BUG_ON(!chip->dev_ready) - could you make a small research ? Or can >> >> The drivers/mtd/nand/h1910.c driver sets dev_ready to NULL on init. The others seem to have dev_ready implementations. Not true: there are several drivers which do not have dev_ready, but this is also because they are controller-based, so they provide their own cmdfunc as well, mostly avoiding the nand_wait_ready stuff. (Drivers: denali.c, docg4.c, my out-of-tree driver, ...) Note: another use of nand_wait_ready is in nand_do_read_ops and nand_do_read_oob, under the following condition: if (!(chip->options & NAND_NO_READRDY)) { Isn't NAND_NO_READRDY no longer needed, since I killed autoincrement? (i.e., we *always* "do not support autoincrement", as its comment says in nand.h) So if we kill this option, we can kill the only use of nand_wait_ready outside of cmdfunc(); this brings us closer to being able to using BUG_ON(!chip->dev_ready). I'll send a patch for this, and you guys can comment. >> Actually, it's worse than that. If dev_ready is NULL (as in the h1910.c case) we'll crash the kernel, right? No, I don't think so. It looks like nand_wait_ready is never called when dev_ready is NULL. >> > >> > My point is that this stuff ancient horror which needs love and >> > cleaning >> > up. Your does not make it less scary. I'd prefer to first clean the >> > whole thing up, and then fix it. > > I agree it needs some cleaning; here are a few things I have noticed: Even though I'm mostly not a user of this code, I agree :) > * Some patterns are repeated: > /* Apply delay or wait for ready/busy pin */ > if (!chip->dev_ready) > udelay(chip->chip_delay); > else > nand_wait_ready(mtd); I'm going to kill this pattern, I think. > * the udelay() fallback should probably be phased out if only one driver (h1910) depends on it because it does not define chip->dev_ready() ? Clarification: only one driver does leaves both chip->dev_ready and chip->cmdfunc NULL, right? > What do you think ? Any suggestions ? > I'm willing to submit patches if we reach a consensus on this, Feel free to send a patch; it's probably easier to talk more concretely about a patch that implements your (seemingly reasonable) suggestions, and then provide comments based on concrete ideas. Brian
> Why not simplify and just use a single 1s (1000ms) timeout ? > > Johan, > with such a timeout value, your bug would manifest itself only if > interrupts are blocked for nearly a second: would that be acceptable ? That indeed would be acceptable from my side.
On Wed, May 23, 2012 at 07:39:51AM +0100, Brian Norris wrote: (...) > >> > Can we simplify this function and assume everyone has "dev_ready()" and > >> > add BUG_ON(!chip->dev_ready) - could you make a small research ? Or can > >> > >> The drivers/mtd/nand/h1910.c driver sets dev_ready to NULL on init. The others seem to have dev_ready implementations. > > Not true: there are several drivers which do not have dev_ready, but > this is also because they are controller-based, so they provide their > own cmdfunc as well, mostly avoiding the nand_wait_ready stuff. > (Drivers: denali.c, docg4.c, my out-of-tree driver, ...) > > Note: another use of nand_wait_ready is in nand_do_read_ops and > nand_do_read_oob, under the following condition: > if (!(chip->options & NAND_NO_READRDY)) { > Isn't NAND_NO_READRDY no longer needed, since I killed autoincrement? > (i.e., we *always* "do not support autoincrement", as its comment says > in nand.h) So if we kill this option, we can kill the only use of > nand_wait_ready outside of cmdfunc(); this brings us closer to being > able to using BUG_ON(!chip->dev_ready). I'll send a patch for this, > and you guys can comment. Agreed. > >> Actually, it's worse than that. If dev_ready is NULL (as in the h1910.c case) we'll crash the kernel, right? > > No, I don't think so. It looks like nand_wait_ready is never called > when dev_ready is NULL. True, but since nand_wait_ready() is a public function, it could be called from a driver that has (dev_ready == NULL), resulting in a crash. So all drivers must make sure they define dev_ready if they are going to call nand_wait_ready() (like cafe_nand.c). (...) > > > * the udelay() fallback should probably be phased out if only one driver (h1910) depends on it because it does not define chip->dev_ready() ? > > Clarification: only one driver does leaves both chip->dev_ready and > chip->cmdfunc NULL, right? Yes, exactly. I meant that we could get rid of the udelay in the default chip->cmdfunc implementations (nand_command and nand_command_lp). In other words, if chip->cmdfunc == NULL, we require chip->dev_ready != NULL. And move the existing udelay in an implementation of chip->dev_ready for the h1910 driver. > Feel free to send a patch; it's probably easier to talk more > concretely about a patch that implements your (seemingly reasonable) > suggestions, and then provide comments based on concrete ideas. Will do, Thanks, -- Ivan
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c index b927e64..0db920b 100644 --- a/drivers/mtd/nand/nand_base.c +++ b/drivers/mtd/nand/nand_base.c @@ -47,6 +47,7 @@ #include <linux/bitops.h> #include <linux/leds.h> #include <linux/io.h> +#include <linux/hrtimer.h> #include <linux/mtd/partitions.h> /* Define default oob placement schemes for large and small page devices */ @@ -99,6 +100,13 @@ static int nand_get_device(struct nand_chip *chip, struct mtd_info *mtd, static int nand_do_write_oob(struct mtd_info *mtd, loff_t to, struct mtd_oob_ops *ops); +/** + * NOP nand_wait timeout callback. + */ +static enum hrtimer_restart nand_wait_timeout_callback(struct hrtimer* timer) { + return HRTIMER_NORESTART; +} + /* * For devices which display every fart in the system on a separate LED. Is * compiled away when LED support is disabled. @@ -533,19 +541,29 @@ static void panic_nand_wait_ready(struct mtd_info *mtd, unsigned long timeo) void nand_wait_ready(struct mtd_info *mtd) { struct nand_chip *chip = mtd->priv; - unsigned long timeo = jiffies + 2; + struct hrtimer timer; /* 400ms timeout */ if (in_interrupt() || oops_in_progress) return panic_nand_wait_ready(mtd, 400); led_trigger_event(nand_led_trigger, LED_FULL); + + /* Arm timeout timer for 20ms timeout */ + hrtimer_init(&timer, CLOCK_REALTIME, HRTIMER_MODE_REL); + timer.function = nand_wait_timeout_callback; + hrtimer_start(&timer, ns_to_ktime(20 * 1000 * 1000), + HRTIMER_MODE_REL); + /* Wait until command is processed or timeout occurs */ do { if (chip->dev_ready(mtd)) break; touch_softlockup_watchdog(); - } while (time_before(jiffies, timeo)); + } while (ktime_to_ns(hrtimer_get_expires(&timer)) > 0); + + hrtimer_cancel(&timer); + led_trigger_event(nand_led_trigger, LED_OFF); } EXPORT_SYMBOL_GPL(nand_wait_ready); @@ -868,7 +886,6 @@ static void panic_nand_wait(struct mtd_info *mtd, struct nand_chip *chip, static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip) { unsigned long timeout_ms; - unsigned long timeo = jiffies; int status, state = chip->state; if (state == FL_ERASING) @@ -876,8 +893,6 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip) else timeout_ms = 20; - timeo += (HZ * timeout_ms) / 1000; - led_trigger_event(nand_led_trigger, LED_FULL); /* @@ -894,7 +909,14 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip) if (in_interrupt() || oops_in_progress) panic_nand_wait(mtd, chip, timeout_ms); else { - while (time_before(jiffies, timeo)) { + struct hrtimer timer; + + hrtimer_init(&timer, CLOCK_REALTIME, HRTIMER_MODE_REL); + timer.function = nand_wait_timeout_callback; + hrtimer_start(&timer, ns_to_ktime(timeout_ms * 1000 * 1000), + HRTIMER_MODE_REL); + + while(ktime_to_ns(hrtimer_get_expires(&timer)) > 0) { if (chip->dev_ready) { if (chip->dev_ready(mtd)) break; @@ -904,6 +926,8 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip) } cond_resched(); } + + hrtimer_cancel(&timer); } led_trigger_event(nand_led_trigger, LED_OFF);
Previous, jiffy-based, implementation had three issues: 1. For low HZ values (<100 and >=50) the timeout would be 1 jiffy, which can cause premature timeouts if a timer interrupt happens right after the "timeo" value is assigned. 2. For very low HZ values (<50) the timeout is 0 jiffies, which also cause premature timeouts. 3. The jiffies counter is not reliable on multicore systems when interrupts are disabled for long times (a couple of timer interrupt periods). For example with excessive printk and serial output in interrupt context. The jiffies counting stops during the period with disabled interrupts and recovers by counting up all lost jiffy increments very, very fast when interrupts are later enabled. This, together with bad luck, can cause a third type of premature timeouts. All three issues can cause good blocks being marked bad. Signed-off-by: Johan Gunnarsson <johan.gunnarsson@axis.com> --- drivers/mtd/nand/nand_base.c | 36 ++++++++++++++++++++++++++++++------ 1 files changed, 30 insertions(+), 6 deletions(-)