Message ID | CAN8TOE-jYm3pRa3utwnqYRB=DonxboW7PfFq-CqdqG+_OzCh9A@mail.gmail.com |
---|---|
State | RFC |
Headers | show |
On Tue, 23 Aug 2011, Brian Norris wrote: > On Mon, Aug 22, 2011 at 11:48 PM, Ricard Wanderlof > <ricard.wanderlof@axis.com> wrote: >>>>> On Wed, 2011-08-17 at 16:50 -0700, Brian Norris wrote: >>>>>> 1) Broken "noecc" writes; `nandwrite -n -o' does not work on my >>>>>> hardware, at least, without these patches >> >> This particular issue was fixed in a patch by Peter Wippich on June 6th: >> >> http://lists.infradead.org/pipermail/linux-mtd/2011-June/036016.html > > Right, I actually found this patch already and have it lined up in my > test kernel, but apparently my issue is separate. > >> I could test patch 1 of this set to see if it accomplishes the same thing, >> but initially to me it seems that the issue has already been fixed, albeit >> perhaps not pushed yet? > > I am now doubting that my fix will solve your problem like Peter > Wippich's patch did. I believe the problem I am having is specifically > related to my NAND controller's characteristics (how it handles ECC). > > So for hardware on which Peter's fix solves your problems, I suppose I > would be interested mostly in seeing if my patch *breaks* your build. > FYI, I already see an issue for those who use both the builtin > `nand_write_page_raw()' and `nand_write_buf()' functions. It can be > fixed if you amend my patch with the following. I can resend later, > but test these for breakage if you can! Ok. I tried applying patch #1 of your patch set, to a tree in which Peter's patch has already been applying, to see what would happen. I had a problem in that in the mtdchar.c I have it looks like this: memset(chip->oob_poi, 0xff, mtd->oobsize); nand_fill_oob(chip, ops->oobbuf, ops->ooblen, ops); status = chip->ecc.write_oob(mtd, chip, page & chip->pagemask); memset(chip->oob_poi, 0xff, mtd->oobsize); whereas your patch looks like it was made against a version which lacks the memsets. First I thought it was because I was running an older kernel (2.6.35), but I looked at HEAD of the linux-2.6 and mtd-2.6 trees at git.infradead.org, and it's the same there. So I'm not sure exactly which version your patch was made against. Perhaps it's obvious to someone but not me right now. Nevertheless, after applying the patch, as you suspected, using 'nandwrite -o -n' fails, in this case the application hangs after outputting Writing data to block 0 After applying the following patch from your email, 'nandwrite -o -n' initially appears to work, however upon closer inspection the oob's of all the written pages are all-ff, i.e. the oob's are never written. > --- a/drivers/mtd/nand/nand_base.c > +++ b/drivers/mtd/nand/nand_base.c > @@ -1898,7 +1898,8 @@ out: > static void nand_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip, > const uint8_t *buf) > { > - chip->write_buf(mtd, buf, mtd->writesize); > + if (buf) > + chip->write_buf(mtd, buf, mtd->writesize); > chip->write_buf(mtd, chip->oob_poi, mtd->oobsize); > } > Now, this may all be nonsense as as I mentioned I'm not sure that I've had the right version to start with. The memsets bug me, as they would explain the all-ff's stuff, but I don't really feel like just experimenting. /Ricard
On Wed, Aug 24, 2011 at 8:36 AM, Ricard Wanderlof <ricard.wanderlof@axis.com> wrote: ... > I had a problem > in that in the mtdchar.c I have it looks like this: Did you mean nand_base.c? > whereas your patch looks like it was made against a version which lacks the > memsets. First I thought it was because I was running an older kernel > (2.6.35), but I looked at HEAD of the linux-2.6 and mtd-2.6 trees at > git.infradead.org, and it's the same there. So I'm not sure exactly which > version your patch was made against. Perhaps it's obvious to someone but not > me right now. My patches were based on l2-mtd-2.6.git, actually. David Woodhouse rarely pulls patches into his mtd-2.6 tree, so I have moved to working with Artem's l2-mtd-2.6 tree, where all the MTD work that's waiting for upstream sits (some stuff's been there since May). This is not obvious, and usually when it matters, I try to mention it in the patch summaries. Artem: is there any official change in policy on patch submission? I see documentation that says to base off mtd-2.6.git, but I've been using l2-mtd-2.6 to help you avoid merge conflicts: http://www.linux-mtd.infradead.org/source.html Anyway, I believe the relevant patch dependency (from l2-mtd-2.6.git) is: commit a8ee364bbf14861d5d0af39c4da06c30441895fb Author: THOMSON, Adam (Adam) <adam.thomson@alcatel-lucent.com> mtd: nand_base: always initialise oob_poi before writing OOB data > Nevertheless, after applying the patch, as you suspected, using 'nandwrite > -o -n' fails, in this case the application hangs after outputting > > Writing data to block 0 I guessed you would be more likely to get a segmentation fault on a NULL pointer, but I may be wrong. > Now, this may all be nonsense as as I mentioned I'm not sure that I've had > the right version to start with. The memsets bug me, as they would explain > the all-ff's stuff, but I don't really feel like just experimenting. Right, that's fair enough. I still think this patch is not 100% ready, anyway. Thanks for the help though! If you're still up for trying, you can try applying Adam Thomson's patch, then my RFC, then the inlined amendment I sent. With your feedback, I will I need to figure out structurally how to handle the differences between hardware that uses the standard functions (and works fine without my patch) and hardware like mine that needs more information passed to it. Perhaps there needs to be a replaceable `chip->ecc.write_oob_raw' function? Or at least some modifications to the other "replaceable" raw functions. Brian
On Wed, 2011-08-24 at 11:01 -0700, Brian Norris wrote: > On Wed, Aug 24, 2011 at 8:36 AM, Ricard Wanderlof > <ricard.wanderlof@axis.com> wrote: > ... > > I had a problem > > in that in the mtdchar.c I have it looks like this: > > Did you mean nand_base.c? > > > whereas your patch looks like it was made against a version which lacks the > > memsets. First I thought it was because I was running an older kernel > > (2.6.35), but I looked at HEAD of the linux-2.6 and mtd-2.6 trees at > > git.infradead.org, and it's the same there. So I'm not sure exactly which > > version your patch was made against. Perhaps it's obvious to someone but not > > me right now. > > My patches were based on l2-mtd-2.6.git, actually. David Woodhouse > rarely pulls patches into his mtd-2.6 tree, so I have moved to working > with Artem's l2-mtd-2.6 tree, where all the MTD work that's waiting > for upstream sits (some stuff's been there since May). This is not > obvious, and usually when it matters, I try to mention it in the patch > summaries. David's tree is desperately out-of-date now, I did not talk to him lately, he is not very reachable now. There are patches from May because David did merge anything this merge window, probably he had some issues/etc, let's hope he'll merge everything next merge window. May be he wanted to ask me to merge it, but I have been having vacation and was not available at the IRC chat. > Artem: is there any official change in policy on patch submission? I > see documentation that says to base off mtd-2.6.git, but I've been > using l2-mtd-2.6 to help you avoid merge conflicts: > http://www.linux-mtd.infradead.org/source.html There is no official policy, this all works because enthusiasts who just like MTD stuff and keep it alive. When I noticed that dwmw2 does not give MTD ML enough attention, I just started my l2 tree to help him - it was faster/easier for him to look with reviewed patches in my tree rather than look through whole MTD ML, find out which acks/reviewed-by to add and where, which patch versions are out of date, etc. At this point I think, that you have to use the l2 tree, because David's tree is very out-of-date. Also, beware that the l2 tree is currently in linux-next.
On Wed, 24 Aug 2011, Brian Norris wrote: > On Wed, Aug 24, 2011 at 8:36 AM, Ricard Wanderlof > <ricard.wanderlof@axis.com> wrote: > ... >> I had a problem >> in that in the mtdchar.c I have it looks like this: > > Did you mean nand_base.c? Yes, sorry about that. >> whereas your patch looks like it was made against a version which lacks the >> memsets. First I thought it was because I was running an older kernel >> ... > My patches were based on l2-mtd-2.6.git, actually. David Woodhouse > rarely pulls patches into his mtd-2.6 tree, so I have moved to working > with Artem's l2-mtd-2.6 tree, where all the MTD work that's waiting > for upstream sits (some stuff's been there since May). This is not > obvious, and usually when it matters, I try to mention it in the patch > summaries. I guess I should have thought of it but never thought to look there. > Anyway, I believe the relevant patch dependency (from l2-mtd-2.6.git) is: > > commit a8ee364bbf14861d5d0af39c4da06c30441895fb > Author: THOMSON, Adam (Adam) <adam.thomson@alcatel-lucent.com> > mtd: nand_base: always initialise oob_poi before writing OOB data That seems right. >> Nevertheless, after applying the patch, as you suspected, using 'nandwrite >> -o -n' fails, in this case the application hangs after outputting >> >> Writing data to block 0 > > I guessed you would be more likely to get a segmentation fault on a > NULL pointer, but I may be wrong. Something seems to hang at a lower level, because the system becomes unresponsive at this point (I can telnet in, but trying to access the flash with ls for instance causes the shell to hang). > If you're still up for trying, you can try applying Adam Thomson's > patch, then my RFC, then the inlined amendment I sent. With your > feedback, I will (something seems missing here ... nevertheless) I applied the patches as you mentioned, which brought success. Dumping a partition with nanddump, then writing it back with nandwrite -o -n results in the correct data being written both to the main and oob areas. (Without your inlinend patch, the nandwrite application still hangs after Writing to data block 0). So the conclusion would be that this combination of patches does not break Peter Wippich's patch. /Ricard
Hi Ricard, On Thu, Aug 25, 2011 at 2:33 AM, Ricard Wanderlof <ricard.wanderlof@axis.com> wrote: > On Wed, 24 Aug 2011, Brian Norris wrote: >> On Wed, Aug 24, 2011 at 8:36 AM, Ricard Wanderlof <ricard.wanderlof@axis.com> wrote: >> Anyway, I believe the relevant patch dependency (from l2-mtd-2.6.git) is: >> >> commit a8ee364bbf14861d5d0af39c4da06c30441895fb >> Author: THOMSON, Adam (Adam) <adam.thomson@alcatel-lucent.com> >> mtd: nand_base: always initialise oob_poi before writing OOB data > > That seems right. BTW, that patch seems a bit broken to me; I sent a fixup for it: http://lists.infradead.org/pipermail/linux-mtd/2011-August/037698.html >> I guessed you would be more likely to get a segmentation fault on a >> NULL pointer, but I may be wrong. > > Something seems to hang at a lower level, because the system becomes > unresponsive at this point (I can telnet in, but trying to access the flash > with ls for instance causes the shell to hang). I think that I had a little bit of the wrong approach. I was doing some ill advised hacking to the existing write functions. I spun off my first two patches here as a different series (I believe I CC'd you); I used a different approach that should make as little impact on currently working hardware as possible: http://lists.infradead.org/pipermail/linux-mtd/2011-August/037695.html > (something seems missing here ... Acknowledged :) > I applied the patches as you mentioned, which brought success. Dumping a > partition with nanddump, then writing it back with nandwrite -o -n results > in the correct data being written both to the main and oob areas. Great! > (Without your inlinend patch, the nandwrite application still hangs after > Writing to data block 0). Not quite sure what the hanging is all about - may be related to my improper use of CMD_SEQIN, then a nand wait function being called later? - but that's why I've changed my approach. > So the conclusion would be that this combination of patches does not break > Peter Wippich's patch. Thanks a lot for the testing. I think that my first approach still may easily have unintended consequences, though. I welcome any testing on my new patch series, and any more systems with broken "noecc" should be handled through that thread. Brian
On Thu, 25 Aug 2011, Brian Norris wrote: >>> Author: THOMSON, Adam (Adam) <adam.thomson@alcatel-lucent.com> >>> mtd: nand_base: always initialise oob_poi before writing OOB data >> >> That seems right. > > BTW, that patch seems a bit broken to me; I sent a fixup for it: > http://lists.infradead.org/pipermail/linux-mtd/2011-August/037698.html Thanks, I did see it but it failed to register. I've applied it to my tree now. > I think that I had a little bit of the wrong approach. I was doing > some ill advised hacking to the existing write functions. I spun off > my first two patches here as a different series (I believe I CC'd > you); I used a different approach that should make as little impact on > currently working hardware as possible: > http://lists.infradead.org/pipermail/linux-mtd/2011-August/037695.html > > Thanks a lot for the testing. I think that my first approach still may > easily have unintended consequences, though. I welcome any testing on my > new patch series, and any more systems with broken "noecc" should be > handled through that thread. I applied the two patches in the above mentioned patch series to my tree (after applying Peter's patch, Adam's, and your amendment to it). Conclusion: nandwrite -o -n still works for me, i.e. the patch doesn't break it. One little caveat, the tree I'm using is 2.6.35 based. The patches don't apply cleanly, but it just takes a little effort. So I think the results are relevant, just wanted to mention it. /Ricard
--- a/drivers/mtd/nand/nand_base.c +++ b/drivers/mtd/nand/nand_base.c @@ -1898,7 +1898,8 @@ out: static void nand_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip, const uint8_t *buf) { - chip->write_buf(mtd, buf, mtd->writesize); + if (buf) + chip->write_buf(mtd, buf, mtd->writesize); chip->write_buf(mtd, chip->oob_poi, mtd->oobsize); }