Message ID | 20141022123929.5FD0B3834D1@gemini.denx.de |
---|---|
State | Not Applicable |
Delegated to: | Tom Rini |
Headers | show |
I had exactly the same behaviour some time ago and tracked it down to this (and posted it on the mailing list, but sadly got no feedback): In my latest u-boot builds I had some strange behaviour that I finally tracked down to not fixed up flash addresses in relocated u-boot. These addresses come from symbols in the .data.rel.ro.local section that is not handled by u-boot linker scripts at the moment. Some background on relro: http://www.airs.com/blog/archives/189 Joerg Albert already inquired about this on the gcc ML: https://gcc.gnu.org/ml/gcc-help/2014-02/msg00017.html and he already suggested a solution: https://gcc.gnu.org/ml/gcc-help/2014-02/msg00054.html So there a three things to notice: 1. Do not use gcc 4.8 and u-boot at the moment. 2. You might not notice that you have a problem until you erase u-boot from flash (and get your cache flushed). 3. Handling relro properly should be on the TODO-List Maybe this is already common knowledge an maybe somebody is already working on this - but I did not notice yet. So in this case: sorry for the noise :) 2014-10-22 14:39 GMT+02:00 Wolfgang Denk <wd@denx.de>: > Hi, > > I'm trying to track down a "syntax error" issue that gets triggered > when erasing the U-Boot image in NOR flash. Symptoms look like this: > > => print update > update=protect off 0xfc000000 +${filesize};erase 0xfc000000 +${filesize};cp.b 200000 0xfc000000 ${filesize};protect on 0xfc000000 +${filesize} > => run update > Un-Protected 2 sectors > > .. done > Erased 2 sectors > syntax error > Protected 2 sectors > => run update > syntax error > > git bisect found commit 199adb6 "common/misc: sparse fixes" as > culprit; breaking this down further showed a single line in > common/cli_hush.c to trigger the problem. This patch fixes it: > > --- > common/cli_hush.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/common/cli_hush.c b/common/cli_hush.c > index 38da5a0..5bbcfe6 100644 > --- a/common/cli_hush.c > +++ b/common/cli_hush.c > @@ -3127,7 +3127,7 @@ static void mapset(const unsigned char *set, int code) > for (s=set; *s; s++) map[*s] = code; > } > > -static void update_ifs_map(void) > +void update_ifs_map(void) > { > /* char *ifs and char map[256] are both globals. */ > ifs = (uchar *)getenv("IFS"); > -- > 1.8.3.1 > > But I still have bad feelings - symptoms indicate that this is > actually a relocation issue, as it only gets triggered when erasing > the U-Boot image in NOR flash, so probably there are still pointers to > data in NOR being used. This patch here is not suited to fix the > original cause of this issue. But then, I do not see where there > might be a relocation problem. To be sure I even verified that "ifs" > and "map[]" are really in RAM all the time. > > Has anybody an idea how to further track this down? Or is the patch > above actually a real fix? If so, why? > > Best regards, > > Wolfgang Denk > > -- > DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de > Old programmers never die, they just branch to a new address. > _______________________________________________ > U-Boot mailing list > U-Boot@lists.denx.de > http://lists.denx.de/mailman/listinfo/u-boot
Dear Dirk, In message <CANVMifLGzKz+=-K-E9_sSXBxpYPdG1YqEXc-tsCApi7WVxQAHg@mail.gmail.com> you wrote: > I had exactly the same behaviour some time ago and tracked it down to > this (and posted it on the mailing list, but sadly got no feedback): Thanks a lot for this pointer. > So there a three things to notice: > 1. Do not use gcc 4.8 and u-boot at the moment. > 2. You might not notice that you have a problem until you erase u-boot > from flash (and get your cache flushed). > 3. Handling relro properly should be on the TODO-List I confirm that the problem is in my case with gcc 4.8.1, too. I did not try another compiler yet. > Maybe this is already common knowledge an maybe somebody is already > working on this - but I did not notice yet. So in this case: sorry for > the noise :) I highly appreciate your hint, it was definitely very useful. Thanks! Best regards, Wolfgang Denk
On Wed, Oct 22, 2014 at 2:56 PM, Wolfgang Denk <wd@denx.de> wrote: >> So there a three things to notice: >> 1. Do not use gcc 4.8 and u-boot at the moment. >> 2. You might not notice that you have a problem until you erase u-boot >> from flash (and get your cache flushed). >> 3. Handling relro properly should be on the TODO-List > > I confirm that the problem is in my case with gcc 4.8.1, too. I did > not try another compiler yet. Yes, there have been reported issues when using gcc 4.8.1 for building an ARM kernel as well: https://lkml.org/lkml/2014/10/10/272 Regards, Fabio Estevam
On Wed, Oct 22, 2014 at 06:56:11PM +0200, Wolfgang Denk wrote: > Dear Dirk, > > In message <CANVMifLGzKz+=-K-E9_sSXBxpYPdG1YqEXc-tsCApi7WVxQAHg@mail.gmail.com> you wrote: > > I had exactly the same behaviour some time ago and tracked it down to > > this (and posted it on the mailing list, but sadly got no feedback): > > Thanks a lot for this pointer. > > > So there a three things to notice: > > 1. Do not use gcc 4.8 and u-boot at the moment. > > 2. You might not notice that you have a problem until you erase u-boot > > from flash (and get your cache flushed). > > 3. Handling relro properly should be on the TODO-List > > I confirm that the problem is in my case with gcc 4.8.1, too. I did > not try another compiler yet. > > > Maybe this is already common knowledge an maybe somebody is already > > working on this - but I did not notice yet. So in this case: sorry for > > the noise :) > > I highly appreciate your hint, it was definitely very useful. Thanks! Is this ARM or PowerPC? The kernel has blacklisted 4.8.x for ARM in some cases, and this may or may not be related (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58854)
Dear Tom, In message <20141022172811.GD25506@bill-the-cat> you wrote: > > Is this ARM or PowerPC? The kernel has blacklisted 4.8.x for ARM in > some cases, and this may or may not be related (see > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D58854) This is on PowerPC (MPC5200, i. e. the TQM5200S board I've had in my fingers yesterday for other reasons). Best regards, Wolfgang Denk
Hi! > > In message <CANVMifLGzKz+=-K-E9_sSXBxpYPdG1YqEXc-tsCApi7WVxQAHg@mail.gmail.com> you wrote: > > > I had exactly the same behaviour some time ago and tracked it down to > > > this (and posted it on the mailing list, but sadly got no feedback): > > > > Thanks a lot for this pointer. > > > > > So there a three things to notice: > > > 1. Do not use gcc 4.8 and u-boot at the moment. > > > 2. You might not notice that you have a problem until you erase u-boot > > > from flash (and get your cache flushed). > > > 3. Handling relro properly should be on the TODO-List > > > > I confirm that the problem is in my case with gcc 4.8.1, too. I did > > not try another compiler yet. > > > > > Maybe this is already common knowledge an maybe somebody is already > > > working on this - but I did not notice yet. So in this case: sorry for > > > the noise :) > > > > I highly appreciate your hint, it was definitely very useful. Thanks! > > Is this ARM or PowerPC? The kernel has blacklisted 4.8.x for ARM in > some cases, and this may or may not be related (see > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58854) Just for the record, I also stree strange issues with 4.8.1 on arm/socfpga. u-boot works ok when compiled with 4.7.2, and behaviour on 4.8.1 seems to change based on compiler flags (-Os vs. -O2). If anyone knows some kind of workaround, that would be nice... I'm using bitbake with eldk-5.5, and changing that would not be too easy :-(. Best regards, Pavel
On Wednesday, October 22, 2014 at 11:27:42 PM, Pavel Machek wrote: > Hi! > > > > In message <CANVMifLGzKz+=-K-E9_sSXBxpYPdG1YqEXc- tsCApi7WVxQAHg@mail.gmail.com> you wrote: > > > > I had exactly the same behaviour some time ago and tracked it down to > > > > > > > this (and posted it on the mailing list, but sadly got no feedback): > > > Thanks a lot for this pointer. > > > > > > > So there a three things to notice: > > > > 1. Do not use gcc 4.8 and u-boot at the moment. > > > > 2. You might not notice that you have a problem until you erase > > > > u-boot from flash (and get your cache flushed). > > > > 3. Handling relro properly should be on the TODO-List > > > > > > I confirm that the problem is in my case with gcc 4.8.1, too. I did > > > not try another compiler yet. > > > > > > > Maybe this is already common knowledge an maybe somebody is already > > > > working on this - but I did not notice yet. So in this case: sorry > > > > for the noise :) > > > > > > I highly appreciate your hint, it was definitely very useful. Thanks! > > > > Is this ARM or PowerPC? The kernel has blacklisted 4.8.x for ARM in > > some cases, and this may or may not be related (see > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58854) > > Just for the record, I also stree strange issues with 4.8.1 on > arm/socfpga. u-boot works ok when compiled with 4.7.2, and behaviour > on 4.8.1 seems to change based on compiler flags (-Os vs. -O2). > > If anyone knows some kind of workaround, that would be nice... I'm > using bitbake with eldk-5.5, and changing that would not be too easy What is the issue that you do see and I do not see ? What are the symptoms? Best regards, Marek Vasut
On Wed 2014-10-22 23:46:45, Marek Vasut wrote: > On Wednesday, October 22, 2014 at 11:27:42 PM, Pavel Machek wrote: > > Hi! > > > > > > In message <CANVMifLGzKz+=-K-E9_sSXBxpYPdG1YqEXc- > tsCApi7WVxQAHg@mail.gmail.com> you wrote: > > > > > I had exactly the same behaviour some time ago and tracked it down to > > > > > > > > > this (and posted it on the mailing list, but sadly got no feedback): > > > > Thanks a lot for this pointer. > > > > > > > > > So there a three things to notice: > > > > > 1. Do not use gcc 4.8 and u-boot at the moment. > > > > > 2. You might not notice that you have a problem until you erase > > > > > u-boot from flash (and get your cache flushed). > > > > > 3. Handling relro properly should be on the TODO-List > > > > > > > > I confirm that the problem is in my case with gcc 4.8.1, too. I did > > > > not try another compiler yet. > > > > > > > > > Maybe this is already common knowledge an maybe somebody is already > > > > > working on this - but I did not notice yet. So in this case: sorry > > > > > for the noise :) > > > > > > > > I highly appreciate your hint, it was definitely very useful. Thanks! > > > > > > Is this ARM or PowerPC? The kernel has blacklisted 4.8.x for ARM in > > > some cases, and this may or may not be related (see > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58854) > > > > Just for the record, I also stree strange issues with 4.8.1 on > > arm/socfpga. u-boot works ok when compiled with 4.7.2, and behaviour > > on 4.8.1 seems to change based on compiler flags (-Os vs. -O2). > > > > If anyone knows some kind of workaround, that would be nice... I'm > > using bitbake with eldk-5.5, and changing that would not be too easy > > What is the issue that you do see and I do not see ? What are the symptoms? I'm not sure if you should be seing this issue, are you using gcc 4.8.1? I get hang during MMC init. If I comment it out, it hangs at setenv of random variable. With changed compiler flags, commenting out MMC init does not get me to prompt. I'll try to update to gcc-4.8.3 as gcc-4.8.1 has known issues. Pavel
On Wednesday, October 22, 2014 at 11:57:39 PM, Pavel Machek wrote: > On Wed 2014-10-22 23:46:45, Marek Vasut wrote: > > On Wednesday, October 22, 2014 at 11:27:42 PM, Pavel Machek wrote: > > > Hi! > > > > > > > > In message <CANVMifLGzKz+=-K-E9_sSXBxpYPdG1YqEXc- > > > > tsCApi7WVxQAHg@mail.gmail.com> you wrote: > > > > > > I had exactly the same behaviour some time ago and tracked it > > > > > > down to > > > > > > > > > > > this (and posted it on the mailing list, but sadly got no feedback): > > > > > Thanks a lot for this pointer. > > > > > > > > > > > So there a three things to notice: > > > > > > 1. Do not use gcc 4.8 and u-boot at the moment. > > > > > > 2. You might not notice that you have a problem until you erase > > > > > > u-boot from flash (and get your cache flushed). > > > > > > 3. Handling relro properly should be on the TODO-List > > > > > > > > > > I confirm that the problem is in my case with gcc 4.8.1, too. I > > > > > did not try another compiler yet. > > > > > > > > > > > Maybe this is already common knowledge an maybe somebody is > > > > > > already working on this - but I did not notice yet. So in this > > > > > > case: sorry for the noise :) > > > > > > > > > > I highly appreciate your hint, it was definitely very useful. > > > > > Thanks! > > > > > > > > Is this ARM or PowerPC? The kernel has blacklisted 4.8.x for ARM in > > > > some cases, and this may or may not be related (see > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58854) > > > > > > Just for the record, I also stree strange issues with 4.8.1 on > > > arm/socfpga. u-boot works ok when compiled with 4.7.2, and behaviour > > > on 4.8.1 seems to change based on compiler flags (-Os vs. -O2). > > > > > > If anyone knows some kind of workaround, that would be nice... I'm > > > using bitbake with eldk-5.5, and changing that would not be too easy > > > > What is the issue that you do see and I do not see ? What are the > > symptoms? > > I'm not sure if you should be seing this issue, are you using gcc > 4.8.1? > > I get hang during MMC init. If I comment it out, it hangs at setenv of > random variable. With changed compiler flags, commenting out MMC init > does not get me to prompt. > > I'll try to update to gcc-4.8.3 as gcc-4.8.1 has known issues. Actually 4.8.2 from ELDK 5.6 . I recall there were some fixes for GCC , but this should be fixed in ELDK 5.5.2 . See: https://www.mail-archive.com/eldk@lists.denx.de/msg00908.html Best regards, Marek Vasut
Hello Wolfgang, 2014-10-22 18:56 GMT+02:00 Wolfgang Denk <wd@denx.de>: > Dear Dirk, > > In message <CANVMifLGzKz+=-K-E9_sSXBxpYPdG1YqEXc-tsCApi7WVxQAHg@mail.gmail.com> you wrote: >> I had exactly the same behaviour some time ago and tracked it down to >> this (and posted it on the mailing list, but sadly got no feedback): > > Thanks a lot for this pointer. I am really glad this was helpful. It was very nasty to track down, so I was really concerned when I found it. For that reson I chose "u-boot ppc does not work with gcc 4.8" as a topic when I reported it to U-Boot mailing list and put you on CC on august 5th. But maybe I should have been more explicit, something like "APOCALYPSE NOW: u-boot ppc does not work with gcc 4.8" ;) This problem is *not* fixed by the links Marek addressed. Just a quick explanation of what is going on: Since gcc 4.8 we have new sections .data.rel.ro and .data.rel.ro.local. They contain absolute addresses that should really be fixed up in our relocation process but are not considered yet. In your case you wre running u-boot referencing the not fixed-up addresses which worked perfectly as long as they still pointed to valid content. But as soon as you erased flash this was no longer the case. To make debugging even more fun, behaviour also depends on cache contents. In my original mail I referenced this potential solution, at least it worked for me: https://gcc.gnu.org/ml/gcc-help/2014-02/msg00054.html Cheers DIrk
> > Hello Wolfgang, > > 2014-10-22 18:56 GMT+02:00 Wolfgang Denk <wd@denx.de>: > > Dear Dirk, > > > > In message <CANVMifLGzKz+=-K-E9_sSXBxpYPdG1YqEXc-tsCApi7WVxQAHg@mail.gmail.com> you wrote: > >> I had exactly the same behaviour some time ago and tracked it down to > >> this (and posted it on the mailing list, but sadly got no feedback): > > > > Thanks a lot for this pointer. > > I am really glad this was helpful. It was very nasty to track down, so > I was really concerned when I found it. For that reson I chose "u-boot > ppc does not work with gcc 4.8" as a topic when I reported it to > U-Boot mailing list and put you on CC on august 5th. But maybe I > should have been more explicit, something like "APOCALYPSE NOW: u-boot > ppc does not work with gcc 4.8" ;) > > This problem is *not* fixed by the links Marek addressed. > > Just a quick explanation of what is going on: > Since gcc 4.8 we have new sections .data.rel.ro and > .data.rel.ro.local. They contain absolute addresses that should really > be fixed up in our relocation process but are not considered yet. > In your case you wre running u-boot referencing the not fixed-up > addresses which worked perfectly as long as they still pointed to > valid content. But as soon as you erased flash this was no longer the > case. To make debugging even more fun, behaviour also depends on cache > contents. Ouch, that was a nasty surprise. > > In my original mail I referenced this potential solution, at least it > worked for me: > https://gcc.gnu.org/ml/gcc-help/2014-02/msg00054.html That looks like the correct fix but I presume both .data.rel.ro and data.rel.ro.local should be added? Jocke
diff --git a/common/cli_hush.c b/common/cli_hush.c index 38da5a0..5bbcfe6 100644 --- a/common/cli_hush.c +++ b/common/cli_hush.c @@ -3127,7 +3127,7 @@ static void mapset(const unsigned char *set, int code) for (s=set; *s; s++) map[*s] = code; } -static void update_ifs_map(void) +void update_ifs_map(void) { /* char *ifs and char map[256] are both globals. */ ifs = (uchar *)getenv("IFS");