diff mbox

[OpenWrt-Devel,V2] Fix for mvebu (WRT1900AC/WRT1200AC/etc) boot counter

Message ID 1439938574-26818-1-git-send-email-nyt-openwrt@countercultured.net
State Not Applicable
Headers show

Commit Message

nyt-openwrt@countercultured.net Aug. 18, 2015, 10:56 p.m. UTC
The uboot boot counter was never reset after a successful boot.
This patch prevents bootcmd and boot_part from becoming out of sync
after a sysupgrade and ensures the proper partition is booted.

Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Signed-off-by: Rob Mosher <nyt-openwrt@countercultured.net>
---
 package/system/mtd/Makefile                        |   2 +-
 package/system/mtd/src/Makefile                    |   1 +
 package/system/mtd/src/linksys_bootcount.c         | 114 +++++++++++++++++++++
 package/system/mtd/src/mtd.c                       |  12 +++
 package/system/mtd/src/mtd.h                       |   1 +
 .../linux/mvebu/base-files/etc/init.d/u-boot_env   |   1 +
 6 files changed, 130 insertions(+), 1 deletion(-)
 create mode 100644 package/system/mtd/src/linksys_bootcount.c

Comments

Claudio Leite Aug. 19, 2015, 12:16 a.m. UTC | #1
* Rob Mosher (nyt-openwrt@countercultured.net) wrote:
> +++ b/target/linux/mvebu/base-files/etc/init.d/u-boot_env
> @@ -9,6 +9,7 @@ boot() {
>  case $(mvebu_board_name) in
>  	armada-385-linksys-caiman|armada-385-linksys-cobra|armada-xp-linksys-mamba)
>  		fw_setenv auto_recovery off

With s_env now properly updated, is there still a reason to turn off
auto_recovery?

I know it's generally a poor recovery mechanism, but as long as it
doesn't interfere with booting OpenWrt it might be better than nothing.

> +		mtd resetbc s_env
>  		;;
>  esac
>  }
> -- 
> 2.1.4
nyt-openwrt@countercultured.net Aug. 19, 2015, 12:29 a.m. UTC | #2
Auto recovery is enabled when flashing, so if it goes wrong, you can get 
back to your previous image.  Otherwise, I don't think there's a need 
for it as all it does is switch which kernel/root partitions are booted 
which shouldn't ever change unless flashing (famous last words?).


root@ZOMGWTFBBQWIFI:/lib/upgrade# cat linksys.sh
#
# Copyright (C) 2014-2015 OpenWrt.org
#

linksys_get_target_firmware() {
         cur_boot_part=`/usr/sbin/fw_printenv -n boot_part`
         target_firmware=""
         if [ "$cur_boot_part" = "1" ]
         then
                 # current primary boot - update alt boot
                 target_firmware="kernel2"
                 fw_setenv boot_part 2
                 fw_setenv bootcmd "run altnandboot"
         elif [ "$cur_boot_part" = "2" ]
         then
                 # current alt boot - update primary boot
                 target_firmware="kernel1"
                 fw_setenv boot_part 1
                 fw_setenv bootcmd "run nandboot"
         fi

         # re-enable recovery so we get back if the new firmware is broken
         fw_setenv auto_recovery yes

         echo "$target_firmware"
}
[...]


On 8/18/2015 8:16 PM, Claudio Leite wrote:
> * Rob Mosher (nyt-openwrt@countercultured.net) wrote:
>> +++ b/target/linux/mvebu/base-files/etc/init.d/u-boot_env
>> @@ -9,6 +9,7 @@ boot() {
>>   case $(mvebu_board_name) in
>>   	armada-385-linksys-caiman|armada-385-linksys-cobra|armada-xp-linksys-mamba)
>>   		fw_setenv auto_recovery off
> With s_env now properly updated, is there still a reason to turn off
> auto_recovery?
>
> I know it's generally a poor recovery mechanism, but as long as it
> doesn't interfere with booting OpenWrt it might be better than nothing.
>
>> +		mtd resetbc s_env
>>   		;;
>>   esac
>>   }
>> -- 
>> 2.1.4
Claudio Leite Aug. 19, 2015, 1 a.m. UTC | #3
* Rob Mosher (nyt-openwrt@countercultured.net) wrote:
> Auto recovery is enabled when flashing, so if it goes wrong, you can get
> back to your previous image.  Otherwise, I don't think there's a need for it
> as all it does is switch which kernel/root partitions are booted which
> shouldn't ever change unless flashing (famous last words?).

The use case for leaving it on is for somebody without serial console
access who tries something slightly risky and breaks the active image.
Say, an image that gets far enough to get to mtd resetbc and set
auto_recovery to off, but not actually work (no network, etc.) There are
also enough cases of people messing up their active images on the
WRT1900AC forum thread, one way or another.

If auto_recovery were kept on, they could then use that trick posted
in the wiki ("Stock Firmware Recovery") to manually trigger the
auto recovery to flip back to the other image. (that of course assumes
the other firmware works, but again, this isn't a great recovery
mechanism...)

I tested a build with this patch but auto_recovery kept to "on" and it
works as expected after many reboots. I also manually triggered the
auto_recovery flip, which also worked as expected. So, it seems like it
doesn't really cost anything to keep it on (not yet, at least) while
providing a fringe benefit.

-Claudio
nyt-openwrt@countercultured.net Aug. 19, 2015, 1:10 a.m. UTC | #4
Good point.  It looks like Linksys actually just did the same thing as 
well in their firmware.  Anyway, this patch is just to fix the boot 
counter, I'll submit another to leave auto recovery on.

On 8/18/2015 9:00 PM, Claudio Leite wrote:
> * Rob Mosher (nyt-openwrt@countercultured.net) wrote:
>> Auto recovery is enabled when flashing, so if it goes wrong, you can get
>> back to your previous image.  Otherwise, I don't think there's a need for it
>> as all it does is switch which kernel/root partitions are booted which
>> shouldn't ever change unless flashing (famous last words?).
> The use case for leaving it on is for somebody without serial console
> access who tries something slightly risky and breaks the active image.
> Say, an image that gets far enough to get to mtd resetbc and set
> auto_recovery to off, but not actually work (no network, etc.) There are
> also enough cases of people messing up their active images on the
> WRT1900AC forum thread, one way or another.
>
> If auto_recovery were kept on, they could then use that trick posted
> in the wiki ("Stock Firmware Recovery") to manually trigger the
> auto recovery to flip back to the other image. (that of course assumes
> the other firmware works, but again, this isn't a great recovery
> mechanism...)
>
> I tested a build with this patch but auto_recovery kept to "on" and it
> works as expected after many reboots. I also manually triggered the
> auto_recovery flip, which also worked as expected. So, it seems like it
> doesn't really cost anything to keep it on (not yet, at least) while
> providing a fringe benefit.
>
> -Claudio
Bjørn Mork Aug. 19, 2015, 7:27 a.m. UTC | #5
Rob Mosher <nyt-openwrt@countercultured.net> writes:

> +#define BOOTCOUNT_MAGIC	0x20110811
> +
> +struct bootcounter {
> +	uint32_t magic;
> +	uint32_t count;
> +	uint32_t checksum;
> +};

Maybe make it clear that these numbers are stored in little endian
order?  Or will that always be the native endianness?

> +static char page[2048];
> +
> +int mtd_resetbc(const char *mtd)
> +{
> +	struct mtd_info_user mtd_info;
> +	struct bootcounter *curr = (struct bootcounter *)page;
> +	unsigned int i;
> +	int last_count = 0;
> +	int num_bc;
> +	int fd;
> +	int ret;
> +
> +	fd = mtd_check_open(mtd);
> +
> +	if (ioctl(fd, MEMGETINFO, &mtd_info) < 0) {
> +		fprintf(stderr, "failed to get mtd info!\n");
> +		return -1;
> +	}
> +
> +	num_bc = mtd_info.size / mtd_info.writesize;
> +
> +	for (i = 0; i < num_bc; i++) {
> +		pread(fd, curr, sizeof(*curr), i * mtd_info.writesize);
> +
> +		if (curr->magic != BOOTCOUNT_MAGIC && curr->magic != 0xffffffff) {
> +			fprintf(stderr, "unexpected magic %08x, bailing out\n", curr->magic);
> +			goto out;
> +		}

Maybe verify the checksum sanity as well as an extra failsafe?

> +		mtd resetbc s_env

What good does it do to make the partition name an input parameter?
It's just as static as the BOOTCOUNT_MAGIC, AFAICT.


Bjørn
diff mbox

Patch

diff --git a/package/system/mtd/Makefile b/package/system/mtd/Makefile
index 8d7bb44..ae1922f 100644
--- a/package/system/mtd/Makefile
+++ b/package/system/mtd/Makefile
@@ -9,7 +9,7 @@  include $(TOPDIR)/rules.mk
 include $(INCLUDE_DIR)/kernel.mk
 
 PKG_NAME:=mtd
-PKG_RELEASE:=20
+PKG_RELEASE:=21
 
 PKG_BUILD_DIR := $(KERNEL_BUILD_DIR)/$(PKG_NAME)
 STAMP_PREPARED := $(STAMP_PREPARED)_$(call confvar,CONFIG_MTD_REDBOOT_PARTS)
diff --git a/package/system/mtd/src/Makefile b/package/system/mtd/src/Makefile
index 27ac339..40a165e 100644
--- a/package/system/mtd/src/Makefile
+++ b/package/system/mtd/src/Makefile
@@ -10,6 +10,7 @@  obj.brcm47xx = $(obj.brcm)
 obj.bcm53xx = $(obj.brcm)
 obj.brcm63xx = imagetag.o
 obj.ramips = $(obj.seama)
+obj.mvebu = linksys_bootcount.o
 
 ifdef FIS_SUPPORT
   obj += fis.o
diff --git a/package/system/mtd/src/linksys_bootcount.c b/package/system/mtd/src/linksys_bootcount.c
new file mode 100644
index 0000000..95f75fe
--- /dev/null
+++ b/package/system/mtd/src/linksys_bootcount.c
@@ -0,0 +1,114 @@ 
+/*
+ * Linksys boot counter reset code for mtd
+ *
+ * Copyright (C) 2013 Jonas Gorski <jogo@openwrt.org>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License v2
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <endian.h>
+#include <string.h>
+#include <errno.h>
+
+#include <sys/ioctl.h>
+#include <mtd/mtd-user.h>
+
+#include "mtd.h"
+
+#define BOOTCOUNT_MAGIC	0x20110811
+
+struct bootcounter {
+	uint32_t magic;
+	uint32_t count;
+	uint32_t checksum;
+};
+
+static char page[2048];
+
+int mtd_resetbc(const char *mtd)
+{
+	struct mtd_info_user mtd_info;
+	struct bootcounter *curr = (struct bootcounter *)page;
+	unsigned int i;
+	int last_count = 0;
+	int num_bc;
+	int fd;
+	int ret;
+
+	fd = mtd_check_open(mtd);
+
+	if (ioctl(fd, MEMGETINFO, &mtd_info) < 0) {
+		fprintf(stderr, "failed to get mtd info!\n");
+		return -1;
+	}
+
+	num_bc = mtd_info.size / mtd_info.writesize;
+
+	for (i = 0; i < num_bc; i++) {
+		pread(fd, curr, sizeof(*curr), i * mtd_info.writesize);
+
+		if (curr->magic != BOOTCOUNT_MAGIC && curr->magic != 0xffffffff) {
+			fprintf(stderr, "unexpected magic %08x, bailing out\n", curr->magic);
+			goto out;
+		}
+
+		if (curr->magic == 0xffffffff)
+			break;
+
+		last_count = curr->count;
+	}
+
+	/* no need to do writes when last boot count is already 0 */
+	if (last_count == 0)
+		goto out;
+
+
+	if (i == num_bc) {
+		struct erase_info_user erase_info;
+		erase_info.start = 0;
+		erase_info.length = mtd_info.size;
+
+		/* erase block */
+		ret = ioctl(fd, MEMERASE, &erase_info);
+		if (ret < 0) {
+			fprintf(stderr, "failed to erase block: %i\n", ret);
+			return -1;
+		}
+
+		i = 0;
+	}
+
+	memset(curr, 0xff, mtd_info.writesize);
+
+	curr->magic = BOOTCOUNT_MAGIC;
+	curr->count = 0;
+	curr->checksum = BOOTCOUNT_MAGIC;
+
+	ret = pwrite(fd, curr, mtd_info.writesize, i * mtd_info.writesize);
+	if (ret < 0)
+		fprintf(stderr, "failed to write: %i\n", ret);
+	sync();
+out:
+	close(fd);
+
+	return 0;
+}
diff --git a/package/system/mtd/src/mtd.c b/package/system/mtd/src/mtd.c
index 741b57b..0247630 100644
--- a/package/system/mtd/src/mtd.c
+++ b/package/system/mtd/src/mtd.c
@@ -640,6 +640,10 @@  static void usage(void)
 	"        verify <imagefile>|-    verify <imagefile> (use - for stdin) to device\n"
 	"        write <imagefile>|-     write <imagefile> (use - for stdin) to device\n"
 	"        jffs2write <file>       append <file> to the jffs2 partition on the device\n");
+	if (mtd_resetbc) {
+	    fprintf(stderr,
+	"        resetbc <device>        reset the uboot boot counter\n");
+	}
 	if (mtd_fixtrx) {
 	    fprintf(stderr,
 	"        fixtrx                  fix the checksum in a trx header on first boot\n");
@@ -706,6 +710,7 @@  int main (int argc, char **argv)
 		CMD_FIXSEAMA,
 		CMD_VERIFY,
 		CMD_DUMP,
+		CMD_RESETBC,
 	} cmd = -1;
 
 	erase[0] = NULL;
@@ -800,6 +805,9 @@  int main (int argc, char **argv)
 	} else if ((strcmp(argv[0], "erase") == 0) && (argc == 2)) {
 		cmd = CMD_ERASE;
 		device = argv[1];
+	} else if (((strcmp(argv[0], "resetbc") == 0) && (argc == 2)) && mtd_resetbc) {
+		cmd = CMD_RESETBC;
+		device = argv[1];
 	} else if (((strcmp(argv[0], "fixtrx") == 0) && (argc == 2)) && mtd_fixtrx) {
 		cmd = CMD_FIXTRX;
 		device = argv[1];
@@ -892,6 +900,10 @@  int main (int argc, char **argv)
 		    if (mtd_fixtrx) {
 			    mtd_fixtrx(device, offset);
             }
+		case CMD_RESETBC:
+		    if (mtd_resetbc) {
+			    mtd_resetbc(device);
+            }
 		case CMD_FIXSEAMA:
 			if (mtd_fixseama)
 			    mtd_fixseama(device, 0);
diff --git a/package/system/mtd/src/mtd.h b/package/system/mtd/src/mtd.h
index d94f394..fb37b8b 100644
--- a/package/system/mtd/src/mtd.h
+++ b/package/system/mtd/src/mtd.h
@@ -27,4 +27,5 @@  extern int trx_fixup(int fd, const char *name)  __attribute__ ((weak));
 extern int trx_check(int imagefd, const char *mtd, char *buf, int *len) __attribute__ ((weak));
 extern int mtd_fixtrx(const char *mtd, size_t offset) __attribute__ ((weak));
 extern int mtd_fixseama(const char *mtd, size_t offset) __attribute__ ((weak));
+extern int mtd_resetbc(const char *mtd) __attribute__ ((weak));
 #endif /* __mtd_h */
diff --git a/target/linux/mvebu/base-files/etc/init.d/u-boot_env b/target/linux/mvebu/base-files/etc/init.d/u-boot_env
index 82f36cb..bf038b2 100755
--- a/target/linux/mvebu/base-files/etc/init.d/u-boot_env
+++ b/target/linux/mvebu/base-files/etc/init.d/u-boot_env
@@ -9,6 +9,7 @@  boot() {
 case $(mvebu_board_name) in
 	armada-385-linksys-caiman|armada-385-linksys-cobra|armada-xp-linksys-mamba)
 		fw_setenv auto_recovery off
+		mtd resetbc s_env
 		;;
 esac
 }