diff mbox

Fix test for loaded kernel

Message ID 48CABE27.10301@am.sony.com
State Not Applicable
Headers show

Commit Message

Geoff Levand Sept. 12, 2008, 7:08 p.m. UTC
Fix these reboot errors with NFS mounted root filesystems:

  nfs: server 192.168.1.1 not responding, still trying

The main kexec code that uses kexec_loaded() expects a non-zero 
return to mean a kexec kernel has been loaded for execution.
Here is the current check:

	if ((result == 0) && (do_shutdown || do_exec) && !kexec_loaded())
		die

In cases where the currently running kernel does not have kexec enabled,
or in cases where the distro init scripts (YDL, maybe others) have unmounted
the sys filesystem prior to running kexec, the open of
"/sys/kernel/kexec_loaded" will fail.  This result should be returned as
(0), meaning NOT LOADED.  The current kexec_loaded() code returns (-1),
meaning LOADED.

With the current code, kexec will continue on.  The next steps are to
shutdown the network, then call sys_kexec.  The shutdown of the network
will succeed, but the call to sys_kexec will fail.  In this case, control
will pass back to the init scripts, but the network will be down.
Systems with NFS mounted root filesystem cannot work in this state.

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>

Comments

Geoff Levand Sept. 12, 2008, 7:33 p.m. UTC | #1
Hi Simon,

Sorry, this does not work correctly.  Please ignore. 

Geoff Levand wrote:
> Fix these reboot errors with NFS mounted root filesystems:
> 
>   nfs: server 192.168.1.1 not responding, still trying
> 
> The main kexec code that uses kexec_loaded() expects a non-zero 
> return to mean a kexec kernel has been loaded for execution.
> Here is the current check:
> 
> 	if ((result == 0) && (do_shutdown || do_exec) && !kexec_loaded())
> 		die
> 
> In cases where the currently running kernel does not have kexec enabled,
> or in cases where the distro init scripts (YDL, maybe others) have unmounted
> the sys filesystem prior to running kexec, the open of
> "/sys/kernel/kexec_loaded" will fail.  This result should be returned as
> (0), meaning NOT LOADED.  The current kexec_loaded() code returns (-1),
> meaning LOADED.

Unfortunately, in the case where a kernel has been loaded, but the init
scripts unmount sys_fs, my change will not allow the kexec to continue.

The only way to fix the NFS problem is to change the init scripts to
pass the -x option to kexec.

-Geoff
Mohan Kumar M Sept. 15, 2008, 6:18 a.m. UTC | #2
Hi Geoff,

> 
> The only way to fix the NFS problem is to change the init scripts to
> pass the -x option to kexec.

Can we simply call ifup (that function does not exist now) to bring up 
the network interface while returning from main() in kexec.c? This will 
be executed only if kexec'ing a kernel fails.

Regards,
Mohan.
Simon Horman Sept. 15, 2008, 7:03 a.m. UTC | #3
On Mon, Sep 15, 2008 at 11:48:56AM +0530, Mohan Kumar M wrote:
> Hi Geoff,
>
>>
>> The only way to fix the NFS problem is to change the init scripts to
>> pass the -x option to kexec.
>
> Can we simply call ifup (that function does not exist now) to bring up  
> the network interface while returning from main() in kexec.c? This will  
> be executed only if kexec'ing a kernel fails.

Good greif, that ifdown() stuff seems horrible. I wonder what the
motivation for it is/was.
diff mbox

Patch

--- a/kexec/kexec.c
+++ b/kexec/kexec.c
@@ -805,7 +805,7 @@  static int kexec_loaded(void)
 
 	fp = fopen("/sys/kernel/kexec_loaded", "r");
 	if (fp == NULL)
-		return -1;
+		return 0;
 	fscanf(fp, "%d", &ret);
 	fclose(fp);
 	return ret;