[RFC,3/3] kexec: extend kexec_file_load system call
diff mbox

Message ID 20160712014201.11456-4-takahiro.akashi@linaro.org
State Not Applicable
Headers show

Commit Message

AKASHI Takahiro July 12, 2016, 1:42 a.m. UTC
Device tree blob must be passed to a second kernel on DTB-capable
archs, like powerpc and arm64, but the current kernel interface
lacks this support.

This patch extends kexec_file_load system call by adding an extra
argument to this syscall so that an arbitrary number of file descriptors
can be handed out from user space to the kernel.

	long sys_kexec_file_load(int kernel_fd, int initrd_fd,
				 unsigned long cmdline_len,
				 const char __user *cmdline_ptr,
				 unsigned long flags,
				 const struct kexec_fdset __user *ufdset);

If KEXEC_FILE_EXTRA_FDS is set to the "flags" argument, the "ufdset"
argument points to the following struct buffer:

	struct kexec_fdset {
		int nr_fds;
		struct kexec_file_fd fds[0];
	}

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 include/linux/fs.h         |  1 +
 include/linux/kexec.h      |  2 +-
 include/linux/syscalls.h   |  4 +++-
 include/uapi/linux/kexec.h | 17 ++++++++++++++
 kernel/kexec_file.c        | 57 ++++++++++++++++++++++++++++++++++++++++------
 5 files changed, 72 insertions(+), 9 deletions(-)

Comments

Vivek Goyal July 15, 2016, 1:09 p.m. UTC | #1
On Tue, Jul 12, 2016 at 10:42:01AM +0900, AKASHI Takahiro wrote:

[..]
> -SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
> +SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,
>  		unsigned long, cmdline_len, const char __user *, cmdline_ptr,
> -		unsigned long, flags)
> +		unsigned long, flags, const struct kexec_fdset __user *, ufdset)

Can one add more parameters to existing syscall. Can it break existing
programs with new kernel? I was of the impression that one can't do that.
But may be I am missing something.

Vivek
Mark Rutland July 15, 2016, 1:19 p.m. UTC | #2
On Fri, Jul 15, 2016 at 09:09:55AM -0400, Vivek Goyal wrote:
> On Tue, Jul 12, 2016 at 10:42:01AM +0900, AKASHI Takahiro wrote:
> 
> [..]
> > -SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > +SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,
> >  		unsigned long, cmdline_len, const char __user *, cmdline_ptr,
> > -		unsigned long, flags)
> > +		unsigned long, flags, const struct kexec_fdset __user *, ufdset)
> 
> Can one add more parameters to existing syscall. Can it break existing
> programs with new kernel? I was of the impression that one can't do that.
> But may be I am missing something.

I think the idea was that we would only look at the new params if a new
flags was set, and otherwise it would behave as the old syscall.

Regardless, I think it makes far more sense to add a kexec_file_load2
syscall if we're going to modify the prototype at all. It's a rather
different proposition to the existing syscall, and needs to be treated
as such.

Thanks,
Mark.
Dave Young July 18, 2016, 2:30 a.m. UTC | #3
On 07/15/16 at 02:19pm, Mark Rutland wrote:
> On Fri, Jul 15, 2016 at 09:09:55AM -0400, Vivek Goyal wrote:
> > On Tue, Jul 12, 2016 at 10:42:01AM +0900, AKASHI Takahiro wrote:
> > 
> > [..]
> > > -SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > > +SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > >  		unsigned long, cmdline_len, const char __user *, cmdline_ptr,
> > > -		unsigned long, flags)
> > > +		unsigned long, flags, const struct kexec_fdset __user *, ufdset)
> > 
> > Can one add more parameters to existing syscall. Can it break existing
> > programs with new kernel? I was of the impression that one can't do that.
> > But may be I am missing something.
> 
> I think the idea was that we would only look at the new params if a new
> flags was set, and otherwise it would behave as the old syscall.
> 
> Regardless, I think it makes far more sense to add a kexec_file_load2
> syscall if we're going to modify the prototype at all. It's a rather
> different proposition to the existing syscall, and needs to be treated
> as such.

I do not think it is worth to add another syscall for extra fds.
We have open(2) as an example for different numbers of arguments
already.

Thanks
Dave
Dave Young July 18, 2016, 2:33 a.m. UTC | #4
On 07/15/16 at 09:09am, Vivek Goyal wrote:
> On Tue, Jul 12, 2016 at 10:42:01AM +0900, AKASHI Takahiro wrote:
> 
> [..]
> > -SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > +SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,
> >  		unsigned long, cmdline_len, const char __user *, cmdline_ptr,
> > -		unsigned long, flags)
> > +		unsigned long, flags, const struct kexec_fdset __user *, ufdset)
> 
> Can one add more parameters to existing syscall. Can it break existing
> programs with new kernel? I was of the impression that one can't do that.
> But may be I am missing something.

It will not break existing programs because we can use the new param only
when the new flag is set.

But we have a case below, but I think it is fine?
Originally kexec_file_load with the new flags will fail, but now it will
succeed and will access the new argument.

Thanks
Dave
Mark Rutland July 18, 2016, 10:07 a.m. UTC | #5
On Mon, Jul 18, 2016 at 10:30:24AM +0800, Dave Young wrote:
> On 07/15/16 at 02:19pm, Mark Rutland wrote:
> > On Fri, Jul 15, 2016 at 09:09:55AM -0400, Vivek Goyal wrote:
> > > On Tue, Jul 12, 2016 at 10:42:01AM +0900, AKASHI Takahiro wrote:
> > > 
> > > [..]
> > > > -SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > > > +SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > > >  		unsigned long, cmdline_len, const char __user *, cmdline_ptr,
> > > > -		unsigned long, flags)
> > > > +		unsigned long, flags, const struct kexec_fdset __user *, ufdset)
> > > 
> > > Can one add more parameters to existing syscall. Can it break existing
> > > programs with new kernel? I was of the impression that one can't do that.
> > > But may be I am missing something.
> > 
> > I think the idea was that we would only look at the new params if a new
> > flags was set, and otherwise it would behave as the old syscall.
> > 
> > Regardless, I think it makes far more sense to add a kexec_file_load2
> > syscall if we're going to modify the prototype at all. It's a rather
> > different proposition to the existing syscall, and needs to be treated
> > as such.
> 
> I do not think it is worth to add another syscall for extra fds.
> We have open(2) as an example for different numbers of arguments
> already.

Did we change the syscall interface for that?

I was under the impression that there was always one underlying syscall,
and the C library did the right thing to pass the expected information
to the underlying syscall.

That's rather different to changing the underlying syscall.

Regardless of how this is wrapped in userspace, I do not think modifying
the existing prototype is a good idea, and I think this kind of
extension needs to be a new syscall.

Thanks,
Mark.
Dave Young July 19, 2016, 12:55 a.m. UTC | #6
On 07/18/16 at 11:07am, Mark Rutland wrote:
> On Mon, Jul 18, 2016 at 10:30:24AM +0800, Dave Young wrote:
> > On 07/15/16 at 02:19pm, Mark Rutland wrote:
> > > On Fri, Jul 15, 2016 at 09:09:55AM -0400, Vivek Goyal wrote:
> > > > On Tue, Jul 12, 2016 at 10:42:01AM +0900, AKASHI Takahiro wrote:
> > > > 
> > > > [..]
> > > > > -SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > > > > +SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > > > >  		unsigned long, cmdline_len, const char __user *, cmdline_ptr,
> > > > > -		unsigned long, flags)
> > > > > +		unsigned long, flags, const struct kexec_fdset __user *, ufdset)
> > > > 
> > > > Can one add more parameters to existing syscall. Can it break existing
> > > > programs with new kernel? I was of the impression that one can't do that.
> > > > But may be I am missing something.
> > > 
> > > I think the idea was that we would only look at the new params if a new
> > > flags was set, and otherwise it would behave as the old syscall.
> > > 
> > > Regardless, I think it makes far more sense to add a kexec_file_load2
> > > syscall if we're going to modify the prototype at all. It's a rather
> > > different proposition to the existing syscall, and needs to be treated
> > > as such.
> > 
> > I do not think it is worth to add another syscall for extra fds.
> > We have open(2) as an example for different numbers of arguments
> > already.
> 
> Did we change the syscall interface for that?
> 
> I was under the impression that there was always one underlying syscall,
> and the C library did the right thing to pass the expected information
> to the underlying syscall.

I'm not sure kexec_load and kexec_file_load were included in glibc, we use
syscall directly in kexec-tools.

kexec_load man pages says there are no wrappers for both kexec_load and
kexec_file_load in glibc.

> 
> That's rather different to changing the underlying syscall.
> 
> Regardless of how this is wrapped in userspace, I do not think modifying
> the existing prototype is a good idea, and I think this kind of
> extension needs to be a new syscall.

Hmm, as I replied to Vivek, there is one case about the flags, previously
the new flag will be regarded as invalid, but not we extend it it will be
valid, this maybe the only potential bad case.

Thanks
Dave
Mark Rutland July 19, 2016, 10:52 a.m. UTC | #7
On Tue, Jul 19, 2016 at 08:55:56AM +0800, Dave Young wrote:
> On 07/18/16 at 11:07am, Mark Rutland wrote:
> > On Mon, Jul 18, 2016 at 10:30:24AM +0800, Dave Young wrote:
> > > I do not think it is worth to add another syscall for extra fds.
> > > We have open(2) as an example for different numbers of arguments
> > > already.
> > 
> > Did we change the syscall interface for that?
> > 
> > I was under the impression that there was always one underlying syscall,
> > and the C library did the right thing to pass the expected information
> > to the underlying syscall.
> 
> I'm not sure kexec_load and kexec_file_load were included in glibc, we use
> syscall directly in kexec-tools.
> 
> kexec_load man pages says there are no wrappers for both kexec_load and
> kexec_file_load in glibc.

For the above, I was talking about how open() was handled.

If there are no userspace wrappers, then the two cases aren't comparable
in the first place...

> > That's rather different to changing the underlying syscall.
> > 
> > Regardless of how this is wrapped in userspace, I do not think modifying
> > the existing prototype is a good idea, and I think this kind of
> > extension needs to be a new syscall.
> 
> Hmm, as I replied to Vivek, there is one case about the flags, previously
> the new flag will be regarded as invalid, but not we extend it it will be
> valid, this maybe the only potential bad case.

It's true that adding suport for new flags will change the behaviour of
what used to be error cases. We generally expect real users to not be
making pointless calls for which they rely on an error being returned in
all cases.

Regardless, this extended syscall changes some underlying assumptions
made with the development of kexec_file_load, and I think treating this
as an extension is not a great idea. From a user's perspective there is
little difference between passing an additional flag or using a
different syscall number, so I don't think that we gain much by altering
the existing prototype relative to allocating a new syscall number.

Thus, I think that if this is necessary it should be treated as a new
syscall.

Thanks,
Mark.
Vivek Goyal July 19, 2016, 12:24 p.m. UTC | #8
On Tue, Jul 19, 2016 at 11:52:00AM +0100, Mark Rutland wrote:
> On Tue, Jul 19, 2016 at 08:55:56AM +0800, Dave Young wrote:
> > On 07/18/16 at 11:07am, Mark Rutland wrote:
> > > On Mon, Jul 18, 2016 at 10:30:24AM +0800, Dave Young wrote:
> > > > I do not think it is worth to add another syscall for extra fds.
> > > > We have open(2) as an example for different numbers of arguments
> > > > already.
> > > 
> > > Did we change the syscall interface for that?
> > > 
> > > I was under the impression that there was always one underlying syscall,
> > > and the C library did the right thing to pass the expected information
> > > to the underlying syscall.
> > 
> > I'm not sure kexec_load and kexec_file_load were included in glibc, we use
> > syscall directly in kexec-tools.
> > 
> > kexec_load man pages says there are no wrappers for both kexec_load and
> > kexec_file_load in glibc.
> 
> For the above, I was talking about how open() was handled.
> 
> If there are no userspace wrappers, then the two cases aren't comparable
> in the first place...
> 
> > > That's rather different to changing the underlying syscall.
> > > 
> > > Regardless of how this is wrapped in userspace, I do not think modifying
> > > the existing prototype is a good idea, and I think this kind of
> > > extension needs to be a new syscall.
> > 
> > Hmm, as I replied to Vivek, there is one case about the flags, previously
> > the new flag will be regarded as invalid, but not we extend it it will be
> > valid, this maybe the only potential bad case.
> 
> It's true that adding suport for new flags will change the behaviour of
> what used to be error cases. We generally expect real users to not be
> making pointless calls for which they rely on an error being returned in
> all cases.
> 
> Regardless, this extended syscall changes some underlying assumptions
> made with the development of kexec_file_load, and I think treating this
> as an extension is not a great idea. From a user's perspective there is
> little difference between passing an additional flag or using a
> different syscall number, so I don't think that we gain much by altering
> the existing prototype relative to allocating a new syscall number.

If we are providing/opening up additional flags, I can't think what will
it break. Same flag was invalid in old kernel but new kernel supports 
it and will accept it. So it sounds reasonable to me to add new flags.

If existing users are not broken, then I think it might be a good idea
to extend existing syscall. Otherwise userspace will have to be modified
to understand a 3rd syscall also and an additional option will show up
which asks users to specify which syscall to use. So extending existing
syscall might keep it little simple for users.

This is only if conclusion in the end is that DT needs to be passed in
from user space.

BTW, does kexec_load() needs to be modified too to handle DT?

Vivek
Mark Rutland July 19, 2016, 12:47 p.m. UTC | #9
On Tue, Jul 19, 2016 at 08:24:06AM -0400, Vivek Goyal wrote:
> On Tue, Jul 19, 2016 at 11:52:00AM +0100, Mark Rutland wrote:
> > Regardless, this extended syscall changes some underlying assumptions
> > made with the development of kexec_file_load, and I think treating this
> > as an extension is not a great idea. From a user's perspective there is
> > little difference between passing an additional flag or using a
> > different syscall number, so I don't think that we gain much by altering
> > the existing prototype relative to allocating a new syscall number.
> 
> If we are providing/opening up additional flags, I can't think what will
> it break. Same flag was invalid in old kernel but new kernel supports 
> it and will accept it. So it sounds reasonable to me to add new flags.
> 
> If existing users are not broken, then I think it might be a good idea
> to extend existing syscall. Otherwise userspace will have to be modified
> to understand a 3rd syscall also and an additional option will show up
> which asks users to specify which syscall to use. So extending existing
> syscall might keep it little simple for users.

I don't follow.

To use the new feature, you have to modify userspace anyway, as you
require userspace to pass information which it did not previously pass
(in the new arguments added to the syscall).

The presence of a new syscall does not imply the absence of the old
syscall, so you can always use that be default unless the user asks for
asomething only the new syscall provides. Regardless of the
syscall/flags difference, you still have to detect whether the new
functionality is present somehow.

> BTW, does kexec_load() needs to be modified too to handle DT?

No, at least for arm64. In the kexec_load case userspace provides the
DTB as a raw segment, and the user-provided purgatory sets up registers
to pass that to the new kernel.

Thanks,
Mark.
Vivek Goyal July 19, 2016, 1:26 p.m. UTC | #10
On Tue, Jul 19, 2016 at 01:47:28PM +0100, Mark Rutland wrote:
> On Tue, Jul 19, 2016 at 08:24:06AM -0400, Vivek Goyal wrote:
> > On Tue, Jul 19, 2016 at 11:52:00AM +0100, Mark Rutland wrote:
> > > Regardless, this extended syscall changes some underlying assumptions
> > > made with the development of kexec_file_load, and I think treating this
> > > as an extension is not a great idea. From a user's perspective there is
> > > little difference between passing an additional flag or using a
> > > different syscall number, so I don't think that we gain much by altering
> > > the existing prototype relative to allocating a new syscall number.
> > 
> > If we are providing/opening up additional flags, I can't think what will
> > it break. Same flag was invalid in old kernel but new kernel supports 
> > it and will accept it. So it sounds reasonable to me to add new flags.
> > 
> > If existing users are not broken, then I think it might be a good idea
> > to extend existing syscall. Otherwise userspace will have to be modified
> > to understand a 3rd syscall also and an additional option will show up
> > which asks users to specify which syscall to use. So extending existing
> > syscall might keep it little simple for users.
> 
> I don't follow.
> 
> To use the new feature, you have to modify userspace anyway, as you
> require userspace to pass information which it did not previously pass
> (in the new arguments added to the syscall).
> 
> The presence of a new syscall does not imply the absence of the old
> syscall, so you can always use that be default unless the user asks for
> asomething only the new syscall provides. Regardless of the
> syscall/flags difference, you still have to detect whether the new
> functionality is present somehow.
> 

Hmm., so current idea is that we have two syscalls() which are *ideally*
supposed to work for all arches. Difference between two is that first
one does not support kernel signature verification while second one does.

By default old syscall is used and user can force using new syscall using
option --kexec-file-load.

If a user DTB is present, I was hoping that it will continue to work the
same way. Both the sycalls can be used and can handle DTB. If we introduce
a 3rd syscall, that means only first and 3rd syscall can handle DTB and
we need to introduce one more option which tells whether to use
kexec_load() or use the 3rd new syscall. And that's what I am trying
to avoid.

Vivek

> > BTW, does kexec_load() needs to be modified too to handle DT?
> 
> No, at least for arm64. In the kexec_load case userspace provides the
> DTB as a raw segment, and the user-provided purgatory sets up registers
> to pass that to the new kernel.
> 
> Thanks,
> Mark.
David Laight July 20, 2016, 11:41 a.m. UTC | #11
From: Dave Young

> On 07/15/16 at 02:19pm, Mark Rutland wrote:

> > On Fri, Jul 15, 2016 at 09:09:55AM -0400, Vivek Goyal wrote:

> > > On Tue, Jul 12, 2016 at 10:42:01AM +0900, AKASHI Takahiro wrote:

> > >

> > > [..]

> > > > -SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,

> > > > +SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,

> > > >  		unsigned long, cmdline_len, const char __user *, cmdline_ptr,

> > > > -		unsigned long, flags)

> > > > +		unsigned long, flags, const struct kexec_fdset __user *, ufdset)

> > >

> > > Can one add more parameters to existing syscall. Can it break existing

> > > programs with new kernel? I was of the impression that one can't do that.

> > > But may be I am missing something.

> >

> > I think the idea was that we would only look at the new params if a new

> > flags was set, and otherwise it would behave as the old syscall.

> >

> > Regardless, I think it makes far more sense to add a kexec_file_load2

> > syscall if we're going to modify the prototype at all. It's a rather

> > different proposition to the existing syscall, and needs to be treated

> > as such.

> 

> I do not think it is worth to add another syscall for extra fds.

> We have open(2) as an example for different numbers of arguments

> already.


Probably works 'by luck' and no one has actually thought about why.
That ioctl() works is (probably) even more lucky.

There are ABI that use different calling conventions for varags functions
(eg always stack all the arguments). I guess linux doesn't run on any of them.

ioctl() is a particular problem because the 'arg' might be an integer or a pointer.
Fortunately all the 64bit ABI linux uses pass the arg parameter in a register
(and don't use different registers for pointer and data arguments).

You could have two 'libc' functions that refer to the same system call entry.
Certainly safer than a varargs function.

	David
Russell King - ARM Linux admin July 21, 2016, 9:21 a.m. UTC | #12
On Wed, Jul 20, 2016 at 11:41:35AM +0000, David Laight wrote:
> From: Dave Young
> > I do not think it is worth to add another syscall for extra fds.
> > We have open(2) as an example for different numbers of arguments
> > already.
> 
> Probably works 'by luck' and no one has actually thought about why.
> That ioctl() works is (probably) even more lucky.
> 
> There are ABI that use different calling conventions for varags functions
> (eg always stack all the arguments). I guess linux doesn't run on any of them.
> 
> ioctl() is a particular problem because the 'arg' might be an integer or a pointer.
> Fortunately all the 64bit ABI linux uses pass the arg parameter in a register
> (and don't use different registers for pointer and data arguments).
> 
> You could have two 'libc' functions that refer to the same system call entry.
> Certainly safer than a varargs function.

Don't forget that the syscall API is not a normal C function API - it's
special, because there's little point stacking arguments on the userspace
stack and then having the kernel function try and read them off the
kernelspace stack.

If an architecture does such a thing, then it needs special veneers to
handle that (reading off the userspace stack and placing them onto the
kernelspace stack, or the arch needs to define some other method of
handling the situation.)

So, really, the actual C APIs don't matter that much - what matters more
is the definition of a sane way to pass such arguments.  Given the
extensive historical nature of open() and ioctl(), it would be completely
silly not to create something which allows these calls to work.

Patch
diff mbox

diff --git a/include/linux/fs.h b/include/linux/fs.h
index dd28814..6dd6fdf 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2634,6 +2634,7 @@  extern int do_pipe_flags(int *, int);
 	id(MODULE, kernel-module)		\
 	id(KEXEC_IMAGE, kexec-image)		\
 	id(KEXEC_INITRAMFS, kexec-initramfs)	\
+	id(KEXEC_DTB, kexec-dtb)		\
 	id(POLICY, security-policy)		\
 	id(MAX_ID, )
 
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 554c848..5f11bd5 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -277,7 +277,7 @@  extern int kexec_load_disabled;
 
 /* List of defined/legal kexec file flags */
 #define KEXEC_FILE_FLAGS	(KEXEC_FILE_UNLOAD | KEXEC_FILE_ON_CRASH | \
-				 KEXEC_FILE_NO_INITRAMFS)
+				 KEXEC_FILE_NO_INITRAMFS | KEXEC_FILE_EXTRA_FDS)
 
 #define VMCOREINFO_BYTES           (4096)
 #define VMCOREINFO_NOTE_NAME       "VMCOREINFO"
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index d022390..fc072bd 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -66,6 +66,7 @@  struct perf_event_attr;
 struct file_handle;
 struct sigaltstack;
 union bpf_attr;
+struct kexec_fdset;
 
 #include <linux/types.h>
 #include <linux/aio_abi.h>
@@ -321,7 +322,8 @@  asmlinkage long sys_kexec_load(unsigned long entry, unsigned long nr_segments,
 asmlinkage long sys_kexec_file_load(int kernel_fd, int initrd_fd,
 				    unsigned long cmdline_len,
 				    const char __user *cmdline_ptr,
-				    unsigned long flags);
+				    unsigned long flags,
+				    const struct kexec_fdset __user *ufdset);
 
 asmlinkage long sys_exit(int error_code);
 asmlinkage long sys_exit_group(int error_code);
diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index aae5ebf..adf53b6 100644
--- a/include/uapi/linux/kexec.h
+++ b/include/uapi/linux/kexec.h
@@ -23,6 +23,23 @@ 
 #define KEXEC_FILE_UNLOAD	0x00000001
 #define KEXEC_FILE_ON_CRASH	0x00000002
 #define KEXEC_FILE_NO_INITRAMFS	0x00000004
+#define KEXEC_FILE_EXTRA_FDS	0x00000008
+
+enum kexec_file_type {
+	KEXEC_FILE_TYPE_KERNEL,
+	KEXEC_FILE_TYPE_INITRAMFS,
+	KEXEC_FILE_TYPE_DTB,
+};
+
+struct kexec_file_fd {
+	enum kexec_file_type type;
+	int fd;
+};
+
+struct kexec_fdset {
+	int nr_fds;
+	struct kexec_file_fd fds[0];
+};
 
 /* These values match the ELF architecture values.
  * Unless there is a good reason that should continue to be the case.
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 7278329..451b4b0 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -137,11 +137,14 @@  void kimage_file_post_load_cleanup(struct kimage *image)
 static int
 kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd,
 			     const char __user *cmdline_ptr,
-			     unsigned long cmdline_len, unsigned flags)
+			     unsigned long cmdline_len, unsigned long flags,
+			     const struct kexec_fdset __user *ufdset)
 {
-	int ret = 0;
+	int ret = 0, nr_fds, i;
 	void *ldata;
 	loff_t size;
+	struct kexec_fdset *fdset = NULL;
+	size_t fdset_size;
 
 	ret = kernel_read_file_from_fd(kernel_fd, &image->kernel_buf,
 				       &size, INT_MAX, READING_KEXEC_IMAGE);
@@ -174,6 +177,42 @@  kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd,
 		image->initrd_buf_len = size;
 	}
 
+	if (flags & KEXEC_FILE_EXTRA_FDS) {
+		ret = copy_from_user(&nr_fds, ufdset, sizeof(int));
+		if (ret) {
+			ret = -EFAULT;
+			goto out;
+		}
+
+		fdset_size = sizeof(struct kexec_fdset)
+				+ nr_fds * sizeof(struct kexec_file_fd);
+		fdset = kmalloc(fdset_size, GFP_KERNEL);
+		if (!fdset) {
+			ret = -ENOMEM;
+			goto out;
+		}
+
+		ret = copy_from_user(fdset, ufdset, fdset_size);
+		if (ret) {
+			ret = -EFAULT;
+			goto out;
+		}
+
+		for (i = 0; i < fdset->nr_fds; i++) {
+			if (fdset->fds[i].type == KEXEC_FILE_TYPE_DTB) {
+				ret = kernel_read_file_from_fd(fdset->fds[i].fd,
+						&image->dtb_buf, &size, INT_MAX,
+						READING_KEXEC_DTB);
+				if (ret)
+					goto out;
+				image->dtb_buf_len = size;
+			} else {
+				pr_debug("unknown file type %d failed.\n",
+						fdset->fds[i].type);
+			}
+		}
+	}
+
 	if (cmdline_len) {
 		image->cmdline_buf = kzalloc(cmdline_len, GFP_KERNEL);
 		if (!image->cmdline_buf) {
@@ -208,6 +247,8 @@  kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd,
 	image->image_loader_data = ldata;
 out:
 	/* In case of error, free up all allocated memory in this function */
+	kfree(fdset);
+
 	if (ret)
 		kimage_file_post_load_cleanup(image);
 	return ret;
@@ -216,7 +257,8 @@  out:
 static int
 kimage_file_alloc_init(struct kimage **rimage, int kernel_fd,
 		       int initrd_fd, const char __user *cmdline_ptr,
-		       unsigned long cmdline_len, unsigned long flags)
+		       unsigned long cmdline_len, unsigned long flags,
+		       const struct kexec_fdset __user *ufdset)
 {
 	int ret;
 	struct kimage *image;
@@ -235,7 +277,8 @@  kimage_file_alloc_init(struct kimage **rimage, int kernel_fd,
 	}
 
 	ret = kimage_file_prepare_segments(image, kernel_fd, initrd_fd,
-					   cmdline_ptr, cmdline_len, flags);
+					   cmdline_ptr, cmdline_len, flags,
+					   ufdset);
 	if (ret)
 		goto out_free_image;
 
@@ -270,9 +313,9 @@  out_free_image:
 	return ret;
 }
 
-SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
+SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,
 		unsigned long, cmdline_len, const char __user *, cmdline_ptr,
-		unsigned long, flags)
+		unsigned long, flags, const struct kexec_fdset __user *, ufdset)
 {
 	int ret = 0, i;
 	struct kimage **dest_image, *image;
@@ -309,7 +352,7 @@  SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
 		kimage_free(xchg(&kexec_crash_image, NULL));
 
 	ret = kimage_file_alloc_init(&image, kernel_fd, initrd_fd, cmdline_ptr,
-				     cmdline_len, flags);
+				     cmdline_len, flags, ufdset);
 	if (ret)
 		goto out;