Patchwork [2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

login
register
mail settings
Submitter Yann Droneaud
Date Aug. 24, 2013, 8:14 p.m.
Message ID <b7ae23be3a3d4a2f57eca6cfc7f9361da883ceca.1377372576.git.ydroneaud@opteya.com>
Download mbox | patch
Permalink /patch/269667/
State New
Headers show

Comments

Yann Droneaud - Aug. 24, 2013, 8:14 p.m.
KVM uses anon_inode_get() to allocate file descriptors as part
of some of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file descriptor.

In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().

This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1377372576.git.ydroneaud@opteya.com

---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c    | 2 +-
 arch/powerpc/kvm/book3s_hv.c        | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)
Alexander Graf - Aug. 25, 2013, 3:04 p.m.
On 24.08.2013, at 21:14, Yann Droneaud wrote:

> KVM uses anon_inode_get() to allocate file descriptors as part
> of some of its ioctls. But those ioctls are lacking a flag argument
> allowing userspace to choose options for the newly opened file descriptor.
> 
> In such case it's advised to use O_CLOEXEC by default so that
> userspace is allowed to choose, without race, if the file descriptor
> is going to be inherited across exec().
> 
> This patch set O_CLOEXEC flag on all file descriptors created
> with anon_inode_getfd() to not leak file descriptors across exec().
> 
> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
> Link: http://lkml.kernel.org/r/cover.1377372576.git.ydroneaud@opteya.com

Reviewed-by: Alexander Graf <agraf@suse.de>

Would it make sense to simply inherit the O_CLOEXEC flag from the parent kvm fd instead? That would give user space the power to keep fds across exec() if it wants to.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini - Aug. 26, 2013, 7:39 a.m.
Il 25/08/2013 17:04, Alexander Graf ha scritto:
> 
> On 24.08.2013, at 21:14, Yann Droneaud wrote:
> 
>> KVM uses anon_inode_get() to allocate file descriptors as part
>> of some of its ioctls. But those ioctls are lacking a flag argument
>> allowing userspace to choose options for the newly opened file descriptor.
>>
>> In such case it's advised to use O_CLOEXEC by default so that
>> userspace is allowed to choose, without race, if the file descriptor
>> is going to be inherited across exec().
>>
>> This patch set O_CLOEXEC flag on all file descriptors created
>> with anon_inode_getfd() to not leak file descriptors across exec().
>>
>> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
>> Link: http://lkml.kernel.org/r/cover.1377372576.git.ydroneaud@opteya.com
> 
> Reviewed-by: Alexander Graf <agraf@suse.de>
> 
> Would it make sense to simply inherit the O_CLOEXEC flag from the
> parent kvm fd instead? That would give user space the power to keep
> fds across exec() if it wants to.

Does it make sense to use non-O_CLOEXEC file descriptors with KVM at
all?  Besides fork() not being supported by KVM, as described in
Documentation/virtual/kvm/api.txt, the VMAs of the parent process go
away as soon as you exec().  I'm not sure how you can use the inherited
file descriptor in a sensible way after exec().

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yann Droneaud - Aug. 26, 2013, 8:23 a.m.
Le 26.08.2013 09:39, Paolo Bonzini a écrit :
> Il 25/08/2013 17:04, Alexander Graf ha scritto:
>> On 24.08.2013, at 21:14, Yann Droneaud wrote:
>> 
>>> 
>>> This patch set O_CLOEXEC flag on all file descriptors created
>>> with anon_inode_getfd() to not leak file descriptors across exec().
>>> 
>>> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
>>> Link: 
>>> http://lkml.kernel.org/r/cover.1377372576.git.ydroneaud@opteya.com
>> 
>> Reviewed-by: Alexander Graf <agraf@suse.de>
>> 
>> Would it make sense to simply inherit the O_CLOEXEC flag from the
>> parent kvm fd instead? That would give user space the power to keep
>> fds across exec() if it wants to.
> 
> Does it make sense to use non-O_CLOEXEC file descriptors with KVM at
> all?  Besides fork() not being supported by KVM, as described in
> Documentation/virtual/kvm/api.txt, the VMAs of the parent process go
> away as soon as you exec().  I'm not sure how you can use the inherited
> file descriptor in a sensible way after exec().
> 

Sounds a lot like InfiniBand subsystem behavor: IB file descriptors
are of no use accross exec() since memory mappings tied to those fds
won't be available in the new process:

https://lkml.org/lkml/2013/7/8/380
http://mid.gmane.org/f58540dc64fec1ac0e496dfcd3cc1af7@meuh.org

Regards.
Paolo Bonzini - Aug. 26, 2013, 8:28 a.m.
Il 26/08/2013 10:23, Yann Droneaud ha scritto:
> 
> Sounds a lot like InfiniBand subsystem behavor: IB file descriptors
> are of no use accross exec() since memory mappings tied to those fds
> won't be available in the new process:
> 
> https://lkml.org/lkml/2013/7/8/380
> http://mid.gmane.org/f58540dc64fec1ac0e496dfcd3cc1af7@meuh.org

Yes, it is very similar.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 710d313..f7c9e8a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1579,7 +1579,7 @@  int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *ghf)
 	ctx->first_pass = 1;
 
 	rwflag = (ghf->flags & KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY;
-	ret = anon_inode_getfd("kvm-htab", &kvm_htab_fops, ctx, rwflag);
+	ret = anon_inode_getfd("kvm-htab", &kvm_htab_fops, ctx, rwflag | O_CLOEXEC);
 	if (ret < 0) {
 		kvm_put_kvm(kvm);
 		return ret;
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index b2d3f3b..54cf9bc 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -136,7 +136,7 @@  long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
 	mutex_unlock(&kvm->lock);
 
 	return anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
-				stt, O_RDWR);
+				stt, O_RDWR | O_CLOEXEC);
 
 fail:
 	if (stt) {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e8d51cb..3503829 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1556,7 +1556,7 @@  long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct kvm_allocate_rma *ret)
 	if (!ri)
 		return -ENOMEM;
 
-	fd = anon_inode_getfd("kvm-rma", &kvm_rma_fops, ri, O_RDWR);
+	fd = anon_inode_getfd("kvm-rma", &kvm_rma_fops, ri, O_RDWR | O_CLOEXEC);
 	if (fd < 0)
 		kvm_release_rma(ri);