Patchwork [for-1.4,04/12] kvm: Create kvm_arch_vcpu_id() function

login
register
mail settings
Submitter Eduardo Habkost
Date Jan. 17, 2013, 8:59 p.m.
Message ID <1358456378-29248-5-git-send-email-ehabkost@redhat.com>
Download mbox | patch
Permalink /patch/213373/
State New
Headers show

Comments

Eduardo Habkost - Jan. 17, 2013, 8:59 p.m.
This will allow each architecture to define how the VCPU ID is set on
the KVM_CREATE_VCPU ioctl call.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
Cc: kvm@vger.kernel.org
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>

Changes v2:
 - Get CPUState as argument instead of CPUArchState
---
 include/sysemu/kvm.h | 3 +++
 kvm-all.c            | 2 +-
 target-i386/kvm.c    | 5 +++++
 target-ppc/kvm.c     | 5 +++++
 target-s390x/kvm.c   | 5 +++++
 5 files changed, 19 insertions(+), 1 deletion(-)
Andreas Färber - Jan. 18, 2013, 11:11 a.m.
Am 17.01.2013 21:59, schrieb Eduardo Habkost:
> This will allow each architecture to define how the VCPU ID is set on
> the KVM_CREATE_VCPU ioctl call.
> 
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
> Cc: kvm@vger.kernel.org
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Gleb Natapov <gleb@redhat.com>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> 
> Changes v2:
>  - Get CPUState as argument instead of CPUArchState
> ---
>  include/sysemu/kvm.h | 3 +++
>  kvm-all.c            | 2 +-
>  target-i386/kvm.c    | 5 +++++
>  target-ppc/kvm.c     | 5 +++++
>  target-s390x/kvm.c   | 5 +++++
>  5 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 22acf91..384ee66 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -196,6 +196,9 @@ int kvm_arch_init(KVMState *s);
>  
>  int kvm_arch_init_vcpu(CPUState *cpu);
>  
> +/* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
> +unsigned long kvm_arch_vcpu_id(CPUState *cpu);
> +
>  void kvm_arch_reset_vcpu(CPUState *cpu);
>  
>  int kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
> diff --git a/kvm-all.c b/kvm-all.c
> index 6278d61..995220d 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -222,7 +222,7 @@ int kvm_init_vcpu(CPUState *cpu)
>  
>      DPRINTF("kvm_init_vcpu\n");
>  
> -    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, cpu->cpu_index);
> +    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, kvm_arch_vcpu_id(cpu));
>      if (ret < 0) {
>          DPRINTF("kvm_create_vcpu failed\n");
>          goto err;

This is changing the vararg from int to unsigned long. I have no
insights yet on how this is handled and whether that is okay; I would at
least expect this change to be mentioned in the commit message.

> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index 3acff40..5f3f789 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -411,6 +411,11 @@ static void cpu_update_state(void *opaque, int running, RunState state)
>      }
>  }
>  
> +unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> +{
> +    return cpu->cpu_index;
> +}
> +
>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
>      struct {

Minor nit: If you change this to CPUState *cs you spare the renaming in
05/12. Alternatively use x86_cpu there (not much code affected so you
can just ignore this, no need to respin just for that).

Otherwise looks okay to me.

Andreas

> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 19e9f25..1e544ae 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -384,6 +384,11 @@ static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  
>  #endif /* !defined (TARGET_PPC64) */
>  
> +unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> +{
> +    return cpu->cpu_index;
> +}
> +
>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
>      PowerPCCPU *cpu = POWERPC_CPU(cs);
> diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
> index 6ec5e6d..bd9864c 100644
> --- a/target-s390x/kvm.c
> +++ b/target-s390x/kvm.c
> @@ -72,6 +72,11 @@ int kvm_arch_init(KVMState *s)
>      return 0;
>  }
>  
> +unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> +{
> +    return cpu->cpu_index;
> +}
> +
>  int kvm_arch_init_vcpu(CPUState *cpu)
>  {
>      int ret = 0;
Eduardo Habkost - Jan. 18, 2013, 12:53 p.m.
On Fri, Jan 18, 2013 at 12:11:29PM +0100, Andreas Färber wrote:
[...]
> > +/* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
> > +unsigned long kvm_arch_vcpu_id(CPUState *cpu);
> > +
> >  void kvm_arch_reset_vcpu(CPUState *cpu);
> >  
> >  int kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
> > diff --git a/kvm-all.c b/kvm-all.c
> > index 6278d61..995220d 100644
> > --- a/kvm-all.c
> > +++ b/kvm-all.c
> > @@ -222,7 +222,7 @@ int kvm_init_vcpu(CPUState *cpu)
> >  
> >      DPRINTF("kvm_init_vcpu\n");
> >  
> > -    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, cpu->cpu_index);
> > +    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, kvm_arch_vcpu_id(cpu));
> >      if (ret < 0) {
> >          DPRINTF("kvm_create_vcpu failed\n");
> >          goto err;
> 
> This is changing the vararg from int to unsigned long. I have no
> insights yet on how this is handled and whether that is okay; I would at
> least expect this change to be mentioned in the commit message.

It was an unexpected change (I didn't notice that cpu_index was int),
but strictly speaking the previous code was incorrect (as ioctl() gets
an unsigned long argument, not int). I doubt there are cases where it
would really break, but it is a good thing to fix it.

I agree this should be mentioned in the commit message, though. Will you
add it before committing, or should I resubmit?

> 
> > diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> > index 3acff40..5f3f789 100644
> > --- a/target-i386/kvm.c
> > +++ b/target-i386/kvm.c
> > @@ -411,6 +411,11 @@ static void cpu_update_state(void *opaque, int running, RunState state)
> >      }
> >  }
> >  
> > +unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> > +{
> > +    return cpu->cpu_index;
> > +}
> > +
> >  int kvm_arch_init_vcpu(CPUState *cs)
> >  {
> >      struct {
> 
> Minor nit: If you change this to CPUState *cs you spare the renaming in
> 05/12. Alternatively use x86_cpu there (not much code affected so you
> can just ignore this, no need to respin just for that).
> 
> Otherwise looks okay to me.

I actually wanted to rename the variable only when necessary, otherwise
this patch would be confusing if all architectures used 'cpu' and i386
used 'cs'.

(And I like using "cpu" for the more specific CPU type in the function
[e.g.  CPUState or X86CPUState depending on the case] and abbreviations
[like 'cs'] for the more generic types. I believe I have seen this style
used in other parts of the code.)

> 
> Andreas
> 
> > diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> > index 19e9f25..1e544ae 100644
> > --- a/target-ppc/kvm.c
> > +++ b/target-ppc/kvm.c
> > @@ -384,6 +384,11 @@ static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> >  
> >  #endif /* !defined (TARGET_PPC64) */
> >  
> > +unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> > +{
> > +    return cpu->cpu_index;
> > +}
> > +
> >  int kvm_arch_init_vcpu(CPUState *cs)
> >  {
> >      PowerPCCPU *cpu = POWERPC_CPU(cs);
> > diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
> > index 6ec5e6d..bd9864c 100644
> > --- a/target-s390x/kvm.c
> > +++ b/target-s390x/kvm.c
> > @@ -72,6 +72,11 @@ int kvm_arch_init(KVMState *s)
> >      return 0;
> >  }
> >  
> > +unsigned long kvm_arch_vcpu_id(CPUState *cpu)
> > +{
> > +    return cpu->cpu_index;
> > +}
> > +
> >  int kvm_arch_init_vcpu(CPUState *cpu)
> >  {
> >      int ret = 0;
> 
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Andreas Färber - Jan. 18, 2013, 1:03 p.m.
Am 18.01.2013 13:53, schrieb Eduardo Habkost:
> On Fri, Jan 18, 2013 at 12:11:29PM +0100, Andreas Färber wrote:
> [...]
>>> +/* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
>>> +unsigned long kvm_arch_vcpu_id(CPUState *cpu);
>>> +
>>>  void kvm_arch_reset_vcpu(CPUState *cpu);
>>>  
>>>  int kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
>>> diff --git a/kvm-all.c b/kvm-all.c
>>> index 6278d61..995220d 100644
>>> --- a/kvm-all.c
>>> +++ b/kvm-all.c
>>> @@ -222,7 +222,7 @@ int kvm_init_vcpu(CPUState *cpu)
>>>  
>>>      DPRINTF("kvm_init_vcpu\n");
>>>  
>>> -    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, cpu->cpu_index);
>>> +    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, kvm_arch_vcpu_id(cpu));
>>>      if (ret < 0) {
>>>          DPRINTF("kvm_create_vcpu failed\n");
>>>          goto err;
>>
>> This is changing the vararg from int to unsigned long. I have no
>> insights yet on how this is handled and whether that is okay; I would at
>> least expect this change to be mentioned in the commit message.
> 
> It was an unexpected change (I didn't notice that cpu_index was int),
> but strictly speaking the previous code was incorrect (as ioctl() gets
> an unsigned long argument, not int). I doubt there are cases where it
> would really break, but it is a good thing to fix it.
> 
> I agree this should be mentioned in the commit message, though. Will you
> add it before committing, or should I resubmit?

Could you suggest a text for me to add please?

>>> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
>>> index 3acff40..5f3f789 100644
>>> --- a/target-i386/kvm.c
>>> +++ b/target-i386/kvm.c
>>> @@ -411,6 +411,11 @@ static void cpu_update_state(void *opaque, int running, RunState state)
>>>      }
>>>  }
>>>  
>>> +unsigned long kvm_arch_vcpu_id(CPUState *cpu)
>>> +{
>>> +    return cpu->cpu_index;
>>> +}
>>> +
>>>  int kvm_arch_init_vcpu(CPUState *cs)
>>>  {
>>>      struct {
>>
>> Minor nit: If you change this to CPUState *cs you spare the renaming in
>> 05/12. Alternatively use x86_cpu there (not much code affected so you
>> can just ignore this, no need to respin just for that).
>>
>> Otherwise looks okay to me.
> 
> I actually wanted to rename the variable only when necessary, otherwise
> this patch would be confusing if all architectures used 'cpu' and i386
> used 'cs'.

It's inconsistent anyway, 'cs' is relatively new and I see no reason to
use it in the prototype.
But OK, once 03/12 gets an ack I'll start applying.

> 
> (And I like using "cpu" for the more specific CPU type in the function
> [e.g.  CPUState or X86CPUState depending on the case] and abbreviations
> [like 'cs'] for the more generic types. I believe I have seen this style
> used in other parts of the code.)

Yes, I chose to use "cpu" for the more frequently used type. I don't
really like "cs" but it seemed better than "base_cpu" or so. When
changing something over from "env" something with three letters looks
nicer though, easier to review. Can't have everything. ;)

Andreas
Eduardo Habkost - Jan. 18, 2013, 2:20 p.m.
On Fri, Jan 18, 2013 at 02:03:09PM +0100, Andreas Färber wrote:
> Am 18.01.2013 13:53, schrieb Eduardo Habkost:
> > On Fri, Jan 18, 2013 at 12:11:29PM +0100, Andreas Färber wrote:
> > [...]
> >>> +/* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
> >>> +unsigned long kvm_arch_vcpu_id(CPUState *cpu);
> >>> +
> >>>  void kvm_arch_reset_vcpu(CPUState *cpu);
> >>>  
> >>>  int kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
> >>> diff --git a/kvm-all.c b/kvm-all.c
> >>> index 6278d61..995220d 100644
> >>> --- a/kvm-all.c
> >>> +++ b/kvm-all.c
> >>> @@ -222,7 +222,7 @@ int kvm_init_vcpu(CPUState *cpu)
> >>>  
> >>>      DPRINTF("kvm_init_vcpu\n");
> >>>  
> >>> -    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, cpu->cpu_index);
> >>> +    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, kvm_arch_vcpu_id(cpu));
> >>>      if (ret < 0) {
> >>>          DPRINTF("kvm_create_vcpu failed\n");
> >>>          goto err;
> >>
> >> This is changing the vararg from int to unsigned long. I have no
> >> insights yet on how this is handled and whether that is okay; I would at
> >> least expect this change to be mentioned in the commit message.
> > 
> > It was an unexpected change (I didn't notice that cpu_index was int),
> > but strictly speaking the previous code was incorrect (as ioctl() gets
> > an unsigned long argument, not int). I doubt there are cases where it
> > would really break, but it is a good thing to fix it.
> > 
> > I agree this should be mentioned in the commit message, though. Will you
> > add it before committing, or should I resubmit?
> 
> Could you suggest a text for me to add please?

"The argument passed to KVM_CREATE_VCPU now has 'unsigned long' type
instead of 'int', as expected by the Linux ioctl() syscall. Maybe an int
works on most or all architectures supporting KVM, but it is safer to
use an appropriate 'unsigned long' parameter."

To find out if 'int' breaks on any architecture, I would need to check
the ABI specification for each architecture. I didn't do that, but I am
sure we should pass an unsigned long instead, if that's the type
expected by the kernel.
Eric Blake - Jan. 18, 2013, 4:11 p.m.
On 01/18/2013 07:20 AM, Eduardo Habkost wrote:
>> Could you suggest a text for me to add please?
> 
> "The argument passed to KVM_CREATE_VCPU now has 'unsigned long' type
> instead of 'int', as expected by the Linux ioctl() syscall. Maybe an int
> works on most or all architectures supporting KVM, but it is safer to
> use an appropriate 'unsigned long' parameter."

Interestingly enough, while the Linux syscall uses 'unsigned long', the
POSIX definition of ioctl() uses 'int'; so the Linux kernel is already
constrained to never use an ioctl value that doesn't fit within 'int',
and glibc is already responsible for ensuring that argument promotion of
an int doesn't change the behavior of ioctl() in libc when converting it
over to the unsigned long syscall semantics expected by the kernel.

> 
> To find out if 'int' breaks on any architecture, I would need to check
> the ABI specification for each architecture. I didn't do that, but I am
> sure we should pass an unsigned long instead, if that's the type
> expected by the kernel.
>
Eduardo Habkost - Jan. 18, 2013, 4:40 p.m.
On Fri, Jan 18, 2013 at 09:11:42AM -0700, Eric Blake wrote:
> On 01/18/2013 07:20 AM, Eduardo Habkost wrote:
> >> Could you suggest a text for me to add please?
> > 
> > "The argument passed to KVM_CREATE_VCPU now has 'unsigned long' type
> > instead of 'int', as expected by the Linux ioctl() syscall. Maybe an int
> > works on most or all architectures supporting KVM, but it is safer to
> > use an appropriate 'unsigned long' parameter."
> 
> Interestingly enough, while the Linux syscall uses 'unsigned long', the
> POSIX definition of ioctl() uses 'int'; so the Linux kernel is already
> constrained to never use an ioctl value that doesn't fit within 'int',

Really? What about the ioctl()s that get a pointer as argument on
architectures where pointers don't fit in an int?

Do you have a pointer to the POSIX definition you are talking about?

Note that I'm talking about the the extra ioctl() argument, not the
ioctl() number (that is an unsigned int in the kernel code).


> and glibc is already responsible for ensuring that argument promotion of
> an int doesn't change the behavior of ioctl() in libc when converting it
> over to the unsigned long syscall semantics expected by the kernel.
> 
> > 
> > To find out if 'int' breaks on any architecture, I would need to check
> > the ABI specification for each architecture. I didn't do that, but I am
> > sure we should pass an unsigned long instead, if that's the type
> > expected by the kernel.
> > 
> 
> -- 
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
Eric Blake - Jan. 18, 2013, 5:46 p.m.
On 01/18/2013 09:40 AM, Eduardo Habkost wrote:
> On Fri, Jan 18, 2013 at 09:11:42AM -0700, Eric Blake wrote:
>> On 01/18/2013 07:20 AM, Eduardo Habkost wrote:
>>>> Could you suggest a text for me to add please?
>>>
>>> "The argument passed to KVM_CREATE_VCPU now has 'unsigned long' type
>>> instead of 'int', as expected by the Linux ioctl() syscall. Maybe an int
>>> works on most or all architectures supporting KVM, but it is safer to
>>> use an appropriate 'unsigned long' parameter."
>>
>> Interestingly enough, while the Linux syscall uses 'unsigned long', the
>> POSIX definition of ioctl() uses 'int'; so the Linux kernel is already
>> constrained to never use an ioctl value that doesn't fit within 'int',
> 
> Really? What about the ioctl()s that get a pointer as argument on
> architectures where pointers don't fit in an int?
> 
> Do you have a pointer to the POSIX definition you are talking about?
> 
> Note that I'm talking about the the extra ioctl() argument, not the
> ioctl() number (that is an unsigned int in the kernel code).

Okay, now you made me go back and check sources.

POSIX 2008 says:
#include <stropts.h>
int ioctl(int fildes, int request, ... /* arg */);

Gnulib says this about a bug that it works around:
@item
On glibc platforms, the second parameter is of type @code{unsigned long}
rather than @code{int}.

But gnulib also suggests using <sys/ioctl.h> instead of the POSIX header
<stropts.h> for getting ioctl(), because <stropts.h> was declared
obsolete in POSIX 2008 and was never implemented in glibc.

Sure enough, looking at Fedora 18 /usr/include/sys/ioctl.h, I still see:
extern int ioctl (int __fd, unsigned long int __request, ...) __THROW;

Meanwhile, you are correct that the kernel defines request as 32 bits:
linux.git:include/uapi/asm-generic/ioctl.h
/* ioctl command encoding: 32 bits total, command in lower 16 bits,
 * size of the parameter structure in the lower 14 bits of the
 * upper 16 bits.
 * Encoding the size of the parameter structure in the ioctl request
 * is useful for catching programs compiled with old versions
 * and to avoid overwriting user space outside the user buffer area.
 * The highest 2 bits are reserved for indicating the ``access mode''.
 * NOTE: This limits the max parameter size to 16kB -1 !
 */

> 
>> and glibc is already responsible for ensuring that argument promotion of
>> an int doesn't change the behavior of ioctl() in libc when converting it
>> over to the unsigned long syscall semantics expected by the kernel.

So a more precise wording of this is:

glibc is already responsible from converting the 'unsigned long int' of
the user declaration back into the 'unsigned int' that the kernel
expects for the second argument.  The third argument (when present), is
generally treated as a pointer (of size appropriate for the
architecture).  Although there _might_ be an ioctl that uses it directly
as an integer instead of dereferencing it as a pointer, those would be
the exceptions to the rule.
Andreas Färber - Jan. 21, 2013, 1:14 p.m.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am 18.01.2013 18:46, schrieb Eric Blake:
> On 01/18/2013 09:40 AM, Eduardo Habkost wrote:
>> On Fri, Jan 18, 2013 at 09:11:42AM -0700, Eric Blake wrote:
>>> On 01/18/2013 07:20 AM, Eduardo Habkost wrote:
>>>>> Could you suggest a text for me to add please?
>>>> 
>>>> "The argument passed to KVM_CREATE_VCPU now has 'unsigned
>>>> long' type instead of 'int', as expected by the Linux ioctl()
>>>> syscall. Maybe an int works on most or all architectures
>>>> supporting KVM, but it is safer to use an appropriate
>>>> 'unsigned long' parameter."
>>> 
>>> Interestingly enough, while the Linux syscall uses 'unsigned
>>> long', the POSIX definition of ioctl() uses 'int'; so the Linux
>>> kernel is already constrained to never use an ioctl value that
>>> doesn't fit within 'int',
>> 
>> Really? What about the ioctl()s that get a pointer as argument
>> on architectures where pointers don't fit in an int?
>> 
>> Do you have a pointer to the POSIX definition you are talking
>> about?
>> 
>> Note that I'm talking about the the extra ioctl() argument, not
>> the ioctl() number (that is an unsigned int in the kernel code).
> 
> Okay, now you made me go back and check sources.
> 
> POSIX 2008 says: #include <stropts.h> int ioctl(int fildes, int
> request, ... /* arg */);
> 
> Gnulib says this about a bug that it works around: @item On glibc
> platforms, the second parameter is of type @code{unsigned long} 
> rather than @code{int}.
> 
> But gnulib also suggests using <sys/ioctl.h> instead of the POSIX
> header <stropts.h> for getting ioctl(), because <stropts.h> was
> declared obsolete in POSIX 2008 and was never implemented in
> glibc.
> 
> Sure enough, looking at Fedora 18 /usr/include/sys/ioctl.h, I still
> see: extern int ioctl (int __fd, unsigned long int __request, ...)
> __THROW;
> 
> Meanwhile, you are correct that the kernel defines request as 32
> bits: linux.git:include/uapi/asm-generic/ioctl.h /* ioctl command
> encoding: 32 bits total, command in lower 16 bits, * size of the
> parameter structure in the lower 14 bits of the * upper 16 bits. *
> Encoding the size of the parameter structure in the ioctl request *
> is useful for catching programs compiled with old versions * and to
> avoid overwriting user space outside the user buffer area. * The
> highest 2 bits are reserved for indicating the ``access mode''. *
> NOTE: This limits the max parameter size to 16kB -1 ! */
> 
>> 
>>> and glibc is already responsible for ensuring that argument
>>> promotion of an int doesn't change the behavior of ioctl() in
>>> libc when converting it over to the unsigned long syscall
>>> semantics expected by the kernel.
> 
> So a more precise wording of this is:
> 
> glibc is already responsible from converting the 'unsigned long
> int' of the user declaration back into the 'unsigned int' that the
> kernel expects for the second argument.  The third argument (when
> present), is generally treated as a pointer (of size appropriate
> for the architecture).  Although there _might_ be an ioctl that
> uses it directly as an integer instead of dereferencing it as a
> pointer, those would be the exceptions to the rule.

So ... do we have a conclusion what to put into the commit message? :)

It looks to me as if kvm-all.c:kvm_vm_ioctl() is using void*. I like
unsigned long but maybe uintptr_t would be more correct then?

Or should kvm_vm_ioctl() be fixed to use something else instead?
Eric's int would be a semantic change for the 64-bit platforms, no?

Andreas

- -- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)

iQIcBAEBAgAGBQJQ/T8lAAoJEPou0S0+fgE/gl8QALesZwG5q07W21mp2j4ikL8N
jrBHjG2VZ9Kda+AIGMClVsWntGZSOzHdtriJ4gjxp90D71S/LQfsYAy6bj45FIwS
kPbIQblLlL5Xc6ZiTS5yTzkwyEd7gUpDVouXyv3XxeyUxqhQKwgWxpiP4RftbBRI
Z8wLbVFNpIoIsHfhKoNkT4M/Ucm1iZbChV6y4zqltAfdQhcl6Gq0jzhtkAfmN41t
p3tCJYldRwayiKLsO2Y0BMNrKmCJisKCEGmkCQzye/3cuFoat/WUmpjV/65hLNtm
ruzfn6pCqMTEGPC4YeDdUsxAhVzVX+Sd4mBKHBGItmvhhJMFYUtwTosRwX5bOrAJ
mpVLAj5/XDYTm2/jQUEOJAqpxUr5oAVMQL3sNeWJPmXkk1kNaNWTNVHHDW1iJnRj
ty0YIWOnuNabkwiDEjPCz6ghjfA3wOBWy8Gk3+F21MYgRQwDTFw4JZuroOIzy3iD
6Vs4MmiBUGnoLobSqw2dUZFmjL7a1500AxZG0MwBd+EqnbLHGqD33kvLrbUYT8+F
eW+cqKV+ZXo3ux343rTxD6EFgmN7GmHSkknxJN5m6ldlw5wfFQ8KhdCiKjwSq3EP
X0bVGmryEdIh+6w/RbhL75Vfb/Je0mr/GzhtijtXo+FORkF8ip2mlpVSl46r0AfI
KvsZ0HZqZHsfoaSBC1js
=/cL3
-----END PGP SIGNATURE-----
Eric Blake - Jan. 21, 2013, 2:35 p.m.
On 01/21/2013 06:14 AM, Andreas Färber wrote:
>> glibc is already responsible from converting the 'unsigned long
>> int' of the user declaration back into the 'unsigned int' that the
>> kernel expects for the second argument.  The third argument (when
>> present), is generally treated as a pointer (of size appropriate
>> for the architecture).  Although there _might_ be an ioctl that
>> uses it directly as an integer instead of dereferencing it as a
>> pointer, those would be the exceptions to the rule.
> 
> So ... do we have a conclusion what to put into the commit message? :)
> 
> It looks to me as if kvm-all.c:kvm_vm_ioctl() is using void*. I like
> unsigned long but maybe uintptr_t would be more correct then?

uintptr_t feels more correct - the 3rd (vararg) argument through the
ioctl() syscall is always retrieved using the same size as void*.

> 
> Or should kvm_vm_ioctl() be fixed to use something else instead?
> Eric's int would be a semantic change for the 64-bit platforms, no?

My discussion about 'int' vs. 'unsigned long' was in regards to the
second argument KVM_CREATE_VCPU, which your patch does not change
(perhaps my fault for jumping in on a conversation mid-thread without
actually reading your original patch, which I have now done).  That is,
KVM_CREATE_VCPU as a constant is always 32 bits (kernel constraint),
widened out to unsigned long when passed to the glibc function (due to
the glibc signature disagreeing with POSIX), then narrowed back down to
32 bits when forwarded to the kernel syscall.

Meanwhile, your patch is fixing the third argument from 'int' to a wider
type, which is necessary for passing that value through varargs when the
receiving end will retrieve the same argument via a void* variable.
Marcelo Tosatti - Jan. 22, 2013, 1:44 a.m.
On Thu, Jan 17, 2013 at 06:59:30PM -0200, Eduardo Habkost wrote:
> This will allow each architecture to define how the VCPU ID is set on
> the KVM_CREATE_VCPU ioctl call.
> 
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
> Cc: kvm@vger.kernel.org
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Gleb Natapov <gleb@redhat.com>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> 
> Changes v2:
>  - Get CPUState as argument instead of CPUArchState
> ---
>  include/sysemu/kvm.h | 3 +++
>  kvm-all.c            | 2 +-
>  target-i386/kvm.c    | 5 +++++
>  target-ppc/kvm.c     | 5 +++++
>  target-s390x/kvm.c   | 5 +++++
>  5 files changed, 19 insertions(+), 1 deletion(-)

Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Eduardo Habkost - Jan. 22, 2013, 3:54 p.m.
On Mon, Jan 21, 2013 at 07:35:22AM -0700, Eric Blake wrote:
> On 01/21/2013 06:14 AM, Andreas Färber wrote:
> >> glibc is already responsible from converting the 'unsigned long
> >> int' of the user declaration back into the 'unsigned int' that the
> >> kernel expects for the second argument.  The third argument (when
> >> present), is generally treated as a pointer (of size appropriate
> >> for the architecture).  Although there _might_ be an ioctl that
> >> uses it directly as an integer instead of dereferencing it as a
> >> pointer, those would be the exceptions to the rule.
> > 
> > So ... do we have a conclusion what to put into the commit message? :)
> > 
> > It looks to me as if kvm-all.c:kvm_vm_ioctl() is using void*. I like
> > unsigned long but maybe uintptr_t would be more correct then?
> 
> uintptr_t feels more correct - the 3rd (vararg) argument through the
> ioctl() syscall is always retrieved using the same size as void*.

Actually, sys_ioctl() always retrieve it using "unsigned long", but
nothing prevents the arch-specific syscall entry code to from
translating something from a different type to "unsigned long" before
calling sys_ioctl().

So I guess the only guarantee we have is the Linux ioctl(2) man page,
that says: "The third argument is an untyped pointer to memory. It's
traditionally char *argp (from the days before void * was valid C), and
will be so named for this discussion."

That said, I plan to change the code to cast the argument to (void*) in
the next version.

> 
> > 
> > Or should kvm_vm_ioctl() be fixed to use something else instead?
> > Eric's int would be a semantic change for the 64-bit platforms, no?
> 
> My discussion about 'int' vs. 'unsigned long' was in regards to the
> second argument KVM_CREATE_VCPU, which your patch does not change
> (perhaps my fault for jumping in on a conversation mid-thread without
> actually reading your original patch, which I have now done).  That is,
> KVM_CREATE_VCPU as a constant is always 32 bits (kernel constraint),
> widened out to unsigned long when passed to the glibc function (due to
> the glibc signature disagreeing with POSIX), then narrowed back down to
> 32 bits when forwarded to the kernel syscall.
> 
> Meanwhile, your patch is fixing the third argument from 'int' to a wider
> type, which is necessary for passing that value through varargs when the
> receiving end will retrieve the same argument via a void* variable.

I am confident that "unsigned long" will work properly on all
architectures we care about today, but I also don't know if this is
documented and guaranteed to work on all architectures. Passing an
argument of the documented type (void*) sounds like the right thing to
do.

Patch

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 22acf91..384ee66 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -196,6 +196,9 @@  int kvm_arch_init(KVMState *s);
 
 int kvm_arch_init_vcpu(CPUState *cpu);
 
+/* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
+unsigned long kvm_arch_vcpu_id(CPUState *cpu);
+
 void kvm_arch_reset_vcpu(CPUState *cpu);
 
 int kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
diff --git a/kvm-all.c b/kvm-all.c
index 6278d61..995220d 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -222,7 +222,7 @@  int kvm_init_vcpu(CPUState *cpu)
 
     DPRINTF("kvm_init_vcpu\n");
 
-    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, cpu->cpu_index);
+    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, kvm_arch_vcpu_id(cpu));
     if (ret < 0) {
         DPRINTF("kvm_create_vcpu failed\n");
         goto err;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 3acff40..5f3f789 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -411,6 +411,11 @@  static void cpu_update_state(void *opaque, int running, RunState state)
     }
 }
 
+unsigned long kvm_arch_vcpu_id(CPUState *cpu)
+{
+    return cpu->cpu_index;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     struct {
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 19e9f25..1e544ae 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -384,6 +384,11 @@  static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
 
 #endif /* !defined (TARGET_PPC64) */
 
+unsigned long kvm_arch_vcpu_id(CPUState *cpu)
+{
+    return cpu->cpu_index;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     PowerPCCPU *cpu = POWERPC_CPU(cs);
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 6ec5e6d..bd9864c 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -72,6 +72,11 @@  int kvm_arch_init(KVMState *s)
     return 0;
 }
 
+unsigned long kvm_arch_vcpu_id(CPUState *cpu)
+{
+    return cpu->cpu_index;
+}
+
 int kvm_arch_init_vcpu(CPUState *cpu)
 {
     int ret = 0;