diff mbox

[1/4] qemu-user: Impl. setsockopt(SO_BINDTODEVICE)

Message ID 1405091884-29955-2-git-send-email-Joakim.Tjernlund@transmode.se
State New
Headers show

Commit Message

Joakim Tjernlund July 11, 2014, 3:18 p.m. UTC
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
---
 linux-user/syscall.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Peter Maydell July 11, 2014, 3:46 p.m. UTC | #1
On 11 July 2014 16:18, Joakim Tjernlund <Joakim.Tjernlund@transmode.se> wrote:
> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> ---
>  linux-user/syscall.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 5a272d3..1380f4e 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -1497,6 +1497,18 @@ set_timeout:
>                  unlock_user_struct(tfprog, optval_addr, 1);
>                  return ret;
>          }
> +       case TARGET_SO_BINDTODEVICE:
> +       {
> +               char *dev_ifname;
> +
> +               if (optlen > IFNAMSIZ)
> +                       return -TARGET_EINVAL;

This needs braces for our coding style; you might like
to run your patches through scripts/checkpatch.pl, which
will warn about this kind of thing.

Also, the kernel implementation handles overlong
lengths via
     if (optlen > IFNAMSIZ - 1) {
         optlen = IFNAMSIZ - 1;
     }

which effectively truncates overlong strings rather
than rejecting them. We should do the same (there will
be complication required to ensure we pass the kernel
a NUL-terminated string even if we don't get one from
the guest; may be easiest to use a local array, given
IFNAMSIZ is only 16).

> +
> +               dev_ifname = lock_user(VERIFY_READ, optval_addr, optlen, 1);
> +               ret = get_errno(setsockopt(sockfd, level, optname, dev_ifname, optlen));

This is passing the target optname through to the
host kernel; you need to set 'optname = SO_BINDTODEVICE;'
(look at the other cases in this switch).

> +               unlock_user (dev_ifname, optval_addr, 0);
> +               return ret;
> +       }
>              /* Options with 'int' argument.  */
>          case TARGET_SO_DEBUG:
>                 optname = SO_DEBUG;
> --
> 1.8.5.5

thanks
-- PMM
Joakim Tjernlund July 11, 2014, 4:07 p.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/11 17:46:28:

> From: Peter Maydell <peter.maydell@linaro.org>
> To: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>, 
> Cc: QEMU Developers <qemu-devel@nongnu.org>
> Date: 2014/07/11 17:46
> Subject: Re: [Qemu-devel] [PATCH 1/4] qemu-user: Impl. 
setsockopt(SO_BINDTODEVICE)
> 
> On 11 July 2014 16:18, Joakim Tjernlund <Joakim.Tjernlund@transmode.se> 
wrote:
> > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > ---
> >  linux-user/syscall.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> > index 5a272d3..1380f4e 100644
> > --- a/linux-user/syscall.c
> > +++ b/linux-user/syscall.c
> > @@ -1497,6 +1497,18 @@ set_timeout:
> >                  unlock_user_struct(tfprog, optval_addr, 1);
> >                  return ret;
> >          }
> > +       case TARGET_SO_BINDTODEVICE:
> > +       {
> > +               char *dev_ifname;
> > +
> > +               if (optlen > IFNAMSIZ)
> > +                       return -TARGET_EINVAL;
> 
> This needs braces for our coding style; you might like
> to run your patches through scripts/checkpatch.pl, which
> will warn about this kind of thing.

Ahh, I figured you had kernel style.

> 
> Also, the kernel implementation handles overlong
> lengths via
>      if (optlen > IFNAMSIZ - 1) {
>          optlen = IFNAMSIZ - 1;
>      }
> 
> which effectively truncates overlong strings rather
> than rejecting them. We should do the same (there will
> be complication required to ensure we pass the kernel
> a NUL-terminated string even if we don't get one from
> the guest; may be easiest to use a local array, given
> IFNAMSIZ is only 16).

hmm, should we not pass through as is to the kernel?
Since we don't copy anything we could just remove this
check and let the kernel decide policy?

> 
> > +
> > +               dev_ifname = lock_user(VERIFY_READ, optval_addr, 
optlen, 1);
> > +               ret = get_errno(setsockopt(sockfd, level, optname, 
dev_ifname, optlen));
> 
> This is passing the target optname through to the
> host kernel; you need to set 'optname = SO_BINDTODEVICE;'
> (look at the other cases in this switch).

Yes, I see. Thanks
Peter Maydell July 11, 2014, 5:02 p.m. UTC | #3
On 11 July 2014 17:07, Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:
> Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/11 17:46:28:
>> This needs braces for our coding style; you might like
>> to run your patches through scripts/checkpatch.pl, which
>> will warn about this kind of thing.
>
> Ahh, I figured you had kernel style.

No, we're a bit different (and we don't always follow
our own style in existing code, though we try to
maintain it on new changes).

>>
>> Also, the kernel implementation handles overlong
>> lengths via
>>      if (optlen > IFNAMSIZ - 1) {
>>          optlen = IFNAMSIZ - 1;
>>      }
>>
>> which effectively truncates overlong strings rather
>> than rejecting them. We should do the same (there will
>> be complication required to ensure we pass the kernel
>> a NUL-terminated string even if we don't get one from
>> the guest; may be easiest to use a local array, given
>> IFNAMSIZ is only 16).
>
> hmm, should we not pass through as is to the kernel?
> Since we don't copy anything we could just remove this
> check and let the kernel decide policy?

I thought about that, but there's a corner case:
the kernel does the clamping of the optlen before the
copy_from_user(), which means if you have:
 [interface name] [unreadable memory]
and optlen is long enough that optval_addr + optlen
reaches into the unreadable memory, then QEMU will return
EFAULT (whereas the native kernel implementation will
succeed) unless we do the clamping of the optlen ourselves.

Which reminds me that your patch forgot the check
    if (!dev_ifname) {
        return -TARGET_EFAULT;
    }
(lock_user can fail).

thanks
-- PMM
Joakim Tjernlund July 12, 2014, 8:31 a.m. UTC | #4
Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/11 19:02:30:

> From: Peter Maydell <peter.maydell@linaro.org>
> To: Joakim Tjernlund <joakim.tjernlund@transmode.se>, 
> Cc: QEMU Developers <qemu-devel@nongnu.org>
> Date: 2014/07/11 19:02
> Subject: Re: [Qemu-devel] [PATCH 1/4] qemu-user: Impl. 
setsockopt(SO_BINDTODEVICE)
> 
Snip
> 
> >>
> >> Also, the kernel implementation handles overlong
> >> lengths via
> >>      if (optlen > IFNAMSIZ - 1) {
> >>          optlen = IFNAMSIZ - 1;
> >>      }
> >>
> >> which effectively truncates overlong strings rather
> >> than rejecting them. We should do the same (there will
> >> be complication required to ensure we pass the kernel
> >> a NUL-terminated string even if we don't get one from
> >> the guest; may be easiest to use a local array, given
> >> IFNAMSIZ is only 16).
> >
> > hmm, should we not pass through as is to the kernel?
> > Since we don't copy anything we could just remove this
> > check and let the kernel decide policy?
> 
> I thought about that, but there's a corner case:
> the kernel does the clamping of the optlen before the
> copy_from_user(), which means if you have:
>  [interface name] [unreadable memory]
> and optlen is long enough that optval_addr + optlen
> reaches into the unreadable memory, then QEMU will return
> EFAULT (whereas the native kernel implementation will
> succeed) unless we do the clamping of the optlen ourselves.

I can live with that IMHO very minor flaw that I dont think is
going to matter in practice for simplicity and speed of QEMU.
It is your call though, do we go for exact emulation or can we
cut some corners?

> 
> Which reminds me that your patch forgot the check
>     if (!dev_ifname) {
>         return -TARGET_EFAULT;
>     }
> (lock_user can fail).
> 
> thanks
> -- PMM
Peter Maydell July 12, 2014, 9:01 a.m. UTC | #5
On 12 July 2014 09:31, Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:
> Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/11 19:02:30:
>> I thought about that, but there's a corner case:
>> the kernel does the clamping of the optlen before the
>> copy_from_user(), which means if you have:
>>  [interface name] [unreadable memory]
>> and optlen is long enough that optval_addr + optlen
>> reaches into the unreadable memory, then QEMU will return
>> EFAULT (whereas the native kernel implementation will
>> succeed) unless we do the clamping of the optlen ourselves.
>
> I can live with that IMHO very minor flaw that I dont think is
> going to matter in practice for simplicity and speed of QEMU.
> It is your call though, do we go for exact emulation or can we
> cut some corners?

In this case I would prefer to get it right:
 * it's purely localised to this function
 * it's not all that hard to get right
 * we've already done the hard work of looking at the
   kernel and determining the correct behaviour

SO_BINDTODEVICE is not going to be on any guest's
speed-critical fastpath anyway...

thanks
-- PMM
Joakim Tjernlund July 12, 2014, 9:48 a.m. UTC | #6
Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/12 11:01:55:
> 
> On 12 July 2014 09:31, Joakim Tjernlund <joakim.tjernlund@transmode.se> 
wrote:
> > Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/11 19:02:30:
> >> I thought about that, but there's a corner case:
> >> the kernel does the clamping of the optlen before the
> >> copy_from_user(), which means if you have:
> >>  [interface name] [unreadable memory]
> >> and optlen is long enough that optval_addr + optlen
> >> reaches into the unreadable memory, then QEMU will return
> >> EFAULT (whereas the native kernel implementation will
> >> succeed) unless we do the clamping of the optlen ourselves.
> >
> > I can live with that IMHO very minor flaw that I dont think is
> > going to matter in practice for simplicity and speed of QEMU.
> > It is your call though, do we go for exact emulation or can we
> > cut some corners?
> 
> In this case I would prefer to get it right:
>  * it's purely localised to this function
>  * it's not all that hard to get right
>  * we've already done the hard work of looking at the
>    kernel and determining the correct behaviour
> 
> SO_BINDTODEVICE is not going to be on any guest's
> speed-critical fastpath anyway...

OK, 2 new patches sent

I hope these will all make it to the impending release?
Peter Maydell July 12, 2014, 10:42 a.m. UTC | #7
On 12 July 2014 10:48, Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:
> Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/12 11:01:55:
> OK, 2 new patches sent

Thanks.

> I hope these will all make it to the impending release?

I'm not sure at this point; they are bug fixes, and reasonably
small ones, but we're in freeze and only a few weeks from
release, and these aren't regressions, so you could argue
either way. Ultimately that's up to the linux-user submaintainer.

thanks
-- PMM
Joakim Tjernlund July 12, 2014, 1:52 p.m. UTC | #8
Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/12 12:42:48:
> 
> On 12 July 2014 10:48, Joakim Tjernlund <joakim.tjernlund@transmode.se> 
wrote:
> > Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/12 11:01:55:
> > OK, 2 new patches sent
> 
> Thanks.
> 
> > I hope these will all make it to the impending release?
> 
> I'm not sure at this point; they are bug fixes, and reasonably
> small ones, but we're in freeze and only a few weeks from
> release, and these aren't regressions, so you could argue
> either way. Ultimately that's up to the linux-user submaintainer.

Fingers crossed the :) I would be great to have DHCP functioning
though.

I just sent v3 of the 2 patches addressing your comments.

Could you look at too?
http://lists.nongnu.org/archive/html/qemu-devel/2014-07/msg01925.html

That one really simple and also needed for dhcp 

 Jocke
Joakim Tjernlund July 12, 2014, 3:13 p.m. UTC | #9
Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/11 19:02:30:
> >
> > hmm, should we not pass through as is to the kernel?
> > Since we don't copy anything we could just remove this
> > check and let the kernel decide policy?
> 
> I thought about that, but there's a corner case:
> the kernel does the clamping of the optlen before the
> copy_from_user(), which means if you have:
>  [interface name] [unreadable memory]
> and optlen is long enough that optval_addr + optlen
> reaches into the unreadable memory, then QEMU will return
> EFAULT (whereas the native kernel implementation will
> succeed) unless we do the clamping of the optlen ourselves.

hmm, this kernel code: 
        if (optlen > IFNAMSIZ - 1)
                optlen = IFNAMSIZ - 1;
        memset(devname, 0, sizeof(devname));

        ret = -EFAULT;
        if (copy_from_user(devname, optval, optlen))
                goto out;

suggests to me that the qemu code could be written:

        case TARGET_SO_BINDTODEVICE:
        {
                char *dev_ifname;

                if (optlen > IFNAMSIZ - 1) {
                    optlen = IFNAMSIZ - 1;
                }
                dev_ifname = lock_user(VERIFY_READ, optval_addr, optlen, 
1);
                if (!dev_ifname) {
                    return -TARGET_EFAULT;
                }
                optname = SO_BINDTODEVICE;
                ret = get_errno(setsockopt(sockfd, level, optname, 
dev_ifname, optlen));
                unlock_user (dev_ifname, optval_addr, 0);
                return ret;
        }

Not a big deal, I just wanted to improve my understanding of the 
kernel/qemu code.

 Jocke
Peter Maydell July 12, 2014, 3:47 p.m. UTC | #10
On 12 July 2014 16:13, Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:
> Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/11 19:02:30:
>> >
>> > hmm, should we not pass through as is to the kernel?
>> > Since we don't copy anything we could just remove this
>> > check and let the kernel decide policy?
>>
>> I thought about that, but there's a corner case:
>> the kernel does the clamping of the optlen before the
>> copy_from_user(), which means if you have:
>>  [interface name] [unreadable memory]
>> and optlen is long enough that optval_addr + optlen
>> reaches into the unreadable memory, then QEMU will return
>> EFAULT (whereas the native kernel implementation will
>> succeed) unless we do the clamping of the optlen ourselves.
>
> hmm, this kernel code:
>         if (optlen > IFNAMSIZ - 1)
>                 optlen = IFNAMSIZ - 1;
>         memset(devname, 0, sizeof(devname));
>
>         ret = -EFAULT;
>         if (copy_from_user(devname, optval, optlen))
>                 goto out;
>
> suggests to me that the qemu code could be written:
>
>         case TARGET_SO_BINDTODEVICE:
>         {
>                 char *dev_ifname;
>
>                 if (optlen > IFNAMSIZ - 1) {
>                     optlen = IFNAMSIZ - 1;
>                 }
>                 dev_ifname = lock_user(VERIFY_READ, optval_addr, optlen,
> 1);
>                 if (!dev_ifname) {
>                     return -TARGET_EFAULT;
>                 }
>                 optname = SO_BINDTODEVICE;
>                 ret = get_errno(setsockopt(sockfd, level, optname,
> dev_ifname, optlen));
>                 unlock_user (dev_ifname, optval_addr, 0);
>                 return ret;
>         }
>
> Not a big deal, I just wanted to improve my understanding of the
> kernel/qemu code.

That would work with the current kernel implementation, but
the API definition (as described in the socket(7) manpage)
says we should pass a NUL-terminated string to the kernel,
so it's more robust for us to make sure we do so.

thanks
-- PMM
Joakim Tjernlund July 12, 2014, 5:30 p.m. UTC | #11
Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/12 17:47:56:

> From: Peter Maydell <peter.maydell@linaro.org>
> To: Joakim Tjernlund <joakim.tjernlund@transmode.se>, 
> Cc: QEMU Developers <qemu-devel@nongnu.org>
> Date: 2014/07/12 17:48
> Subject: Re: [Qemu-devel] [PATCH 1/4] qemu-user: Impl. 
setsockopt(SO_BINDTODEVICE)
> 
> On 12 July 2014 16:13, Joakim Tjernlund <joakim.tjernlund@transmode.se> 
wrote:
> > Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/11 19:02:30:
> >> >
> >> > hmm, should we not pass through as is to the kernel?
> >> > Since we don't copy anything we could just remove this
> >> > check and let the kernel decide policy?
> >>
> >> I thought about that, but there's a corner case:
> >> the kernel does the clamping of the optlen before the
> >> copy_from_user(), which means if you have:
> >>  [interface name] [unreadable memory]
> >> and optlen is long enough that optval_addr + optlen
> >> reaches into the unreadable memory, then QEMU will return
> >> EFAULT (whereas the native kernel implementation will
> >> succeed) unless we do the clamping of the optlen ourselves.
> >
> > hmm, this kernel code:
> >         if (optlen > IFNAMSIZ - 1)
> >                 optlen = IFNAMSIZ - 1;
> >         memset(devname, 0, sizeof(devname));
> >
> >         ret = -EFAULT;
> >         if (copy_from_user(devname, optval, optlen))
> >                 goto out;
> >
> > suggests to me that the qemu code could be written:
> >
> >         case TARGET_SO_BINDTODEVICE:
> >         {
> >                 char *dev_ifname;
> >
> >                 if (optlen > IFNAMSIZ - 1) {
> >                     optlen = IFNAMSIZ - 1;
> >                 }
> >                 dev_ifname = lock_user(VERIFY_READ, optval_addr, 
optlen,
> > 1);
> >                 if (!dev_ifname) {
> >                     return -TARGET_EFAULT;
> >                 }
> >                 optname = SO_BINDTODEVICE;
> >                 ret = get_errno(setsockopt(sockfd, level, optname,
> > dev_ifname, optlen));
> >                 unlock_user (dev_ifname, optval_addr, 0);
> >                 return ret;
> >         }
> >
> > Not a big deal, I just wanted to improve my understanding of the
> > kernel/qemu code.
> 
> That would work with the current kernel implementation, but
> the API definition (as described in the socket(7) manpage)
> says we should pass a NUL-terminated string to the kernel,
> so it's more robust for us to make sure we do so.

But we emulate user space and we should not "improve" stuff on the way 
should we?
If we do then one could end up with progs that work with qemu but not with
the real kernel.

 Jocke
Peter Maydell July 12, 2014, 5:38 p.m. UTC | #12
On 12 July 2014 18:30, Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:
> Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/12 17:47:56:
>> That would work with the current kernel implementation, but
>> the API definition (as described in the socket(7) manpage)
>> says we should pass a NUL-terminated string to the kernel,
>> so it's more robust for us to make sure we do so.
>
> But we emulate user space and we should not "improve" stuff on the way
> should we?

No, but in this case we don't improve anything. The unterminated
string from the guest is still truncated as a native kernel would do;
it's just that our implementation of that behaviour doesn't rely
on the specific behaviour of the host kernel in this undocumented
corner case. It's not a big deal either way, but given that the code
is pretty similar either way I prefer the approach I suggest.

thanks
-- PMM
Joakim Tjernlund July 12, 2014, 6:11 p.m. UTC | #13
Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/12 19:38:26:
> 
> On 12 July 2014 18:30, Joakim Tjernlund <joakim.tjernlund@transmode.se> 
wrote:
> > Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/12 17:47:56:
> >> That would work with the current kernel implementation, but
> >> the API definition (as described in the socket(7) manpage)
> >> says we should pass a NUL-terminated string to the kernel,
> >> so it's more robust for us to make sure we do so.
> >
> > But we emulate user space and we should not "improve" stuff on the way
> > should we?
> 
> No, but in this case we don't improve anything. The unterminated
> string from the guest is still truncated as a native kernel would do;
> it's just that our implementation of that behaviour doesn't rely
> on the specific behaviour of the host kernel in this undocumented
> corner case. It's not a big deal either way, but given that the code

Right, thank you for your patience. Now I got a little better 
understanding.

 Jocke
diff mbox

Patch

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 5a272d3..1380f4e 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -1497,6 +1497,18 @@  set_timeout:
                 unlock_user_struct(tfprog, optval_addr, 1);
                 return ret;
         }
+	case TARGET_SO_BINDTODEVICE:
+	{
+		char *dev_ifname;
+
+		if (optlen > IFNAMSIZ)
+			return -TARGET_EINVAL;
+
+		dev_ifname = lock_user(VERIFY_READ, optval_addr, optlen, 1);
+		ret = get_errno(setsockopt(sockfd, level, optname, dev_ifname, optlen));
+		unlock_user (dev_ifname, optval_addr, 0);
+		return ret;
+	}
             /* Options with 'int' argument.  */
         case TARGET_SO_DEBUG:
 		optname = SO_DEBUG;