diff mbox

linux-user: Add binfmt wrapper

Message ID 1405348708-13909-1-git-send-email-Joakim.Tjernlund@transmode.se
State New
Headers show

Commit Message

Joakim Tjernlund July 14, 2014, 2:38 p.m. UTC
The popular binfmt-wrapper patch adds an additional
executable which mangle argv suitable for binfmt flag P.
In a chroot you need the both (statically linked) qemu-$arch
and qemu-$arch-binfmt-wrapper. This is sub optimal and a
better approach is to recognize the -binfmt-wrapper extension
within linux-user(qemu-$arch) and mangle argv there.
This just produces on executable which can be either copied to
the chroot or bind mounted with the appropriate -binfmt-wrapper
suffix.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
---
 linux-user/main.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Alexander Graf July 14, 2014, 3:21 p.m. UTC | #1
On 14.07.14 16:38, Joakim Tjernlund wrote:
> The popular binfmt-wrapper patch adds an additional
> executable which mangle argv suitable for binfmt flag P.
> In a chroot you need the both (statically linked) qemu-$arch
> and qemu-$arch-binfmt-wrapper. This is sub optimal and a
> better approach is to recognize the -binfmt-wrapper extension
> within linux-user(qemu-$arch) and mangle argv there.
> This just produces on executable which can be either copied to
> the chroot or bind mounted with the appropriate -binfmt-wrapper
> suffix.
>
> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>

Please make sure to CC Riku on patches like this - he is the linux-user 
maintainer.

> ---
>   linux-user/main.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
>
> diff --git a/linux-user/main.c b/linux-user/main.c
> index 71a33c7..212067a 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -3828,6 +3828,19 @@ int main(int argc, char **argv, char **envp)
>       int i;
>       int ret;
>       int execfd;
> +    char *binfmt;
> +
> +    i = strlen( argv[0] ) - strlen ( "-binfmt-wrapper" );

The spaces are odd. Did this patch pass checkpatch.pl? Same comment goes 
for almost all function invocations.

> +    binfmt = argv[0] + i;
> +    if (i > 0 && strcmp ( binfmt, "-binfmt-wrapper" ) == 0) {

This magic needs to be documented somewhere. In fact, I find it pretty 
hard to use in real world scenarios. Imagine a distribution - should it 
package every target binary twice? Should it create hardlinks all over?

I think we should try and find better magic :). Looking at the 
binfmt_misc loading code, I think we can cheat a bit. If we pass the 'O' 
flag (open target binary for handler), binfmt_misc will tell us the 
binary fd in AT_EXECFD:

                 NEW_AUX_ENT(AT_EXECFD, bprm->interp_data);

We could then use this as a hint that we were spawned by binfmt_misc 
rather than directly and interpret the first argv as target_argv[0].

Then we can also add the P and O flags to scripts/qemu-binfmt-conf.sh 
and have a solution that works well for everyone.

> +	if (argc < 3 ) {
> +	    fprintf ( stderr, "%s: Please use me through binfmt with P flag\n", argv[0] );
> +	    exit(1);
> +	}
> +	handle_arg_argv0(argv[2]); /* binfmt wrapper */
> +	memmove(&argv[2], &argv[3], (argc-2)*sizeof(argv));

I can't say I'm a big fan of this memmove, but everything else I can 
think of is going to be even uglier.


Alex

> +	argc--;
> +    }
>   
>       module_call_init(MODULE_INIT_QOM);
>
Joakim Tjernlund July 14, 2014, 3:38 p.m. UTC | #2
Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:

> From: Alexander Graf <agraf@suse.de>
> To: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>, 
> Cc: qemu-devel@nongnu.org
> Date: 2014/07/14 17:21
> Subject: Re: [PATCH] linux-user: Add binfmt wrapper
> 
> 
> On 14.07.14 16:38, Joakim Tjernlund wrote:
> > The popular binfmt-wrapper patch adds an additional
> > executable which mangle argv suitable for binfmt flag P.
> > In a chroot you need the both (statically linked) qemu-$arch
> > and qemu-$arch-binfmt-wrapper. This is sub optimal and a
> > better approach is to recognize the -binfmt-wrapper extension
> > within linux-user(qemu-$arch) and mangle argv there.
> > This just produces on executable which can be either copied to
> > the chroot or bind mounted with the appropriate -binfmt-wrapper
> > suffix.
> >
> > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> 
> Please make sure to CC Riku on patches like this - he is the linux-user 
> maintainer.

Doesn't he read the devel list? Anyhow CC:ed

> 
> > ---
> >   linux-user/main.c | 13 +++++++++++++
> >   1 file changed, 13 insertions(+)
> >
> > diff --git a/linux-user/main.c b/linux-user/main.c
> > index 71a33c7..212067a 100644
> > --- a/linux-user/main.c
> > +++ b/linux-user/main.c
> > @@ -3828,6 +3828,19 @@ int main(int argc, char **argv, char **envp)
> >       int i;
> >       int ret;
> >       int execfd;
> > +    char *binfmt;
> > +
> > +    i = strlen( argv[0] ) - strlen ( "-binfmt-wrapper" );
> 
> The spaces are odd. Did this patch pass checkpatch.pl? Same comment goes 

> for almost all function invocations.

ehh, didn't run it through checkpatch.pl. Easy to fix next time.

> 
> > +    binfmt = argv[0] + i;
> > +    if (i > 0 && strcmp ( binfmt, "-binfmt-wrapper" ) == 0) {
> 
> This magic needs to be documented somewhere. In fact, I find it pretty 
> hard to use in real world scenarios. Imagine a distribution - should it 
> package every target binary twice? Should it create hardlinks all over?

How does dists. handle your original binfmt-wrapper? This is not much
different I think. Here you got a choice to create a hardlink or a copy.
Any chroot will only have to bind mount binfmt-wrapper into the chroot or 
lxc container.

> 
> I think we should try and find better magic :). Looking at the 
> binfmt_misc loading code, I think we can cheat a bit. If we pass the 'O' 

> flag (open target binary for handler), binfmt_misc will tell us the 
> binary fd in AT_EXECFD:
> 
>                  NEW_AUX_ENT(AT_EXECFD, bprm->interp_data);
> 
> We could then use this as a hint that we were spawned by binfmt_misc 
> rather than directly and interpret the first argv as target_argv[0].
> 
> Then we can also add the P and O flags to scripts/qemu-binfmt-conf.sh 
> and have a solution that works well for everyone.

What to do with P only then? Seems like most dists uses only P 

> 
> > +   if (argc < 3 ) {
> > +       fprintf ( stderr, "%s: Please use me through binfmt with P 
flag\n", argv[0] );
> > +       exit(1);
> > +   }
> > +   handle_arg_argv0(argv[2]); /* binfmt wrapper */
> > +   memmove(&argv[2], &argv[3], (argc-2)*sizeof(argv));
> 
> I can't say I'm a big fan of this memmove, but everything else I can 
> think of is going to be even uglier.

Me too :)

> > +   argc--;
> > +    }
> > 
> >       module_call_init(MODULE_INIT_QOM);
> > 
>
Alexander Graf July 14, 2014, 3:46 p.m. UTC | #3
On 14.07.14 17:38, Joakim Tjernlund wrote:
> Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:
>
>> From: Alexander Graf <agraf@suse.de>
>> To: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>,
>> Cc: qemu-devel@nongnu.org
>> Date: 2014/07/14 17:21
>> Subject: Re: [PATCH] linux-user: Add binfmt wrapper
>>
>>
>> On 14.07.14 16:38, Joakim Tjernlund wrote:
>>> The popular binfmt-wrapper patch adds an additional
>>> executable which mangle argv suitable for binfmt flag P.
>>> In a chroot you need the both (statically linked) qemu-$arch
>>> and qemu-$arch-binfmt-wrapper. This is sub optimal and a
>>> better approach is to recognize the -binfmt-wrapper extension
>>> within linux-user(qemu-$arch) and mangle argv there.
>>> This just produces on executable which can be either copied to
>>> the chroot or bind mounted with the appropriate -binfmt-wrapper
>>> suffix.
>>>
>>> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
>> Please make sure to CC Riku on patches like this - he is the linux-user
>> maintainer.
> Doesn't he read the devel list? Anyhow CC:ed

He may or may not. Qemu-devel can be pretty high volume :).

>
>>> ---
>>>    linux-user/main.c | 13 +++++++++++++
>>>    1 file changed, 13 insertions(+)
>>>
>>> diff --git a/linux-user/main.c b/linux-user/main.c
>>> index 71a33c7..212067a 100644
>>> --- a/linux-user/main.c
>>> +++ b/linux-user/main.c
>>> @@ -3828,6 +3828,19 @@ int main(int argc, char **argv, char **envp)
>>>        int i;
>>>        int ret;
>>>        int execfd;
>>> +    char *binfmt;
>>> +
>>> +    i = strlen( argv[0] ) - strlen ( "-binfmt-wrapper" );
>> The spaces are odd. Did this patch pass checkpatch.pl? Same comment goes
>> for almost all function invocations.
> ehh, didn't run it through checkpatch.pl. Easy to fix next time.
>
>>> +    binfmt = argv[0] + i;
>>> +    if (i > 0 && strcmp ( binfmt, "-binfmt-wrapper" ) == 0) {
>> This magic needs to be documented somewhere. In fact, I find it pretty
>> hard to use in real world scenarios. Imagine a distribution - should it
>> package every target binary twice? Should it create hardlinks all over?
> How does dists. handle your original binfmt-wrapper? This is not much
> different I think. Here you got a choice to create a hardlink or a copy.
> Any chroot will only have to bind mount binfmt-wrapper into the chroot or
> lxc container.

Yeah, and there are reasons my original approach isn't upstream :).

>
>> I think we should try and find better magic :). Looking at the
>> binfmt_misc loading code, I think we can cheat a bit. If we pass the 'O'
>> flag (open target binary for handler), binfmt_misc will tell us the
>> binary fd in AT_EXECFD:
>>
>>                   NEW_AUX_ENT(AT_EXECFD, bprm->interp_data);
>>
>> We could then use this as a hint that we were spawned by binfmt_misc
>> rather than directly and interpret the first argv as target_argv[0].
>>
>> Then we can also add the P and O flags to scripts/qemu-binfmt-conf.sh
>> and have a solution that works well for everyone.
> What to do with P only then? Seems like most dists uses only P

If a distro uses the P flag it's not using upstream code, so they have 
to deal with their own breakage :). Fortunately the binfmt install 
scripts are usually part of a package too, so they can be updated easily.

If a distro cares a lot about backwards compatibility with their old 
name space, they can still compile the old -binfmt wrapper code and ship it.


Alex
Joakim Tjernlund July 14, 2014, 3:59 p.m. UTC | #4
Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:46:18:
> 
> 
> On 14.07.14 17:38, Joakim Tjernlund wrote:
> > Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:
> >
> >> From: Alexander Graf <agraf@suse.de>
> >> To: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>,
> >> Cc: qemu-devel@nongnu.org
> >> Date: 2014/07/14 17:21
> >> Subject: Re: [PATCH] linux-user: Add binfmt wrapper
> >>
> >>
> >> On 14.07.14 16:38, Joakim Tjernlund wrote:
> >>> The popular binfmt-wrapper patch adds an additional
> >>> executable which mangle argv suitable for binfmt flag P.
> >>> In a chroot you need the both (statically linked) qemu-$arch
> >>> and qemu-$arch-binfmt-wrapper. This is sub optimal and a
> >>> better approach is to recognize the -binfmt-wrapper extension
> >>> within linux-user(qemu-$arch) and mangle argv there.
> >>> This just produces on executable which can be either copied to
> >>> the chroot or bind mounted with the appropriate -binfmt-wrapper
> >>> suffix.
> >>>
> >>> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> >> Please make sure to CC Riku on patches like this - he is the 
linux-user
> >> maintainer.
> > Doesn't he read the devel list? Anyhow CC:ed
> 
> He may or may not. Qemu-devel can be pretty high volume :).
> 
> >
> >>> ---
> >>>    linux-user/main.c | 13 +++++++++++++
> >>>    1 file changed, 13 insertions(+)
> >>>
> >>> diff --git a/linux-user/main.c b/linux-user/main.c
> >>> index 71a33c7..212067a 100644
> >>> --- a/linux-user/main.c
> >>> +++ b/linux-user/main.c
> >>> @@ -3828,6 +3828,19 @@ int main(int argc, char **argv, char **envp)
> >>>        int i;
> >>>        int ret;
> >>>        int execfd;
> >>> +    char *binfmt;
> >>> +
> >>> +    i = strlen( argv[0] ) - strlen ( "-binfmt-wrapper" );
> >> The spaces are odd. Did this patch pass checkpatch.pl? Same comment 
goes
> >> for almost all function invocations.
> > ehh, didn't run it through checkpatch.pl. Easy to fix next time.
> >
> >>> +    binfmt = argv[0] + i;
> >>> +    if (i > 0 && strcmp ( binfmt, "-binfmt-wrapper" ) == 0) {
> >> This magic needs to be documented somewhere. In fact, I find it 
pretty
> >> hard to use in real world scenarios. Imagine a distribution - should 
it
> >> package every target binary twice? Should it create hardlinks all 
over?
> > How does dists. handle your original binfmt-wrapper? This is not much
> > different I think. Here you got a choice to create a hardlink or a 
copy.
> > Any chroot will only have to bind mount binfmt-wrapper into the chroot 
or
> > lxc container.
> 
> Yeah, and there are reasons my original approach isn't upstream :).

What are those then? Hardly just packaging problem/choise.

> 
> >
> >> I think we should try and find better magic :). Looking at the
> >> binfmt_misc loading code, I think we can cheat a bit. If we pass the 
'O'
> >> flag (open target binary for handler), binfmt_misc will tell us the
> >> binary fd in AT_EXECFD:
> >>
> >>                   NEW_AUX_ENT(AT_EXECFD, bprm->interp_data);
> >>
> >> We could then use this as a hint that we were spawned by binfmt_misc
> >> rather than directly and interpret the first argv as target_argv[0].
> >>
> >> Then we can also add the P and O flags to scripts/qemu-binfmt-conf.sh
> >> and have a solution that works well for everyone.
> > What to do with P only then? Seems like most dists uses only P
> 
> If a distro uses the P flag it's not using upstream code, so they have 
> to deal with their own breakage :). Fortunately the binfmt install 
> scripts are usually part of a package too, so they can be updated 
easily.

scripts/qemu-binfmt-conf.sh does not use any flag currently, I don't think
that works either with current linux-user and choot/lxc

You think everyone feel OK with new defaults like OP ?

> 
> If a distro cares a lot about backwards compatibility with their old 
> name space, they can still compile the old -binfmt wrapper code and ship 
it.
Alexander Graf July 14, 2014, 4 p.m. UTC | #5
On 14.07.14 17:59, Joakim Tjernlund wrote:
> Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:46:18:
>>
>> On 14.07.14 17:38, Joakim Tjernlund wrote:
>>> Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:
>>>
>>>> From: Alexander Graf <agraf@suse.de>
>>>> To: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>,
>>>> Cc: qemu-devel@nongnu.org
>>>> Date: 2014/07/14 17:21
>>>> Subject: Re: [PATCH] linux-user: Add binfmt wrapper
>>>>
>>>>
>>>> On 14.07.14 16:38, Joakim Tjernlund wrote:
>>>>> The popular binfmt-wrapper patch adds an additional
>>>>> executable which mangle argv suitable for binfmt flag P.
>>>>> In a chroot you need the both (statically linked) qemu-$arch
>>>>> and qemu-$arch-binfmt-wrapper. This is sub optimal and a
>>>>> better approach is to recognize the -binfmt-wrapper extension
>>>>> within linux-user(qemu-$arch) and mangle argv there.
>>>>> This just produces on executable which can be either copied to
>>>>> the chroot or bind mounted with the appropriate -binfmt-wrapper
>>>>> suffix.
>>>>>
>>>>> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
>>>> Please make sure to CC Riku on patches like this - he is the
> linux-user
>>>> maintainer.
>>> Doesn't he read the devel list? Anyhow CC:ed
>> He may or may not. Qemu-devel can be pretty high volume :).
>>
>>>>> ---
>>>>>     linux-user/main.c | 13 +++++++++++++
>>>>>     1 file changed, 13 insertions(+)
>>>>>
>>>>> diff --git a/linux-user/main.c b/linux-user/main.c
>>>>> index 71a33c7..212067a 100644
>>>>> --- a/linux-user/main.c
>>>>> +++ b/linux-user/main.c
>>>>> @@ -3828,6 +3828,19 @@ int main(int argc, char **argv, char **envp)
>>>>>         int i;
>>>>>         int ret;
>>>>>         int execfd;
>>>>> +    char *binfmt;
>>>>> +
>>>>> +    i = strlen( argv[0] ) - strlen ( "-binfmt-wrapper" );
>>>> The spaces are odd. Did this patch pass checkpatch.pl? Same comment
> goes
>>>> for almost all function invocations.
>>> ehh, didn't run it through checkpatch.pl. Easy to fix next time.
>>>
>>>>> +    binfmt = argv[0] + i;
>>>>> +    if (i > 0 && strcmp ( binfmt, "-binfmt-wrapper" ) == 0) {
>>>> This magic needs to be documented somewhere. In fact, I find it
> pretty
>>>> hard to use in real world scenarios. Imagine a distribution - should
> it
>>>> package every target binary twice? Should it create hardlinks all
> over?
>>> How does dists. handle your original binfmt-wrapper? This is not much
>>> different I think. Here you got a choice to create a hardlink or a
> copy.
>>> Any chroot will only have to bind mount binfmt-wrapper into the chroot
> or
>>> lxc container.
>> Yeah, and there are reasons my original approach isn't upstream :).
> What are those then? Hardly just packaging problem/choise.
>
>>>> I think we should try and find better magic :). Looking at the
>>>> binfmt_misc loading code, I think we can cheat a bit. If we pass the
> 'O'
>>>> flag (open target binary for handler), binfmt_misc will tell us the
>>>> binary fd in AT_EXECFD:
>>>>
>>>>                    NEW_AUX_ENT(AT_EXECFD, bprm->interp_data);
>>>>
>>>> We could then use this as a hint that we were spawned by binfmt_misc
>>>> rather than directly and interpret the first argv as target_argv[0].
>>>>
>>>> Then we can also add the P and O flags to scripts/qemu-binfmt-conf.sh
>>>> and have a solution that works well for everyone.
>>> What to do with P only then? Seems like most dists uses only P
>> If a distro uses the P flag it's not using upstream code, so they have
>> to deal with their own breakage :). Fortunately the binfmt install
>> scripts are usually part of a package too, so they can be updated
> easily.
>
> scripts/qemu-binfmt-conf.sh does not use any flag currently, I don't think
> that works either with current linux-user and choot/lxc
>
> You think everyone feel OK with new defaults like OP ?

Yes.


Alex
Peter Maydell July 14, 2014, 4 p.m. UTC | #6
On 14 July 2014 16:59, Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:
> scripts/qemu-binfmt-conf.sh does not use any flag currently, I don't think
> that works either with current linux-user and choot/lxc

That script is pretty awful and no sane distro is going to use
it anyhow...

-- PMM
Joakim Tjernlund July 14, 2014, 4:08 p.m. UTC | #7
Peter Maydell <peter.maydell@linaro.org> wrote on 2014/07/14 18:00:38:
> 
> On 14 July 2014 16:59, Joakim Tjernlund <joakim.tjernlund@transmode.se> 
wrote:
> > scripts/qemu-binfmt-conf.sh does not use any flag currently, I don't 
think
> > that works either with current linux-user and choot/lxc
> 
> That script is pretty awful and no sane distro is going to use
> it anyhow...

Perhaps, but that script should serve as a template and have all flags 
correct
so it will work with current linux-user.
Joakim Tjernlund July 14, 2014, 4:32 p.m. UTC | #8
Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:00:35:
> 
> 
> On 14.07.14 17:59, Joakim Tjernlund wrote:
> > Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:46:18:
> >>
> >> On 14.07.14 17:38, Joakim Tjernlund wrote:
> >>> Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:
> >>>
> >>>> From: Alexander Graf <agraf@suse.de>
> >>>> To: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>,
> >>>> Cc: qemu-devel@nongnu.org
> >>>> Date: 2014/07/14 17:21
> >>>> Subject: Re: [PATCH] linux-user: Add binfmt wrapper
> >>>>
> >>>>
> >>>> On 14.07.14 16:38, Joakim Tjernlund wrote:
> >>>>> The popular binfmt-wrapper patch adds an additional
> >>>>> executable which mangle argv suitable for binfmt flag P.
> >>>>> In a chroot you need the both (statically linked) qemu-$arch
> >>>>> and qemu-$arch-binfmt-wrapper. This is sub optimal and a
> >>>>> better approach is to recognize the -binfmt-wrapper extension
> >>>>> within linux-user(qemu-$arch) and mangle argv there.
> >>>>> This just produces on executable which can be either copied to
> >>>>> the chroot or bind mounted with the appropriate -binfmt-wrapper
> >>>>> suffix.
> >>>>>
> >>>>> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> >>>> Please make sure to CC Riku on patches like this - he is the
> > linux-user
> >>>> maintainer.
> >>> Doesn't he read the devel list? Anyhow CC:ed
> >> He may or may not. Qemu-devel can be pretty high volume :).
> >>
> >>>>> ---
> >>>>>     linux-user/main.c | 13 +++++++++++++
> >>>>>     1 file changed, 13 insertions(+)
> >>>>>
> >>>>> diff --git a/linux-user/main.c b/linux-user/main.c
> >>>>> index 71a33c7..212067a 100644
> >>>>> --- a/linux-user/main.c
> >>>>> +++ b/linux-user/main.c
> >>>>> @@ -3828,6 +3828,19 @@ int main(int argc, char **argv, char 
**envp)
> >>>>>         int i;
> >>>>>         int ret;
> >>>>>         int execfd;
> >>>>> +    char *binfmt;
> >>>>> +
> >>>>> +    i = strlen( argv[0] ) - strlen ( "-binfmt-wrapper" );
> >>>> The spaces are odd. Did this patch pass checkpatch.pl? Same comment
> > goes
> >>>> for almost all function invocations.
> >>> ehh, didn't run it through checkpatch.pl. Easy to fix next time.
> >>>
> >>>>> +    binfmt = argv[0] + i;
> >>>>> +    if (i > 0 && strcmp ( binfmt, "-binfmt-wrapper" ) == 0) {
> >>>> This magic needs to be documented somewhere. In fact, I find it
> > pretty
> >>>> hard to use in real world scenarios. Imagine a distribution - 
should
> > it
> >>>> package every target binary twice? Should it create hardlinks all
> > over?
> >>> How does dists. handle your original binfmt-wrapper? This is not 
much
> >>> different I think. Here you got a choice to create a hardlink or a
> > copy.
> >>> Any chroot will only have to bind mount binfmt-wrapper into the 
chroot
> > or
> >>> lxc container.
> >> Yeah, and there are reasons my original approach isn't upstream :).
> > What are those then? Hardly just packaging problem/choise.
> >
> >>>> I think we should try and find better magic :). Looking at the
> >>>> binfmt_misc loading code, I think we can cheat a bit. If we pass 
the
> > 'O'
> >>>> flag (open target binary for handler), binfmt_misc will tell us the
> >>>> binary fd in AT_EXECFD:
> >>>>
> >>>>                    NEW_AUX_ENT(AT_EXECFD, bprm->interp_data);
> >>>>
> >>>> We could then use this as a hint that we were spawned by 
binfmt_misc
> >>>> rather than directly and interpret the first argv as 
target_argv[0].
> >>>>
> >>>> Then we can also add the P and O flags to 
scripts/qemu-binfmt-conf.sh
> >>>> and have a solution that works well for everyone.
> >>> What to do with P only then? Seems like most dists uses only P
> >> If a distro uses the P flag it's not using upstream code, so they 
have
> >> to deal with their own breakage :). Fortunately the binfmt install
> >> scripts are usually part of a package too, so they can be updated
> > easily.
> >
> > scripts/qemu-binfmt-conf.sh does not use any flag currently, I don't 
think
> > that works either with current linux-user and choot/lxc
> >
> > You think everyone feel OK with new defaults like OP ?
> 
> Yes.

hmm, with current qemu it works to boot a LXC with just O flag.
Why would we then want to complicate things by adding OP which
then requires some version of my patch?

 Jocke
Alexander Graf July 14, 2014, 4:34 p.m. UTC | #9
On 14.07.14 18:32, Joakim Tjernlund wrote:
> Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:00:35:
> You think everyone feel OK with new defaults like OP ?
>> Yes.
> hmm, with current qemu it works to boot a LXC with just O flag.
> Why would we then want to complicate things by adding OP which
> then requires some version of my patch?

How does current QEMU boot anything with the 0 flag? It doesn't know how 
to handle it and will misinterpret the argument, no?


Alex
Joakim Tjernlund July 14, 2014, 4:45 p.m. UTC | #10
Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:34:34:
> 
> 
> On 14.07.14 18:32, Joakim Tjernlund wrote:
> > Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:00:35:
> > You think everyone feel OK with new defaults like OP ?
> >> Yes.
> > hmm, with current qemu it works to boot a LXC with just O flag.
> > Why would we then want to complicate things by adding OP which
> > then requires some version of my patch?
> 
> How does current QEMU boot anything with the 0 flag? It doesn't know how 

> to handle it and will misinterpret the argument, no?

my ppc gentoo LXC container boots just fine, why I don't know. I see a 
difference though:

with just O:
root         1  0.1  0.0 4138016 5736 ?        Ss   18:37   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /sbin/init
root        79  0.0  0.0 4138016 5792 ?        Ss   18:37   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /usr/sbin/sshd
root       293  0.0  0.0 4137952 4072 ?        Ss   18:37   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /bin/busybox u
root       334  0.3  0.0 4138016 5964 tty1     Ss+  18:37   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /sbin/agetty 3
root       335  3.0  0.0 4138048 7068 console  Ss   18:37   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /bin/login -- 
root       336  3.6  0.0 4138016 8916 console  S    18:37   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /bin/bash
root       340  0.0  0.0 4138016 6340 ?        R+   Jul10   0:00 /bin/ps 
uax

vs. with OP:
root         1  0.2  0.0 4138016 5748 ?        Ss   18:40   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /sbin/init /sb
root        79  0.0  0.0 4137984 5792 ?        Ss   18:41   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /usr/sbin/sshd
root       293  0.0  0.0 4137984 4064 ?        Ss   18:41   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /bin/busybox /
root       334  0.2  0.0 4138016 7664 tty1     Ss+  18:41   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /sbin/agetty /
root       335  2.1  0.0 4138048 8512 console  Ss   18:41   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /bin/login /bi
root       336  3.6  0.0 4138016 7932 console  S    18:41   0:00 
/usr/bin/qemu-ppc-binfmt-wrapper /bin/bash -bas
root       340  0.0  0.0 4138016 8384 ?        R+   Jul10   0:00 /bin/ps 
ps uax

So the argv is different but the small gentoo image survies that, just by 
chance I suppose :)

 Jocke
Joakim Tjernlund July 14, 2014, 4:51 p.m. UTC | #11
Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:34:34:
> 
> 
> On 14.07.14 18:32, Joakim Tjernlund wrote:
> > Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:00:35:
> > You think everyone feel OK with new defaults like OP ?
> >> Yes.
> > hmm, with current qemu it works to boot a LXC with just O flag.
> > Why would we then want to complicate things by adding OP which
> > then requires some version of my patch?
> 
> How does current QEMU boot anything with the 0 flag? It doesn't know how 

> to handle it and will misinterpret the argument, no?

Playing some, one could possibly do both:

-    if (i > 0 && strcmp(binfmt, "-binfmt-wrapper") == 0) {
+    execfd = qemu_getauxval(AT_EXECFD);
+    if (execfd > 0 || i > 0 && strcmp(binfmt, "-binfmt-wrapper") == 0) {
Alexander Graf July 14, 2014, 4:54 p.m. UTC | #12
On 14.07.14 18:51, Joakim Tjernlund wrote:
> Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:34:34:
>>
>> On 14.07.14 18:32, Joakim Tjernlund wrote:
>>> Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:00:35:
>>> You think everyone feel OK with new defaults like OP ?
>>>> Yes.
>>> hmm, with current qemu it works to boot a LXC with just O flag.
>>> Why would we then want to complicate things by adding OP which
>>> then requires some version of my patch?
>> How does current QEMU boot anything with the 0 flag? It doesn't know how
>> to handle it and will misinterpret the argument, no?
> Playing some, one could possibly do both:
>
> -    if (i > 0 && strcmp(binfmt, "-binfmt-wrapper") == 0) {
> +    execfd = qemu_getauxval(AT_EXECFD);
> +    if (execfd > 0 || i > 0 && strcmp(binfmt, "-binfmt-wrapper") == 0) {

0 is a valid fd :). And yes, this would work, but I don't see why we 
should introduce the -binfmt-wrapper logic to upstream QEMU. It's never 
been there. And the AT_EXECFD evaluation is a lot cleaner.

While we're at it - should we also pass the C flag to binfmt_misc?


Alex
Joakim Tjernlund July 14, 2014, 5:01 p.m. UTC | #13
Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:54:27:
> 
> 
> On 14.07.14 18:51, Joakim Tjernlund wrote:
> > Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:34:34:
> >>
> >> On 14.07.14 18:32, Joakim Tjernlund wrote:
> >>> Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:00:35:
> >>> You think everyone feel OK with new defaults like OP ?
> >>>> Yes.
> >>> hmm, with current qemu it works to boot a LXC with just O flag.
> >>> Why would we then want to complicate things by adding OP which
> >>> then requires some version of my patch?
> >> How does current QEMU boot anything with the 0 flag? It doesn't know 
how
> >> to handle it and will misinterpret the argument, no?
> > Playing some, one could possibly do both:
> >
> > -    if (i > 0 && strcmp(binfmt, "-binfmt-wrapper") == 0) {
> > +    execfd = qemu_getauxval(AT_EXECFD);
> > +    if (execfd > 0 || i > 0 && strcmp(binfmt, "-binfmt-wrapper") == 
0) {
> 
> 0 is a valid fd :). And yes, this would work, but I don't see why we 

Not according to qemu:
    execfd = qemu_getauxval(AT_EXECFD);
    if (execfd == 0) {
        execfd = open(filename, O_RDONLY);
        if (execfd < 0) {
            printf("Error while loading %s: %s\n", filename, 
strerror(errno));
            _exit(1);
        }
    }

> should introduce the -binfmt-wrapper logic to upstream QEMU. It's never 
> been there. And the AT_EXECFD evaluation is a lot cleaner.

It is, it is just an option to those who want to use P only(for security 
reasons).
Then there is a built-in fallback which is much easier to use that the 
current external program.
I can do both, first the execfd and the other ontop, then you choose :)

> 
> While we're at it - should we also pass the C flag to binfmt_misc?

That should be a user/distribution choise, not sure what should be default 
in
QEMU.
Joakim Tjernlund July 14, 2014, 5:08 p.m. UTC | #14
Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:54:27:
> 
> 0 is a valid fd :). And yes, this would work, but I don't see why we 
> should introduce the -binfmt-wrapper logic to upstream QEMU. It's never 
> been there. And the AT_EXECFD evaluation is a lot cleaner.
> 
> While we're at it - should we also pass the C flag to binfmt_misc?

While I have your attention :)
Is there no way to pass options to the bin wrapper, defining env. varibles
in LXC/chroot which the wrapper picks up?
Alexander Graf July 14, 2014, 5:14 p.m. UTC | #15
On 14.07.14 19:08, Joakim Tjernlund wrote:
> Alexander Graf <agraf@suse.de> wrote on 2014/07/14 18:54:27:
>> 0 is a valid fd :). And yes, this would work, but I don't see why we
>> should introduce the -binfmt-wrapper logic to upstream QEMU. It's never
>> been there. And the AT_EXECFD evaluation is a lot cleaner.
>>
>> While we're at it - should we also pass the C flag to binfmt_misc?
> While I have your attention :)
> Is there no way to pass options to the bin wrapper, defining env. varibles
> in LXC/chroot which the wrapper picks up?

Unfortunately I don't think there is a consistent way for that - unless 
you want to do a wrapper binary again.


Alex
Riku Voipio July 15, 2014, 2:12 p.m. UTC | #16
On Mon, Jul 14, 2014 at 05:38:49PM +0200, Joakim Tjernlund wrote:
> Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:
> > On 14.07.14 16:38, Joakim Tjernlund wrote:
> > > The popular binfmt-wrapper patch adds an additional
> > > executable which mangle argv suitable for binfmt flag P.
> > > In a chroot you need the both (statically linked) qemu-$arch
> > > and qemu-$arch-binfmt-wrapper. This is sub optimal and a
> > > better approach is to recognize the -binfmt-wrapper extension
> > > within linux-user(qemu-$arch) and mangle argv there.
> > > This just produces on executable which can be either copied to
> > > the chroot or bind mounted with the appropriate -binfmt-wrapper
> > > suffix.
> > >
> > > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > 
> > Please make sure to CC Riku on patches like this - he is the linux-user 
> > maintainer.
> 
> Doesn't he read the devel list? Anyhow CC:ed

I do - but CC gets directly to my inbox while qemu-devel goes to an
folder.

I take from this discussion, that this patch has been superceded by the
Patch at: http://patchwork.ozlabs.org/patch/369770/ ?

Riku

> > 
> > > ---
> > >   linux-user/main.c | 13 +++++++++++++
> > >   1 file changed, 13 insertions(+)
> > >
> > > diff --git a/linux-user/main.c b/linux-user/main.c
> > > index 71a33c7..212067a 100644
> > > --- a/linux-user/main.c
> > > +++ b/linux-user/main.c
> > > @@ -3828,6 +3828,19 @@ int main(int argc, char **argv, char **envp)
> > >       int i;
> > >       int ret;
> > >       int execfd;
> > > +    char *binfmt;
> > > +
> > > +    i = strlen( argv[0] ) - strlen ( "-binfmt-wrapper" );
> > 
> > The spaces are odd. Did this patch pass checkpatch.pl? Same comment goes 
> 
> > for almost all function invocations.
> 
> ehh, didn't run it through checkpatch.pl. Easy to fix next time.
> 
> > 
> > > +    binfmt = argv[0] + i;
> > > +    if (i > 0 && strcmp ( binfmt, "-binfmt-wrapper" ) == 0) {
> > 
> > This magic needs to be documented somewhere. In fact, I find it pretty 
> > hard to use in real world scenarios. Imagine a distribution - should it 
> > package every target binary twice? Should it create hardlinks all over?
> 
> How does dists. handle your original binfmt-wrapper? This is not much
> different I think. Here you got a choice to create a hardlink or a copy.
> Any chroot will only have to bind mount binfmt-wrapper into the chroot or 
> lxc container.
> 
> > 
> > I think we should try and find better magic :). Looking at the 
> > binfmt_misc loading code, I think we can cheat a bit. If we pass the 'O' 
> 
> > flag (open target binary for handler), binfmt_misc will tell us the 
> > binary fd in AT_EXECFD:
> > 
> >                  NEW_AUX_ENT(AT_EXECFD, bprm->interp_data);
> > 
> > We could then use this as a hint that we were spawned by binfmt_misc 
> > rather than directly and interpret the first argv as target_argv[0].
> > 
> > Then we can also add the P and O flags to scripts/qemu-binfmt-conf.sh 
> > and have a solution that works well for everyone.
> 
> What to do with P only then? Seems like most dists uses only P 
> 
> > 
> > > +   if (argc < 3 ) {
> > > +       fprintf ( stderr, "%s: Please use me through binfmt with P 
> flag\n", argv[0] );
> > > +       exit(1);
> > > +   }
> > > +   handle_arg_argv0(argv[2]); /* binfmt wrapper */
> > > +   memmove(&argv[2], &argv[3], (argc-2)*sizeof(argv));
> > 
> > I can't say I'm a big fan of this memmove, but everything else I can 
> > think of is going to be even uglier.
> 
> Me too :)
> 
> > > +   argc--;
> > > +    }
> > > 
> > >       module_call_init(MODULE_INIT_QOM);
> > > 
> > 
>
Joakim Tjernlund July 15, 2014, 2:39 p.m. UTC | #17
Riku Voipio <riku.voipio@iki.fi> wrote on 2014/07/15 16:12:26:
> 
> On Mon, Jul 14, 2014 at 05:38:49PM +0200, Joakim Tjernlund wrote:
> > Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:
> > > On 14.07.14 16:38, Joakim Tjernlund wrote:
> > > > The popular binfmt-wrapper patch adds an additional
> > > > executable which mangle argv suitable for binfmt flag P.
> > > > In a chroot you need the both (statically linked) qemu-$arch
> > > > and qemu-$arch-binfmt-wrapper. This is sub optimal and a
> > > > better approach is to recognize the -binfmt-wrapper extension
> > > > within linux-user(qemu-$arch) and mangle argv there.
> > > > This just produces on executable which can be either copied to
> > > > the chroot or bind mounted with the appropriate -binfmt-wrapper
> > > > suffix.
> > > >
> > > > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > > 
> > > Please make sure to CC Riku on patches like this - he is the 
linux-user 
> > > maintainer.
> > 
> > Doesn't he read the devel list? Anyhow CC:ed
> 
> I do - but CC gets directly to my inbox while qemu-devel goes to an
> folder.
> 
> I take from this discussion, that this patch has been superceded by the
> Patch at: http://patchwork.ozlabs.org/patch/369770/ ?

It is superceded but by v2 I sent a litte while ago:
http://patchwork.ozlabs.org/patch/369995/

 Jocke
Joakim Tjernlund July 15, 2014, 3:11 p.m. UTC | #18
Riku Voipio <riku.voipio@iki.fi> wrote on 2014/07/15 16:12:26:
> On Mon, Jul 14, 2014 at 05:38:49PM +0200, Joakim Tjernlund wrote:
> > Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:
> > > On 14.07.14 16:38, Joakim Tjernlund wrote:
> > > > The popular binfmt-wrapper patch adds an additional
> > > > executable which mangle argv suitable for binfmt flag P.
> > > > In a chroot you need the both (statically linked) qemu-$arch
> > > > and qemu-$arch-binfmt-wrapper. This is sub optimal and a
> > > > better approach is to recognize the -binfmt-wrapper extension
> > > > within linux-user(qemu-$arch) and mangle argv there.
> > > > This just produces on executable which can be either copied to
> > > > the chroot or bind mounted with the appropriate -binfmt-wrapper
> > > > suffix.
> > > >
> > > > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > > 
> > > Please make sure to CC Riku on patches like this - he is the 
linux-user 
> > > maintainer.
> > 
> > Doesn't he read the devel list? Anyhow CC:ed
> 
> I do - but CC gets directly to my inbox while qemu-devel goes to an
> folder.
> 
> I take from this discussion, that this patch has been superceded by the
> Patch at: http://patchwork.ozlabs.org/patch/369770/ ?

BTW, any chance qemu binfmt could fixup the ps output from within a 
container:
 jocke-ppc2 ~ # ps uaxww
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.1  0.0 4138016 7600 ?        Ss   17:02   0:00 
/usr/bin/qemu-ppc /sbin/init /sbin/init
root        79  0.0  0.0 4138016 5792 ?        Ss   17:02   0:00 
/usr/bin/qemu-ppc /usr/sbin/sshd /usr/sbin/sshd
root       293  0.0  0.0 4137952 4072 ?        Ss   17:02   0:00 
/usr/bin/qemu-ppc /bin/busybox /bin/busybox udhcpc -x hostname:jocke-ppc2 
--interface=eth0 --now --script=/lib/netifrc/sh/udhcpc-hook.sh 
--pidfile=/var/run/udhcpc-eth0.pid
root       334  0.3  0.0 4138016 5964 tty1     Ss+  17:02   0:00 
/usr/bin/qemu-ppc /sbin/agetty /sbin/agetty 38400 tty1 linux
root       335  3.1  0.0 4138048 7064 console  Ss   17:02   0:00 
/usr/bin/qemu-ppc /bin/login /bin/login -- root
root       336  3.3  0.0 4138016 9764 console  S    17:02   0:00 
/usr/bin/qemu-ppc /bin/bash -bash
root       340  0.0  0.0 4138016 6336 ?        R+   Jul10   0:00 /bin/ps 
ps uaxww

As you can see, qemu-ppc is visible.
Riku Voipio July 16, 2014, 6:54 a.m. UTC | #19
On Tue, Jul 15, 2014 at 05:11:48PM +0200, Joakim Tjernlund wrote:
> Riku Voipio <riku.voipio@iki.fi> wrote on 2014/07/15 16:12:26:
> > On Mon, Jul 14, 2014 at 05:38:49PM +0200, Joakim Tjernlund wrote:
> > > Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:
> > > > On 14.07.14 16:38, Joakim Tjernlund wrote:
> > > > > The popular binfmt-wrapper patch adds an additional
> > > > > executable which mangle argv suitable for binfmt flag P.
> > > > > In a chroot you need the both (statically linked) qemu-$arch
> > > > > and qemu-$arch-binfmt-wrapper. This is sub optimal and a
> > > > > better approach is to recognize the -binfmt-wrapper extension
> > > > > within linux-user(qemu-$arch) and mangle argv there.
> > > > > This just produces on executable which can be either copied to
> > > > > the chroot or bind mounted with the appropriate -binfmt-wrapper
> > > > > suffix.
> > > > >
> > > > > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > > > 
> > > > Please make sure to CC Riku on patches like this - he is the 
> linux-user 
> > > > maintainer.
> > > 
> > > Doesn't he read the devel list? Anyhow CC:ed
> > 
> > I do - but CC gets directly to my inbox while qemu-devel goes to an
> > folder.
> > 
> > I take from this discussion, that this patch has been superceded by the
> > Patch at: http://patchwork.ozlabs.org/patch/369770/ ?
> 
> BTW, any chance qemu binfmt could fixup the ps output from within a 
> container:
>  jocke-ppc2 ~ # ps uaxww
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root         1  0.1  0.0 4138016 7600 ?        Ss   17:02   0:00 
> /usr/bin/qemu-ppc /sbin/init /sbin/init
> root        79  0.0  0.0 4138016 5792 ?        Ss   17:02   0:00 
> /usr/bin/qemu-ppc /usr/sbin/sshd /usr/sbin/sshd
> root       293  0.0  0.0 4137952 4072 ?        Ss   17:02   0:00 
> /usr/bin/qemu-ppc /bin/busybox /bin/busybox udhcpc -x hostname:jocke-ppc2 
> --interface=eth0 --now --script=/lib/netifrc/sh/udhcpc-hook.sh 
> --pidfile=/var/run/udhcpc-eth0.pid
> root       334  0.3  0.0 4138016 5964 tty1     Ss+  17:02   0:00 
> /usr/bin/qemu-ppc /sbin/agetty /sbin/agetty 38400 tty1 linux
> root       335  3.1  0.0 4138048 7064 console  Ss   17:02   0:00 
> /usr/bin/qemu-ppc /bin/login /bin/login -- root
> root       336  3.3  0.0 4138016 9764 console  S    17:02   0:00 
> /usr/bin/qemu-ppc /bin/bash -bash
> root       340  0.0  0.0 4138016 6336 ?        R+   Jul10   0:00 /bin/ps 
> ps uaxww
> 
> As you can see, qemu-ppc is visible. 

This isn't something binfmt could do. ps uses /proc/$pid/exe to map the
right binary. However, qemu already fixes /proc/self/exe to point to
right file - as you can see from the last line where "ps uaxww" doesn't
have qemu shown. So it would be possible to make do_open() in syscall.c do
similar mapping for /proc/$pid/exe. 

Exactly how and when the qemu processes should be hidden within qemu
needs careful thought thou. A naive approach would check if
/proc/$pid/exe points to same binary as /proc/self/exe, hiding at least
same architecture qemu processes.

Riku
Joakim Tjernlund July 16, 2014, 7:22 a.m. UTC | #20
Riku Voipio <riku.voipio@iki.fi> wrote on 2014/07/16 08:54:45:
> 
> On Tue, Jul 15, 2014 at 05:11:48PM +0200, Joakim Tjernlund wrote:
> > Riku Voipio <riku.voipio@iki.fi> wrote on 2014/07/15 16:12:26:
> > > On Mon, Jul 14, 2014 at 05:38:49PM +0200, Joakim Tjernlund wrote:
> > > > Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:
> > > > > On 14.07.14 16:38, Joakim Tjernlund wrote:
> > > > > > The popular binfmt-wrapper patch adds an additional
> > > > > > executable which mangle argv suitable for binfmt flag P.
> > > > > > In a chroot you need the both (statically linked) qemu-$arch
> > > > > > and qemu-$arch-binfmt-wrapper. This is sub optimal and a
> > > > > > better approach is to recognize the -binfmt-wrapper extension
> > > > > > within linux-user(qemu-$arch) and mangle argv there.
> > > > > > This just produces on executable which can be either copied to
> > > > > > the chroot or bind mounted with the appropriate 
-binfmt-wrapper
> > > > > > suffix.
> > > > > >
> > > > > > Signed-off-by: Joakim Tjernlund 
<Joakim.Tjernlund@transmode.se>
> > > > > 
> > > > > Please make sure to CC Riku on patches like this - he is the 
> > linux-user 
> > > > > maintainer.
> > > > 
> > > > Doesn't he read the devel list? Anyhow CC:ed
> > > 
> > > I do - but CC gets directly to my inbox while qemu-devel goes to an
> > > folder.
> > > 
> > > I take from this discussion, that this patch has been superceded by 
the
> > > Patch at: http://patchwork.ozlabs.org/patch/369770/ ?
> > 
> > BTW, any chance qemu binfmt could fixup the ps output from within a 
> > container:
> >  jocke-ppc2 ~ # ps uaxww
> > USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME 
COMMAND
> > root         1  0.1  0.0 4138016 7600 ?        Ss   17:02   0:00 
> > /usr/bin/qemu-ppc /sbin/init /sbin/init
> > root        79  0.0  0.0 4138016 5792 ?        Ss   17:02   0:00 
> > /usr/bin/qemu-ppc /usr/sbin/sshd /usr/sbin/sshd
> > root       293  0.0  0.0 4137952 4072 ?        Ss   17:02   0:00 
> > /usr/bin/qemu-ppc /bin/busybox /bin/busybox udhcpc -x 
hostname:jocke-ppc2 
> > --interface=eth0 --now --script=/lib/netifrc/sh/udhcpc-hook.sh 
> > --pidfile=/var/run/udhcpc-eth0.pid
> > root       334  0.3  0.0 4138016 5964 tty1     Ss+  17:02   0:00 
> > /usr/bin/qemu-ppc /sbin/agetty /sbin/agetty 38400 tty1 linux
> > root       335  3.1  0.0 4138048 7064 console  Ss   17:02   0:00 
> > /usr/bin/qemu-ppc /bin/login /bin/login -- root
> > root       336  3.3  0.0 4138016 9764 console  S    17:02   0:00 
> > /usr/bin/qemu-ppc /bin/bash -bash
> > root       340  0.0  0.0 4138016 6336 ?        R+   Jul10   0:00 
/bin/ps 
> > ps uaxww
> > 
> > As you can see, qemu-ppc is visible. 
> 
> This isn't something binfmt could do. ps uses /proc/$pid/exe to map the

Why not? I think it is the perfekt place to do it in Linux binfmt code. 
All
the other interp(ELF ld.so and normal bash scripts) do it. Fixing this 
with
no support from Linux amounts to hacks like:
    /*
     * Munge our argv so it will look like it would
     * if started without linux-user
     */
    if (execfd > 0) {
        unsigned long len;
        char *p = argv[0];

        for (i = 0; i < target_argc; i++, p += len) {
            len = strlen(target_argv[i]) + 1;
            memcpy(p, target_argv[i], len);
        }
        len = *envp - p;
        memset(p, 0, len);
    }
and it is still not perfekt, apps like ps and top still does
not work.
Linux binfmt code should at least pass a correct argv to us

BTW,
    You forgot:
[PATCH 4/4] ppc: remove excessive logging
Joakim Tjernlund July 16, 2014, 11:38 a.m. UTC | #21
Riku Voipio <riku.voipio@iki.fi> wrote on 2014/07/16 08:54:45:
> 
> On Tue, Jul 15, 2014 at 05:11:48PM +0200, Joakim Tjernlund wrote:
> > Riku Voipio <riku.voipio@iki.fi> wrote on 2014/07/15 16:12:26:
> > > On Mon, Jul 14, 2014 at 05:38:49PM +0200, Joakim Tjernlund wrote:
> > > > Alexander Graf <agraf@suse.de> wrote on 2014/07/14 17:21:33:
> > > > > On 14.07.14 16:38, Joakim Tjernlund wrote:
> > > > > > The popular binfmt-wrapper patch adds an additional
> > > > > > executable which mangle argv suitable for binfmt flag P.
> > > > > > In a chroot you need the both (statically linked) qemu-$arch
> > > > > > and qemu-$arch-binfmt-wrapper. This is sub optimal and a
> > > > > > better approach is to recognize the -binfmt-wrapper extension
> > > > > > within linux-user(qemu-$arch) and mangle argv there.
> > > > > > This just produces on executable which can be either copied to
> > > > > > the chroot or bind mounted with the appropriate 
-binfmt-wrapper
> > > > > > suffix.
> > > > > >
> > > > > > Signed-off-by: Joakim Tjernlund 
<Joakim.Tjernlund@transmode.se>
> > > > > 
> > > > > Please make sure to CC Riku on patches like this - he is the 
> > linux-user 
> > > > > maintainer.
> > > > 
> > > > Doesn't he read the devel list? Anyhow CC:ed
> > > 
> > > I do - but CC gets directly to my inbox while qemu-devel goes to an
> > > folder.
> > > 
> > > I take from this discussion, that this patch has been superceded by 
the
> > > Patch at: http://patchwork.ozlabs.org/patch/369770/ ?
> > 
> > BTW, any chance qemu binfmt could fixup the ps output from within a 
> > container:
> >  jocke-ppc2 ~ # ps uaxww
> > USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME 
COMMAND
> > root         1  0.1  0.0 4138016 7600 ?        Ss   17:02   0:00 
> > /usr/bin/qemu-ppc /sbin/init /sbin/init
> > root        79  0.0  0.0 4138016 5792 ?        Ss   17:02   0:00 
> > /usr/bin/qemu-ppc /usr/sbin/sshd /usr/sbin/sshd
> > root       293  0.0  0.0 4137952 4072 ?        Ss   17:02   0:00 
> > /usr/bin/qemu-ppc /bin/busybox /bin/busybox udhcpc -x 
hostname:jocke-ppc2 
> > --interface=eth0 --now --script=/lib/netifrc/sh/udhcpc-hook.sh 
> > --pidfile=/var/run/udhcpc-eth0.pid
> > root       334  0.3  0.0 4138016 5964 tty1     Ss+  17:02   0:00 
> > /usr/bin/qemu-ppc /sbin/agetty /sbin/agetty 38400 tty1 linux
> > root       335  3.1  0.0 4138048 7064 console  Ss   17:02   0:00 
> > /usr/bin/qemu-ppc /bin/login /bin/login -- root
> > root       336  3.3  0.0 4138016 9764 console  S    17:02   0:00 
> > /usr/bin/qemu-ppc /bin/bash -bash
> > root       340  0.0  0.0 4138016 6336 ?        R+   Jul10   0:00 
/bin/ps 
> > ps uaxww
> > 
> > As you can see, qemu-ppc is visible. 
> 
> This isn't something binfmt could do. ps uses /proc/$pid/exe to map the
> right binary. However, qemu already fixes /proc/self/exe to point to
> right file - as you can see from the last line where "ps uaxww" doesn't
> have qemu shown. So it would be possible to make do_open() in syscall.c 
do
> similar mapping for /proc/$pid/exe. 
> 
> Exactly how and when the qemu processes should be hidden within qemu
> needs careful thought thou. A naive approach would check if
> /proc/$pid/exe points to same binary as /proc/self/exe, hiding at least
> same architecture qemu processes.

Took a look at do_open(), what horror to read what you do there.
How on earth is this supposed to work?
You don't separate normal qemu invocation from binfmt, only adjust
"self" stuff, always only remove one entry(binfmt P flag is 2 entries)

Reasonably this hiding should only be performed when started by binfmt?
What are the other uses for ?

 Jocke
diff mbox

Patch

diff --git a/linux-user/main.c b/linux-user/main.c
index 71a33c7..212067a 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -3828,6 +3828,19 @@  int main(int argc, char **argv, char **envp)
     int i;
     int ret;
     int execfd;
+    char *binfmt;
+
+    i = strlen( argv[0] ) - strlen ( "-binfmt-wrapper" );
+    binfmt = argv[0] + i;
+    if (i > 0 && strcmp ( binfmt, "-binfmt-wrapper" ) == 0) {
+	if (argc < 3 ) {
+	    fprintf ( stderr, "%s: Please use me through binfmt with P flag\n", argv[0] );
+	    exit(1);
+	}
+	handle_arg_argv0(argv[2]); /* binfmt wrapper */
+	memmove(&argv[2], &argv[3], (argc-2)*sizeof(argv));
+	argc--;
+    }
 
     module_call_init(MODULE_INIT_QOM);