diff mbox

[8/8] net: Implement socketat.

Message ID m1bp7oq1u8.fsf@fess.ebiederm.org
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric W. Biederman Sept. 23, 2010, 8:51 a.m. UTC
Add a system call for creating sockets in a specified network namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 net/socket.c |   26 ++++++++++++++++++++++++--
 1 files changed, 24 insertions(+), 2 deletions(-)

Comments

Pavel Emelyanov Sept. 23, 2010, 8:56 a.m. UTC | #1
On 09/23/2010 12:51 PM, Eric W. Biederman wrote:
> 
> Add a system call for creating sockets in a specified network namespace.

What for?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jamal Sept. 23, 2010, 11:19 a.m. UTC | #2
On Thu, 2010-09-23 at 12:56 +0400, Pavel Emelyanov wrote:
> On 09/23/2010 12:51 PM, Eric W. Biederman wrote:
> > 
> > Add a system call for creating sockets in a specified network namespace.
> 
> What for?

I can see many uses if my understanding is correct..
ex, from mother namespace:
fdx = open socket at namespace blah
from mother namespace, read/write/poll fdx 
(eg add route with netlink socket)

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Emelyanov Sept. 23, 2010, 11:33 a.m. UTC | #3
On 09/23/2010 03:19 PM, jamal wrote:
> On Thu, 2010-09-23 at 12:56 +0400, Pavel Emelyanov wrote:
>> On 09/23/2010 12:51 PM, Eric W. Biederman wrote:
>>>
>>> Add a system call for creating sockets in a specified network namespace.
>>
>> What for?
> 
> I can see many uses if my understanding is correct..
> ex, from mother namespace:
> fdx = open socket at namespace blah
> from mother namespace, read/write/poll fdx 
> (eg add route with netlink socket)

This particular usecase is unneeded once you have the "enter" ability.

> cheers,
> jamal
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jamal Sept. 23, 2010, 11:40 a.m. UTC | #4
On Thu, 2010-09-23 at 15:33 +0400, Pavel Emelyanov wrote:

> This particular usecase is unneeded once you have the "enter" ability.

Is that cheaper from a syscall count/cost?
i.e do I have to enter every time i want to write/read this fd?
How does poll/select work in that enter scenario?

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Emelyanov Sept. 23, 2010, 11:53 a.m. UTC | #5
On 09/23/2010 03:40 PM, jamal wrote:
> On Thu, 2010-09-23 at 15:33 +0400, Pavel Emelyanov wrote:
> 
>> This particular usecase is unneeded once you have the "enter" ability.
> 
> Is that cheaper from a syscall count/cost?

Why does it matter? You told, that the usage scenario was to
add routes to container. If I do 2 syscalls instead of 1, is
it THAT worse?

> i.e do I have to enter every time i want to write/read this fd?

No - you enter once, create a socket and do whatever you need
withing the enterned namespace.

> How does poll/select work in that enter scenario?

Just like it used to before the enter.

> cheers,
> jamal
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jamal Sept. 23, 2010, 12:11 p.m. UTC | #6
On Thu, 2010-09-23 at 15:53 +0400, Pavel Emelyanov wrote:

> Why does it matter? You told, that the usage scenario was to
> add routes to container. If I do 2 syscalls instead of 1, is
> it THAT worse?
> 

Anything to do with socket IO that requires namespace awareness
applies for usage; it could be tcp/udp/etc socket. If it doesnt
make any difference performance wise using one scheme vs other
to write/read heavy messages then i dont see an issue and socketat
is redundant.

If i was to pick blindly - I would say whatever approach with
less syscalls is better even if just a "slow" path one time
thing. I could create a scenario which would make it bad
to have more syscalls.

But theres also the simplicity aspect in doing:
fdx = socketat namespace foo
use fdx for read/write/poll into foo without any wrapper code.
Vs
enter foo
fdx = socket ..
read/write fdx
leave foo.

> Just like it used to before the enter.
> 

So if i enter foo, get a fdx, leave foo i can use it in
ns0 as if it was in ns0?

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Emelyanov Sept. 23, 2010, 12:34 p.m. UTC | #7
On 09/23/2010 04:11 PM, jamal wrote:
> On Thu, 2010-09-23 at 15:53 +0400, Pavel Emelyanov wrote:
> 
>> Why does it matter? You told, that the usage scenario was to
>> add routes to container. If I do 2 syscalls instead of 1, is
>> it THAT worse?
>>
> 
> Anything to do with socket IO that requires namespace awareness
> applies for usage; it could be tcp/udp/etc socket. If it doesnt
> make any difference performance wise using one scheme vs other
> to write/read heavy messages then i dont see an issue and socketat
> is redundant.

That's what my point is about - unless we know why would we need it
we don't need it.

Eric, please clarify, what is the need in creating a socket in foreign
net namespace?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Lamparter Sept. 23, 2010, 2:54 p.m. UTC | #8
On Thu, Sep 23, 2010 at 04:34:37PM +0400, Pavel Emelyanov wrote:
> On 09/23/2010 04:11 PM, jamal wrote:
> > On Thu, 2010-09-23 at 15:53 +0400, Pavel Emelyanov wrote:
> > 
> >> Why does it matter? You told, that the usage scenario was to
> >> add routes to container. If I do 2 syscalls instead of 1, is
> >> it THAT worse?
> >>
> > 
> > Anything to do with socket IO that requires namespace awareness
> > applies for usage; it could be tcp/udp/etc socket. If it doesnt
> > make any difference performance wise using one scheme vs other
> > to write/read heavy messages then i dont see an issue and socketat
> > is redundant.
> 
> That's what my point is about - unless we know why would we need it
> we don't need it.
> 
> Eric, please clarify, what is the need in creating a socket in foreign
> net namespace?

Hmm. If you somewhere get the fd to a socket from another namespace, it
definitely does work (I'm currently implementing my "socketat" with fd
passing through AF_UNIX sockets, so i know it works), so the

  setns(other...)
  fd = socket(...)
  setns(orig...)

sequence would certainly work. However, there might be other things
happening inbetween like a signal (imagine AIO particularly). While
signals are user-controllable (and therefore to be managed/excluded by
the user), we need to think if there are other problems with doing this
as sequence?

If there are no other problematic conditions with this, socketat should
probably be moved to a user library.


-David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Sept. 23, 2010, 3 p.m. UTC | #9
Pavel Emelyanov <xemul@parallels.com> writes:

> On 09/23/2010 04:11 PM, jamal wrote:
>> On Thu, 2010-09-23 at 15:53 +0400, Pavel Emelyanov wrote:
>> 
>>> Why does it matter? You told, that the usage scenario was to
>>> add routes to container. If I do 2 syscalls instead of 1, is
>>> it THAT worse?
>>>
>> 
>> Anything to do with socket IO that requires namespace awareness
>> applies for usage; it could be tcp/udp/etc socket. If it doesnt
>> make any difference performance wise using one scheme vs other
>> to write/read heavy messages then i dont see an issue and socketat
>> is redundant.
>
> That's what my point is about - unless we know why would we need it
> we don't need it.
>
> Eric, please clarify, what is the need in creating a socket in foreign
> net namespace?

Strictly speaking with setns() you can implement this functionality
with setns().  aka

int socketat(int nsfd, int domain, int type, int protocol)
{
        int sk;

        setns(0, nsfd);
        sk = socket(domain, type, protocol);
        setns(0, default_nsfd);

        return sk;
}

The major difference is that socketat in userspace suffers
from races, with signals etc.

The use case are applications are the handful of networking applications
that find that it makes sense to listen to sockets from multiple network
namespaces at once.  Say a home machine that has a vpn into your office
network and the vpn into the office network runs in a different network
namespace so you don't have to worry about address conflicts between
the two networks, the chance of accidentally bridging between them,
and so you can use different dns resolvers for the different networks.

In that scenario it would be nice if I could run some services on both
networks.  Starting two+ copies of the daemons just so the can have live
in all of the networks is ok, but in the fullness of time I expect that
there will be daemons that want to optimize things and have sockets in
all of the network namespaces you are connected to.

In a multiple network namespace aware application when it goes to open
a socket it will want to specify which network namespace the socket is
in.  If it is a general listener it will probably listening to events
in /proc/mounts waiting for extra namespaces to be mounted under a
standard location say: /var/run/netns/<netnsname>/ns.

Once the application receives the event for a new network namespace
showing up it can will want to create a new socket listening for
connections in the new network namespace.

In that scenario none of those network namespaces are foreign, but one
network namespace will be the default and the rest will be non-default
network namespaces.

To support a multiple network namespace aware daemon I need to implement
sockeat() somewhere.  So I figured I would see if anyone minded a
trivial in kernel race free implementation.  To me it is a wart in the
API and I am busily removing warts in the API.

I don't know of any scenarios with other namespaces where there would be
applications that would be native in multiple namespaces.  So I haven't
haven't done any work in that direction.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano Oct. 2, 2010, 9:13 p.m. UTC | #10
On 09/23/2010 01:53 PM, Pavel Emelyanov wrote:
> On 09/23/2010 03:40 PM, jamal wrote:
>    
>> On Thu, 2010-09-23 at 15:33 +0400, Pavel Emelyanov wrote:
>>
>>      
>>> This particular usecase is unneeded once you have the "enter" ability.
>>>        
>> Is that cheaper from a syscall count/cost?
>>      
> Why does it matter? You told, that the usage scenario was to
> add routes to container. If I do 2 syscalls instead of 1, is
> it THAT worse?
>
>    
>> i.e do I have to enter every time i want to write/read this fd?
>>      
> No - you enter once, create a socket and do whatever you need
> withing the enterned namespace.
>    

Just to clarify this point. You enter the namespace, create the socket 
and go back to the initial namespace (or create a new one). Further 
operations can be made against this fd because it is the network 
namespace stored in the sock struct which is used, not the current 
process network namespace which is used at the socket creation only.

We can actually already do that by unsharing and then create a socket. 
This socket will pin the namespace and can be used as a control socket 
for the namespace (assuming the socket domain will be ok for all the 
operations).

Jamal, I don't know what kind of application you want to use but if I 
assume you want to create a process controlling 1024 netns, let's try to 
identificate what happen with setns and with socketat :

With setns:

     * open /proc/self/ns/net (1)
     * unshare the netns
     * open /proc/self/ns/net (2)
     * setns (1)
     * create a virtual network device
     * move the virtual device to (2) (using the set netns by fd)
     * unshare the netns
     ...

With socketat:

     * open a socket (1)
     * unshare the netns
     * open a netlink with socketat(1) => (2)
     * create a virtual device using (2) (at this point it is init_net_ns)
     * move the virtual device to the current netns (using the set netns 
by pid)
     * open a socket (3)
     * unshare the netns
     ...

We have the same number of file descriptors kept opened. Except, with 
setns we can bind mount the directory somewhere, that will pin the 
namespace and then we can close the /proc/self/ns/net file descriptors 
and reopen them later.

If your application has to do a lot of specific network processing, 
during its life cycle, in different namespaces, the socketat syscall 
will be better because it will reduce the number of syscalls but at the 
cost of keeping the file descriptors opened (potentially a big number). 
Otherwise, setns should fit your needs.



>> How does poll/select work in that enter scenario?
>>      
> Just like it used to before the enter.
>
>    
>> cheers,
>> jamal
>>
>>
>>      
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>    

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jamal Oct. 3, 2010, 1:44 p.m. UTC | #11
Hi Daniel,

Thanks for clarifying this ..

On Sat, 2010-10-02 at 23:13 +0200, Daniel Lezcano wrote:
> Just to clarify this point. You enter the namespace, create the socket
> and go back to the initial namespace (or create a new one). Further 
> operations can be made against this fd because it is the network 
> namespace stored in the sock struct which is used, not the current 
> process network namespace which is used at the socket creation only.
> 
> We can actually already do that by unsharing and then create a
> socket. 
> This socket will pin the namespace and can be used as a control socket
> for the namespace (assuming the socket domain will be ok for all the 
> operations).
>
> Jamal, I don't know what kind of application you want to use but if I 
> assume you want to create a process controlling 1024 netns, 

At the moment i am looking at 8K on a Nehalem with lots of RAM. They
will mostly be created at startup but some could be created afterwards.
Each will have its own netdevs etc. also created at startup (and some
other config that may happen later). 
Because startup time may accumulate, it is clearly important to me
to pick whatever scheme that reduces the number of calls...

> let's try to identificate what happen with setns and with socketat :
> 
> With setns:
> 
>      * open /proc/self/ns/net (1)
>      * unshare the netns
>      * open /proc/self/ns/net (2)
>      * setns (1)
>      * create a virtual network device
>      * move the virtual device to (2) (using the set netns by fd)
>      * unshare the netns
>      ...
> 
> With socketat:
> 
>      * open a socket (1)
>      * unshare the netns
>      * open a netlink with socketat(1) => (2)
>      * create a virtual device using (2) (at this point it is
> init_net_ns)
>      * move the virtual device to the current netns (using the set
> netns 
> by pid)
>      * open a socket (3)
>      * unshare the netns
>      ...
> 
> We have the same number of file descriptors kept opened. Except, with 
> setns we can bind mount the directory somewhere, that will pin the 
> namespace and then we can close the /proc/self/ns/net file descriptors
> and reopen them later.
> 

Ok, so a wrapper such as: create_socket_on(namespaceid)
will have generally less system calls with socketat()

> If your application has to do a lot of specific network processing, 
> during its life cycle, in different namespaces, the socketat syscall 
> will be better because it will reduce the number of syscalls but at
> the cost of keeping the file descriptors opened (potentially a big
> number). Otherwise, setns should fit your needs.

Makes sense. 

One thing still confuses me...
The app control point is in namespace0. I still want to be able to
"boot" namespaces first and maybe a few seconds later do a socketat()...
and create devices, tcp sockets etc. I suspect create_ns(namespace-name)
would involve:
     * open /proc/self/ns/net (namespace-name)
     * unshare the netns
Is this correct?

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano Oct. 4, 2010, 10:13 a.m. UTC | #12
On 10/03/2010 03:44 PM, jamal wrote:
> Hi Daniel,
>
> Thanks for clarifying this ..
>
> On Sat, 2010-10-02 at 23:13 +0200, Daniel Lezcano wrote:
>    
>> Just to clarify this point. You enter the namespace, create the socket
>> and go back to the initial namespace (or create a new one). Further
>> operations can be made against this fd because it is the network
>> namespace stored in the sock struct which is used, not the current
>> process network namespace which is used at the socket creation only.
>>
>> We can actually already do that by unsharing and then create a
>> socket.
>> This socket will pin the namespace and can be used as a control socket
>> for the namespace (assuming the socket domain will be ok for all the
>> operations).
>>
>> Jamal, I don't know what kind of application you want to use but if I
>> assume you want to create a process controlling 1024 netns,
>>      
> At the moment i am looking at 8K on a Nehalem with lots of RAM. They
> will mostly be created at startup but some could be created afterwards.
> Each will have its own netdevs etc. also created at startup (and some
> other config that may happen later).
> Because startup time may accumulate, it is clearly important to me
> to pick whatever scheme that reduces the number of calls...
>    

8K ! whow ! :)


>> let's try to identificate what happen with setns and with socketat :
>>
>> With setns:
>>
>>       * open /proc/self/ns/net (1)
>>       * unshare the netns
>>       * open /proc/self/ns/net (2)
>>       * setns (1)
>>       * create a virtual network device
>>       * move the virtual device to (2) (using the set netns by fd)
>>       * unshare the netns
>>       ...
>>
>> With socketat:
>>
>>       * open a socket (1)
>>       * unshare the netns
>>       * open a netlink with socketat(1) =>  (2)
>>       * create a virtual device using (2) (at this point it is
>> init_net_ns)
>>       * move the virtual device to the current netns (using the set
>> netns
>> by pid)
>>       * open a socket (3)
>>       * unshare the netns
>>       ...
>>
>> We have the same number of file descriptors kept opened. Except, with
>> setns we can bind mount the directory somewhere, that will pin the
>> namespace and then we can close the /proc/self/ns/net file descriptors
>> and reopen them later.
>>
>>      
> Ok, so a wrapper such as: create_socket_on(namespaceid)
> will have generally less system calls with socketat()
>    

Yes, I think so.

>> If your application has to do a lot of specific network processing,
>> during its life cycle, in different namespaces, the socketat syscall
>> will be better because it will reduce the number of syscalls but at
>> the cost of keeping the file descriptors opened (potentially a big
>> number). Otherwise, setns should fit your needs.
>>      
> Makes sense.
>
> One thing still confuses me...
> The app control point is in namespace0. I still want to be able to
> "boot" namespaces first and maybe a few seconds later do a socketat()...
> and create devices, tcp sockets etc. I suspect create_ns(namespace-name)
> would involve:
>       * open /proc/self/ns/net (namespace-name)
>       * unshare the netns
> Is this correct?
>    

Maybe I misunderstanding but you are trying to save some syscalls, you 
should use socketat only and keep app control namespace0 socket for it. 
The process will be in the last netns you unshared (maybe you can use 
here one setns syscall to return back to the namespace0).

     (1) socketat  :
         * pros : 1 syscall to create a socket
         * cons : a file descriptor per namespace, namespace is only 
manageable via a socket

     (2) setns :
         * pros : namespace is fully manageable with a generic code
         * cons : 2 syscall (or 3 if we want to return to the initial 
netns) to create a socket(setns + socket [ + setns ]), a file descriptor 
per namespace

     (3) setns + bind mount :
         * pros : no file descriptor need to be kept opened
         * cons : startup longer, (unshare + mount --bind), 4 syscalls 
to create a socket in the namespace (open, setns, socket, close), (may 
be 5 syscalls if we want to return to the initial netns).

Depending of the scheme you choose the startup will be for:

     (1) socketat :
          * open /proc/self/ns/net (one time to 'save' and pin the 
initial netns)
         and then

         int create_ns(void)
         {
             unshare(CLONE_NEWNET);
             return socket(...)
         }

         and,

          for (i = 0; i < 8192; i++)
                  mynsfd[i] = create_ns();

     (2) setns :
          * open /proc/self/ns/net (one time to 'save' and pin the 
initial netns)
           and then

         int create_ns(void)
         {
             unshare(CLONE_NEWNET);
             return open("/proc/self/ns/net");
         }

         and,

         for (i = 0; i < 8192; i++)
               mynsfd[i] = create_ns();

     (3) setns + mount :

          * open /proc/self/ns/net (one time to 'save' and pin the 
initial netns)
           and then

             int create_ns(const char *nspath)
             {
                unshare(CLONE_NEWNET);
                creat(nspath);
                mount("/proc/self/ns/net", nspath, MS_BIND);
             }

             for (i  = 0; i < 8192; i++)
                     create_ns(mynspath[i]);

Hope that helps.

   -- Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Oct. 4, 2010, 7:07 p.m. UTC | #13
jamal <hadi@cyberus.ca> writes:

> One thing still confuses me...
> The app control point is in namespace0. I still want to be able to
> "boot" namespaces first and maybe a few seconds later do a socketat()...
> and create devices, tcp sockets etc. I suspect create_ns(namespace-name)
> would involve:
>      * open /proc/self/ns/net (namespace-name)
>      * unshare the netns
> Is this correct?

Almost.

create should be:
        * verify namespace-name is not already in use
        * mkdir -p /var/run/netns/<namespace-name>
	* unshare the netns
        * mount --bind /proc/self/ns/net /var/run/netns/<namespace-name>

Are you talking about an replacing something that used to use the linux
vrf patches that are floating around?

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jamal Oct. 15, 2010, 12:30 p.m. UTC | #14
Eric et al,

Did these patches make it in? I was looking at
two Davem net trees and i dont see them.

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jamal Oct. 26, 2010, 8:52 p.m. UTC | #15
Eric,

Ping?
If you are too busy to push these in maybe have
someone clueful like Daniel help out submitting? I think it
should probably be reasonable to leave out the sockeat
patch initially if it is deemed controversial..

cheers,
jamal

On Fri, 2010-10-15 at 08:30 -0400, jamal wrote:
> Eric et al,
> 
> Did these patches make it in? I was looking at
> two Davem net trees and i dont see them.
> 
> cheers,
> jamal
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Oct. 27, 2010, 12:27 a.m. UTC | #16
jamal <hadi@cyberus.ca> writes:

> Eric,
>
> Ping?
> If you are too busy to push these in maybe have
> someone clueful like Daniel help out submitting? I think it
> should probably be reasonable to leave out the sockeat
> patch initially if it is deemed controversial..

This merge cycle I am too busy, and my patches did not make it into
linux-next before the merge window.

Everything except socketat at seems non-controversial.  socketat makes
sense to post-pone a little bit until we start converting applications,
and there is a little real world experience about what is needed.

I anticipate some time freeing up in the next couple of weeks so I
should be ready for the next merge window.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/socket.c b/net/socket.c
index 2270b94..1116f3c 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1269,7 +1269,7 @@  int sock_create_kern(int family, int type, int protocol, struct socket **res)
 }
 EXPORT_SYMBOL(sock_create_kern);
 
-SYSCALL_DEFINE3(socket, int, family, int, type, int, protocol)
+static int do_socket(struct net *net, int family, int type, int protocol)
 {
 	int retval;
 	struct socket *sock;
@@ -1289,7 +1289,7 @@  SYSCALL_DEFINE3(socket, int, family, int, type, int, protocol)
 	if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK))
 		flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
 
-	retval = sock_create(family, type, protocol, &sock);
+	retval = __sock_create(net, family, type, protocol, &sock, 0);
 	if (retval < 0)
 		goto out;
 
@@ -1306,6 +1306,28 @@  out_release:
 	return retval;
 }
 
+SYSCALL_DEFINE3(socket, int, family, int, type, int, protocol)
+{
+	return do_socket(current->nsproxy->net_ns, family, type, protocol);
+}
+
+SYSCALL_DEFINE4(socketat, int, fd, int, family, int, type, int, protocol)
+{
+	struct net *net;
+	int retval;
+
+	if (fd == -1) {
+		net = get_net(current->nsproxy->net_ns);
+	} else {
+		net = get_net_ns_by_fd(fd);
+		if (IS_ERR(net))
+			return  PTR_ERR(net);
+	}
+	retval = do_socket(net, family, type, protocol);
+	put_net(net);
+	return retval;
+}
+
 /*
  *	Create a pair of connected sockets.
  */