diff mbox

ipv4: tcp: security_sk_alloc() needed for unicast_sock

Message ID 1344547743.31104.582.camel@edumazet-glaptop
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Aug. 9, 2012, 9:29 p.m. UTC
On Thu, 2012-08-09 at 16:06 -0400, Eric Paris wrote:
> NAK.
> 
> I personally think commit be9f4a44e7d41cee should be reverted until it
> is fixed.  Let me explain what all I believe it broke and how.
> 

Suggesting to revert this commit while we have known working fixes is a
bit of strange reaction.

I understand you are upset, but I believe we tried to fix it.

> Old callchain of the creation of the 'equivalent' socket previous to
> the patch in question just for reference:
> 
>     inet_ctl_sock_create
>       sock_create_kern
>         __sock_create
>           pf->create (inet_create)
>             sk_alloc
>               sk_prot_alloc
>                 security_sk_alloc()
> 
> 
> This WAS working properly.  All of it. 

Nobody denies it. But acknowledge my patch uncovered a fundamental
issue.

What kind of 'security module' can decide to let RST packets being sent
or not, on a global scale ? (one socket for the whole machine)

smack_sk_alloc_security() uses smk_of_current() : What can be the
meaning of smk_of_current() in the context of 'kernel' sockets...

Your patch tries to maintain this status quo.

In fact I suggest the following one liner patch, unless you can really
demonstrate what can be the meaning of providing a fake socket for these
packets.

This mess only happened because ip_append_data()/ip_push_pending_frames()
are so complex and use an underlying socket.

But this socket should not be ever used outside of its scope.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Casey Schaufler Aug. 9, 2012, 9:53 p.m. UTC | #1
On 8/9/2012 2:29 PM, Eric Dumazet wrote:
> On Thu, 2012-08-09 at 16:06 -0400, Eric Paris wrote:
>> NAK.
>>
>> I personally think commit be9f4a44e7d41cee should be reverted until it
>> is fixed.  Let me explain what all I believe it broke and how.
>>
> Suggesting to revert this commit while we have known working fixes is a
> bit of strange reaction.

A couple of potential short term workarounds have been identified,
but no one is happy with them for the long term. That does not
qualify as a "working fix" in engineering terms.

> I understand you are upset, but I believe we tried to fix it.
>
>> Old callchain of the creation of the 'equivalent' socket previous to
>> the patch in question just for reference:
>>
>>     inet_ctl_sock_create
>>       sock_create_kern
>>         __sock_create
>>           pf->create (inet_create)
>>             sk_alloc
>>               sk_prot_alloc
>>                 security_sk_alloc()
>>
>>
>> This WAS working properly.  All of it. 
> Nobody denies it. But acknowledge my patch uncovered a fundamental
> issue.
>
> What kind of 'security module' can decide to let RST packets being sent
> or not, on a global scale ? (one socket for the whole machine)

The short answer is "any security module that wants to".

And before we go any further, I'm a little surprised that
SELinux doesn't do this already.

>
> smack_sk_alloc_security() uses smk_of_current() : What can be the
> meaning of smk_of_current() in the context of 'kernel' sockets...

Yes, and all of it's callers - to date - have had an appropriate
value of current. It is using the API in the way it is supposed to.
It is assuming a properly formed socket. You want to give it a
cobbled together partial socket structure without task context.
Your predecessor did not have this problem.

>
> Your patch tries to maintain this status quo.
>
> In fact I suggest the following one liner patch, unless you can really
> demonstrate what can be the meaning of providing a fake socket for these
> packets.
>
> This mess only happened because ip_append_data()/ip_push_pending_frames()
> are so complex and use an underlying socket.
>
> But this socket should not be ever used outside of its scope.
>
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 76dde25..ec410e0 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
>  			  arg->csumoffset) = csum_fold(csum_add(nskb->csum,
>  								arg->csum));
>  		nskb->ip_summed = CHECKSUM_NONE;
> +		skb_orphan(nskb);
>  		skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb));
>  		ip_push_pending_frames(sk, &fl4);
>  	}
>
>
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Aug. 9, 2012, 10:05 p.m. UTC | #2
On Thu, 2012-08-09 at 14:53 -0700, Casey Schaufler wrote:
> On 8/9/2012 2:29 PM, Eric Dumazet wrote:

> > smack_sk_alloc_security() uses smk_of_current() : What can be the
> > meaning of smk_of_current() in the context of 'kernel' sockets...
> 
> Yes, and all of it's callers - to date - have had an appropriate
> value of current. It is using the API in the way it is supposed to.
> It is assuming a properly formed socket. You want to give it a
> cobbled together partial socket structure without task context.
> Your predecessor did not have this problem.

My predecessor ? You mean before the patch ?

tcp socket was preallocated by at kernel boot time.

What is the 'user' owning this socket ?

You guys focus on an implementation detail of TCP stack.
You should never use this fake socket.

I repeat : There are no true socket for these control packets.

If you want them, then you'll have to add fields in timewait socket.

I can decide to rewrite the whole thing just building a TCP packet on
its own, and send it without any fake socket.

Some guy 15 years ago tried to reuse some high level functions, able to
build super packets and use sophisticated tricks, while we only want so
send a 40 or 60 bytes packet.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Casey Schaufler Aug. 9, 2012, 10:26 p.m. UTC | #3
On 8/9/2012 3:05 PM, Eric Dumazet wrote:
> On Thu, 2012-08-09 at 14:53 -0700, Casey Schaufler wrote:
>> On 8/9/2012 2:29 PM, Eric Dumazet wrote:
>>> smack_sk_alloc_security() uses smk_of_current() : What can be the
>>> meaning of smk_of_current() in the context of 'kernel' sockets...
>> Yes, and all of it's callers - to date - have had an appropriate
>> value of current. It is using the API in the way it is supposed to.
>> It is assuming a properly formed socket. You want to give it a
>> cobbled together partial socket structure without task context.
>> Your predecessor did not have this problem.
> My predecessor ? You mean before the patch ?
>
> tcp socket was preallocated by at kernel boot time.
>
> What is the 'user' owning this socket ?
>
> You guys focus on an implementation detail of TCP stack.
> You should never use this fake socket.
>
> I repeat : There are no true socket for these control packets.
>
> If you want them, then you'll have to add fields in timewait socket.
>
> I can decide to rewrite the whole thing just building a TCP packet on
> its own, and send it without any fake socket.
>
> Some guy 15 years ago tried to reuse some high level functions, able to
> build super packets and use sophisticated tricks, while we only want so
> send a 40 or 60 bytes packet.

OK, fine. You have an optimization. I'm good with that. Just don't
expect that the entire software stack you are taking advantage of
is going to change to accommodate your special case.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Aug. 9, 2012, 11:38 p.m. UTC | #4
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 09 Aug 2012 23:29:03 +0200

> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 76dde25..ec410e0 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
>  			  arg->csumoffset) = csum_fold(csum_add(nskb->csum,
>  								arg->csum));
>  		nskb->ip_summed = CHECKSUM_NONE;
> +		skb_orphan(nskb);
>  		skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb));
>  		ip_push_pending_frames(sk, &fl4);
>  	}
> 

This is definitely the best fix, please submit this formally.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 76dde25..ec410e0 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1536,6 +1536,7 @@  void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
 			  arg->csumoffset) = csum_fold(csum_add(nskb->csum,
 								arg->csum));
 		nskb->ip_summed = CHECKSUM_NONE;
+		skb_orphan(nskb);
 		skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb));
 		ip_push_pending_frames(sk, &fl4);
 	}