diff mbox

[net-next,v3,2/2] netlink: specify netlink packet direction for nlmon

Message ID 1387805756-21121-3-git-send-email-dborkman@redhat.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Daniel Borkmann Dec. 23, 2013, 1:35 p.m. UTC
In order to facilitate development for netlink protocol dissector,
fill the unused field skb->pkt_type of the cloned skb with a hint
of the address space of the new owner (receiver) socket in the
notion of "to kernel" resp. "to user".

At the time we invoke __netlink_deliver_tap_skb(), we already have
set the new skb owner via netlink_skb_set_owner_r(), so we can use
that for netlink_is_kernel() probing.

In normal PF_PACKET network traffic, this field denotes if the
packet is destined for us (PACKET_HOST), if it's broadcast
(PACKET_BROADCAST), etc.

As we only have 3 bit reserved, we can use the value (= 6) of
PACKET_FASTROUTE as it's _not used_ anywhere in the whole kernel
and not supported anywhere, and packets of such type were never
exposed to user space, so there are no overlapping users of such
kind. Thus, as wished, that seems the only way to make both
PACKET_* values non-overlapping and therefore device agnostic.

By using those two flags for netlink skbs on nlmon devices, they
can be made available and picked up via sll_pkttype (previously
unused in netlink context) in struct sockaddr_ll. We now have
these two directions:

 - PACKET_USER (= 6)    ->  to user space
 - PACKET_KERNEL (= 7)  ->  to kernel space

Partial `ip a` example strace for sa_family=AF_NETLINK with
detected nl msg direction:

syscall:                     direction:
sendto(3,  ...) = 40         /* to kernel */
recvmsg(3, ...) = 3404       /* to user */
recvmsg(3, ...) = 1120       /* to user */
recvmsg(3, ...) = 20         /* to user */
sendto(3,  ...) = 40         /* to kernel */
recvmsg(3, ...) = 168        /* to user */
recvmsg(3, ...) = 144        /* to user */
recvmsg(3, ...) = 20         /* to user */

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
---
 v1->v2:
  - let PACKET_* values not overlap as requested by Dave
 v2->v3:
  - fixed typo in comment spotted by Nicolas, thanks

 include/uapi/linux/if_packet.h | 4 +++-
 net/netlink/af_netlink.c       | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

Comments

Nicolas Dichtel Dec. 23, 2013, 5:46 p.m. UTC | #1
Le 23/12/2013 14:35, Daniel Borkmann a écrit :
> In order to facilitate development for netlink protocol dissector,
> fill the unused field skb->pkt_type of the cloned skb with a hint
> of the address space of the new owner (receiver) socket in the
> notion of "to kernel" resp. "to user".
>
> At the time we invoke __netlink_deliver_tap_skb(), we already have
> set the new skb owner via netlink_skb_set_owner_r(), so we can use
> that for netlink_is_kernel() probing.
>
> In normal PF_PACKET network traffic, this field denotes if the
> packet is destined for us (PACKET_HOST), if it's broadcast
> (PACKET_BROADCAST), etc.
>
> As we only have 3 bit reserved, we can use the value (= 6) of
> PACKET_FASTROUTE as it's _not used_ anywhere in the whole kernel
> and not supported anywhere, and packets of such type were never
> exposed to user space, so there are no overlapping users of such
> kind. Thus, as wished, that seems the only way to make both
> PACKET_* values non-overlapping and therefore device agnostic.
>
> By using those two flags for netlink skbs on nlmon devices, they
> can be made available and picked up via sll_pkttype (previously
> unused in netlink context) in struct sockaddr_ll. We now have
> these two directions:
>
>   - PACKET_USER (= 6)    ->  to user space
>   - PACKET_KERNEL (= 7)  ->  to kernel space
>
> Partial `ip a` example strace for sa_family=AF_NETLINK with
> detected nl msg direction:
>
> syscall:                     direction:
> sendto(3,  ...) = 40         /* to kernel */
> recvmsg(3, ...) = 3404       /* to user */
> recvmsg(3, ...) = 1120       /* to user */
> recvmsg(3, ...) = 20         /* to user */
> sendto(3,  ...) = 40         /* to kernel */
> recvmsg(3, ...) = 168        /* to user */
> recvmsg(3, ...) = 144        /* to user */
> recvmsg(3, ...) = 20         /* to user */
>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Signed-off-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
> ---
>   v1->v2:
>    - let PACKET_* values not overlap as requested by Dave
>   v2->v3:
>    - fixed typo in comment spotted by Nicolas, thanks
>
>   include/uapi/linux/if_packet.h | 4 +++-
>   net/netlink/af_netlink.c       | 2 ++
>   2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
> index e9d844c..06e2a28 100644
> --- a/include/uapi/linux/if_packet.h
> +++ b/include/uapi/linux/if_packet.h
> @@ -26,8 +26,10 @@ struct sockaddr_ll {
>   #define PACKET_MULTICAST	2		/* To group		*/
>   #define PACKET_OTHERHOST	3		/* To someone else 	*/
>   #define PACKET_OUTGOING		4		/* Outgoing of any type */
> -/* These ones are invisible by user level */
>   #define PACKET_LOOPBACK		5		/* MC/BRD frame looped back */
> +#define PACKET_USER		6		/* To user space	*/
> +#define PACKET_KERNEL		7		/* To kernel space	*/
> +/* Unused, PACKET_FASTROUTE and PACKET_LOOPBACK are invisible to user space */
>   #define PACKET_FASTROUTE	6		/* Fastrouted frame	*/
Sorry to insist, I just try to understand. Why not removing the definition of
PACKET_FASTROUTE?
Or have a name like PACKET_NL_USER to document the difference between both
cases?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann Dec. 23, 2013, 5:54 p.m. UTC | #2
On 12/23/2013 06:46 PM, Nicolas Dichtel wrote:
> Le 23/12/2013 14:35, Daniel Borkmann a écrit :
>> In order to facilitate development for netlink protocol dissector,
>> fill the unused field skb->pkt_type of the cloned skb with a hint
>> of the address space of the new owner (receiver) socket in the
>> notion of "to kernel" resp. "to user".
>>
>> At the time we invoke __netlink_deliver_tap_skb(), we already have
>> set the new skb owner via netlink_skb_set_owner_r(), so we can use
>> that for netlink_is_kernel() probing.
>>
>> In normal PF_PACKET network traffic, this field denotes if the
>> packet is destined for us (PACKET_HOST), if it's broadcast
>> (PACKET_BROADCAST), etc.
>>
>> As we only have 3 bit reserved, we can use the value (= 6) of
>> PACKET_FASTROUTE as it's _not used_ anywhere in the whole kernel
>> and not supported anywhere, and packets of such type were never
>> exposed to user space, so there are no overlapping users of such
>> kind. Thus, as wished, that seems the only way to make both
>> PACKET_* values non-overlapping and therefore device agnostic.
>>
>> By using those two flags for netlink skbs on nlmon devices, they
>> can be made available and picked up via sll_pkttype (previously
>> unused in netlink context) in struct sockaddr_ll. We now have
>> these two directions:
>>
>>   - PACKET_USER (= 6)    ->  to user space
>>   - PACKET_KERNEL (= 7)  ->  to kernel space
>>
>> Partial `ip a` example strace for sa_family=AF_NETLINK with
>> detected nl msg direction:
>>
>> syscall:                     direction:
>> sendto(3,  ...) = 40         /* to kernel */
>> recvmsg(3, ...) = 3404       /* to user */
>> recvmsg(3, ...) = 1120       /* to user */
>> recvmsg(3, ...) = 20         /* to user */
>> sendto(3,  ...) = 40         /* to kernel */
>> recvmsg(3, ...) = 168        /* to user */
>> recvmsg(3, ...) = 144        /* to user */
>> recvmsg(3, ...) = 20         /* to user */
>>
>> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
>> Signed-off-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
>> ---
>>   v1->v2:
>>    - let PACKET_* values not overlap as requested by Dave
>>   v2->v3:
>>    - fixed typo in comment spotted by Nicolas, thanks
>>
>>   include/uapi/linux/if_packet.h | 4 +++-
>>   net/netlink/af_netlink.c       | 2 ++
>>   2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
>> index e9d844c..06e2a28 100644
>> --- a/include/uapi/linux/if_packet.h
>> +++ b/include/uapi/linux/if_packet.h
>> @@ -26,8 +26,10 @@ struct sockaddr_ll {
>>   #define PACKET_MULTICAST    2        /* To group        */
>>   #define PACKET_OTHERHOST    3        /* To someone else     */
>>   #define PACKET_OUTGOING        4        /* Outgoing of any type */
>> -/* These ones are invisible by user level */
>>   #define PACKET_LOOPBACK        5        /* MC/BRD frame looped back */
>> +#define PACKET_USER        6        /* To user space    */
>> +#define PACKET_KERNEL        7        /* To kernel space    */
>> +/* Unused, PACKET_FASTROUTE and PACKET_LOOPBACK are invisible to user space */
>>   #define PACKET_FASTROUTE    6        /* Fastrouted frame    */
> Sorry to insist, I just try to understand. Why not removing the definition of
> PACKET_FASTROUTE?
> Or have a name like PACKET_NL_USER to document the difference between both
> cases?

It's now used by nl, but as we have purely generic names, I simply wanted
to comply with that.

We could entirely remove it as it was e.g. proposed in 2008 [1] already if
you see any value in that. Eventually it's up to Dave and if he likes, I'll
be happy to send a patch that removes this define.

Best,

Daniel

  [1] http://lists.openwall.net/netdev/2008/05/07/19
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Dec. 31, 2013, 6:50 p.m. UTC | #3
From: Daniel Borkmann <dborkman@redhat.com>
Date: Mon, 23 Dec 2013 18:54:31 +0100

> We could entirely remove it as it was e.g. proposed in 2008 [1]
> already if you see any value in that. Eventually it's up to Dave and
> if he likes, I'll be happy to send a patch that removes this define.

Removing user visible defines can break source builds, for example
someone building string tables or auto-generating things to facilitate
accessing these values from languages other than C.

It's harmless, since nobody semantically expects anything of it, but
we have to keep it around.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann Jan. 1, 2014, 4:16 a.m. UTC | #4
On 12/31/2013 07:50 PM, David Miller wrote:
> From: Daniel Borkmann <dborkman@redhat.com>
> Date: Mon, 23 Dec 2013 18:54:31 +0100
>
>> We could entirely remove it as it was e.g. proposed in 2008 [1]
>> already if you see any value in that. Eventually it's up to Dave and
>> if he likes, I'll be happy to send a patch that removes this define.
>
> Removing user visible defines can break source builds, for example
> someone building string tables or auto-generating things to facilitate
> accessing these values from languages other than C.
>
> It's harmless, since nobody semantically expects anything of it, but
> we have to keep it around.

Ok, that's fine by me.

Thanks for applying Dave and a happy new year!

Best,

Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
index e9d844c..06e2a28 100644
--- a/include/uapi/linux/if_packet.h
+++ b/include/uapi/linux/if_packet.h
@@ -26,8 +26,10 @@  struct sockaddr_ll {
 #define PACKET_MULTICAST	2		/* To group		*/
 #define PACKET_OTHERHOST	3		/* To someone else 	*/
 #define PACKET_OUTGOING		4		/* Outgoing of any type */
-/* These ones are invisible by user level */
 #define PACKET_LOOPBACK		5		/* MC/BRD frame looped back */
+#define PACKET_USER		6		/* To user space	*/
+#define PACKET_KERNEL		7		/* To kernel space	*/
+/* Unused, PACKET_FASTROUTE and PACKET_LOOPBACK are invisible to user space */
 #define PACKET_FASTROUTE	6		/* Fastrouted frame	*/
 
 /* Packet socket options */
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 56e09d8..3f75f1c 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -204,6 +204,8 @@  static int __netlink_deliver_tap_skb(struct sk_buff *skb,
 	if (nskb) {
 		nskb->dev = dev;
 		nskb->protocol = htons((u16) sk->sk_protocol);
+		nskb->pkt_type = netlink_is_kernel(sk) ?
+				 PACKET_KERNEL : PACKET_USER;
 
 		ret = dev_queue_xmit(nskb);
 		if (unlikely(ret > 0))