diff mbox

[net-next,v2,4/7] net: packet: use skb->dev as storage for skb orig len instead of skb->cb[]

Message ID 1424977624-649-5-git-send-email-eyal.birger@gmail.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Eyal Birger Feb. 26, 2015, 7:07 p.m. UTC
As part of an effort to move skb->dropcount to skb->cb[], 4 bytes
of additional room are needed in skb->cb[] in packet sockets.

Store the skb original length in skb->dev instead of skb->cb[] for
this purpose.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
---
 net/packet/af_packet.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

David Miller Feb. 28, 2015, 7:21 p.m. UTC | #1
From: Eyal Birger <eyal.birger@gmail.com>
Date: Thu, 26 Feb 2015 21:07:01 +0200

> As part of an effort to move skb->dropcount to skb->cb[], 4 bytes
> of additional room are needed in skb->cb[] in packet sockets.
> 
> Store the skb original length in skb->dev instead of skb->cb[] for
> this purpose.
> 
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>

I'm a little confused, why is this even needed?

packet_skb_cb is 24 bytes by my calculations, which is much
smaller than the cb[] size which is 48 bytes.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eyal Birger Feb. 28, 2015, 7:39 p.m. UTC | #2
On Sat, Feb 28, 2015 at 9:21 PM, David Miller <davem@davemloft.net> wrote:
> From: Eyal Birger <eyal.birger@gmail.com>
> Date: Thu, 26 Feb 2015 21:07:01 +0200
>
>> As part of an effort to move skb->dropcount to skb->cb[], 4 bytes
>> of additional room are needed in skb->cb[] in packet sockets.
>>
>> Store the skb original length in skb->dev instead of skb->cb[] for
>> this purpose.
>>
>> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
>
> I'm a little confused, why is this even needed?
>
> packet_skb_cb is 24 bytes by my calculations, which is much
> smaller than the cb[] size which is 48 bytes.

Note the BUILD_BUG_ON in packet_rcv().

packet_skb_cb may contain an address as large as MAX_ADDR_LEN (32)
Therefore the required space is sizeof(packet_skb_cb) + MAX_ADDR_LEN - 8
which is 48 bytes before this change.

Regards,
Eyal.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 28, 2015, 8:08 p.m. UTC | #3
From: Eyal Birger <eyal.birger@gmail.com>
Date: Sat, 28 Feb 2015 21:39:34 +0200

> On Sat, Feb 28, 2015 at 9:21 PM, David Miller <davem@davemloft.net> wrote:
>> From: Eyal Birger <eyal.birger@gmail.com>
>> Date: Thu, 26 Feb 2015 21:07:01 +0200
>>
>>> As part of an effort to move skb->dropcount to skb->cb[], 4 bytes
>>> of additional room are needed in skb->cb[] in packet sockets.
>>>
>>> Store the skb original length in skb->dev instead of skb->cb[] for
>>> this purpose.
>>>
>>> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
>>
>> I'm a little confused, why is this even needed?
>>
>> packet_skb_cb is 24 bytes by my calculations, which is much
>> smaller than the cb[] size which is 48 bytes.
> 
> Note the BUILD_BUG_ON in packet_rcv().
> 
> packet_skb_cb may contain an address as large as MAX_ADDR_LEN (32)
> Therefore the required space is sizeof(packet_skb_cb) + MAX_ADDR_LEN - 8
> which is 48 bytes before this change.

So let's take a step back.

What link layer we support has 32-byte hardware addresses?

If the answer is lower than 32, we should use that value instead.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eyal Birger Feb. 28, 2015, 8:38 p.m. UTC | #4
On Sat, Feb 28, 2015 at 10:08 PM, David Miller <davem@davemloft.net> wrote:
> From: Eyal Birger <eyal.birger@gmail.com>
> Date: Sat, 28 Feb 2015 21:39:34 +0200
>
>> On Sat, Feb 28, 2015 at 9:21 PM, David Miller <davem@davemloft.net> wrote:
>>> From: Eyal Birger <eyal.birger@gmail.com>
>>> Date: Thu, 26 Feb 2015 21:07:01 +0200
>>>
>>>> As part of an effort to move skb->dropcount to skb->cb[], 4 bytes
>>>> of additional room are needed in skb->cb[] in packet sockets.
>>>>
>>>> Store the skb original length in skb->dev instead of skb->cb[] for
>>>> this purpose.
>>>>
>>>> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
>>>
>>> I'm a little confused, why is this even needed?
>>>
>>> packet_skb_cb is 24 bytes by my calculations, which is much
>>> smaller than the cb[] size which is 48 bytes.
>>
>> Note the BUILD_BUG_ON in packet_rcv().
>>
>> packet_skb_cb may contain an address as large as MAX_ADDR_LEN (32)
>> Therefore the required space is sizeof(packet_skb_cb) + MAX_ADDR_LEN - 8
>> which is 48 bytes before this change.
>
> So let's take a step back.
>
> What link layer we support has 32-byte hardware addresses?

The largest one I've been hinted of is INFINIBAND_ALEN which is 20 bytes.

> If the answer is lower than 32, we should use that value instead.

Won't that value become the 'real' MAX_ADDR_LEN in that case?

My concern is that any value I pick based on the existing implementations
would need to be adjusted come a protocol with a larger address length.

Also, I would have to assert the available amount of space at run-time as
the address length is provided by a call to dev_parse_header() instead of
using a build-time assert.

Regards,
Eyal.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 28, 2015, 9 p.m. UTC | #5
From: Eyal Birger <eyal.birger@gmail.com>
Date: Sat, 28 Feb 2015 22:38:04 +0200

> My concern is that any value I pick based on the existing implementations
> would need to be adjusted come a protocol with a larger address length.

Then we need a method that requires protocols to register their
address length in a manner that will allow us to validate that
limit at compile time.

This is not rocket science.

Right now we're proposing to do utterly stupid shit like encoding
integers in device pointers in the sk_buff, when that is absolutely
not necessary at all.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eyal Birger March 1, 2015, 5:09 a.m. UTC | #6
On Sat, Feb 28, 2015 at 11:00 PM, David Miller <davem@davemloft.net> wrote:
> From: Eyal Birger <eyal.birger@gmail.com>
> Date: Sat, 28 Feb 2015 22:38:04 +0200
>
>> My concern is that any value I pick based on the existing implementations
>> would need to be adjusted come a protocol with a larger address length.
>
> Then we need a method that requires protocols to register their
> address length in a manner that will allow us to validate that
> limit at compile time.
>

Sorry to reiterate this, but such validation will inherently become the
actual limit for hardware addresses.
So, it would be equivalent to changing the in-kernel definition of
MAX_ADDR_SIZE.

Is this something to be considered?

> This is not rocket science.
>
> Right now we're proposing to do utterly stupid shit like encoding
> integers in device pointers in the sk_buff, when that is absolutely
> not necessary at all.

Ok. Another suggestion I received was to delay the preparation of the full
sockaddr_ll until it is needed, and store the skb original length in the first
two fields (sll_protocol and sll_family) as they can be derived later on from
the skb.

IMHO, It would still be somewhat of a hack though.

Would that approach be considered?

Regards,
Eyal.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 1, 2015, 5:31 a.m. UTC | #7
From: Eyal Birger <eyal.birger@gmail.com>
Date: Sun, 1 Mar 2015 07:09:54 +0200

> Ok. Another suggestion I received was to delay the preparation of the full
> sockaddr_ll until it is needed, and store the skb original length in the first
> two fields (sll_protocol and sll_family) as they can be derived later on from
> the skb.
> 
> IMHO, It would still be somewhat of a hack though.
> 
> Would that approach be considered?

I think it's better than mangling skb->dev
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eyal Birger March 1, 2015, 5:35 a.m. UTC | #8
On Sun, Mar 1, 2015 at 7:31 AM, David Miller <davem@davemloft.net> wrote:
> From: Eyal Birger <eyal.birger@gmail.com>
> Date: Sun, 1 Mar 2015 07:09:54 +0200
>
>> Ok. Another suggestion I received was to delay the preparation of the full
>> sockaddr_ll until it is needed, and store the skb original length in the first
>> two fields (sll_protocol and sll_family) as they can be derived later on from
>> the skb.
>>
>> IMHO, It would still be somewhat of a hack though.
>>
>> Would that approach be considered?
>
> I think it's better than mangling skb->dev

Ok. Will submit a v3.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 9c28cec..9d571bc 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -216,7 +216,6 @@  static void prb_fill_vlan_info(struct tpacket_kbdq_core *,
 static void packet_flush_mclist(struct sock *sk);
 
 struct packet_skb_cb {
-	unsigned int origlen;
 	union {
 		struct sockaddr_pkt pkt;
 		struct sockaddr_ll ll;
@@ -224,6 +223,7 @@  struct packet_skb_cb {
 };
 
 #define PACKET_SKB_CB(__skb)	((struct packet_skb_cb *)((__skb)->cb))
+#define PACKET_SKB_ORIGLEN(__skb) ((unsigned long *)&(__skb)->dev)
 
 #define GET_PBDQC_FROM_RB(x)	((struct tpacket_kbdq_core *)(&(x)->prb_bdqc))
 #define GET_PBLOCK_DESC(x, bid)	\
@@ -1757,7 +1757,7 @@  static int packet_rcv(struct sk_buff *skb, struct net_device *dev,
 	struct packet_sock *po;
 	u8 *skb_head = skb->data;
 	int skb_len = skb->len;
-	unsigned int snaplen, res;
+	unsigned int snaplen, origlen, res;
 
 	if (skb->pkt_type == PACKET_LOOPBACK)
 		goto drop;
@@ -1825,13 +1825,13 @@  static int packet_rcv(struct sk_buff *skb, struct net_device *dev,
 
 	sll->sll_halen = dev_parse_header(skb, sll->sll_addr);
 
-	PACKET_SKB_CB(skb)->origlen = skb->len;
+	origlen = skb->len;
 
 	if (pskb_trim(skb, snaplen))
 		goto drop_n_acct;
 
 	skb_set_owner_r(skb, sk);
-	skb->dev = NULL;
+	*PACKET_SKB_ORIGLEN(skb) = origlen;
 	skb_dst_drop(skb);
 
 	/* drop conntrack reference */
@@ -3006,7 +3006,7 @@  static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 		aux.tp_status = TP_STATUS_USER;
 		if (skb->ip_summed == CHECKSUM_PARTIAL)
 			aux.tp_status |= TP_STATUS_CSUMNOTREADY;
-		aux.tp_len = PACKET_SKB_CB(skb)->origlen;
+		aux.tp_len = *PACKET_SKB_ORIGLEN(skb);
 		aux.tp_snaplen = skb->len;
 		aux.tp_mac = 0;
 		aux.tp_net = skb_network_offset(skb);