diff mbox

[net] net: sched: shrink struct qdisc_skb_cb to 28 bytes

Message ID 1411052525.7106.269.camel@edumazet-glaptop2.roam.corp.google.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Sept. 18, 2014, 3:02 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
or increasing skb->cb[] size.

Commit e0f31d849867 ("flow_keys: Record IP layer protocol in
skb_flow_dissect()") broke IPoIB.

Only current offender is sch_choke, and this one do not need an
absolutely precise flow key.

If we store 17 bytes of flow key, its more than enough. (Its the actual
size of flow_keys if it was a packed structure, but we might add new
fields at the end of it later)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()")
---
 include/net/sch_generic.h |    3 ++-
 net/sched/sch_choke.c     |   18 ++++++++++++++----
 2 files changed, 16 insertions(+), 5 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Stephen Hemminger Sept. 18, 2014, 4:26 p.m. UTC | #1
On Thu, 18 Sep 2014 08:02:05 -0700
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> From: Eric Dumazet <edumazet@google.com>
> 
> We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
> or increasing skb->cb[] size.
> 
> Commit e0f31d849867 ("flow_keys: Record IP layer protocol in
> skb_flow_dissect()") broke IPoIB.
> 
> Only current offender is sch_choke, and this one do not need an
> absolutely precise flow key.
> 
> If we store 17 bytes of flow key, its more than enough. (Its the actual
> size of flow_keys if it was a packed structure, but we might add new
> fields at the end of it later)
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Can we add BUILD_BUG to stop next time something smacks this.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Sept. 18, 2014, 4:32 p.m. UTC | #2
On Thu, 2014-09-18 at 09:26 -0700, Stephen Hemminger wrote:
> On Thu, 18 Sep 2014 08:02:05 -0700
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
> > or increasing skb->cb[] size.
> > 
> > Commit e0f31d849867 ("flow_keys: Record IP layer protocol in
> > skb_flow_dissect()") broke IPoIB.
> > 
> > Only current offender is sch_choke, and this one do not need an
> > absolutely precise flow key.
> > 
> > If we store 17 bytes of flow key, its more than enough. (Its the actual
> > size of flow_keys if it was a packed structure, but we might add new
> > fields at the end of it later)
> > 
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> 
> Can we add BUILD_BUG to stop next time something smacks this.

I though we had.

Maybe IPoIB lacks one.

Or, do you have an idea ?


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Sept. 18, 2014, 9:28 p.m. UTC | #3
On Thu, Sep 18, 2014 at 7:32 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2014-09-18 at 09:26 -0700, Stephen Hemminger wrote:
>> On Thu, 18 Sep 2014 08:02:05 -0700
>> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>> > From: Eric Dumazet <edumazet@google.com>
>> >
>> > We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
>> > or increasing skb->cb[] size.
>> >
>> > Commit e0f31d849867 ("flow_keys: Record IP layer protocol in
>> > skb_flow_dissect()") broke IPoIB.
>> >
>> > Only current offender is sch_choke, and this one do not need an
>> > absolutely precise flow key.
>> >
>> > If we store 17 bytes of flow key, its more than enough. (Its the actual
>> > size of flow_keys if it was a packed structure, but we might add new
>> > fields at the end of it later)
>> >
>> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>>
>> Can we add BUILD_BUG to stop next time something smacks this.
>
> I though we had.
>
> Maybe IPoIB lacks one.
>
> Or, do you have an idea ?

Nope, we currently don't have such build time check, I saw you posted
one and I will be able to test it Sunday.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Sept. 18, 2014, 10:29 p.m. UTC | #4
On 09/18/2014 11:02 AM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
> or increasing skb->cb[] size.
>
> Commit e0f31d849867 ("flow_keys: Record IP layer protocol in
> skb_flow_dissect()") broke IPoIB.
>
> Only current offender is sch_choke, and this one do not need an
> absolutely precise flow key.
>
> If we store 17 bytes of flow key, its more than enough. (Its the actual
> size of flow_keys if it was a packed structure, but we might add new
> fields at the end of it later)
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Fixes: e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()")

I've installed this patch on my cluster and it resolves the problem.

Tested-by/Acked-by: Doug Ledford <dledford@redhat.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Sept. 19, 2014, 12:07 p.m. UTC | #5
On Fri, Sep 19, 2014 at 1:29 AM, Doug Ledford <dledford@xsintricity.com> wrote:
> On 09/18/2014 11:02 AM, Eric Dumazet wrote:
>> From: Eric Dumazet <edumazet@google.com>
>> We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
>> or increasing skb->cb[] size.
>>
>> Commit e0f31d849867 ("flow_keys: Record IP layer protocol in
>> skb_flow_dissect()") broke IPoIB.
>>
>> Only current offender is sch_choke, and this one do not need an
>> absolutely precise flow key.
>>
>> If we store 17 bytes of flow key, its more than enough. (Its the actual
>> size of flow_keys if it was a packed structure, but we might add new
>> fields at the end of it later)
>>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>> Fixes: e0f31d849867 ("flow_keys: Record IP layer protocol in
>> skb_flow_dissect()")

> I've installed this patch on my cluster and it resolves the problem.
> Tested-by/Acked-by: Doug Ledford <dledford@redhat.com>

Thanks Eric/Doug!

Dave - just to make sure, this is for net, as the regression was
introduced in 3.17-rc1.

Or.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 22, 2014, 6:22 p.m. UTC | #6
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 18 Sep 2014 08:02:05 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
> or increasing skb->cb[] size.
> 
> Commit e0f31d849867 ("flow_keys: Record IP layer protocol in
> skb_flow_dissect()") broke IPoIB.
> 
> Only current offender is sch_choke, and this one do not need an
> absolutely precise flow key.
> 
> If we store 17 bytes of flow key, its more than enough. (Its the actual
> size of flow_keys if it was a packed structure, but we might add new
> fields at the end of it later)
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Fixes: e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()")

Applied, thanks Eric.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 22, 2014, 6:38 p.m. UTC | #7
From: Doug Ledford <dledford@xsintricity.com>
Date: Thu, 18 Sep 2014 18:29:31 -0400

> Tested-by/Acked-by: Doug Ledford <dledford@redhat.com>

Automated tools do not understand this, please explicitly give
separate Tested-by: and Acked-by: tags in the future.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index a3cfb8ebeb53..620e086c0cbe 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -231,7 +231,8 @@  struct qdisc_skb_cb {
 	unsigned int		pkt_len;
 	u16			slave_dev_queue_mapping;
 	u16			_pad;
-	unsigned char		data[24];
+#define QDISC_CB_PRIV_LEN 20
+	unsigned char		data[QDISC_CB_PRIV_LEN];
 };
 
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
diff --git a/net/sched/sch_choke.c b/net/sched/sch_choke.c
index ed30e436128b..fb666d1e4de3 100644
--- a/net/sched/sch_choke.c
+++ b/net/sched/sch_choke.c
@@ -133,10 +133,16 @@  static void choke_drop_by_idx(struct Qdisc *sch, unsigned int idx)
 	--sch->q.qlen;
 }
 
+/* private part of skb->cb[] that a qdisc is allowed to use
+ * is limited to QDISC_CB_PRIV_LEN bytes.
+ * As a flow key might be too large, we store a part of it only.
+ */
+#define CHOKE_K_LEN min_t(u32, sizeof(struct flow_keys), QDISC_CB_PRIV_LEN - 3)
+
 struct choke_skb_cb {
 	u16			classid;
 	u8			keys_valid;
-	struct flow_keys	keys;
+	u8			keys[QDISC_CB_PRIV_LEN - 3];
 };
 
 static inline struct choke_skb_cb *choke_skb_cb(const struct sk_buff *skb)
@@ -163,22 +169,26 @@  static u16 choke_get_classid(const struct sk_buff *skb)
 static bool choke_match_flow(struct sk_buff *skb1,
 			     struct sk_buff *skb2)
 {
+	struct flow_keys temp;
+
 	if (skb1->protocol != skb2->protocol)
 		return false;
 
 	if (!choke_skb_cb(skb1)->keys_valid) {
 		choke_skb_cb(skb1)->keys_valid = 1;
-		skb_flow_dissect(skb1, &choke_skb_cb(skb1)->keys);
+		skb_flow_dissect(skb1, &temp);
+		memcpy(&choke_skb_cb(skb1)->keys, &temp, CHOKE_K_LEN);
 	}
 
 	if (!choke_skb_cb(skb2)->keys_valid) {
 		choke_skb_cb(skb2)->keys_valid = 1;
-		skb_flow_dissect(skb2, &choke_skb_cb(skb2)->keys);
+		skb_flow_dissect(skb2, &temp);
+		memcpy(&choke_skb_cb(skb2)->keys, &temp, CHOKE_K_LEN);
 	}
 
 	return !memcmp(&choke_skb_cb(skb1)->keys,
 		       &choke_skb_cb(skb2)->keys,
-		       sizeof(struct flow_keys));
+		       CHOKE_K_LEN);
 }
 
 /*