diff mbox series

[net,v2] rps: Correct wrong skb_flow_limit check when enable RPS

Message ID 1525990182-12042-1-git-send-email-gfree.wind@vip.163.com
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series [net,v2] rps: Correct wrong skb_flow_limit check when enable RPS | expand

Commit Message

Gao Feng May 10, 2018, 10:09 p.m. UTC
From: Gao Feng <gfree.wind@vip.163.com>

The skb flow limit is implemented for each CPU independently. In the
current codes, the function skb_flow_limit gets the softnet_data by
this_cpu_ptr. But the target cpu of enqueue_to_backlog would be not
the current cpu when enable RPS. As the result, the skb_flow_limit checks
the stats of current CPU, while the skb is going to append the queue of
another CPU. It isn't the expected behavior.

Now pass the softnet_data as a param to make consistent.

Fixes: 99bbc7074190 ("rps: selective flow shedding during softnet overflow")
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
---
 v2: Add Fixes tag per Eric, and enhance the commit log
 v1: intial version

 net/core/dev.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

Comments

Willem de Bruijn May 11, 2018, 9:47 p.m. UTC | #1
On Thu, May 10, 2018 at 6:09 PM,  <gfree.wind@vip.163.com> wrote:
> From: Gao Feng <gfree.wind@vip.163.com>
>
> The skb flow limit is implemented for each CPU independently. In the
> current codes, the function skb_flow_limit gets the softnet_data by
> this_cpu_ptr. But the target cpu of enqueue_to_backlog would be not
> the current cpu when enable RPS. As the result, the skb_flow_limit checks
> the stats of current CPU, while the skb is going to append the queue of
> another CPU. It isn't the expected behavior.
>
> Now pass the softnet_data as a param to make consistent.
>
> Fixes: 99bbc7074190 ("rps: selective flow shedding during softnet overflow")
> Signed-off-by: Gao Feng <gfree.wind@vip.163.com>

See also the discussion in the v1 of this patch.

The merits of moving flow_limit state from irq to rps cpu can
be argued, but the existing behavior is intentional and correct,
so this should not be applied to net and be backported to stable
branches.

My bad for reviving the discussion in the v1 thread while v2 was
already pending, sorry.
diff mbox series

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index af0558b..0f98eff 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3883,18 +3883,15 @@  static int rps_ipi_queued(struct softnet_data *sd)
 int netdev_flow_limit_table_len __read_mostly = (1 << 12);
 #endif
 
-static bool skb_flow_limit(struct sk_buff *skb, unsigned int qlen)
+static bool skb_flow_limit(struct softnet_data *sd, struct sk_buff *skb, unsigned int qlen)
 {
 #ifdef CONFIG_NET_FLOW_LIMIT
 	struct sd_flow_limit *fl;
-	struct softnet_data *sd;
 	unsigned int old_flow, new_flow;
 
 	if (qlen < (netdev_max_backlog >> 1))
 		return false;
 
-	sd = this_cpu_ptr(&softnet_data);
-
 	rcu_read_lock();
 	fl = rcu_dereference(sd->flow_limit);
 	if (fl) {
@@ -3938,7 +3935,7 @@  static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
 	if (!netif_running(skb->dev))
 		goto drop;
 	qlen = skb_queue_len(&sd->input_pkt_queue);
-	if (qlen <= netdev_max_backlog && !skb_flow_limit(skb, qlen)) {
+	if (qlen <= netdev_max_backlog && !skb_flow_limit(sd, skb, qlen)) {
 		if (qlen) {
 enqueue:
 			__skb_queue_tail(&sd->input_pkt_queue, skb);