From patchwork Fri Oct 23 03:29:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Don X-Patchwork-Id: 1386491 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=i4wG/gwo; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4CHV9g6YqWz9sSW for ; Fri, 23 Oct 2020 14:30:31 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S374802AbgJWDaQ (ORCPT ); Thu, 22 Oct 2020 23:30:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40680 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S368814AbgJWDaQ (ORCPT ); Thu, 22 Oct 2020 23:30:16 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B443C0613CE for ; Thu, 22 Oct 2020 20:30:16 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id w4so136488plp.17 for ; Thu, 22 Oct 2020 20:30:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:message-id:mime-version:subject:from:to:cc; bh=1nY3WP57swg6MAG4MDM8yKEtar1GITDnzwO2HHJz44U=; b=i4wG/gwoPmGRZ6Mpz8QxDdp67CljbBwZPYy9aiHWtHbg4Yjk2zTjT5HniN2uRl+Qxv gRv7AZgHdFg7vtaVzteuxIsZ/JmDzxnbuP6QEPQaJ9wOrJzESY9tyUgJ1m1pBv21m8KJ hgT/1Ff73GDCmmnlabp4ozlnhXfW2f2muaShGzYcrkSkxd39qeaafQvRriM73XFnEuTY Jcj50rqDVJPXGgZGvLAUuVxwweSabtVnQxF6wPoKxc/BsejfXtcXgHUaOghv3xYhkdnr OsH4XdW+IsmDTCun6KJ5XbRjdwMI/OaApRCKkvVa5ZK3l3OfQKTuBJHfqLChyZMhXTlL ShPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:message-id:mime-version:subject:from :to:cc; bh=1nY3WP57swg6MAG4MDM8yKEtar1GITDnzwO2HHJz44U=; b=OlFq6WGxUvvRV4rso892cAVQxffZadXuZ8Uw8DFdO6rENH+Ha0YkUcbCL93rKI3rTi qw0paOfqvdOgwAICWO3k70A1m1SB1j/S/QU/FjAuySu/QI8z20lcryejnAs4b2tn7BSm 4mU6y/VKzQTk7aW7RdcHIR2Cw5WQDpkbaUx/TiAUYpy4Qjlt9JNCweOEJqtejE2+1HYY a24b+oREeCtG6PLoMbmaDCyba4CJ9/h5PCLkWRh6QTjmsv9pan/1bxOzOEhrLJ10DzFr X+G6Q36tMOYNPtf5Ro2A2aDdgJaeucn/o3lylUCxk0aNrN/SJxu4fWhy14iv8vF97t35 wYmQ== X-Gm-Message-State: AOAM530rc1Oy9GNbW1KCQX8oSuyzi86dBKXKr9yfTL7kW/093Oy/FqFS YBySxguoEAtoi7FtbHcbR+RZRDBYIclJ X-Google-Smtp-Source: ABdhPJzQxmKmCLQEWloUtqGTBdjOj0/OiTQmKmDlMAVk1e0OUR2bLkyYPhuESGhvRCGjWYhVdmZxT/FuxztA Sender: "joshdon via sendgmr" X-Received: from joshdon.svl.corp.google.com ([2620:15c:2cd:202:a28c:fdff:fee1:cc86]) (user=joshdon job=sendgmr) by 2002:a05:6a00:8c5:b029:142:2501:39e6 with SMTP id s5-20020a056a0008c5b0290142250139e6mr184421pfu.53.1603423815689; Thu, 22 Oct 2020 20:30:15 -0700 (PDT) Date: Thu, 22 Oct 2020 20:29:42 -0700 Message-Id: <20201023032944.399861-1-joshdon@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.29.0.rc1.297.gfa9743e501-goog Subject: [PATCH 1/3] sched: better handling for busy polling loops From: Josh Don To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , "David S. Miller" , Jakub Kicinski Cc: Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Paolo Bonzini , Eric Dumazet , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kvm@vger.kernel.org, Josh Don , Xi Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Busy polling loops in the kernel such as network socket poll and kvm halt polling have performance problems related to process scheduler load accounting. Both of the busy polling examples are opportunistic - they relinquish the cpu if another thread is ready to run. This design, however, doesn't extend to multiprocessor load balancing very well. The scheduler still sees the busy polling cpu as 100% busy and will be less likely to put another thread on that cpu. In other words, if all cores are 100% utilized and some of them are running real workloads and some others are running busy polling loops, newly woken up threads will not prefer the busy polling cpus. System wide throughput and latency may suffer. This change allows the scheduler to detect busy polling cpus in order to allow them to be more frequently considered for wake up balancing. This change also disables preemption for the duration of the busy polling loop. This is important, as it ensures that if a polling thread decides to end its poll to relinquish cpu to another thread, the polling thread will actually exit the busy loop and potentially block. When it later becomes runnable, it will have the opportunity to find an idle cpu via wakeup cpu selection. Suggested-by: Xi Wang Signed-off-by: Josh Don Signed-off-by: Xi Wang --- include/linux/sched.h | 5 +++ kernel/sched/core.c | 94 +++++++++++++++++++++++++++++++++++++++++++ kernel/sched/fair.c | 25 ++++++++---- kernel/sched/sched.h | 2 + 4 files changed, 119 insertions(+), 7 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index afe01e232935..80ef477e5a87 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1651,6 +1651,7 @@ extern int can_nice(const struct task_struct *p, const int nice); extern int task_curr(const struct task_struct *p); extern int idle_cpu(int cpu); extern int available_idle_cpu(int cpu); +extern int polling_cpu(int cpu); extern int sched_setscheduler(struct task_struct *, int, const struct sched_param *); extern int sched_setscheduler_nocheck(struct task_struct *, int, const struct sched_param *); extern void sched_set_fifo(struct task_struct *p); @@ -2048,4 +2049,8 @@ int sched_trace_rq_nr_running(struct rq *rq); const struct cpumask *sched_trace_rd_span(struct root_domain *rd); +extern void prepare_to_busy_poll(void); +extern int continue_busy_poll(void); +extern void end_busy_poll(bool allow_resched); + #endif diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 2d95dc3f4644..2783191d0bd4 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5107,6 +5107,24 @@ int available_idle_cpu(int cpu) return 1; } +/** + * polling_cpu - is a given CPU currently running a thread in a busy polling + * loop that could be preempted if a new thread were to be scheduled? + * @cpu: the CPU in question. + * + * Return: 1 if the CPU is currently polling. 0 otherwise. + */ +int polling_cpu(int cpu) +{ +#ifdef CONFIG_SMP + struct rq *rq = cpu_rq(cpu); + + return unlikely(rq->busy_polling); +#else + return 0; +#endif +} + /** * idle_task - return the idle task for a given CPU. * @cpu: the processor in question. @@ -7191,6 +7209,7 @@ void __init sched_init(void) rq_csd_init(rq, &rq->nohz_csd, nohz_csd_func); #endif + rq->busy_polling = 0; #endif /* CONFIG_SMP */ hrtick_rq_init(rq); atomic_set(&rq->nr_iowait, 0); @@ -7417,6 +7436,81 @@ void ia64_set_curr_task(int cpu, struct task_struct *p) #endif +/* + * Calling this function before entering a preemptible busy polling loop will + * help the scheduler make better load balancing decisions. Wake up balance + * will treat the polling cpu as idle. + * + * Preemption is disabled inside this function and re-enabled in + * end_busy_poll(), thus the polling loop must periodically check + * continue_busy_poll(). + * + * REQUIRES: prepare_to_busy_poll(), continue_busy_poll(), and end_busy_poll() + * must be used together. + */ +void prepare_to_busy_poll(void) +{ + struct rq __maybe_unused *rq = this_rq(); + unsigned long __maybe_unused flags; + + /* Preemption will be reenabled by end_busy_poll() */ + preempt_disable(); + +#ifdef CONFIG_SMP + raw_spin_lock_irqsave(&rq->lock, flags); + /* preemption disabled; only one thread can poll at a time */ + WARN_ON_ONCE(rq->busy_polling); + rq->busy_polling++; + raw_spin_unlock_irqrestore(&rq->lock, flags); +#endif +} +EXPORT_SYMBOL(prepare_to_busy_poll); + +int continue_busy_poll(void) +{ + if (!single_task_running()) + return 0; + + /* Important that we check this, since preemption is disabled */ + if (need_resched()) + return 0; + + return 1; +} +EXPORT_SYMBOL(continue_busy_poll); + +/* + * Restore any state modified by prepare_to_busy_poll(), including re-enabling + * preemption. + * + * @allow_resched: If true, this potentially calls schedule() as part of + * enabling preemption. A busy poll loop can use false in order to have an + * opportunity to block before rescheduling. + */ +void end_busy_poll(bool allow_resched) +{ +#ifdef CONFIG_SMP + struct rq *rq = this_rq(); + unsigned long flags; + + raw_spin_lock_irqsave(&rq->lock, flags); + BUG_ON(!rq->busy_polling); /* not paired with prepare() */ + rq->busy_polling--; + raw_spin_unlock_irqrestore(&rq->lock, flags); +#endif + + /* + * preemption needs to be kept disabled between prepare_to_busy_poll() + * and end_busy_poll(). + */ + BUG_ON(preemptible()); + if (allow_resched) + preempt_enable(); + else + preempt_enable_no_resched(); +} +EXPORT_SYMBOL(end_busy_poll); + #ifdef CONFIG_CGROUP_SCHED /* task_group_lock serializes the addition/removal of task groups */ static DEFINE_SPINLOCK(task_group_lock); diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1a68a0536add..58e525c74cc6 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5460,6 +5460,11 @@ static int sched_idle_cpu(int cpu) { return sched_idle_rq(cpu_rq(cpu)); } + +static int sched_idle_or_polling_cpu(int cpu) +{ + return sched_idle_cpu(cpu) || polling_cpu(cpu); +} #endif /* @@ -5880,6 +5885,7 @@ find_idlest_group_cpu(struct sched_group *group, struct task_struct *p, int this u64 latest_idle_timestamp = 0; int least_loaded_cpu = this_cpu; int shallowest_idle_cpu = -1; + int found_polling = 0; int i; /* Check if we have any choice: */ @@ -5914,10 +5920,14 @@ find_idlest_group_cpu(struct sched_group *group, struct task_struct *p, int this shallowest_idle_cpu = i; } } else if (shallowest_idle_cpu == -1) { + int polling = polling_cpu(i); + load = cpu_load(cpu_rq(i)); - if (load < min_load) { + if ((polling == found_polling && load < min_load) || + (polling && !found_polling)) { min_load = load; least_loaded_cpu = i; + found_polling = polling; } } } @@ -6085,7 +6095,7 @@ static int select_idle_smt(struct task_struct *p, int target) for_each_cpu(cpu, cpu_smt_mask(target)) { if (!cpumask_test_cpu(cpu, p->cpus_ptr)) continue; - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) + if (available_idle_cpu(cpu) || sched_idle_or_polling_cpu(cpu)) return cpu; } @@ -6149,7 +6159,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t for_each_cpu_wrap(cpu, cpus, target) { if (!--nr) return -1; - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) + if (available_idle_cpu(cpu) || sched_idle_or_polling_cpu(cpu)) break; } @@ -6179,7 +6189,7 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target) for_each_cpu_wrap(cpu, cpus, target) { unsigned long cpu_cap = capacity_of(cpu); - if (!available_idle_cpu(cpu) && !sched_idle_cpu(cpu)) + if (!available_idle_cpu(cpu) && !sched_idle_or_polling_cpu(cpu)) continue; if (task_fits_capacity(p, cpu_cap)) return cpu; @@ -6223,14 +6233,14 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) } symmetric: - if (available_idle_cpu(target) || sched_idle_cpu(target)) + if (available_idle_cpu(target) || sched_idle_or_polling_cpu(target)) return target; /* * If the previous CPU is cache affine and idle, don't be stupid: */ if (prev != target && cpus_share_cache(prev, target) && - (available_idle_cpu(prev) || sched_idle_cpu(prev))) + (available_idle_cpu(prev) || sched_idle_or_polling_cpu(prev))) return prev; /* @@ -6252,7 +6262,8 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) if (recent_used_cpu != prev && recent_used_cpu != target && cpus_share_cache(recent_used_cpu, target) && - (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) && + (available_idle_cpu(recent_used_cpu) || + sched_idle_or_polling_cpu(recent_used_cpu)) && cpumask_test_cpu(p->recent_used_cpu, p->cpus_ptr)) { /* * Replace recent_used_cpu with prev as it is a potential diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 28709f6b0975..45de468d0ffb 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1003,6 +1003,8 @@ struct rq { /* This is used to determine avg_idle's max value */ u64 max_idle_balance_cost; + + unsigned int busy_polling; #endif /* CONFIG_SMP */ #ifdef CONFIG_IRQ_TIME_ACCOUNTING From patchwork Fri Oct 23 03:29:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Don X-Patchwork-Id: 1386492 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=tz+FVd1p; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4CHV9l5wxmz9sTR for ; Fri, 23 Oct 2020 14:30:35 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S374847AbgJWDaX (ORCPT ); Thu, 22 Oct 2020 23:30:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S374835AbgJWDaW (ORCPT ); Thu, 22 Oct 2020 23:30:22 -0400 Received: from mail-qv1-xf49.google.com (mail-qv1-xf49.google.com [IPv6:2607:f8b0:4864:20::f49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFBD6C0613CF for ; Thu, 22 Oct 2020 20:30:22 -0700 (PDT) Received: by mail-qv1-xf49.google.com with SMTP id d16so115302qvy.16 for ; Thu, 22 Oct 2020 20:30:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=q84jJ3bGQh9hixZvKUeeIGN3DtugPKTamoF/dyrHLFI=; b=tz+FVd1pN36tw1bBQmvqr+Y2n0MIU+W7R+izbV8oJ8fgeAKPZoplKAvJcLYPYctDkN FiltZT50WLAaR4aspJVHloJjO6Yw1L1Qo2/gY707fc0byoSUHUFN4+yUOfmvLN7b7BLf dFvtLW5stbWzfFZn3avsXzwdQ5dlSZJ8ADvMW8xLX8kg/cMUonSvv0jYUXn/Y7Ax8PvG 3hAzTfDOVNGkgm+apl6pysVZC+xovU7QLo3krg1eUDk/srcG7SOLFJaiJleL6ahNbCxA DCkMacJ9RvSY6jYKBxMbCsMGmthYVsXsNzmhdmF9SZIZlLoeQ+5G6/o8OcyIVvsEkQaw U9Uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=q84jJ3bGQh9hixZvKUeeIGN3DtugPKTamoF/dyrHLFI=; b=ItmNai/SLupHB8sfKnkHIiAK2O2TcGRhpT4gmYimn7NpceEPoHkCst0UCnr3CrOHd6 muBxmU267Fw0OIdZN2FXaqoB9yUQNmnWJ6/vQwVpwpsGC4WkchWUGR0cIl2cyhUZ2z2z G46am/LTKEI5v4e5alYVR2/zs0ViTfnMuAv0Cn6dASQBh6krTmnNsolYJKufV5epnQq2 FtcAAO1SzxhbSYA0nNj+wOo91vIQ4FH+vEmRV4h7IqwYD2LpaCJBIJV4fWDZIMebRx04 xuc2hpyhaWLOjufc/snPdF1Dz5D+Q5Wkwwj5ESMbOFmU4OgSfznGAfhKIAPvrqb0e72c IgXQ== X-Gm-Message-State: AOAM532x5IbuMnga9uZgXeR/1iaXbp7loF/4rSN+Uzty6nnK76jNLcLI 6AroVIIHizFGh/75o2M/6jeUOD+Ock2L X-Google-Smtp-Source: ABdhPJxbtFDCbtjPjdxOYTFaImtDm6yGGHc8tKKsm9bUQfUsi1wM/ldih7BglY7z7cH9xMS1lGGikQ0vDTOY Sender: "joshdon via sendgmr" X-Received: from joshdon.svl.corp.google.com ([2620:15c:2cd:202:a28c:fdff:fee1:cc86]) (user=joshdon job=sendgmr) by 2002:a05:6214:122a:: with SMTP id p10mr488108qvv.0.1603423821912; Thu, 22 Oct 2020 20:30:21 -0700 (PDT) Date: Thu, 22 Oct 2020 20:29:43 -0700 In-Reply-To: <20201023032944.399861-1-joshdon@google.com> Message-Id: <20201023032944.399861-2-joshdon@google.com> Mime-Version: 1.0 References: <20201023032944.399861-1-joshdon@google.com> X-Mailer: git-send-email 2.29.0.rc1.297.gfa9743e501-goog Subject: [PATCH 2/3] kvm: better handling for kvm halt polling From: Josh Don To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , "David S. Miller" , Jakub Kicinski Cc: Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Paolo Bonzini , Eric Dumazet , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kvm@vger.kernel.org, Josh Don , Xi Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add the new functions prepare_to_busy_poll() and friends to kvm_vcpu_block. The busy polling cpu will be considered an idle target during wake up balancing. cpu_relax is also added to the polling loop to improve the performance of other hw threads sharing the busy polling core. Suggested-by: Xi Wang Signed-off-by: Josh Don Signed-off-by: Xi Wang --- virt/kvm/kvm_main.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index cf88233b819a..8f818f0fc979 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2772,7 +2772,9 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns); ++vcpu->stat.halt_attempted_poll; + prepare_to_busy_poll(); /* also disables preemption */ do { + cpu_relax(); /* * This sets KVM_REQ_UNHALT if an interrupt * arrives. @@ -2781,10 +2783,12 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) ++vcpu->stat.halt_successful_poll; if (!vcpu_valid_wakeup(vcpu)) ++vcpu->stat.halt_poll_invalid; + end_busy_poll(false); goto out; } poll_end = cur = ktime_get(); - } while (single_task_running() && ktime_before(cur, stop)); + } while (continue_busy_poll() && ktime_before(cur, stop)); + end_busy_poll(false); } prepare_to_rcuwait(&vcpu->wait); From patchwork Fri Oct 23 03:29:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Don X-Patchwork-Id: 1386493 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=iVgGGpv/; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4CHV9p17VCz9sTR for ; Fri, 23 Oct 2020 14:30:38 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S374868AbgJWDac (ORCPT ); Thu, 22 Oct 2020 23:30:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S374856AbgJWDa3 (ORCPT ); Thu, 22 Oct 2020 23:30:29 -0400 Received: from mail-qt1-x849.google.com (mail-qt1-x849.google.com [IPv6:2607:f8b0:4864:20::849]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0405C0613D4 for ; Thu, 22 Oct 2020 20:30:27 -0700 (PDT) Received: by mail-qt1-x849.google.com with SMTP id r4so9215qta.9 for ; Thu, 22 Oct 2020 20:30:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=6HFlEVpudWH5pffnpQkc+Yc/oxstkbax6tUIXieA0lk=; b=iVgGGpv/7FVDQ3pQoJo5ENDkVCIPPnAb2W/P85Q5YObYQ2qOZxGF5aIUBugdD4WxjU Z5RiS70Guwrf2/je+hqDDOaMglLC2zunZShamfYLdW7/LeCKyl0Jm+323jMv/JP4bdLn Cnc1mlvlswebVTlRL7llKF7A6RsRxb5HywKIzg2GugnXeaGy2yJebCLgsW4mT3HNGT67 Q9Ykutn29AadOxcTFqlJfponLBNNUotM0TG5bkKPWwVkNvEfLOzYa4mI5lFBvayOC1b3 zP5wOtjf8ftz3PMQHQr6qhUXN6Jm0YYw/T2c3+BrXI0d1YjElTouA69Vo9i4xhiZB1ok Vx0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=6HFlEVpudWH5pffnpQkc+Yc/oxstkbax6tUIXieA0lk=; b=S3SvhxZ+qiQrqAYjHJM+HWAtCc/FKW7IRNHN9DNMPp/+8/nuDemHxb+r2dw1G8sG4m hrTqL8EAojv5yBXqMAJKxUtU3xJQHRQhV3ZYdYrUUxzMiiU64YMSjN3ts/eJmDuuEzLE eplExfnmVRTuw2Zo8+340HryV1ZPAo96XLa67hbR6WjzCw+53DHOP0YEkWK9PogI/8Et Fz/AWw8X46PSYIWfxP84UAqAPHim33yvUwQekH0jiDrSDa3Vah8SfBTljuIeAz095TjN XQmo3n1Wd2Mncz8JPF9aP5cGzppSlBre1VuRBYY3MHGeftrWqOKkZMgKdZkkGLLqHsSb n34g== X-Gm-Message-State: AOAM5339nb7qaszreUGCmCfma9WYyZceQSib9Cb7rwgDhS+lqSM4Th9k b/DimdkRxQTMf9LLPE9y1mS9O+CgZ7+E X-Google-Smtp-Source: ABdhPJywFZ9nl2zxccBBMSoKLsYKnTzcd6iRKtoGpdGnB7NFOH56JsDER6SvXGz5MrN+E2VlfKYlk66r6HEj Sender: "joshdon via sendgmr" X-Received: from joshdon.svl.corp.google.com ([2620:15c:2cd:202:a28c:fdff:fee1:cc86]) (user=joshdon job=sendgmr) by 2002:ad4:43ca:: with SMTP id o10mr376677qvs.33.1603423826790; Thu, 22 Oct 2020 20:30:26 -0700 (PDT) Date: Thu, 22 Oct 2020 20:29:44 -0700 In-Reply-To: <20201023032944.399861-1-joshdon@google.com> Message-Id: <20201023032944.399861-3-joshdon@google.com> Mime-Version: 1.0 References: <20201023032944.399861-1-joshdon@google.com> X-Mailer: git-send-email 2.29.0.rc1.297.gfa9743e501-goog Subject: [PATCH 3/3] net: better handling for network busy poll From: Josh Don To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , "David S. Miller" , Jakub Kicinski Cc: Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Paolo Bonzini , Eric Dumazet , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kvm@vger.kernel.org, Josh Don , Xi Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add the new functions prepare_to_busy_poll() and friends to napi_busy_loop(). The busy polling cpu will be considered an idle target during wake up balancing. Suggested-by: Xi Wang Signed-off-by: Josh Don Signed-off-by: Xi Wang --- net/core/dev.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 266073e300b5..4fb4ae4b27fc 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6476,7 +6476,7 @@ void napi_busy_loop(unsigned int napi_id, if (!napi) goto out; - preempt_disable(); + prepare_to_busy_poll(); /* disables preemption */ for (;;) { int work = 0; @@ -6509,10 +6509,10 @@ void napi_busy_loop(unsigned int napi_id, if (!loop_end || loop_end(loop_end_arg, start_time)) break; - if (unlikely(need_resched())) { + if (unlikely(!continue_busy_poll())) { if (napi_poll) busy_poll_stop(napi, have_poll_lock); - preempt_enable(); + end_busy_poll(true); rcu_read_unlock(); cond_resched(); if (loop_end(loop_end_arg, start_time)) @@ -6523,7 +6523,7 @@ void napi_busy_loop(unsigned int napi_id, } if (napi_poll) busy_poll_stop(napi, have_poll_lock); - preempt_enable(); + end_busy_poll(true); out: rcu_read_unlock(); }