From patchwork Mon Dec 16 04:02:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Li RongQing X-Patchwork-Id: 1210066 X-Patchwork-Delegate: dsahern@gmail.com Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=baidu.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47bng20MJfz9sPT for ; Mon, 16 Dec 2019 15:02:54 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726682AbfLPECK (ORCPT ); Sun, 15 Dec 2019 23:02:10 -0500 Received: from mx21.baidu.com ([220.181.3.85]:35752 "EHLO baidu.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726437AbfLPECJ (ORCPT ); Sun, 15 Dec 2019 23:02:09 -0500 Received: from BC-Mail-Ex14.internal.baidu.com (unknown [172.31.51.54]) by Forcepoint Email with ESMTPS id E2AAA99D2612B9F25FA0; Mon, 16 Dec 2019 12:02:04 +0800 (CST) Received: from BJHW-Mail-Ex13.internal.baidu.com (10.127.64.36) by BC-Mail-Ex14.internal.baidu.com (172.31.51.54) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1531.3; Mon, 16 Dec 2019 12:02:04 +0800 Received: from BJHW-Mail-Ex13.internal.baidu.com ([100.100.100.36]) by BJHW-Mail-Ex13.internal.baidu.com ([100.100.100.36]) with mapi id 15.01.1713.004; Mon, 16 Dec 2019 12:02:04 +0800 From: "Li,Rongqing" To: Yunsheng Lin , Jesper Dangaard Brouer CC: Saeed Mahameed , "ilias.apalodimas@linaro.org" , "jonathan.lemon@gmail.com" , "netdev@vger.kernel.org" , "mhocko@kernel.org" , "peterz@infradead.org" , Greg Kroah-Hartman , "bhelgaas@google.com" , "linux-kernel@vger.kernel.org" , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= Subject: =?utf-8?b?562U5aSNOiBbUEFUQ0hdW3YyXSBwYWdlX3Bvb2w6IGhhbmRsZSBw?= =?utf-8?q?age_recycle_for_NUMA=5FNO=5FNODE_condition?= Thread-Topic: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition Thread-Index: AQHVsZIrr0J3dgd4XE+VTzKxYVWGCKe7fWqAgACmZRA= Date: Mon, 16 Dec 2019 04:02:04 +0000 Message-ID: References: <1575624767-3343-1-git-send-email-lirongqing@baidu.com> <9fecbff3518d311ec7c3aee9ae0315a73682a4af.camel@mellanox.com> <20191211194933.15b53c11@carbon> <831ed886842c894f7b2ffe83fe34705180a86b3b.camel@mellanox.com> <0a252066-fdc3-a81d-7a36-8f49d2babc01@huawei.com> <20191212111831.2a9f05d3@carbon> <7c555cb1-6beb-240d-08f8-7044b9087fe4@huawei.com> <1d4f10f4c0f1433bae658df8972a904f@baidu.com> <079a0315-efea-9221-8538-47decf263684@huawei.com> <20191213094845.56fb42a4@carbon> <15be326d-1811-329c-424c-6dd22b0604a8@huawei.com> In-Reply-To: <15be326d-1811-329c-424c-6dd22b0604a8@huawei.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.22.198.6] x-baidu-bdmsfe-datecheck: 1_BC-Mail-Ex14_2019-12-16 12:02:04:855 x-baidu-bdmsfe-viruscheck: BC-Mail-Ex14_GRAY_Inside_WithoutAtta_2019-12-16 12:02:04:824 MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org > -----邮件原件----- > 发件人: Yunsheng Lin [mailto:linyunsheng@huawei.com] > 发送时间: 2019年12月16日 9:51 > 收件人: Jesper Dangaard Brouer > 抄送: Li,Rongqing ; Saeed Mahameed > ; ilias.apalodimas@linaro.org; > jonathan.lemon@gmail.com; netdev@vger.kernel.org; mhocko@kernel.org; > peterz@infradead.org; Greg Kroah-Hartman ; > bhelgaas@google.com; linux-kernel@vger.kernel.org; Björn Töpel > > 主题: Re: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE > condition > > On 2019/12/13 16:48, Jesper Dangaard Brouer wrote:> You are basically saying > that the NUMA check should be moved to > > allocation time, as it is running the RX-CPU (NAPI). And eventually > > after some time the pages will come from correct NUMA node. > > > > I think we can do that, and only affect the semi-fast-path. > > We just need to handle that pages in the ptr_ring that are recycled > > can be from the wrong NUMA node. In __page_pool_get_cached() when > > consuming pages from the ptr_ring (__ptr_ring_consume_batched), then > > we can evict pages from wrong NUMA node. > > Yes, that's workable. > > > > > For the pool->alloc.cache we either accept, that it will eventually > > after some time be emptied (it is only in a 100% XDP_DROP workload that > > it will continue to reuse same pages). Or we simply clear the > > pool->alloc.cache when calling page_pool_update_nid(). > > Simply clearing the pool->alloc.cache when calling page_pool_update_nid() > seems better. > How about the below codes, the driver can configure p.nid to any, which will be adjusted in NAPI polling, irq migration will not be problem, but it will add a check into hot path. Thanks -Li > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c index a6aefe989043..4374a6239d17 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -108,6 +108,10 @@ static struct page *__page_pool_get_cached(struct page_pool *pool) if (likely(pool->alloc.count)) { /* Fast-path */ page = pool->alloc.cache[--pool->alloc.count]; + + if (unlikely(READ_ONCE(pool->p.nid) != numa_mem_id())) + WRITE_ONCE(pool->p.nid, numa_mem_id()); + return page; } refill = true; @@ -155,6 +159,10 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, if (pool->p.order) gfp |= __GFP_COMP; + + if (unlikely(READ_ONCE(pool->p.nid) != numa_mem_id())) + WRITE_ONCE(pool->p.nid, numa_mem_id()); + /* FUTURE development: * * Current slow-path essentially falls back to single page