[{"id":3678990,"web_url":"http://patchwork.ozlabs.org/comment/3678990/","msgid":"<5dab98de-542a-83d3-cdc9-5898022d6f32@ssi.bg>","list_archive_url":null,"date":"2026-04-18T17:55:39","subject":"Re: [PATCH net 3/3] ipvs: fix the spin_lock usage for RT build","submitter":{"id":2825,"url":"http://patchwork.ozlabs.org/api/people/2825/","name":"Julian Anastasov","email":"ja@ssi.bg"},"content":"Hello,\n\nOn Wed, 15 Apr 2026, Julian Anastasov wrote:\n\n> syzbot reports for sleeping function called from invalid context [1].\n> The recently added code for resizable hash tables uses\n> hlist_bl bit locks in combination with spin_lock for\n> the connection fields (cp->lock).\n> \n> Fix the following problems:\n> \n> * avoid using spin_lock(&cp->lock) under locked bit lock\n> because it sleeps on PREEMPT_RT\n> \n> * as the recent changes call ip_vs_conn_hash() only for newly\n> allocated connection, the spin_lock can be removed there because\n> the connection is still not linked to table and does not need\n> cp->lock protection.\n> \n> * the lock can be removed also from ip_vs_conn_unlink() where we\n> are the last connection user.\n> \n> * the last place that is fixed is ip_vs_conn_fill_cport()\n> where the locks can be reordered to follow the RT rules.\n> \n> [1]:\n> BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48\n> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 16, name: ktimers/0\n> preempt_count: 2, expected: 0\n> RCU nest depth: 3, expected: 3\n> 8 locks held by ktimers/0/16:\n>  #0: ffffffff8de5f260 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x3c/0x420 kernel/softirq.c:163\n>  #1: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: __local_bh_disable_ip+0x3c/0x420 kernel/softirq.c:163\n>  #2: ffff8880b8826360 (&base->expiry_lock){+...}-{3:3}, at: spin_lock include/linux/spinlock_rt.h:45 [inline]\n>  #2: ffff8880b8826360 (&base->expiry_lock){+...}-{3:3}, at: timer_base_lock_expiry kernel/time/timer.c:1502 [inline]\n>  #2: ffff8880b8826360 (&base->expiry_lock){+...}-{3:3}, at: __run_timer_base+0x120/0x9f0 kernel/time/timer.c:2384\n>  #3: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]\n>  #3: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]\n>  #3: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: __rt_spin_lock kernel/locking/spinlock_rt.c:50 [inline]\n>  #3: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x1e0/0x400 kernel/locking/spinlock_rt.c:57\n>  #4: ffffc90000157a80 ((&cp->timer)){+...}-{0:0}, at: call_timer_fn+0xd4/0x5e0 kernel/time/timer.c:1745\n>  #5: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]\n>  #5: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]\n>  #5: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: ip_vs_conn_unlink net/netfilter/ipvs/ip_vs_conn.c:315 [inline]\n>  #5: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: ip_vs_conn_expire+0x257/0x2390 net/netfilter/ipvs/ip_vs_conn.c:1260\n>  #6: ffffffff8de5f260 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x3c/0x420 kernel/softirq.c:163\n>  #7: ffff888068d4c3f0 (&cp->lock#2){+...}-{3:3}, at: spin_lock include/linux/spinlock_rt.h:45 [inline]\n>  #7: ffff888068d4c3f0 (&cp->lock#2){+...}-{3:3}, at: ip_vs_conn_unlink net/netfilter/ipvs/ip_vs_conn.c:324 [inline]\n>  #7: ffff888068d4c3f0 (&cp->lock#2){+...}-{3:3}, at: ip_vs_conn_expire+0xd4a/0x2390 net/netfilter/ipvs/ip_vs_conn.c:1260\n> Preemption disabled at:\n> [<ffffffff898a6358>] bit_spin_lock include/linux/bit_spinlock.h:38 [inline]\n> [<ffffffff898a6358>] hlist_bl_lock+0x18/0x110 include/linux/list_bl.h:149\n> CPU: 0 UID: 0 PID: 16 Comm: ktimers/0 Tainted: G        W    L      syzkaller #0 PREEMPT_{RT,(full)}\n> Tainted: [W]=WARN, [L]=SOFTLOCKUP\n> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/18/2026\n> Call Trace:\n>  <TASK>\n>  dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120\n>  __might_resched+0x329/0x480 kernel/sched/core.c:9162\n>  __rt_spin_lock kernel/locking/spinlock_rt.c:48 [inline]\n>  rt_spin_lock+0xc2/0x400 kernel/locking/spinlock_rt.c:57\n>  spin_lock include/linux/spinlock_rt.h:45 [inline]\n>  ip_vs_conn_unlink net/netfilter/ipvs/ip_vs_conn.c:324 [inline]\n>  ip_vs_conn_expire+0xd4a/0x2390 net/netfilter/ipvs/ip_vs_conn.c:1260\n>  call_timer_fn+0x192/0x5e0 kernel/time/timer.c:1748\n>  expire_timers kernel/time/timer.c:1799 [inline]\n>  __run_timers kernel/time/timer.c:2374 [inline]\n>  __run_timer_base+0x6a3/0x9f0 kernel/time/timer.c:2386\n>  run_timer_base kernel/time/timer.c:2395 [inline]\n>  run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2405\n>  handle_softirqs+0x1de/0x6d0 kernel/softirq.c:622\n>  __do_softirq kernel/softirq.c:656 [inline]\n>  run_ktimerd+0x69/0x100 kernel/softirq.c:1151\n>  smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160\n>  kthread+0x388/0x470 kernel/kthread.c:436\n>  ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158\n>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245\n>  </TASK>\n> \n> Reported-by: syzbot+504e778ddaecd36fdd17@syzkaller.appspotmail.com\n> Fixes: 2fa7cc9c7025 (\"ipvs: switch to per-net connection table\")\n> Signed-off-by: Julian Anastasov <ja@ssi.bg>\n\n\tAccording to Sashiko, this patch needs more\nwork, I'll send new patchset version when I'm ready...\n\npw-bot: changes-requested\n\n> ---\n>  net/netfilter/ipvs/ip_vs_conn.c | 49 ++++++++++++++-------------------\n>  1 file changed, 21 insertions(+), 28 deletions(-)\n> \n> diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c\n> index 84a4921a7865..cf19dc06c65d 100644\n> --- a/net/netfilter/ipvs/ip_vs_conn.c\n> +++ b/net/netfilter/ipvs/ip_vs_conn.c\n> @@ -267,27 +267,20 @@ static inline int ip_vs_conn_hash(struct ip_vs_conn *cp)\n>  \t\thash_key2 = hash_key;\n>  \t\tuse2 = false;\n>  \t}\n> +\n>  \tconn_tab_lock(t, cp, hash_key, hash_key2, use2, true /* new_hash */,\n>  \t\t      &head, &head2);\n> -\tspin_lock(&cp->lock);\n> -\n> -\tif (!(cp->flags & IP_VS_CONN_F_HASHED)) {\n> -\t\tcp->flags |= IP_VS_CONN_F_HASHED;\n> -\t\tWRITE_ONCE(cp->hn0.hash_key, hash_key);\n> -\t\tWRITE_ONCE(cp->hn1.hash_key, hash_key2);\n> -\t\trefcount_inc(&cp->refcnt);\n> -\t\thlist_bl_add_head_rcu(&cp->hn0.node, head);\n> -\t\tif (use2)\n> -\t\t\thlist_bl_add_head_rcu(&cp->hn1.node, head2);\n> -\t\tret = 1;\n> -\t} else {\n> -\t\tpr_err(\"%s(): request for already hashed, called from %pS\\n\",\n> -\t\t       __func__, __builtin_return_address(0));\n> -\t\tret = 0;\n> -\t}\n>  \n> -\tspin_unlock(&cp->lock);\n> +\tcp->flags |= IP_VS_CONN_F_HASHED;\n> +\tWRITE_ONCE(cp->hn0.hash_key, hash_key);\n> +\tWRITE_ONCE(cp->hn1.hash_key, hash_key2);\n> +\trefcount_inc(&cp->refcnt);\n> +\thlist_bl_add_head_rcu(&cp->hn0.node, head);\n> +\tif (use2)\n> +\t\thlist_bl_add_head_rcu(&cp->hn1.node, head2);\n> +\n>  \tconn_tab_unlock(head, head2);\n> +\tret = 1;\n>  \n>  \t/* Schedule resizing if load increases */\n>  \tif (atomic_read(&ipvs->conn_count) > t->u_thresh &&\n> @@ -321,7 +314,6 @@ static inline bool ip_vs_conn_unlink(struct ip_vs_conn *cp)\n>  \n>  \tconn_tab_lock(t, cp, hash_key, hash_key2, use2, false /* new_hash */,\n>  \t\t      &head, &head2);\n> -\tspin_lock(&cp->lock);\n>  \n>  \tif (cp->flags & IP_VS_CONN_F_HASHED) {\n>  \t\t/* Decrease refcnt and unlink conn only if we are last user */\n> @@ -334,7 +326,6 @@ static inline bool ip_vs_conn_unlink(struct ip_vs_conn *cp)\n>  \t\t}\n>  \t}\n>  \n> -\tspin_unlock(&cp->lock);\n>  \tconn_tab_unlock(head, head2);\n>  \n>  \trcu_read_unlock();\n> @@ -637,6 +628,7 @@ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport)\n>  \tstruct ip_vs_conn_hnode *hn;\n>  \tu32 hash_key, hash_key_new;\n>  \tstruct ip_vs_conn_param p;\n> +\tbool changed = false;\n>  \tint ntbl;\n>  \tint dir;\n>  \n> @@ -709,9 +701,7 @@ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport)\n>  \t\tgoto retry;\n>  \t}\n>  \n> -\tspin_lock(&cp->lock);\n> -\tif ((cp->flags & IP_VS_CONN_F_NO_CPORT) &&\n> -\t    (cp->flags & IP_VS_CONN_F_HASHED)) {\n> +\tif (cp->flags & IP_VS_CONN_F_NO_CPORT) {\n>  \t\t/* We do not recalc hash_key_r under lock, we assume the\n>  \t\t * parameters in cp do not change, i.e. cport is\n>  \t\t * the only possible change.\n> @@ -726,19 +716,22 @@ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport)\n>  \t\t\thlist_bl_del_rcu(&hn->node);\n>  \t\t\thlist_bl_add_head_rcu(&hn->node, head_new);\n>  \t\t}\n> -\t\tif (!dir) {\n> -\t\t\tatomic_dec(&ipvs->no_cport_conns[af_id]);\n> -\t\t\tcp->flags &= ~IP_VS_CONN_F_NO_CPORT;\n> -\t\t\tcp->cport = cport;\n> -\t\t}\n> +\t\tif (!dir)\n> +\t\t\tchanged = true;\n>  \t}\n> -\tspin_unlock(&cp->lock);\n>  \n>  \tif (head != head2)\n>  \t\thlist_bl_unlock(head2);\n>  \thlist_bl_unlock(head);\n>  \twrite_seqcount_end(&t->seqc[hash_key & t->seqc_mask]);\n>  \tpreempt_enable_nested();\n> +\tif (changed) {\n> +\t\tatomic_dec(&ipvs->no_cport_conns[af_id]);\n> +\t\tspin_lock(&cp->lock);\n> +\t\tcp->flags &= ~IP_VS_CONN_F_NO_CPORT;\n> +\t\tcp->cport = cport;\n> +\t\tspin_unlock(&cp->lock);\n> +\t}\n>  \tspin_unlock_bh(&t->lock[hash_key & t->lock_mask].l);\n>  \tif (dir--)\n>  \t\tgoto next_dir;\n> -- \n> 2.53.0\n\nRegards\n\n--\nJulian Anastasov <ja@ssi.bg>","headers":{"Return-Path":"\n <netfilter-devel+bounces-12019-incoming=patchwork.ozlabs.org@vger.kernel.org>","X-Original-To":["incoming@patchwork.ozlabs.org","netfilter-devel@vger.kernel.org"],"Delivered-To":"patchwork-incoming@legolas.ozlabs.org","Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (4096-bit key;\n unprotected) header.d=ssi.bg header.i=@ssi.bg header.a=rsa-sha256\n header.s=ssi header.b=kODIVuRz;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=172.232.135.74; helo=sto.lore.kernel.org;\n envelope-from=netfilter-devel+bounces-12019-incoming=patchwork.ozlabs.org@vger.kernel.org;\n receiver=patchwork.ozlabs.org)","smtp.subspace.kernel.org;\n\tdkim=pass (4096-bit key) header.d=ssi.bg header.i=@ssi.bg header.b=\"kODIVuRz\"","smtp.subspace.kernel.org;\n arc=none smtp.client-ip=193.238.174.39","smtp.subspace.kernel.org;\n dmarc=pass (p=reject dis=none) header.from=ssi.bg","smtp.subspace.kernel.org;\n spf=pass smtp.mailfrom=ssi.bg"],"Received":["from sto.lore.kernel.org (sto.lore.kernel.org [172.232.135.74])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fyfZ42LCpz1yDF\n\tfor <incoming@patchwork.ozlabs.org>; Sun, 19 Apr 2026 03:56:12 +1000 (AEST)","from smtp.subspace.kernel.org (conduit.subspace.kernel.org\n [100.90.174.1])\n\tby sto.lore.kernel.org (Postfix) with ESMTP id D3266300E2AF\n\tfor <incoming@patchwork.ozlabs.org>; Sat, 18 Apr 2026 17:56:08 +0000 (UTC)","from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 8F71030E85C;\n\tSat, 18 Apr 2026 17:56:06 +0000 (UTC)","from mx.ssi.bg (mx.ssi.bg [193.238.174.39])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id AD4D822A80D;\n\tSat, 18 Apr 2026 17:56:00 +0000 (UTC)","from mx.ssi.bg (localhost [127.0.0.1])\n\tby mx.ssi.bg (Potsfix) with ESMTP id 370CB21186;\n\tSat, 18 Apr 2026 20:55:52 +0300 (EEST)","from box.ssi.bg (box.ssi.bg [193.238.174.46])\n\tby mx.ssi.bg (Potsfix) with ESMTPS;\n\tSat, 18 Apr 2026 20:55:50 +0300 (EEST)","from ja.ssi.bg (unknown [213.16.62.126])\n\tby box.ssi.bg (Potsfix) with ESMTPSA id 4D2AD603CB;\n\tSat, 18 Apr 2026 20:55:49 +0300 (EEST)","from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby ja.ssi.bg (8.18.1/8.18.1) with ESMTP id 63IHtd8K031330;\n\tSat, 18 Apr 2026 20:55:41 +0300"],"ARC-Seal":"i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1776534964; cv=none;\n b=F4qNdb6qTJtBQcav14xXIjEDwZnj++5CWktYtyVFiv4ystKlLtFBuNpDoJ0bPQSH1W/dKfvQPj2YGFfl8UbzCRj+rru/Ojpx9rUxszCo9ZT1/cBtajOGX1iVC/XgiisNawWs0y8EunOLE0sW/jiCyw67F5yqoPk0g4hV18gk9EQ=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1776534964; c=relaxed/simple;\n\tbh=0hq0tMHEz4C12C9QP6VN9owVx7l1dH0Z8ycYdij2GxA=;\n\th=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References:\n\t MIME-Version:Content-Type;\n b=JslRw8s1GCojJbGEdrnva0xZJv81bu84Ik481j8QXoTB17moGhm9H9MWEhvnqOlaJZwJtVzQt132MtSDTwzNIwplNl38mamlWDH36Z67o2VEpYNSBjWVxi7lzo6h1tAipRJsJ28ZfPe7uJWS+bdKrP0CX2CdRiStzVtAleLeQbk=","ARC-Authentication-Results":"i=1; smtp.subspace.kernel.org;\n dmarc=pass (p=reject dis=none) header.from=ssi.bg;\n spf=pass smtp.mailfrom=ssi.bg;\n dkim=pass (4096-bit key) header.d=ssi.bg header.i=@ssi.bg header.b=kODIVuRz;\n arc=none smtp.client-ip=193.238.174.39","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=ssi.bg; h=cc:cc\n\t:content-type:content-type:date:from:from:in-reply-to:message-id\n\t:mime-version:references:reply-to:subject:subject:to:to; s=ssi;\n\t bh=XhosEcUWIxPrLwPY7iKLmGNz9xVeBQ3zl8QN7RbSQzQ=; b=kODIVuRzR/fq\n\tQkYcByXQJcEmzEu1uUQUn7gCYbY7JEr48wIAxMaRpa4NbRPJmfMQ+12zaft1mWCY\n\tmGpKnesD7Kigfn/bmmXEr0npU7mRFvespEaGrnzgaBvyddWhIWPQ3NXcfE0j8sGN\n\tD9RVbW0It51NSfX/VicgXAfYR9KcdmCp0MGNdGjFIpzS1l50Rf7n9V4hjgVWeYlH\n\t3no6mvVgKUuT9RLZvMewAv2cE6AONhVgCkwv6JfSImzRGI+HCujijP4RM5Wq4CPL\n\thM2d0O/7dLkozN7hcuj9v3ZFwa2+0hQL9tOhBiwr+MN3Kg2jVbdzbirOObV4Idah\n\tgkpESESIIA7VZkS5VemdUKTz4WXLo43szNndUOgG4Edqg033lSa9HoCRt+aUpN/O\n\tJEVnETsRpJYgXLpj7KmCr+zXubJ8tV5q88X5oKddTXj+ilXoJHNFkLLm9pTtuFDM\n\taKys4MpqQRKevN1UZOleMdCbldm9TzXwNHw3PrgR3T/p97O9SNX9G8hAym0QNlAP\n\t2pW7ibkTJu4F4sXl+Br6Q7ZbdG6eYVJJQm7a6FJQQM8WIInXlcf8FP86lYpCjdPK\n\tngPIw6/t6Y9AiWeBDlywRdjLSGBqrzakyu+J/eMU36QBJ19WQD5AVngOkgFJpvqc\n\tfO2BvUE29k0P9pFXAyEyIYmluBriOfU=","Date":"Sat, 18 Apr 2026 20:55:39 +0300 (EEST)","From":"Julian Anastasov <ja@ssi.bg>","To":"Simon Horman <horms@verge.net.au>","cc":"Pablo Neira Ayuso <pablo@netfilter.org>, Florian Westphal <fw@strlen.de>,\n        lvs-devel@vger.kernel.org, netfilter-devel@vger.kernel.org","Subject":"Re: [PATCH net 3/3] ipvs: fix the spin_lock usage for RT build","In-Reply-To":"<20260415200216.79699-4-ja@ssi.bg>","Message-ID":"<5dab98de-542a-83d3-cdc9-5898022d6f32@ssi.bg>","References":"<20260415200216.79699-1-ja@ssi.bg>\n <20260415200216.79699-4-ja@ssi.bg>","Precedence":"bulk","X-Mailing-List":"netfilter-devel@vger.kernel.org","List-Id":"<netfilter-devel.vger.kernel.org>","List-Subscribe":"<mailto:netfilter-devel+subscribe@vger.kernel.org>","List-Unsubscribe":"<mailto:netfilter-devel+unsubscribe@vger.kernel.org>","MIME-Version":"1.0","Content-Type":"text/plain; charset=US-ASCII"}}]