From patchwork Tue Feb 13 20:42:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Cox X-Patchwork-Id: 1898489 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TZCxW0pSpz23j8 for ; Wed, 14 Feb 2024 07:44:46 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1rZzdv-0008Sv-06; Tue, 13 Feb 2024 20:44:39 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1rZzdp-0008S4-Kn for kernel-team@lists.ubuntu.com; Tue, 13 Feb 2024 20:44:34 +0000 Received: from mail-yb1-f198.google.com (mail-yb1-f198.google.com [209.85.219.198]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 8F69F3F670 for ; Tue, 13 Feb 2024 20:44:32 +0000 (UTC) Received: by mail-yb1-f198.google.com with SMTP id 3f1490d57ef6-dcd703b721dso708608276.1 for ; Tue, 13 Feb 2024 12:44:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707857070; x=1708461870; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gwjTXrhYxUyV2DqCPixOrfrtlt5eHZpsN5OHDY7KdOs=; b=LYD/KVpgPVMaOKkMim8s3xLhl6LK+u/tZ+d8jiPvtcejvHbsfaRYBzaHmb3L8F8UtK LfdACTWQa6MA5yURO++7eRFO2pjQopgDlBSNZPvwVFy+Es8J4J37qLVeBHLWgMKFyBBV w6IxCbf8IUBxzlvnYHYVh4fVooWofNixYPaVx7QL3skoqDHTOR4a6V9cy0EPFJLUOdnA JMpdxSBIjTfSDBdkLlMBsXCtL/1L/ebzpw9c2eCOEG0BCHxXmxsF2025F3tYxvZzN/BQ pdLGUovJ37yHgTWgY3A4ceJtTPeirKzLMdpC2Hx7wqEeg1BCGORZB2EjyhUPvKWg05v9 gFuw== X-Gm-Message-State: AOJu0YzMprcQIAzcm1OqBGAyscq8HVRGMSQFM61NElBUsogO5EiracU7 6QRoY/dJMNX0uDAIuWnKE/QRaZQwhlmTdKeh7uKOfVp0tWUGtakUFHVPi+ELyndFxOZWb+dMaRz vC3F6Ba3vwHoACxaUkCBbpbSHkPLg91b1wVZrmorrEEdpnCW+MCy7F0UwkwgQs7rejaNu3z0PEZ bLidvJprUNBg== X-Received: by 2002:a25:8750:0:b0:dcc:7af5:97b4 with SMTP id e16-20020a258750000000b00dcc7af597b4mr486576ybn.12.1707857069903; Tue, 13 Feb 2024 12:44:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IGBHGT7EtgBAKyvfbz67JhCGYfV3YSv/k1d9w5vtd74rbZrjWQVcNWwwyrvXpKYGF5jBFgsfw== X-Received: by 2002:a25:8750:0:b0:dcc:7af5:97b4 with SMTP id e16-20020a258750000000b00dcc7af597b4mr486561ybn.12.1707857069612; Tue, 13 Feb 2024 12:44:29 -0800 (PST) Received: from cox.home.arpa ([108.175.227.176]) by smtp.gmail.com with ESMTPSA id x23-20020ae9f817000000b00785a0416305sm3215401qkh.56.2024.02.13.12.44.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Feb 2024 12:44:26 -0800 (PST) From: Philip Cox To: kernel-team@lists.ubuntu.com Subject: [SRU][jammy][PATCH 1/1] fs/address_space: add alignment padding for i_map and i_mmap_rwsem to mitigate a false sharing. Date: Tue, 13 Feb 2024 15:42:21 -0500 Message-Id: <20240213204221.661294-2-philip.cox@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240213204221.661294-1-philip.cox@canonical.com> References: <20240213204221.661294-1-philip.cox@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: "Zhu, Lipeng" BugLink: https://bugs.launchpad.net/bugs/2053069 When running UnixBench/Shell Scripts, we observed high false sharing for accessing i_mmap against i_mmap_rwsem. UnixBench/Shell Scripts are typical load/execute command test scenarios, which concurrently launch->execute->exit a lot of shell commands. A lot of processes invoke vma_interval_tree_remove which touch "i_mmap", the call stack: ----vma_interval_tree_remove |----unlink_file_vma | free_pgtables | |----exit_mmap | | mmput | | |----begin_new_exec | | | load_elf_binary | | | bprm_execve Meanwhile, there are a lot of processes touch 'i_mmap_rwsem' to acquire the semaphore in order to access 'i_mmap'. In existing 'address_space' layout, 'i_mmap' and 'i_mmap_rwsem' are in the same cacheline. The patch places the i_mmap and i_mmap_rwsem in separate cache lines to avoid this false sharing problem. With this patch, based on kernel v6.4.0, on Intel Sapphire Rapids 112C/224T platform, the score improves by ~5.3%. And perf c2c tool shows the false sharing is resolved as expected, the symbol vma_interval_tree_remove disappeared in cache line 0 after this change. Baseline: ================================================= Shared Cache Line Distribution Pareto ================================================= ------------------------------------------------------------- 0 3729 5791 0 0 0xff19b3818445c740 ------------------------------------------------------------- 3.27% 3.02% 0.00% 0.00% 0x18 0 1 0xffffffffa194403b 604 483 389 692 203 [k] vma_interval_tree_insert [kernel.kallsyms] vma_interval_tree_insert+75 0 1 4.13% 3.63% 0.00% 0.00% 0x20 0 1 0xffffffffa19440a2 553 413 415 962 215 [k] vma_interval_tree_remove [kernel.kallsyms] vma_interval_tree_remove+18 0 1 2.04% 1.35% 0.00% 0.00% 0x28 0 1 0xffffffffa219a1d6 1210 855 460 1229 222 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+678 0 1 0.62% 1.85% 0.00% 0.00% 0x28 0 1 0xffffffffa219a1bf 762 329 577 527 198 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+655 0 1 0.48% 0.31% 0.00% 0.00% 0x28 0 1 0xffffffffa219a58c 1677 1476 733 1544 224 [k] down_write [kernel.kallsyms] down_write+28 0 1 0.05% 0.07% 0.00% 0.00% 0x28 0 1 0xffffffffa219a21d 1040 819 689 33 27 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+749 0 1 0.00% 0.05% 0.00% 0.00% 0x28 0 1 0xffffffffa17707db 0 1005 786 1373 223 [k] up_write [kernel.kallsyms] up_write+27 0 1 0.00% 0.02% 0.00% 0.00% 0x28 0 1 0xffffffffa219a064 0 233 778 32 30 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+308 0 1 33.82% 34.10% 0.00% 0.00% 0x30 0 1 0xffffffffa1770945 779 495 534 6011 224 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+53 0 1 17.06% 15.28% 0.00% 0.00% 0x30 0 1 0xffffffffa1770915 593 438 468 2715 224 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+5 0 1 3.54% 3.52% 0.00% 0.00% 0x30 0 1 0xffffffffa2199f84 881 601 583 1421 223 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+84 0 1 With this change: ------------------------------------------------------------- 0 556 838 0 0 0xff2780d7965d2780 ------------------------------------------------------------- 0.18% 0.60% 0.00% 0.00% 0x8 0 1 0xffffffffafff27b8 503 453 569 14 13 [k] do_dentry_open [kernel.kallsyms] do_dentry_open+456 0 1 0.54% 0.12% 0.00% 0.00% 0x8 0 1 0xffffffffaffc51ac 510 199 428 15 12 [k] hugepage_vma_check [kernel.kallsyms] hugepage_vma_check+252 0 1 1.80% 2.15% 0.00% 0.00% 0x18 0 1 0xffffffffb079a1d6 1778 799 343 215 136 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+678 0 1 0.54% 1.31% 0.00% 0.00% 0x18 0 1 0xffffffffb079a1bf 547 296 528 91 71 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+655 0 1 0.72% 0.72% 0.00% 0.00% 0x18 0 1 0xffffffffb079a58c 1479 1534 676 288 163 [k] down_write [kernel.kallsyms] down_write+28 0 1 0.00% 0.12% 0.00% 0.00% 0x18 0 1 0xffffffffafd707db 0 2381 744 282 158 [k] up_write [kernel.kallsyms] up_write+27 0 1 0.00% 0.12% 0.00% 0.00% 0x18 0 1 0xffffffffb079a064 0 239 518 6 6 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+308 0 1 46.58% 47.02% 0.00% 0.00% 0x20 0 1 0xffffffffafd70945 704 403 499 1137 219 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+53 0 1 23.92% 25.78% 0.00% 0.00% 0x20 0 1 0xffffffffafd70915 558 413 500 542 185 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+5 0 1 v1->v2: change padding to exchange fields. Link: https://lkml.kernel.org/r/20230716145653.20122-1-lipeng.zhu@intel.com Signed-off-by: Lipeng Zhu Reviewed-by: Tim Chen Cc: Alexander Viro Cc: Christian Brauner Cc: Yu Ma Signed-off-by: Andrew Morton (cherry picked from commit aee79d4e5271cee4ffa89ed830189929a6272eb8) Signed-off-by: Philip Cox --- include/linux/fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index e56fd1705122..31b880777554 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -468,11 +468,11 @@ struct address_space { atomic_t nr_thps; #endif struct rb_root_cached i_mmap; - struct rw_semaphore i_mmap_rwsem; unsigned long nrpages; pgoff_t writeback_index; const struct address_space_operations *a_ops; unsigned long flags; + struct rw_semaphore i_mmap_rwsem; errseq_t wb_err; spinlock_t private_lock; struct list_head private_list;