mbox series

[PATCHv2,0/3] Subject: [PATCHv2 0/3] Make cache-object aware of L3 siblings by parsing "ibm, thread-groups" property

Message ID 20210728175607.591679-1-parth@linux.ibm.com (mailing list archive)
Headers show
Series Subject: [PATCHv2 0/3] Make cache-object aware of L3 siblings by parsing "ibm, thread-groups" property | expand

Message

Parth Shah July 28, 2021, 5:56 p.m. UTC
Changes from v1 -> v2:
- Based on Gautham's comments, use a separate thread_group_l3_cache_map
  and modify parsing code to build cache_map for L3. This makes the
  cache_map building code isolated from the parsing code.
v1 can be found at:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-June/230680.html

On POWER10 big-core system, the L3 cache reflected by sysfs contains all
the CPUs in the big-core.

grep . /sys/devices/system/cpu/cpu0/cache/index*/shared_cpu_list
/sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_list:0,2,4,6
/sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_list:0,2,4,6
/sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_list:0,2,4,6
/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list:0-7

In the above example, CPU-0 observes CPU 0-7 in L3 (index3) cache, which
is not correct as only the CPUs in small core share the L3 cache.

The "ibm,thread-groups" contains property "2" to indicate that the CPUs
share both the L2 and L3 caches. This patch-set uses this property to
reflect correct L3 topology to a cache-object.

After applying this patch-set, the topology looks like:
$> ppc64_cpu --smt=8
$> grep . /sys/devices/system/cpu/cpu[89]/cache/*/shared_cpu_list
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list:8,10,12,14
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list:8,10,12,14
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list:8,10,12,14
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list:8,10,12,14
/sys/devices/system/cpu/cpu9/cache/index0/shared_cpu_list:9,11,13,15
/sys/devices/system/cpu/cpu9/cache/index1/shared_cpu_list:9,11,13,15
/sys/devices/system/cpu/cpu9/cache/index2/shared_cpu_list:9,11,13,15
/sys/devices/system/cpu/cpu9/cache/index3/shared_cpu_list:9,11,13,15


$> ppc64_cpu --smt=4
$> grep . /sys/devices/system/cpu/cpu[89]/cache/*/shared_cpu_list
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list:8,10
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list:8,10
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list:8,10
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list:8,10
/sys/devices/system/cpu/cpu9/cache/index0/shared_cpu_list:9,11
/sys/devices/system/cpu/cpu9/cache/index1/shared_cpu_list:9,11
/sys/devices/system/cpu/cpu9/cache/index2/shared_cpu_list:9,11
/sys/devices/system/cpu/cpu9/cache/index3/shared_cpu_list:9,11

$> ppc64_cpu --smt=2
$> grep . /sys/devices/system/cpu/cpu[89]/cache/*/shared_cpu_list
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list:8
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list:8
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list:8
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list:8
/sys/devices/system/cpu/cpu9/cache/index0/shared_cpu_list:9
/sys/devices/system/cpu/cpu9/cache/index1/shared_cpu_list:9
/sys/devices/system/cpu/cpu9/cache/index2/shared_cpu_list:9
/sys/devices/system/cpu/cpu9/cache/index3/shared_cpu_list:9

$> ppc64_cpu --smt=1
grep . /sys/devices/system/cpu/cpu[89]/cache/*/shared_cpu_list
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list:8
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list:8
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list:8
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list:8

Patches Organization:
=====================
This patch-set series is based on top of v5.14-rc2

- Patch 1-2: Add functionality to introduce awareness for
"ibm,thread-groups". Original (not merged) posted version can be found at:
https://lore.kernel.org/linuxppc-dev/1611041780-8640-1-git-send-email-ego@linux.vnet.ibm.co
- Patch 3: Use existing L2 cache_map to detect L3 cache siblings


Gautham R. Shenoy (2):
  powerpc/cacheinfo: Lookup cache by dt node and thread-group id
  powerpc/cacheinfo: Remove the redundant get_shared_cpu_map()

Parth Shah (1):
  powerpc/smp: Use existing L2 cache_map cpumask to find L3 cache
    siblings

 arch/powerpc/include/asm/smp.h  |   6 ++
 arch/powerpc/kernel/cacheinfo.c | 124 ++++++++++++++++----------------
 arch/powerpc/kernel/smp.c       |  70 ++++++++++++------
 3 files changed, 115 insertions(+), 85 deletions(-)

Comments

Michael Ellerman Aug. 18, 2021, 1:38 p.m. UTC | #1
On Wed, 28 Jul 2021 23:26:04 +0530, Parth Shah wrote:
> Changes from v1 -> v2:
> - Based on Gautham's comments, use a separate thread_group_l3_cache_map
>   and modify parsing code to build cache_map for L3. This makes the
>   cache_map building code isolated from the parsing code.
> v1 can be found at:
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-June/230680.html
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc/cacheinfo: Lookup cache by dt node and thread-group id
      https://git.kernel.org/powerpc/c/a4bec516b9c0823d7e2bb8c8928c98b535cf9adf
[2/3] powerpc/cacheinfo: Remove the redundant get_shared_cpu_map()
      https://git.kernel.org/powerpc/c/69aa8e078545bc14d84a8b4b3cb914ac8f9f280e
[3/3] powerpc/smp: Use existing L2 cache_map cpumask to find L3 cache siblings
      https://git.kernel.org/powerpc/c/e9ef81e1079b0c4c374fba0f9affa7129c7c913b

cheers